AWS Cert Notes

My AWS cert notes.

This project is maintained by jangroth


Solutions Architect Professional

10/2020 - 3/2021

Following Ultimate AWS Certified Solutions Architect Professional 2021.


Exam Objectives

Content

Domain 1: Design for Organizational Complexity

Domain 2: Design for New Solutions

Domain 3: Migration Planning

Domain 4: Cost Control

Domain 5: Continuous Improvement for Existing Solutions


Identity and Federation

Overview - possible ways to manage Identity and Federation in AWS

. . .
Identity only in AWS Users and accounts all in AWS Simplest setup
AWS Organizations In case we have multiple accounts
Adds consolidated billing and compliance
Federation With SAML With a SAML-compliant IdP
Without SAML With a custom IdP (STS GetFederationToken)
AWS SSO For multiple accounts within AWS Organizations and SAML
Web Identity Federation Not recommended, use AWS Cognito
AWS Cognito For most web and mobile applications
Has anonymous mode & MFA
Active Directory on AWS Microsoft AD Standalone or setup trust AD with on-premises
Has MFA
Seamless join
RDS integration
AD Connector Proxy requests to on-premises
Simple AD Standalone & cheap AD-compatible with no MFA, no advanced capabilities

Identity and Access Management (Core Topic)

Overview

AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.

Best practices:

Users

An entity that you create in AWS to represent the person or application that uses it to interact with AWS. A user in AWS consists of a name and credentials.

Groups

A collection of IAM users. Groups let you specify permissions for multiple users, which can make it easier to manage the permissions for those users.

Roles

An IAM identity that you can create in your account that has specific permissions. An IAM role has some similarities to an IAM user. Roles and users are both AWS identities with permissions policies that determine what the identity can and cannot do in AWS. However, instead of being uniquely associated with one person, a role is intended to be assumable by anyone who needs it. Also, a role does not have standard long-term credentials such as a password or access keys associated with it. Instead, when you assume a role, it provides you with temporary security credentials for your role session, provided by STS.

Type: AWS::IAM::Role
Properties:
  RoleName: String
  Description: String
  Path: String
  AssumeRolePolicyDocument: Json (Trust policy)
  ManagedPolicyArns:
    - String (ARN)
  PermissionsBoundary: String (ARN)
  Policies:
    - Policy
  MaxSessionDuration: Integer
  Tags:
    - Tag

Policies

You manage access in AWS by creating policies and attaching them to IAM identities (users, groups of users, or roles) or AWS resources. A policy is an object in AWS that, when associated with an identity or resource, defines their permissions. AWS evaluates these policies when an IAM principal (user or role) makes a request. Permissions in the policies determine whether the request is allowed or denied. Most policies are stored in AWS as JSON documents.

Type: AWS::IAM::Policy
Properties:
  PolicyName: String
  PolicyDocument: Json (YAML in CFN)
    Version: 2012-10-17
    Statement:
      - Effect: Allow
        Action: '*'
        Resource: '*'
      - Effect: Allow
        NotAction: 's3:DeleteBucket' # <-- all S3 but deleteBucket
        Resource: '*'
      - Effect: Deny
        NotAction: 's3:DeleteBucket' # <-- no S3 but deleteBucket
        Resource: '*'
  Groups:
    - String (!Ref)
  Roles:
    - String (!Ref)
  Users:
    - String (!Ref)

Policy Conditions

Policy Variables & Tags

Identity-based vs resource-based policies

Automated Scanning

Access Advisor

Access Advisor shows the services that a certain user or role can access and when those services were last accessed. Review this data to remove unused permissions.

Access Analyzer

Makes it simple for security teams and administrators to check that their policies provide only the intended access to resources. Resource policies allow customers to granularly control who is able to access a specific resource and how they are able to use it across the entire cloud environment.

IAM Access Analyzer continuously monitors policies for changes, meaning customers no longer need to rely on intermittent manual checks in order to catch issues as policies are added or updated. Using IAM Access Analyzer, customers can proactively address any resource policies that violate their security and governance best practices around resource sharing and protect their resources from unintended access. IAM Access Analyzer delivers comprehensive, detailed findings through the AWS IAM, Amazon S3, and AWS Security Hub consoles and also through its APIs. Findings can also be exported as a report for auditing purposes. IAM Access Analyzer findings provide definitive answers of who has public and cross-account access to AWS resources from outside an account.


Security Token Service (Core Topic)

Overview

AWS Security Token Service (AWS STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that

Whenever an IAM user assumes another role, it's giving up its original permissions.

STS on AWS:

Providing access to an IAM user in another AWS account that you own

Providing access to an IAM user from a third party AWS account


Amazon Cognito (Core Topic)

Overview

Amazon Cognito provides authentication, authorization, and user management for your web and mobile apps. Your users can sign in directly with a user name and password, or through a third party such as Facebook, Amazon, Google or Apple.

The two main components of Amazon Cognito are user pools and identity pools. User pools are user directories that provide sign-up and sign-in options for your app users. Identity pools enable you to grant your users access to other AWS services. You can use identity pools and user pools separately or together.

User pools

A user pool is a user directory in Amazon Cognito. With a user pool, your users can sign in to your web or mobile app through Amazon Cognito. Your users can also sign in through social identity providers like Google, Facebook, Amazon, or Apple, and through SAML identity providers. Whether your users sign in directly or through a third party, all members of the user pool have a directory profile that you can access through a Software Development Kit (SDK).

Amazon Cognito Identity Pools (Federated Identities)

Amazon Cognito identity pools (federated identities) enable you to create unique identities for your users and federate them with identity providers. With an identity pool, you can obtain temporary, limited-privilege AWS credentials to access other AWS services.


Identity Federation (Core Topic)

Overview

OAuth 2.0 is designed only for authorization, for granting access to data and features from one application to another.

OpenID Connect is built on the OAuth 2.0 protocol and uses an additional JSON Web Token (JWT), called an ID token, to standardize areas that OAuth 2.0 leaves up to choice, such as scopes and endpoint discovery. It is specifically focused on user authentication and is widely used to enable user logins on consumer websites and mobile apps.

Security Assertion Markup Language (SAML) is an open standard for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider. SAML is independent of OAuth, relying on an exchange of messages to authenticate in XML SAML format, as opposed to JWT. It is more commonly used to help enterprise users sign in to multiple applications using a single login.

An identity provider (IdP) is a system entity that creates, maintains, and manages identity information for principals and also provides authentication services to relying applications within a federation or distributed network.

Active Directory Federation Services (ADFS) is a Single Sign-On (SSO) solution created by Microsoft. As a component of Windows Server operating systems, it provides users with authenticated access to applications that are not capable of using Integrated Windows Authentication (IWA) through Active Directory (AD).

If you already manage user identities outside of AWS, you can use IAM identity providers instead of creating IAM users in your AWS account. With an identity provider (IdP), you can manage your user identities outside of AWS and give these external user identities permissions to use AWS resources in your account. These users assume identity provided access role.

Federation can have many flavors:

SAML 2.0

Custom Identity Broker

Federating users of a mobile or web-based app with Web Identity Federation (not using Amazon Cognito)

Federating users of a mobile or web-based app with Amazon Cognito

If you create a mobile or web-based app that accesses AWS resources, the app needs security credentials in order to make programmatic requests to AWS. For most mobile application scenarios, we recommend that you use Amazon Cognito.


AWS Single Sign-On

Overview

AWS Single Sign-On is a cloud-based single sign-on (SSO) service that makes it easy to centrally manage SSO access to all of your AWS accounts and cloud applications. Specifically, it helps you manage SSO access and user permissions across all your AWS accounts in AWS Organizations. AWS SSO also helps you manage access and permissions to commonly used third-party software as a service (SaaS) applications, AWS SSO-integrated applications as well as custom applications that support Security Assertion Markup Language (SAML) 2.0. AWS SSO includes a user portal where your end-users can find and access all their assigned AWS accounts, cloud applications, and custom applications in one place.


AWS Active Directory Services (Core Topic)

Overview

Active Directory (AD) is a directory service developed by Microsoft for Windows domain networks. It is included in most Windows Server operating systems as a set of processes and services.

Active Directory Federation Services (ADFS), a software component developed by Microsoft, can run on Windows Server operating systems to provide users with single sign-on access to systems and applications located across organizational boundaries.

The Lightweight Directory Access Protocol (LDAP) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol network.

AWS Managed Microsoft AD

Overview

AWS Directory Service for Microsoft Active Directory, also known as AWS Managed Microsoft AD, enables your directory-aware workloads and AWS resources to use managed Active Directory (AD) in AWS. AWS Managed Microsoft AD is built on actual Microsoft AD and does not require you to synchronize or replicate data from your existing Active Directory to the cloud. You can use the standard AD administration tools and take advantage of the built-in AD features, such as group policy and single sign-on. With AWS Managed Microsoft AD, you can easily join Amazon EC2 and Amazon RDS for SQL Server instances to your domain, and use AWS End User Computing services, such as Amazon WorkSpaces, with AD users and groups.

Integrations

Connecting to on-premises AD

Syncronizing with on-premises AD

AD Connector

Simple AD


AWS Organization (Core Topic)

Overview

AWS Organizations offers policy-based management for multiple AWS accounts. With Organizations, you can create groups of accounts, automate account creation, apply and manage policies for those groups. Organizations enables you to centrally manage policies across multiple accounts, without requiring custom scripts and manual processes.

Using AWS Organizations, you can create Service Control Policies (SCPs) that centrally control AWS service use across multiple AWS accounts. You can also use Organizations to help automate the creation of new accounts through APIs. Organizations helps simplify the billing for multiple accounts by enabling you to setup a single payment method for all the accounts in your organization through consolidated billing. AWS Organizations is available to all AWS customers at no additional charge.

Benefits

Multi-account strategies

Best Practices

Service Control Policies

Service control policies (SCPs) are one type of policy that you can use to manage your organization. SCPs offer central control over the maximum available permissions for all accounts in your organization, allowing you to ensure your accounts stay within your organization’s access control guidelines. SCPs are available only in an organization that has all features enabled. SCPs aren't available if your organization has enabled only the consolidated billing features. SCPs do not apply for the management account itself.

Tag Policies

Tag policies are a type of policy that can help you standardize tags across resources in your organization's accounts. In a tag policy, you specify tagging rules applicable to resources when they are tagged.

Reserved Instances

Trusted Access

You can use trusted access to enable a supported AWS service that you specify, called the trusted service, to perform tasks in your organization and its accounts on your behalf. This involves granting permissions to the trusted service but does not otherwise affect the permissions for IAM users or roles.


AWS Resource Access Manager

Overview

AWS RAM lets you share your resources with any AWS account or through AWS Organizations. If you have multiple AWS accounts, you can create resources centrally and use AWS RAM to share those resources with other accounts.


Security

CloudTrail

Overview

AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. CloudTrail provides event history of your AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. This event history simplifies security analysis, resource change tracking, and troubleshooting. In addition, you can use CloudTrail to detect unusual activity in your AWS accounts. These capabilities help simplify operational analysis and troubleshooting.

CloudTrail is enabled by default in every account. All activities in an AWS account are being recorded as CloudTrail events.

Concepts

Event

Trail

Notification options

. .
SNS Can notify SQS/Lambda from there
S3 Can use bucket events from there
Stream into CloudWatch Logs Can utilize metric filtering and raise alarms
CloudWatch Events Fastest way, works for every API call

KMS

Overview

AWS Key Management Service (KMS) makes it easy for you to create and manage cryptographic keys and control their use across a wide range of AWS services and in your applications. AWS KMS is a secure and resilient service that uses hardware security modules that have been validated under FIPS 140-2, or are in the process of being validated, to protect your keys. AWS KMS is integrated with AWS CloudTrail to provide you with logs of all key usage to help meet your regulatory and compliance needs.

Concepts

Keys, ownership and management responsibilities

Customer master keys (CMKs)

Customer managed CMKs

Customer managed CMKs are CMKs in your AWS account that you create, own, and manage. You have full control over these CMKs, including establishing and maintaining their key policies, IAM policies, and grants, enabling and disabling them, rotating their cryptographic material, adding tags, creating aliases that refer to the CMK, and scheduling the CMKs for deletion.

AWS managed CMKs

AWS managed CMKs are CMKs in your account that are created, managed, and used on your behalf by an AWS service that is integrated with AWS KMS. Some AWS services support only an AWS managed CMK. Others use an AWS owned CMK or offer you a choice of CMKs.

AWS owned CMKs

AWS owned CMKs are a collection of CMKs that an AWS service owns and manages for use in multiple AWS accounts. Although AWS owned CMKs are not in your AWS account, an AWS service can use its AWS owned CMKs to protect the resources in your account.


SSM Parameter Store

Overview

AWS Systems Manager Parameter Store provides secure, hierarchical storage for configuration data management and secrets management. You can store data such as passwords, database strings, Amazon Machine Image (AMI) IDs, and license codes as parameter values. You can store values as plain text or encrypted data. You can reference Systems Manager parameters in your scripts, commands, SSM documents, and configuration and automation workflows by using the unique name that you specified when you created the parameter.


Secrets Manager

Overview

AWS Secrets Manager helps you protect secrets needed to access your applications, services, and IT resources. The service enables you to easily rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. Users and applications retrieve secrets with a call to Secrets Manager APIs, eliminating the need to hardcode sensitive information in plain text. Secrets Manager offers secret rotation with built-in integration for Amazon RDS, Amazon Redshift, and Amazon DocumentDB. Also, the service is extensible to other types of secrets, including API keys and OAuth tokens. In addition, Secrets Manager enables you to control access to secrets using fine-grained permissions and audit secret rotation centrally for resources in the AWS Cloud, third-party services, and on-premises.


RDS Security

Overview


SSL/SNI/MITM/DNSSEC

SSL Basics

SSL Handshake

Client Server
Client sends hello, cipher suits & random .
. Server Response with server random & SSL certificate (Public Key)
Master key (symmetric) generated and sent encrypted using the Public Key .
. Server verifies Client SSL cert (optional)
. Master key is decrypted using Private Key
Secure Symmetric .. ..Communication in Place

Server Name Indication (SNI)

Man-In-The-Middle (MITM)


AWS Certificate Manager

Overview

AWS Certificate Manager is a service that lets you easily provision, manage, and deploy public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services and your internal connected resources. SSL/TLS certificates are used to secure network communications and establish the identity of websites over the Internet as well as resources on private networks. AWS Certificate Manager removes the time-consuming manual process of purchasing, uploading, and renewing SSL/TLS certificates.

With AWS Certificate Manager, you can quickly request a certificate, deploy it on ACM-integrated AWS resources, such as Elastic Load Balancers, Amazon CloudFront distributions, and APIs on API Gateway, and let AWS Certificate Manager handle certificate renewals. It also enables you to create private certificates for your internal resources and manage the certificate lifecycle centrally. Public and private certificates provisioned through AWS Certificate Manager for use with ACM-integrated services are free. You pay only for the AWS resources you create to run your application. With AWS Certificate Manager Private Certificate Authority, you pay monthly for the operation of the private CA and for the private certificates you issue.


CloudHSM

Overview

AWS CloudHSM provides hardware security modules in the AWS Cloud. A hardware security module (HSM) is a computing device that processes cryptographic operations and provides secure storage for cryptographic keys.

When you use an HSM from AWS CloudHSM, you can perform a variety of cryptographic tasks:

If you want a managed service for creating and controlling your encryption keys, but you don't want or need to operate your own HSM, consider using AWS KMS.

CloudHSM vs KMS

Feature KMS CloudHSM
Tenancy Uses multi-tenant key storage Single tenant key storage, dedicated to one customer
Keys Keys owned and managed by AWS Customer managed Keys
Encryption Symmetric and asymmetric (new) encryption Supports both symmetric and asymmetric encryption
Cryptographic Acceleration None SSL/TLS Acceleration Oracle TDE Acceleration
Key Storage and Management Accessible from multiple regions
Centralized management from IAM
Deployed and managed from a customer VPC.
Accessible and can be shared across VPCs using VPC peering
No IAM integration on user/key level
Free Tier Availability Yes No

S3 Security

S3 Encryption for objects

Encryption in transit

AWS S3 exposes:

Events in S3 buckets

S3 Access Logs:

S3 Events Notifications:

Trusted Advisor:

CloudWatch Events:

S3 Security

User-based

IAM

Resource-based

Bucket policies

S3 pre-signed URLs

Can generate pre-signed URLs using SDK or CLI

S3 WORM

S3 Object Lock


Network Security, DDOS, Shield, WAF and Firewall Manager

Network Security

Security Groups

Preventing Infrastructure Attacks

Types of attacks

DDOS Protection

AWS Shield

You can use AWS WAF web access control lists (web ACLs) to help minimize the effects of a distributed denial of service (DDoS) attack. For additional protection against DDoS attacks, AWS also provides AWS Shield Standard and AWS Shield Advanced. AWS Shield Standard is automatically included at no extra cost beyond what you already pay for AWS WAF and your other AWS services. AWS Shield Advanced provides expanded DDoS attack protection for your Amazon EC2 instances, Elastic Load Balancing load balancers, CloudFront distributions, Route 53 hosted zones, and AWS Global Accelerator accelerators. AWS Shield Advanced incurs additional charges.

AWS WAF

AWS WAF is a web application firewall that lets you monitor the HTTP and HTTPS requests that are forwarded to an Amazon CloudFront distribution, an Amazon API Gateway REST API, an Application Load Balancer, or an AWS AppSync GraphQL API. AWS WAF also lets you control access to your conten . Based on conditions that you specify, such as the IP addresses that requests originate from or the values of query strings, Amazon CloudFront, Amazon API Gateway, Application Load Balancer, or AWS AppSync responds to requests either with the requested content or with an HTTP 403 status code (Forbidden). You also can configure CloudFront to return a custom error page when a request is blocked.

AWS Firewall Manager

AWS Firewall Manager simplifies your administration and maintenance tasks across multiple accounts and resources for AWS WAF, AWS Shield Advanced, Amazon VPC security groups, and AWS Network Firewall. With Firewall Manager, you set up your AWS WAF firewall rules, Shield Advanced protections, Amazon VPC security groups, and Network Firewall firewalls just once. The service automatically applies the rules and protections across your accounts and resources, even as you add new resources.

Firewall Manager provides these benefits:

Summary


Blocking IP addresses


Amazon Inspector

Overview

Amazon Inspector is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS. Amazon Inspector automatically assesses applications for exposure, vulnerabilities, and deviations from best practices. After performing an assessment, Amazon Inspector produces a detailed list of security findings prioritized by level of severity. These findings can be reviewed directly or as part of detailed assessment reports which are available via the Amazon Inspector console or API.

Amazon Inspector security assessments help you check for unintended network accessibility of your Amazon EC2 instances and for vulnerabilities on those EC2 instances. Amazon Inspector assessments are offered to you as pre-defined rules packages mapped to common security best practices and vulnerability definitions. Examples of built-in rules include checking for access to your EC2 instances from the internet, remote root login being enabled, or vulnerable software versions installed. These rules are regularly updated by AWS security researchers.


Config

Overview

AWS Config is a service that enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. With Config, you can review changes in configurations and relationships between AWS resources, dive into detailed resource configuration histories, and determine your overall compliance against the configurations specified in your internal guidelines. This enables you to simplify compliance auditing, security analysis, change management, and operational troubleshooting.

Config Rules

Automation

Aggregation

An aggregator is an AWS Config resource type that collects AWS Config configuration and compliance data from the following:


AWS Managed Logs (Core Topic)

. S3 CloudWatch
Logs
.
Load Balancer Access Logs (ALB, NLB, ELB) + . Access logs for your Load Balancers
CloudTrail Logs + + Logs for API calls made within your account
VPC Flow Logs + + Information about IP traffic going to and from network interfaces in your VPC
Route 53 Access Logs . + Log information about the queries that Route 53 receives
S3 Access Logs + . Server access logging provides detailed records for the requests that are made to a bucket
CloudFront Access Logs + . Detailed information about every user request that CloudFront receives
CloudWatch Logs . + .
AWS Config + . Provides an inventory of your AWS resources and records changes to their configuration

GuardDuty

Overview

Amazon GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior to protect your AWS accounts and workloads. With the cloud, the collection and aggregation of account and network activities is simplified, but it can be time consuming for security teams to continuously analyze event log data for potential threats. With GuardDuty, you now have an intelligent and cost-effective option for continuous threat detection in the AWS Cloud. The service uses machine learning, anomaly detection, and integrated threat intelligence to identify and prioritize potential threats. GuardDuty analyzes tens of billions of events across multiple AWS data sources, such as AWS CloudTrail, Amazon VPC Flow Logs, and DNS log. With a few clicks in the AWS Management Console, GuardDuty can be enabled with no software or hardware to deploy or maintain. By integrating with Amazon CloudWatch Events, GuardDuty alerts are actionable, easy to aggregate across multiple accounts, and straightforward to push into existing event management and workflow systems.


Compute & Load Balancing

AWS Solution Architectures

Web/Internet Layer

DNS Static Content Dynamic Contnet
Route 53 CloudFront Elastic LB, API Gateway, Elastic IP

Computer Layer

Computer Serverless Other
EC2, ASG, ECS Lambda, Fargate Batch, EMR

Backend

Caching/Session Layer Database Layer Decoupling Orchestration Layer Storage Layer Static Assets Layer (storage)
ElastiCache, DAX,
DynamoDB, RDS
RDS, Aurora, DynamoDB
ElasticSearch, S3, Redshift
SQS, SNS, Kinesis
Amazon MQ, Step Functions
EBS, EFS, Instance Store
CDN Layer
S3, Glacier

EC2

Overview

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides secure, resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers. Amazon EC2’s simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon’s proven computing environment. Amazon EC2 offers the broadest and deepest compute platform with choice of processor, storage, networking, operating system, and purchase model. We offer the fastest processors in the cloud and we are the only cloud with 400 Gbps ethernet networking. We have the most powerful GPU instances for machine learning training and graphics workloads, as well as the lowest cost-per-inference instances in the cloud.

Instance Types

Family Mnemomic Description
F FPGA Can be reprogrammed on the fly and be tuned for specific applications, making them faster than traditional CPU/GPU combinations
I(*) IOPS (NVMe) SSD-backed instance storage optimized for low latency
G(*) Graphics GPU optimized
H High disk throughput HDD-based local storage
T Cheap general purpose Balance of computer, memory and networking, bustable
D Density Lowest price per disk throughput performance
R(*) RAM Lowest prize for memory performance
M(*) Main choice for general purpose apps Balance of computer, memory and networking (think: medium)
C(*) Compute Lowest prize for compute performance
P Graphics (pics) GPU optimized
X eXtreme memory Lowest prize for memory performance

(*) - main types

Placement Groups

Strategy Pro Con Use case
Cluster Oldest/original placement group
Only certain instances can be launched into a clustered placement group
Should use instances with enhanced networking
Great network (10 Gbps bandwidth between instances) If the rack fails, all instances fails at the same time
Spread Spreads instances across underlying hardware
Minimizes risk as instances are spread
Opposite of clustered placement group
Up to 7 instances per AZ
Can span across Availability Zones (AZ)
Reduced risk is simultaneous failure
EC2 Instances are on different physical hardware
Limited to 7 instances per AZ per placement group
Partitions Spreads instances across many partitions (different sets of racks) within one AZ
Groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions.
Typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka.
Increased resilience ./.

On AWS

Elastic Fabric Adapter

Elastic Fabric Adapter (EFA) is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS. Its custom-built operating system (OS) bypass hardware interface enhances the performance of inter-instance communications, which is critical to scaling these applications.

Key metrics for EC2

Metric Effect
CPUUtilization The total CPU resources utilized within an instance at a given time.
DiskReadOps,DiskWriteOps The number of read (write) operations performed on all instance store volumes. This metric is applicable for instance store-backed AMI instances.
DiskReadBytes,DiskWriteBytes The number of bytes read (written) on all instance store volumes. This metric is applicable for instance store-backed AMI instances.
NetworkIn,NetworkOut The number of bytes received (sent) on all network interfaces by the instance
NetworkPacketsIn,NetworkPacketsOut The number of packets received (sent) on all network interfaces by the instance
StatusCheckFailed,StatusCheckFailed_Instance,StatusCheckFailed_System Reports whether the instance has passed both/instance/system status check in the last minute.

EC2 Instance Recovery

You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair (StatusCheckFailed_System). Terminated instances cannot be recovered. A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata. If the impaired instance is in a placement group, the recovered instance runs in the placement group.

If your instance has a public IPv4 address, it retains the public IPv4 address after recovery.


Auto Scaling (Core Topic)

Overview

Amazon EC2 Auto Scaling helps you ensure that you have the correct number of Amazon EC2 instances available to handle the load for your application. You create collections of EC2 instances, called Auto Scaling Groups. You can specify the minimum number of instances in each Auto Scaling Group, and Amazon EC2 Auto Scaling ensures that your group never goes below this size. You can specify the maximum number of instances in each Auto Scaling Group, and Amazon EC2 Auto Scaling ensures that your group never goes above this size. If you specify the desired capacity, either when you create the group or at any time thereafter, Amazon EC2 Auto Scaling ensures that your group has this many instances. If you specify scaling policies, then Amazon EC2 Auto Scaling can launch or terminate instances as demand on your application increases or decreases.

Components

Auto Scaling Group

Launch Configuration

Launch Template

Termination Policy

. . .
0 Default Designed to help ensure that your instances span Availability Zones evenly for high availability
3->4->random
1 OldestInstance Useful when upgrading to a new EC2 instance type
2 NewestInstance Useful when testing a new launch configuration
3 OldestLaunchConfiguration Useful when updating a group and phasing out instances
5 OldestLaunchTemplate Useful when you're updating a group and phasing out the instances from a previous configuration
4 ClosestToNextInstanceHour Next billing hour - useful to maximize instance us
6 AllocationStrategy Useful when preferred instance types have changed

Scaling Processes

You can suspend and then resume one or more of the scaling processes for your Auto Scaling Group. This can be useful for investigating a configuration problem or other issues with your web application and making changes to your application without invoking the scaling processes.

Process Impact On Suspension
Launch Add a new EC2 to the group, increasing the capacity Disrupts other processes as no more scale out
Terminate Removes an EC2 instance from the group, decreasing its capacity. Disrupts other processes as no more scale in
HealthCheck Checks the health of the instances .
ReplaceUnhealthy Terminate unhealthy instances and re-create them .
AZRebalance Balancer the number of EC2 instances across AZ .
AlarmNotification Accept notification from CloudWatch Suspends actions normally triggered by alarms
ScheduledAction Performs scheduled actions that you create .
AddToLoadBalancer Adds instances to the load balancer or target group Will not automatically add instances later

Deploying with ASGs


ECS (Core Topic)

Overview

Amazon Elastic Container Service (Amazon ECS) is a highly scalable, fast, container management service that makes it easy to run, stop, and manage Docker containers on a cluster. You can host your cluster on a serverless infrastructure that is managed by Amazon ECS by launching your services or tasks using the Fargate launch type. For more control you can host your tasks on a cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances that you manage by using the EC2 launch type.

Benefits

Components

[Cluster
  [Services
    [Task Definitions
      [Family]
      [Task role/execution role]
      [Network mode]
      [Container Definitions
        [Name/Image]
        [Memory/Port Mappings]
        [Health Check]
        [Environment]
        [Network Settings]
        [Storage and Logging]
        [Security]
        [Resource Limit]
        [Docker labels]
      ]
    ]
  ]
]

Auto Scaling

Logging

Load Balancing

Application Load Balancer (ALB) has a direct integration feature with ECS called “port mapping”

Security


Fargate

Overview

AWS Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service (ECS) and Amazon Elastic Kubernetes Service (EKS). Fargate makes it easy for you to focus on building your applications. Fargate removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.

Fargate allocates the right amount of compute, eliminating the need to choose instances and scale cluster capacity. You only pay for the resources required to run your containers, so there is no over-provisioning and paying for additional servers. Fargate runs each task or pod in its own kernel providing the tasks and pods their own isolated compute environment. This enables your application to have workload isolation and improved security by design. This is why customers such as Vanguard, Accenture, Foursquare, and Ancestry have chosen to run their mission critical applications on Fargate.


Lambda (Core Topic)

Overview

AWS Lambda lets you run code without provisioning or managing servers. You pay only for the compute time you consume - there is no charge when your code is not running. With Lambda, you can run code for virtually any type of application or backend service - all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can set up your code to automatically trigger from other AWS services or call it directly from any web or mobile app.

Security/IAM

Managing Functions

triggers -> function & layers -> destinations

Versions

Aliases

Layers

Network

Database

Invoking Functions

Synchronous/Asynchronous/Event Source Invocation

Function Scaling

Logging, Monitoring and Troubleshooting

Connection Lambdas to a VPC

You can configure a Lambda function to connect to private subnets in a VPC. Use VPC to create a private network for resources such as databases, cache instances, or internal services. Connect your function to the VPC to access private resources while the function is running.

Limits and Latencies

. Limit
RAM 128 MB to 10GB
CPU Linked to RAM (cannot be set manually)
2 vCPU are allocated after 1.5G of RAM
Timeout Up to 15 minutes
/tmp storage 512 MB (can’t process BIG files)
Deployment package limit 250 MB including layers
Concurrency execution 1000 – soft limit that can be increased
. Latency
Cold Lambda Invocation ~100ms
New feature of “provisioned concurrency” (Dec 2019) to reduce # of cold starts
Warm Lambda Invocation ~ms
API Gateway invocation 100 ms
CloudFront invocation 100 ms

Elastic Load Balancing (Core Topic)

Overview

Elastic Load Balancing automatically distributes incoming application traffic across multiple targets, such as Amazon EC2 instances, containers, IP addresses, Lambda functions, and virtual appliances. It can handle the varying load of your application traffic in a single Availability Zone or across multiple Availability Zones. Elastic Load Balancing offers four types of load balancers that all feature the high availability, automatic scaling, and robust security necessary to make your applications fault tolerant.

. ALB (2016) NLB (2017) ELB (2009)
. Active Load Balancer Network Load Balancer Classic Load Balancer
Layer 7 (application layer) 4 (transport layer) EC2-classic network (deprecated)
Protocoll HTTP, HTTPS TCP, TLS (secure TCP), UDP TCP, SSL, HTTP, HTTPS

ELB (Classic Load Balancer)

Listener Internal
HTTP (L7) HTTP
HTTPS (must install certificate on EC2)
HTTPS (L7)
SSL termination
Must install certificate on CLB
HTTP
HTTPS (must install certificate on EC2)
TCP (L4) TCP
SSL (must install certificate on EC2)
SSL secure TCP (L4)
Must install certificate on CLB TCP
SSL
(must install certificate on EC2)

ALB

Overview

An Application Load Balancer functions at the application layer, the seventh layer of the Open Systems Interconnection (OSI) model. After the load balancer receives a request, it evaluates the listener rules in priority order to determine which rule to apply, and then selects a target from the target group for the rule action. You can configure listener rules to route requests to different target groups based on the content of the application traffic. Routing is performed independently for each target group, even when a target is registered with multiple target groups. You can configure the routing algorithm used at the target group level. The default routing algorithm is round robin; alternatively, you can specify the least outstanding requests routing algorithm.

Routing

Target Groups

SSL Certificates

NLB

Overview

A Network Load Balancer functions at the fourth layer of the Open Systems Interconnection (OSI) model. It can handle millions of requests per second. After the load balancer receives a connection request, it selects a target from the target group for the default rule. It attempts to open a TCP connection to the selected target on the port specified in the listener configuration. When you enable an Availability Zone for the load balancer, Elastic Load Balancing creates a load balancer node in the Availability Zone. By default, each load balancer node distributes traffic across the registered targets in its Availability Zone only. If you enable cross-zone load balancing, each load balancer node distributes traffic across the registered targets in all enabled Availability Zones.

Target Groups

Each target group is used to route requests to one or more registered targets. When you create a listener, you specify a target group for its default action. Traffic is forwarded to the target group specified in the listener rule. You can create different target groups for different types of requests. For example, create one target group for general requests and other target groups for requests to the microservices for your application.

You define health check settings for your load balancer on a per target group basis. Each target group uses the default health check settings, unless you override them when you create the target group or modify them later on. After you specify a target group in a rule for a listener, the load balancer continually monitors the health of all targets registered with the target group that are in an Availability Zone enabled for the load balancer. The load balancer routes requests to the registered targets that are healthy.

Proxy protocol

Network Load Balancers use proxy protocol version 2 to send additional connection information such as the source and destination. Proxy protocol version 2 provides a binary encoding of the proxy protocol header. The load balancer prepends a proxy protocol header to the TCP data. It does not discard or overwrite any existing data, including any proxy protocol headers sent by the client or any other proxies, load balancers, or servers in the network path. Therefore, it is possible to receive more than one proxy protocol header. Also, if there is another network path to your targets outside of your Network Load Balancer, the first proxy protocol header might not be the one from your Network Load Balancer.

Cross-Zone Load Balancing

Type Default Costs
Classic Disabled No charges for inter-AZ data
ALB Always on No charges for inter-AZ data
NLB Disabled Charges for inter-AZ data

Load Balancer Stickiness


API Gateway

Overview

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. With a few clicks in the AWS Management Console, you can create REST and WebSocket APIs that act as a “front door” for applications to access data, business logic, or functionality from your backend services, such as workloads running on Amazon Elastic Compute Cloud (Amazon EC2), code running on AWS Lambda, any web application, or real-time communication applications.

Limits

. .
Timeout 29 sec
Max payload size 10 MB

Concepts

Endpoint

A hostname for an API in API Gateway that is deployed to a specific region. The hostname is of the form {api-id}.execute-api.{region}.amazonaws.com.

The following types of API endpoints are supported:

Stage

A logical reference to a lifecycle state of your REST or WebSocket API (for example, dev, prod, beta, v2). API changes are deployed to stages.

Deployment

After creating your API, you must deploy it to make it callable by your users. To deploy an API, you create an API deployment and associate it with a stage. Each stage is a snapshot of the API and is made available for client apps to call.

Canary Deployments

Integration

Mapping Template

Model

A data schema specifying the data structure of a request or response payload.

Throttling

Caching API responses

Errors

Client-side Server-side .
400 . Bad Request
403 . Access Denied, WAF filtered
429 . Quota exceeded, Throttle
. 502 Bad Gateway Exception
usually for an incompatible output returned from a Lambda proxy integration backend and occasionally for out-of-order invocations due to heavy loads.
. 503 Service Unavailable Exception
. 504 Integration Failure
Eg Endpoint Request Timed-out Exception API Gateway requests time out after 29 second maximum

Security & Authentication

Security

Authentication

Logging, Monitoring, Tracing


Route 53 (Core Topic)

Overview

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service. You can use Route 53 to perform three main functions in any combination: domain registration, DNS routing, and health checking. If you choose to use Route 53 for all three functions, perform the steps in this order:

Terminology

How it works

Basic Flow

Resolving DNS queries between VPCs and your network

When you create a VPC using Amazon VPC, Route 53 Resolver automatically answers DNS queries for local VPC domain names for EC2 instances (ec2-192-0-2-44.compute-1.amazonaws.com) and records in private hosted zones (acme.example.com). For all other domain names, Resolver performs recursive lookups against public name servers.

You also can integrate DNS resolution between Resolver and DNS resolvers on your network by configuring forwarding rules. Your network can include any network that is reachable from your VPC, such as the following:

Zone File & Records

Zone file stores records. Various records exists:

Type Definition Example
A Map host name to ip4 address px01.vc.example.com. 198.51.100.40
AAAA Map host name to ip6 address px01.vc.example.com. 2a00:1450:4014:80c:0:0:0:2004
CNAME Defines alias for host name
(maps one domain name to another)
www.dnsimple.com. dnsimple.com.
SOA State of Authority - Mandatory first entry, defines various things,
eg name servers & admin contact
ns1.dnsimple.com admin.dnsimple.com 2013022001 86400 7200 604800 300
MX Defines mail exchange example.com. 1800 MX mail1.example.com. 10
PTR Maps ip4 address to host name (inverse to A record) 10.27/1.168.192.in-addr.arpa. 1800 PTR mail.example.com.
SVR Points one domain to another domain name using a specific destination port _sip._tcp.example.com. 86400 IN SRV 0 5 5060 sipserver.example.com.

Route 53 specific:

Route 53 Routing Policies

Health Checks with Route 53

Trigger automated DNS failover

Setup

Private Hosted Zones

Options:

Sharing a Private Zone across multiple VPCs


Solution Architeture Comparision

EC2 on its own with Elastic IP

EC2 with Route53

ALB + ASG

ALB + ECS on EC2

ALB + ECS on Fargate

ALB + Lambda

API Gateway + Lambda

API Gateway + AWS Service

API Gateway + HTTP backend (ex: ALB)


Storage

EBS

Overview

Amazon Elastic Block Store (EBS) is an easy to use, high-performance, block-storage service designed for use with Amazon Elastic Compute Cloud (EC2) for both throughput and transaction intensive workloads at any scale. A broad range of workloads, such as relational and non-relational databases, enterprise applications, containerized applications, big data analytics engines, file systems, and media workflows are widely deployed on Amazon EBS.

You can choose from six different volume types to balance optimal price and performance. You can achieve single-digit-millisecond latency for high-performance database workloads such as SAP HANA or gigabyte per second throughput for large, sequential workloads such as Hadoop. You can change volume types, tune performance, or increase volume size without disrupting your critical applications, so you have cost-effective storage when you need it.

Designed for mission-critical systems, EBS volumes are replicated within an Availability Zone (AZ) and can easily scale to petabytes of data. Also, you can use EBS Snapshots with automated lifecycle policies to back up your volumes in Amazon S3, while ensuring geographic protection of your data and business continuity.

Volume options

. . . .
General purpose SSD (cheap) gp2 100 IOPS >= x <= 16,000 IOPS 1 GiB - 16 TiB
+1 TiB = 3,000 IOPS
Provisioned IOPS (expensive) io1 100 IOPS >= x <= 32,000 IOPS (other) <= 64,000 IOPS (Nitro) 4 GiB - 16 TiB
Size of volume and IOPS are independent
Magnetic volumes, throughput optimized st1, hs1 * Frequently accessed workload
* Cannot be boot volume
500 GiB – 16 TiB
500 MiB/s throughput
Magnetic volumes, cold sc1 * Less frequently accessed workload
* Cannot be boot volume
250 GiB – 16 TiB
250 MiB/s throughput
Magnetic volumes, standard . Can be boot volume .

Snaphosts

Moving Instances/Volumes To A Different AZ/Region

Encrypting root volumes


Local Instance Store

EBS vs Instance Store


EFS

Overview

Amazon Elastic File System (Amazon EFS) provides a simple, scalable, fully managed elastic NFS file system for use with AWS Cloud services and on-premises resources. It is built to scale on demand to petabytes without disrupting applications, growing and shrinking automatically as you add and remove files, eliminating the need to provision and manage capacity to accommodate growth.

Amazon EFS offers two storage classes: the Standard storage class, and the Infrequent Access storage class (EFS IA). EFS IA provides price/performance that's cost-optimized for files not accessed every day. By simply enabling EFS Lifecycle Management on your file system, files not accessed according to the lifecycle policy you choose will be automatically and transparently moved into EFS IA. The EFS IA storage class costs only $0.025/GB-month*.

While workload patterns vary, customers typically find that 80% of files are infrequently accessed (and suitable for EFS IA), and 20% are actively used (suitable for EFS Standard), resulting in an effective storage cost as low as $0.08/GB-month*. Amazon EFS transparently serves files from both storage classes in a common file system namespace.

Amazon EFS is designed to provide massively parallel shared access to thousands of Amazon EC2 instances, enabling your applications to achieve high levels of aggregate throughput and IOPS with consistent low latencies.

Amazon EFS is well suited to support a broad spectrum of use cases from home directories to business-critical applications. Customers can use EFS to lift-and-shift existing enterprise applications to the AWS Cloud. Other use cases include: big data analytics, web serving and content management, application development and testing, media and entertainment workflows, database backups, and container storage.

Amazon EFS is a regional service storing data within and across multiple Availability Zones (AZs) for high availability and durability. Amazon EC2 instances can access your file system across AZs, regions, and VPCs, while on-premises servers can access using AWS Direct Connect or AWS VPN.

Performance & Storage Classes


Amazon S3

Overview

Amazon Simple Storage Service (S3) is object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web. It is designed to deliver 11x9 durability and scale past trillions of objects worldwide.

Getting Data In And Out

Perfomance & Consistency

Versioning

S3 Events Notifications

Logging

Perfomance

Cross Region Replication/Same Region Replication

Replication enables automatic, asynchronous copying of objects across Amazon S3 buckets. Buckets that are configured for object replication can be owned by the same AWS account or by different accounts. Object may be replicated to a single destination bucket or multiple destination buckets. Destination buckets can be in different AWS Regions or within the same Region as the source bucket.

Hosting Static Websites

<bucket-name>.s3-website-<AWS-Region>.amazonaws.com

Storage classes

. Durability Availability AZs Costs per GB Retrieval Fee .
S3 Standard 11x9 4x9 >=3 $0.023 No .
S3 Intelligent Tiering 11x9 3x9 >=3 $0.023 No Automatically moves objects between two access tiers based on changing access patterns
S3 IA (infrequent access) 11x9 3x9 >=3 $0.0125 Yes For data that is accessed less frequently, but requires rapid access when needed
S3 One Zone IA (infrequent access) 11x9 99.5 1 $0.01 Yes For data that is accessed less frequently, but requires rapid access when needed
Glacier 11x9 . >=3 . Yes For archival only, comes as expedited (1-5min), standard (3-5h) or bulk (5-12h)
Glacier Deep Archive 11x9 . >=3 . Yes Longer time span to retrieve
S3 RRS (reduced redundancy storage) 4x9 4x9 >=3 $0.024 . Deprecated

Access Control

Defaults

IAM

Bucket policies

ACLs

How to specify resources in a policy:

. .
arn:partition:service:region:namespace:relative-id arn:aws:s3:::mybucket
arn:aws:s3:::* All buckets and objects in account
arn:aws:s3:::mybucket mybucket
arn:aws:s3:::mybucket/* All objects in mybucket
arn:aws:s3:::mybucket/mykey mykey in mybucket
arn:aws:s3:::mybucket/developers/($aws:username)/ folder matching the accessing user's name

Pre-signed URLs

All objects are private by default. Only the object owner has permission to access these objects. However, the object owner can optionally share objects with others by creating a pre-signed URL, using their own security credentials, to grant time-limited permission to download the objects.

Encryption

Protecting data in transit

Protecting data at rest

Etc

Pricing

Charged by

Limits

. .
Buckets per account 100
Bucket policy max size 20KB
Object size 0B to 5TB
Object size in a single PUT 5GB

S3 Solution Architecture

Exposing Static Objects

Indexing objects in DynamoDB


S3 vs EFS vs EBS Comparison

Amazon S3 Amazon EBS Amazon EFS
Can be publicly accessible Accessible only via the given EC2 Machine Accessible via several EC2 machines and AWS services
Web interface File system interface Web and file system interface
Object Storage Block storage Object storage
Scalable Hardly scalable Scalable
Slowest Fastest Faster than S3, slower than EBS
Good for storing backups Is meant to be EC2 drive Good for shareable applications and workloads
Elastic, only pay for used storage Fixed, pay for provisioned storage Elastic, only pay for used storage

Caching

CloudFront (Core Topic)

Overview

Amazon CloudFront is a web service that speeds up distribution of your static and dynamic web content, such as .html, .css, .js, and image files, to your users. CloudFront delivers your content through a worldwide network of data centers called edge locations. When a user requests content that you're serving with CloudFront, the request is routed to the edge location that provides the lowest latency (time delay), so that content is delivered with the best possible performance.

CloudFront speeds up the distribution of your content by routing each user request through the AWS backbone network to the edge location that can best serve your content. Typically, this is a CloudFront edge server that provides the fastest delivery to the viewer. Using the AWS network dramatically reduces the number of networks that your users' requests must pass through, which improves performance. Users get lower latency—the time it takes to load the first byte of the file and higher data transfer rates.

You also get increased reliability and availability because copies of your files (also known as objects) are now held (or cached) in multiple edge locations around the world.

Basic workflow

Origin

An origin is the location where content is stored, and from which CloudFront gets content to serve to viewers. To specify an origin:

Can have primary and secondary origin (HA/Failover)

Origin How content is accessed
S3 CloudFront assumes Origin Access Identity
S3 Bucket Policy
EC2 Must have public IPs
ALB (-> EC2) ALB must have public IP, EC2 can be private

CloudFront vs S3 Cross Region Replication

CloudFront S3 Cross Region Replication
Global Edge network Must be setup for each region you want replication to happen
Files are cached for a TTL (maybe a day) Files are updated in near real-time
Great for static content that must be available everywhere Read only
. Great for dynamic content that needs to be available at low-latency in few regions

CloudFront Geo Restriction

Signed URL/Signed Cookies

Signed URL/Cookie S3 Pre-Signed URL
Allow access to a path, no matter the origin Issue a request as the person who pre-signed the URL
Account wide key-pair, only the root can manage it Uses the IAM key of the signing IAM principal
Can filter by IP, path, date, expiration Limited lifetime
Can leverage caching features .

Restricting access to files in Amazon S3 buckets

You can optionally secure the content in your Amazon S3 bucket so that users can access it through CloudFront but cannot access it directly by using Amazon S3 URLs. This prevents someone from bypassing CloudFront and using the Amazon S3 URL to get content that you want to restrict access to. This step isn't required to use signed URLs, but we recommend it.

Trusted Signer

To create signed URLs or signed cookies, you need a signer. A signer is either a trusted key group that you create in CloudFront, or an AWS account that contains a CloudFront key pair. We recommend that you use trusted key groups, for the following reasons:

Field-Level Encryption

Caching

CloudFront Caching vs API Gateway Caching

API Gateway now has two different kinds of endpoints. The original design is now called edge optimized, and the new option is called regional. Regional endpoints do not use front-end services from CloudFront, and may offer lower latency when accessed from EC2 within the same AWS region. All existing endpoints were categorized as edge-optimized when the new regional capability was rolled out. With a regional endpoint, the CloudFront-* headers are not present in the request, unless you use your own CloudFront distribution and whitelist those headers for forwarding to the origin.

Lambda@Edge

https

Scenario 1 (requires 2 certs)

. CloudFront ALB
hostname www.example.com origin.example.com
ssl cert www.example.com origin.example.com
origin origin.example.com .

If Host header is forwarded:

If Host header is not forwarded:

Scenario 2 (doesn't work)

. CloudFront ALB
hostname www.example.com www.example.com
ssl cert www.example.com www.example.com
origin www.example.com .

Impossible, as CloudFront distribution will loop over itself!

Scenario 2 (requires 1 cert)

. CloudFront ALB
hostname www.example.com origin.example.com
ssl cert www.example.com www.example.com
origin origin.example.com .

If Host header is forwarded:

If Host header is not forwarded:


ElastiCache (Core Topic)

Overview

Amazon ElastiCache allows you to seamlessly set up, run, and scale popular open-source compatible in-memory data stores in the cloud. Build data-intensive apps or boost the performance of your existing databases by retrieving data from high throughput and low latency in-memory data stores.

Scenarios

Memcached

Redis


Handling Extreme Rates

Traffic Management

Service Rate Comment
Route53 . Handles extreme rates without problems
CloudFront 100,000 requests/second .
ALB . Can scale to extreme rates, needs warmup though

Compute

Service Rate Comment
ASG, ECS . Scales well, but slowly, requires bootstrap
Fargate . Faster than ECS
Lambda 1,000 concurrent executions Soft limit per region

Storage

Service Rate Comment
S3 3,500 PUT, 5,550 GET per prefix/s .
RDS, Aurora, ElasticSearch . provisioned
DynamoDB . Autoscaling on demand
EBS 16k IOPS (gp2), 64k IOPS (io1) .
Instance Store ~M IOPS .
EFS . Performance modes General, Max IO
Redis <200 nodes (replica+sharding)
Memcached 20 nodes (sharding)
DAX 10 nodes (primary + replicas)

Others

Service Rate Comment
SQS, SNS . unlimited
SQS FIFO 3,000 RPS (with batching)
Kinesis 1 MB/s in, 2 MB/s out per shard

Databases

DynamoDB

Overview

Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It's a fully managed, multi-region, multi-active, durable database with built-in security, backup and restore, and in-memory caching for internet-scale applications. DynamoDB can handle more than 10 trillion requests per day and can support peaks of more than 20 million requests per second.

Basics

Important features

Keys and indexes

Partion Key

Partition Key & Sort Key

Secondary indexes

Projected attributes

Local secondary index

Global secondary index

DynamoDB DAX

DAX vs ElastiCache

Solution Architecture

Use DDB to index objext in S3


ElasticSearch (Core Topic)

Overview

Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost effectively at scale. You can build, monitor, and troubleshoot your applications using the tools you love, at the scale you need. The service provides support for open source Elasticsearch APIs, managed Kibana, integration with Logstash and other AWS services, and built-in alerting and SQL querying. Amazon Elasticsearch Service lets you pay only for what you use – there are no upfront costs or usage requirements. With Amazon Elasticsearch Service, you get the ELK stack you need, without the operational overhead.

ELK

ElasticSearch Patterns


RDS

Overview

Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as hardware provisioning, database setup, patching and backups. It frees you to focus on your applications so you can give them the fast performance, high availability, security and compatibility they need

Amazon RDS is available on several database instance types - optimized for memory, performance or I/O - and provides you with six familiar database engines to choose from, including Amazon Aurora, PostgreSQL, MySQL, MariaDB, Oracle Database, and SQL Server. You can use the AWS Database Migration Service to easily migrate or replicate your existing databases to Amazon RDS.

Security

RDS Backup

Multi-AZ deployments

Amazon RDS Multi-AZ deployments provide enhanced availability for database instances within a single AWS Region.

Replicating RDS

Etc


Aurora

Overview

Amazon Aurora (Aurora) is a fully managed relational database engine that's compatible with MySQL and PostgreSQL. You already know how MySQL and PostgreSQL combine the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. The code, tools, and applications you use today with your existing MySQL and PostgreSQL databases can be used with Aurora. With some workloads, Aurora can deliver up to five times the throughput of MySQL and up to three times the throughput of PostgreSQL without requiring changes to most of your existing applications.

Aurora includes a high-performance storage subsystem. Its MySQL- and PostgreSQL-compatible database engines are customized to take advantage of that fast distributed storage. The underlying storage grows automatically as needed. An Aurora cluster volume can grow to a maximum size of 128 tebibytes (TiB). Aurora also automates and standardizes database clustering and replication, which are typically among the most challenging aspects of database configuration and administration.

Aurora is part of the managed database service Amazon Relational Database Service (Amazon RDS). More cloud-native, usually preferred over RDS

High Availability and Read Scaling

Replicas

Aurora Serverless

Global Aurora

An Aurora global database consists of one primary AWS Region where your data is mastered, and up to five read-only secondary AWS Regions. You issue write operations directly to the primary DB cluster in the primary AWS Region. Aurora replicates data to the secondary AWS Regions using dedicated infrastructure, with latency typically under a second.

Multi Master


Service Communication

Step Functions

Overview

AWS Step Functions is a serverless function orchestrator that makes it easy to sequence AWS Lambda functions and multiple AWS services into business-critical applications. Through its visual interface, you can create and run a series of checkpointed and event-driven workflows that maintain the application state. The output of one step acts as an input to the next. Each step in your application executes in order, as defined by your business logic.

Orchestrating a series of individual serverless applications, managing retries, and debugging failures can be challenging. As your distributed applications become more complex, the complexity of managing them also grows. With its built-in operational controls, Step Functions manages sequencing, error handling, retry logic, and state, removing a significant operational burden from your team.

Tasks

Integration


Simple Workflow Service (SWF)

Overview

The Amazon Simple Workflow Service (Amazon SWF) makes it easy to build applications that coordinate work across distributed components. In Amazon SWF, a task represents a logical unit of work that is performed by a component of your application. Coordinating tasks across the application involves managing intertask dependencies, scheduling, and concurrency in accordance with the logical flow of the application. Amazon SWF gives you full control over implementing tasks and coordinating them without worrying about underlying complexities such as tracking their progress and maintaining their state.

When using Amazon SWF, you implement workers to perform tasks. These workers can run either on cloud infrastructure, such as Amazon Elastic Compute Cloud (Amazon EC2), or on your own premises. You can create tasks that are long-running, or that may fail, time out, or require restarts—or that may complete with varying throughput and latency. Amazon SWF stores tasks and assigns them to workers when they are ready, tracks their progress, and maintains their state, including details on their completion. To coordinate tasks, you write a program that gets the latest state of each task from Amazon SWF and uses it to initiate subsequent tasks. Amazon SWF maintains an application's execution state durably so that the application is resilient to failures in individual components. With Amazon SWF, you can implement, deploy, scale, and modify these application components independently.

Amazon SWF offers capabilities to support a variety of application requirements. It is suitable for a range of use cases that require coordination of tasks, including media processing, web application back-ends, business process workflows, and analytics pipelines.

Core components


Simple Queue Service (Core Topic)

Overview

Amazon Simple Queue Service (SQS) is a fully managed message queuing service that enables you to decouple and scale microservices, distributed systems, and serverless applications. SQS eliminates the complexity and overhead associated with managing and operating message oriented middleware, and empowers developers to focus on differentiating work. Using SQS, you can send, store, and receive messages between software components at any volume, without losing messages or requiring other services to be available. Get started with SQS in minutes using the AWS console, Command Line Interface or SDK of your choice, and three simple commands.

SQS offers two types of message queues. Standard queues offer maximum throughput, best-effort ordering, and at-least-once delivery. SQS FIFO queues are designed to guarantee that messages are processed exactly once, in the exact order that they are sent.

Core features

Scenarios

Limits

. .
Max message size 256KB (2GB with Extended Client Library)
Max inflight messages 120,000

Amazon MQ

Overview

Amazon MQ is a managed message broker service for Apache ActiveMQ and RabbitMQ that makes it easy to set up and operate message brokers on AWS. Amazon MQ reduces your operational responsibilities by managing the provisioning, setup, and maintenance of message brokers for you. Because Amazon MQ connects to your current applications with industry-standard APIs and protocols, you can easily migrate to AWS without having to rewrite code.


Simple Notification Service (Core Topic)

Overview

Amazon Simple Notification Service (Amazon SNS) is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication.

The A2A pub/sub functionality provides topics for high-throughput, push-based, many-to-many messaging between distributed systems, microservices, and event-driven serverless applications. Using Amazon SNS topics, your publisher systems can fanout messages to a large number of subscriber systems including Amazon SQS queues, AWS Lambda functions and HTTPS endpoints, for parallel processing, and Amazon Kinesis Data Firehose. The A2P functionality enables you to send messages to users at scale via SMS, mobile push, and email.

Scenarios


Data Engineering

Kinesis (Core Topic)

Overview

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.

Kinesis Data Streams

Overview

Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources such as website clickstreams, database event streams, financial transactions, social media feeds, IT logs, and location-tracking events. The data collected is available in milliseconds to enable real-time analytics use cases such as real-time dashboards, real-time anomaly detection, dynamic pricing, and more.

Kinesis Streams Shards

Getting Data in and out

Limits

. .
Producer 1MB/s or 1000 messages/s at write PER SHARD
“ProvisionedThroughputException” otherwise
Consumer Classic 2MB/s at read PER SHARD across all consumers
5 API calls per second PER SHARD across all consumers
Consumer Enhanced Fan-Out 2MB/s at read PER SHARD, PER ENHANCED CONSUMER
No API calls needed (push model)
Data Retention 24 hours data retention by default
Can be extended to 365 (was: 7) days

Kinesis Data Firehose

Overview

Amazon Kinesis Data Firehose is the easiest way to reliably load streaming data into data lakes, data stores, and analytics services. It can capture, transform, and deliver streaming data to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, generic HTTP endpoints, and service providers like Datadog, New Relic, MongoDB, and Splunk. It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, transform, and encrypt your data streams before loading, minimizing the amount of storage used and increasing security.

You can easily create a Firehose delivery stream from the AWS Management Console, configure it with a few clicks, and start ingesting streaming data from hundreds of thousands of data sources to your specified destinations. You can also configure your data streams to automatically convert the incoming data to open and standards based formats like Apache Parquet and Apache ORC before the data is delivered.

With Amazon Kinesis Data Firehose, there is no minimum fee or setup cost. You pay for the amount of data that you transmit through the service, if applicable, for converting data formats, and for Amazon VPC delivery and data transfer.

Firehose Buffer Sizing

Data Streams vs Firehose

Kinesis Data Analytics

Amazon Kinesis Data Analytics is the easiest way to transform and analyze streaming data in real time with Apache Flink. Apache Flink is an open source framework and engine for processing data streams. Amazon Kinesis Data Analytics reduces the complexity of building, managing, and integrating Apache Flink applications with other AWS services.

Amazon Kinesis Data Analytics takes care of everything required to run streaming applications continuously, and scales automatically to match the volume and throughput of your incoming data. With Amazon Kinesis Data Analytics, there are no servers to manage, no minimum fee or setup cost, and you only pay for the resources your streaming applications consume.

Streaming Architectures

Full Data Engineering Pipeline

3000 messages of 1KB/sec

Comparison

. Kinesis Data Streams SQS SQS FIFO SNS DynamoDB S3
Data Immutable Immutable Immutable Immutable Mutable Mutable
Retention 1-7 days,export to S3 using KDF 1-14 days 1-14 days No retention Infinite or can implement TTL Infinite, can setup lifecycle policies
Ordering Per shard No ordering Per group-id No ordering No ordering No ordering
Scalability Provision shards Soft limit 300 msg/s Or 3000 if batch Soft limit WCU & RCU On-demand Infinite, 3500 PUT, 5500 GET/prefix
Readers EC2, Lambda, KDF, KDA, KCL (checkpoint) EC2, Lambda EC2, Lambda HTTP, Lambda, Email, SQS... DynamoDB Streams SDK, S3 Events
Latency KDS (200 ms),KDF (1 min) Low (10-100ms) Low (10-100ms) Low (10-100ms) Low (10-100ms) Low (10-100ms)

AWS Batch

Overview

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. AWS Batch plans, schedules, and executes your batch computing workloads across the full range of AWS compute services and features, such as AWS Fargate, Amazon EC2 and Spot Instances.

There is no additional charge for AWS Batch. You only pay for the AWS resources (e.g. EC2 instances or Fargate jobs) you create to store and run your batch jobs.

Batch vs Lambda

Compute Environments

Multi-Node Mode


Amazon EMR

Overview

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. Amazon EMR makes it easy to set up, operate, and scale your big data environments by automating time-consuming tasks like provisioning capacity and tuning clusters. With EMR you can run petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 3x faster than standard Apache Spark. You can run workloads on Amazon EC2 instances, on Amazon Elastic Kubernetes Service (EKS) clusters, or on-premises using EMR on AWS Outposts.

Node types & purchasing

Instance Configuration


Running Jobs on AWS

. .
EC2 instance With long-running cronjob
Reactive workflow CloudWatch Events/S3 Events/API Gateway/SQS/SNS/... -> Lambda
Fargate CloudWatch Events -> Lambda
CloudWatch Events and Lambda (scheduled) Cloudwatch Events -> cron schedule -> Lambda
AWS Batch CloudWatch Events -> Batch
EMR .

Amazon Redshift

Overview

No other data warehouse makes it as easy to gain new insights from all your data. With Redshift, you can query and combine exabytes of structured and semi-structured data across your data warehouse, operational database, and data lake using standard SQL. Redshift lets you easily save the results of your queries back to your S3 data lake using open formats, like Apache Parquet, so that you can do additional analytics from other analytics services like Amazon EMR, Amazon Athena, and Amazon SageMaker.

How it works

Snapshots and DR

Redshift Spectrum

Using Amazon Redshift Spectrum, you can efficiently query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets. Much of the processing occurs in the Redshift Spectrum layer, and most of the data remains in Amazon S3. Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

Troubleshooting

In Redshift, if your query operation hangs or stops responding, below are the possible causes as well as its corresponding solution:


Athena

Overview

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Athena is easy to use. Simply point to your data in Amazon S3, define the schema, and start querying using standard SQL. Most results are delivered within seconds. With Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for anyone with SQL skills to quickly analyze large-scale datasets.

Athena is out-of-the-box integrated with AWS Glue Data Catalog, allowing you to create a unified metadata repository across various services, crawl data sources to discover schemas and populate your Catalog with new and modified table and partition definitions, and maintain schema versioning.


QuickSight

Overview

Amazon QuickSight is a fast business analytics service to build visualizations, perform ad hoc analysis, and quickly get business insights from your data. Amazon QuickSight seamlessly discovers AWS data sources, enables organizations to scale to hundreds of thousands of users, and delivers fast and responsive query performance by using a robust in-memory engine (SPICE).


AWS Data Pipeline

AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR.

AWS Data Pipeline helps you easily create complex data processing workloads that are fault tolerant, repeatable, and highly available. You don’t have to worry about ensuring resource availability, managing inter-task dependencies, retrying transient failures or timeouts in individual tasks, or creating a failure notification system. AWS Data Pipeline also allows you to move and process data that was previously locked up in on-premises data silos.


Big Data Architecture

Analytics Layer

Big Data Ingestion

Comparison of warehousing technologies


Monitoring

CloudWatch (Core Topic)

Overview

Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing you with a unified view of AWS resources, applications, and services that run on AWS and on-premises servers. You can use CloudWatch to detect anomalous behavior in your environments, set alarms, visualize logs and metrics side by side, take automated actions, troubleshoot issues, and discover insights to keep your applications running smoothly.

CloudWatch Metrics

CloudWatch Alarms

CloudWatch Dashboards

CloudWatch Events

EventBridge

S3 Events

CloudWatch Logs

Log sources

Log targets

S3 Export

Log Subscriptions

Logs Agent & Unified Agent

X-Ray

Overview

AWS X-Ray helps developers analyze and debug production, distributed applications, such as those built using a microservices architecture. With X-Ray, you can understand how your application and its underlying services are performing to identify and troubleshoot the root cause of performance issues and errors. X-Ray provides an end-to-end view of requests as they travel through your application, and shows a map of your application’s underlying components. You can use X-Ray to analyze both applications in development and in production, from simple three-tier applications to complex microservices applications consisting of thousands of services.

Deployment and Instance Management

Elastic Beanstalk (Core Topic)

Overview

AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.

You can simply upload your code and Elastic Beanstalk automatically handles the deployment, from capacity provisioning, load balancing, auto-scaling to application health monitoring. At the same time, you retain full control over the AWS resources powering your application and can access the underlying resources at any time.

Architecture models

Deployment Types

An in-place upgrade involves performing application updates on live Amazon EC2 instances. A disposable upgrade, on the other hand, involves rolling out a new set of EC2 instances by terminating older instances.


OpsWorks Stacks (Core Topic)

Overview

AWS OpsWorks is a configuration management service that provides managed instances of Chef and Puppet. Chef and Puppet are automation platforms that allow you to use code to automate the configurations of your servers. OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments.

Components


CodeDeploy (Core Topic)

Overview

AWS CodeDeploy is a fully managed deployment service that automates software deployments to a variety of compute services such as Amazon EC2, AWS Fargate, AWS Lambda, and your on-premises servers. AWS CodeDeploy makes it easier for you to rapidly release new features, helps you avoid downtime during application deployment, and handles the complexity of updating your applications. You can use AWS CodeDeploy to automate software deployments, eliminating the need for error-prone manual operations. The service scales to match your deployment needs.

Deploys

To EC2/On-premises

Integration with Elastic Load Balancing
Integration with Auto Scaling Groups
Register on-premises instances

To Lambdas

Integration with AWS Serverless

To ECS


CloudFormation (Core Topic)

Overview

AWS CloudFormation provides a common language for you to describe and provision all the infrastructure resources in your cloud environment. CloudFormation allows you to use a simple text file to model and provision, in an automated and secure manner, all the resources needed for your applications across all regions and accounts. This file serves as the single source of truth for your cloud environment.

AWS CloudFormation is available at no additional charge, and you pay only for the AWS resources needed to run your applications.

CloudFormation and ASG

Retaining Data on Deletes

CloudFormation and IAM

Custom Resources

Cross vs Nested Stacks

Other Concepts

Service Catalog

Overview

AWS Service Catalog allows organizations to create and manage catalogs of IT services that are approved for use on AWS. These IT services can include everything from virtual machine images, servers, software, and databases to complete multi-tier application architectures. AWS Service Catalog allows you to centrally manage commonly deployed IT services, and helps you achieve consistent governance and meet your compliance requirements, while enabling users to quickly deploy only the approved IT services they need.

Components


AWS Serverless Application Model

Overview

The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications. It provides shorthand syntax to express functions, APIs, databases, and event source mappings. With just a few lines per resource, you can define the application you want and model it using YAML. During deployment, SAM transforms and expands the SAM syntax into AWS CloudFormation syntax, enabling you to build serverless applications faster.

To get started with building SAM-based applications, use the AWS SAM CLI. SAM CLI provides a Lambda-like execution environment that lets you locally build, test, and debug applications defined by SAM templates. You can also use the SAM CLI to deploy your applications to AWS.


Deployments (Core Topic)

Options

. .
Vanilla EC2 With User Data (just for the first launch)
AMI Baking For things that are slow to install (runtimes, updates, tools), and use EC2 user data for quick runtime setup
Auto Scaling Group With launch template (AMI)
CodeDeploy In-place on EC2
In-place on ASG
New instances on ASG
Traffic shifting for AWS Lambda
New task set for ECS + traffic shifting
Elastic Beanstalk In-place all at once upgrades
Rolling upgrades (with or without additional instances)
Immutable upgrades (new instances)
Blue/Green (entirely new stack)
OpsWorks For chef/puppet stacks only
Can manage ELB and EC2 instances
Cannot manage an ASG
SAM Framework Leverages CloudFormation & CodeDeploy

Mechanisms

. .
Runtime/container EC2 - ECS - Lambda - Elastic Beanstalk
Application deployment CodeDeploy - OpsWorks - Elastic Beanstalk
Code/deployment management CodeCommit - CodePipeline - Elastic Beanstalk
Infrastructure deployment OpsWorks - CloudFormation - Elastic Beanstalk

Per AWS Service

Strategy Auto Scaling
Group
CodeDeploy
EC2/On-Premises
CodeDeploy ECS CodeDeploy Lambda Elastic
Beanstalk
OpsWorks
Single Target Deployment . . . . redeploy .
All At Once AutoScalingReplacingUpdate All-at-once . . all at once .
Minimum In Service . . . . rolling .
Rolling AutoScalingRollingUpdate One-at-at-time . . rolling .
Rolling With Extra Batches . . . . rolling with
extra batches
.
Blue/Green . Traffic is shifted to a replacement set of instances
* All-at-once
* Half-at-a-time
* One-at-a-time
Traffic is shifted to a replacement task set
* Canary
* Linear
* All-at-once
Traffic is shifted to a new Lambda version
* Canary
* Linear
* All-at-once
immutable comes close
or: create new environment and use DNS
create new environment and use DNS
Canary . . See above
* Canary
See above
* Canary
Traffic Splitting .

Systems Manager (Core Topic)

Overview

AWS Systems Manager gives you visibility and control of your infrastructure on AWS. Systems Manager provides a unified user interface so you can view operational data from multiple AWS services and allows you to automate operational tasks across your AWS resources. With Systems Manager, you can group resources, like Amazon EC2 instances, Amazon S3 buckets, or Amazon RDS instances, by application, view operational data for monitoring and troubleshooting, and take action on your groups of resources. Systems Manager simplifies resource and application management, shortens the time to detect and resolve operational problems, and makes it easy to operate and manage your infrastructure securely at scale.

Components

Resources groups

Insights

Parameter store

Action & Change

Instances & Nodes


Cost Control

Best practices of cost management

AWS Cost Allocation Tags


Trusted Advisor (Core Topic)

Overview

AWS Trusted Advisor is an online tool that provides you real time guidance to help you provision your resources following AWS best practices. Whether establishing new workflows, developing applications, or as part of ongoing improvement, take advantage of the recommendations provided by Trusted Advisor on a regular basis to help keep your solutions provisioned optimally.

Support Plans

. .
Basic Support included for all AWS customers and free
Developer Recommended when you are experimenting with AWS
Business If you have production workloads
Enterprise If you have mission-critical workloads

EC2 Purchasing Options & Saving Plans (Core Topic)

EC2 Purchasing Options

. . .
On-demand instances Short workloads, predictable pricing, reliable Pay for compute capacity by the hour, instance can be terminated by Amazon
Reserved instances . Provide a significant discount compared to On-Demand pricing and provide a capacity reservation when used in a specific Availability Zone
Up to 50% cheaper than a fully utilized on-demand instance (because we commit upfront to a certain usage)
Minimum 1 year
Guarantees to not run into 'insufficent instance capacity' issues if AWS is unable to provision instances in that AZ
Can resell reserved capacity on Reserved Instance Marketplace
Can transfer between AZs
Standard reserved instances Long workloads Fixed instance type
Convertible reserved instances Long workloads with flexible instances Can be exchanged against another convertible instance type
Scheduled reserved instances . deprecated
Spot instances Short workloads, for cheap, can lose instances (not reliable) Bid on spare Amazon EC2 computing capacity, not available for all instance types
Dedicated instance No other customers will share your hardware Run on dedicated hardware, but no need to purchase the whole host
Great for software licenses that operate at the core, or socket level
Can define host affinity so that instance reboots are kept on the same host
Dedicated hosts Book an entire physical server, control instance placement A physical server with EC2 instance capacity fully dedicated to your use

Spot Instances

Spot Fleets

Pricing by

Savings Plan

Savings Plans is a flexible pricing model that provides savings of up to 72% on your AWS compute usage. This pricing model offers lower prices on Amazon EC2 instances usage, regardless of instance family, size, OS, tenancy or AWS Region, and also applies to AWS Fargate and AWS Lambda usage.

Savings Plans offer significant savings over On Demand, just like EC2 Reserved Instances, in exchange for a commitment to use a specific amount of compute power (measured in $/hour) for a one or three year period. You can sign up for Savings Plans for a 1- or 3-year term and easily manage your plans by taking advantage of recommendations, performance reporting and budget alerts in the AWS Cost Explorer.

S3 Cost Savings

S3 Storage Classes

. Durability Availability AZs Costs per GB Retrieval Fee .
S3 Standard 11x9 4x9 >=3 $0.023 No .
S3 Intelligent Tiering 11x9 3x9 >=3 $0.023 No Automatically moves objects between two access tiers based on changing access patterns
S3 IA (infrequent access) 11x9 3x9 >=3 $0.0125 Yes For data that is accessed less frequently, but requires rapid access when needed
S3 One Zone IA (infrequent access) 11x9 99.5 1 $0.01 Yes For data that is accessed less frequently, but requires rapid access when needed
Glacier 11x9 . >=3 $0.004
min 90 days
Yes For archival only, comes as expedited (1-5min), standard (3-5h) or bulk (5-12h)
Glacier Deep Archive 11x9 . >=3 $0.00099
min 180 days
Yes Longer time span to retrieve
S3 RRS (reduced redundancy storage) 4x9 4x9 >=3 $0.024 . Deprecated

S3 Other Cost savings


Migration

The 6R Strategies

Rehosting — Otherwise known as “lift-and-shift.”

Replatforming — I sometimes call this “lift-tinker-and-shift.”

Repurchasing — Moving to a different product.

Refactoring/Re-architecting — Re-imagining how the application is architected and developed, typically using cloud-native features.

Retire — Get rid of.

Retain — Usually this means “revisit” or do nothing (for now).


On-Premises strategies with AWS

. .
Download Amazon Linux 2 AMI as a VM (.iso format) VMWare, KVM, VirtualBox (Oracle VM), Microsoft Hyper-V
AWS Application Discovery Service Gather information about your on-premises servers to plan a migration
Server utilization and dependency mappings
Track with AWS Migration Hub
AWS VM Import/Export Migrate existing applications into EC2
Create a DR repository strategy for your on-premises VMs
Can export back the VMs from EC2 to on-premise
AWS Server Migration Service (SMS) Incremental replication of on-premises live servers to AWS
Migrates the entire VM into AWS
AWS Database Migration Service (DMS) Replicate On-premises => AWS , AWS => AWS, AWS => On-premises
Works with various database technologies (Oracle, MySQL, DynamoDB, etc..)

Storage Gateway (Core Topic)

Overview

AWS Storage Gateway connects an on-premises software appliance with cloud-based storage to provide seamless integration with data security features between your on-premises IT environment and the AWS storage infrastructure. You can use the service to store data in the AWS Cloud for scalable and cost-effective storage that helps maintain data security.

The gateway connects to AWS storage services, such as Amazon S3, Amazon Glacier, Amazon EBS, and AWS Backup, providing storage for files, volumes, snapshots, and virtual tapes in AWS.

Gateway types

File gateway (NFS, SMB)

The File Gateway presents a file interface that enables you to store files as objects in Amazon S3 using the industry-standard NFS and SMB file protocols, and access those files via NFS and SMB from your datacenter or Amazon EC2, or access those files as objects with the S3 API.

File gateway scenarios

Volume gateway (iSCSI)

The Volume Gateway presents your applications storage volumes using the iSCSI block protocol. Data written to these volumes can be asynchronously backed up as point-in-time snapshots of your volumes, and stored in the cloud as Amazon EBS snapshots. You can set the schedule for when snapshots occur or create them via the AWS Management Console or service API. Snapshots are incremental backups that capture only changed blocks. All snapshot storage is also compressed to minimize your storage charges.

Tape gateway (VTL)

The Tape Gateway presents itself to your existing backup application as an industry-standard iSCSI-based virtual tape library (VTL), consisting of a virtual media changer and virtual tape drives. You can continue to use your existing backup applications and workflows while writing to a nearly limitless collection of virtual tapes. Each virtual tape is stored in Amazon S3. When you no longer require immediate or frequent access to data contained on a virtual tape, you can have your backup application move it from the Storage Gateway Virtual Tape Library into an archive tier that sits on top of Amazon Glacier cloud storage, further reducing storage costs.


Snowball

Snowball is a petabyte-scale data transport solution that uses devices designed to be secure to transfer large amounts of data into and out of the AWS Cloud. Using Snowball addresses common challenges with large-scale data transfers including high network costs, long transfer times, and security concerns. Customers today use Snowball to migrate analytics data, genomics data, video libraries, image repositories, backups, and to archive part of data center shutdowns, tape replacement or application migration projects. Transferring data with Snowball is simple, fast, more secure, and can be as little as one-fifth the cost of transferring data via high-speed Internet.

This replaces Import Export which was a manual service to ship drives to AWS.

Snowball Process

Speeding up transfer into Snowball Edge


Database Migration Service (Core Topic)

Overview

AWS Database Migration Service helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases.

AWS Database Migration Service supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle or Microsoft SQL Server to Amazon Aurora. With AWS Database Migration Service, you can continuously replicate your data with high availability and consolidate databases into a petabyte-scale data warehouse by streaming data to Amazon Redshift and Amazon S3. Learn more about the supported source and target databases.

Schema Conversion Tool (SCT)

Good to know

Combine Snowball & DMS


Application Discovery Service (Core Topic)

AWS Application Discovery Service helps enterprise customers plan migration projects by gathering information about their on-premises data centers.

Planning data center migrations can involve thousands of workloads that are often deeply interdependent. Server utilization data and dependency mapping are important early first steps in the migration process. AWS Application Discovery Service collects and presents configuration, usage, and behavior data from your servers to help you better understand your workloads.

The collected data is retained in encrypted format in an AWS Application Discovery Service data store. You can export this data as a CSV file and use it to estimate the Total Cost of Ownership (TCO) of running on AWS and to plan your migration to AWS. In addition, this data is also available in AWS Migration Hub, where you can migrate the discovered servers and track their progress as they get migrated to AWS.


Server Migration Service

AWS Server Migration Service (SMS) is an agentless service which makes it easier and faster for you to migrate thousands of on-premises workloads to AWS. AWS SMS allows you to automate, schedule, and track incremental replications of live server volumes, making it easier for you to coordinate large-scale server migrations.

Summary:


AWS Migration Hub

AWS Migration Hub (Migration Hub) provides a single place to discover your existing servers, plan migrations, and track the status of each application migration. The Migration Hub provides visibility into your application portfolio and streamlines planning and tracking. You can visualize the connections and the status of the servers and databases that make up each of the applications you are migrating, regardless of which migration tool you are using.

Migration Hub gives you the choice to start migrating right away and group servers while migration is underway, or to first discover servers and then group them into applications. Either way, you can migrate each server in an application and track progress from each tool in the AWS Migration Hub.


AWS Cloud Adoption Readiness Tool (CART)

Helps organizations of all sizes develop efficient and effective plans for cloud adoption and enterprise cloud migrations. This 16-question online survey and assessment report details your cloud migration readiness across six perspectives including business, people, process, platform, operations, and security. Once you complete a CART survey, you can provide your contact details to download a customized cloud migration assessment that charts your readiness and what you can do to improve it. This tool is designed to help organizations assess their progress with cloud adoption and identify gaps in organizational skills and processes.


Disaster Recovery

From To .
On-prem On-prem Traditional DR, very expensive
On-prem Cloud Hybrid recovery
Cloud Region A Cloud Region B .
. RPO RTO Costs Comment What to do for DR
Backup & Restore High High $ Regular backups Restore
Pilot Light Medium Medium $$ Core system is always running, but cannot process requests with additional action being taken Add non-critical systems
Warm Standby Low Low $$$ Full system at minimum size always running, can handle traffic immediately (at reduced capacity) Add resources
Multi Site/Hot Site Lowest Lowest $$$$ Full system at production size always running Only switch traffic

VPC

Virtual Private Cloud (VPC)

Overview

Amazon Virtual Private Cloud (Amazon VPC) is a service that lets you launch AWS resources in a logically isolated virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. You can use both IPv4 and IPv6 for most resources in your virtual private cloud, helping to ensure secure and easy access to resources and applications.

As one of AWS's foundational services, Amazon VPC makes it easy to customize your VPC's network configuration. You can create a public-facing subnet for your web servers that have access to the internet. It also lets you place your backend systems, such as databases or application servers, in a private-facing subnet with no internet access. Amazon VPC lets you to use multiple layers of security, including security groups and network access control lists, to help control access to Amazon EC2 instances in each subnet.

Default VPC (Amazon specific)

Non-default VPC (regular VPC)

VPC Scenarios

Components

Structure & Package Flow

Package flow through VPC components

VPC Flow Logs

VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data can be published to Amazon CloudWatch Logs and Amazon S3. After you've created a flow log, you can retrieve and view its data in the chosen destination.

Can be created at 3 levels:

Limits

. .
VPCs per region 5
Min/max VPC size /28//16
Subnets per VPC 200
Customer gateways per region 50
Gateway per region 5 Internet
Elastic IPs per account per region 5
VPN connections per region 50
Route tables per region 200
Security groups per region 500

VPC Peering

Longest prefix match


Transit Gateway

Overview

AWS Transit Gateway connects VPCs and on-premises networks through a central hub. This simplifies your network and puts an end to complex peering relationships. It acts as a cloud router – each new connection is only made once.

As you expand globally, inter-Region peering connects AWS Transit Gateways together using the AWS global network. Your data is automatically encrypted, and never travels over the public internet. And, because of its central position, AWS Transit Gateway Network Manager has a unique view over your entire network, even connecting to Software-Defined Wide Area Network (SD-WAN) devices.

Previously: Transit VPC (=Software VPN)


VPC Endpoints (Core Topic)

A VPC endpoint enables private connections between your VPC and supported AWS services and VPC endpoint services powered by AWS PrivateLink. AWS PrivateLink is a technology that enables you to privately access services by using private IP addresses. Traffic between your VPC and the other service does not leave the Amazon network. A VPC endpoint does not require an internet gateway, virtual private gateway, NAT device, VPN connection, or AWS Direct Connect connection. Instances in your VPC do not require public IP addresses to communicate with resources in the service.

VPC Endpoint Gateway

A gateway endpoint is a gateway that is a target for a specified route in your route table. This type of endpoint is used for traffic destined to a supported AWS service, such as Amazon S3 or Amazon DynamoDB.

VPC Endpoint Interface

An interface endpoint is an elastic network interface with a private IP address from the IP address range of your subnet. It serves as an entry point for traffic destined to a supported AWS service or a VPC endpoint service. Interface endpoints are powered by AWS PrivateLink.

VPC Endpoint Gateway Load Balancer

A Gateway Load Balancer endpoint is an elastic network interface with a private IP address from the IP address range of your subnet. Gateway Load Balancer endpoints are powered by AWS PrivateLink

VPC Endpoint Policies

A VPC endpoint policy is an IAM resource policy that you attach to an endpoint when you create or modify the endpoint. If you do not attach a policy when you create an endpoint, we attach a default policy for you that allows full access to the service. If a service does not support endpoint policies, the endpoint allows full access to the service. An endpoint policy does not override or replace IAM user policies or service-specific policies (such as S3 bucket policies). It is a separate policy for controlling access from the endpoint to the specified service.


Overview

AWS PrivateLink enables you to connect to some AWS services, services hosted by other AWS accounts (referred to as endpoint services), and supported AWS Marketplace partner services, via private IP addresses in your VPC. The interface endpoints are created directly inside of your VPC, using elastic network interfaces and IP addresses in your VPC’s subnets. That means that VPC Security Groups can be used to manage access to the endpoints.

We recommend this approach if you want to use services offered by another VPC securely within AWS network, with all network traffic staying on the global AWS backbone and never traverses the public internet.


VPN (Core Topic)

AWS Virtual Private Network solutions establish secure connections between your on-premises networks, remote offices, client devices, and the AWS global network. AWS VPN is comprised of two services: AWS Site-to-Site VPN and AWS Client VPN. Together, they deliver a highly-available, managed, and elastic cloud VPN solution to protect your network traffic.

AWS Site-to-Site VPN creates encrypted tunnels between your network and your Amazon VPCs or AWS Transit Gateways. For managing remote access, AWS Client VPN connects your users to AWS or on-premises resources using a VPN software client.

VPN connectivity option Description
AWS Site-to-Site VPN You can create an IPsec VPN connection between your VPC and your remote network. On the AWS side of the Site-to-Site VPN connection, a virtual private gateway or transit gateway provides two VPN endpoints (tunnels) for automatic failover. You configure your customer gateway device on the remote side of the Site-to-Site VPN connection.
AWS Client VPN AWS Client VPN is a managed client-based VPN service that enables you to securely access your AWS resources or your on-premises network. With AWS Client VPN, you configure an endpoint to which your users can connect to establish a secure TLS VPN session. This enables clients to access resources in AWS or an on-premises from any location using an OpenVPN-based VPN client.
AWS VPN CloudHub If you have more than one remote network (for example, multiple branch offices), you can create multiple AWS Site-to-Site VPN connections via your virtual private gateway to enable communication between these networks.
Third party software VPN appliance You can create a VPN connection to your remote network by using an Amazon EC2 instance in your VPC that's running a third party software VPN appliance. AWS does not provide or maintain third party software VPN appliances; however, you can choose from a range of products provided by partners and open source communities.

Site-To-Site VPN

A customer gateway device is a physical or software appliance that you own or manage in your on-premises network (on your side of a Site-to-Site VPN connection). You or your network administrator must configure the device to work with the Site-to-Site VPN connection.

Route Propagation

AWS VPN CloudHub

Building on the AWS managed VPN options described previously, you can securely communicate from one site to another using the AWS VPN CloudHub. The AWS VPN CloudHub operates on a simple hub-and-spoke model that you can use with or without a VPC. Use this approach if you have multiple branch offices and existing internet connections and would like to implement a convenient, potentially low-cost hub-and-spoke model for primary or backup connectivity between these remote offices.

Client VPN

AWS Client VPN is a fully-managed, elastic VPN service that automatically scales up or down based on user demand. Because it is a cloud VPN solution, you don’t need to install and manage hardware or software-based solutions, or try to estimate how many remote users to support at one time.

Software VPN (not AWS managed)

VPN to multiple VPCs


Direct Connect (Core Topic)

Overview

AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated network connection from your premises to AWS. Using AWS Direct Connect, you can establish private connectivity between AWS and your datacenter, office, or colocation environment, which in many cases can reduce your network costs, increase bandwidth throughput, and provide a more consistent network experience than Internet-based connections.

AWS Direct Connect lets you establish a dedicated network connection between your network and one of the AWS Direct Connect locations. Using industry standard 802.1q VLANs, this dedicated connection can be partitioned into multiple virtual interfaces. This allows you to use the same connection to access public resources such as objects stored in Amazon S3 using public IP address space, and private resources such as Amazon EC2 instances running within an Amazon Virtual Private Cloud (VPC) using private IP space, while maintaining network separation between the public and private environments. Virtual interfaces can be reconfigured at any time to meet your changing needs.

Direct Connect Virtual Interfaces (VIF)

Connection Types

Encryption

Direct Connect Gateway


Redundant connections beweteen on-premises and AWS


Other Services

Alex for Business

Alexa for Business is a service that enables organizations and employees to use Alexa to get more work done. With Alexa for Business, employees can use Alexa as their intelligent assistant to be more productive in meeting rooms, at their desks, and even with the Alexa devices they already use at home or on the go. IT and facilities managers can also use Alexa for Business to measure and increase the utilization of the existing meeting rooms in their workplace.


Amazon AppStream & Amazon Workspaces

Amazon AppStream 2.0

Amazon AppStream 2.0 is a fully managed non-persistent application and desktop streaming service. You centrally manage your desktop applications on AppStream 2.0 and securely deliver them to any computer. You can easily scale to any number of users across the globe without acquiring, provisioning, and operating hardware or infrastructure. AppStream 2.0 is built on AWS, so you benefit from a data center and network architecture designed for the most security-sensitive organizations. Each end user has a fluid and responsive experience because your applications run on virtual machines optimized for specific use cases and each streaming sessions automatically adjust to network conditions.

Amazon Workspaces

Amazon WorkSpaces is a managed, secure Desktop-as-a-Service (DaaS) solution. You can use Amazon WorkSpaces to provision either Windows or Linux desktops in just a few minutes and quickly scale to provide thousands of desktops to workers across the globe. You can pay either monthly or hourly, just for the WorkSpaces you launch, which helps you save money when compared to traditional desktops and on-premises VDI solutions. Amazon WorkSpaces helps you eliminate the complexity in managing hardware inventory, OS versions and patches, and Virtual Desktop Infrastructure (VDI), which helps simplify your desktop delivery strategy. With Amazon WorkSpaces, your users get a fast, responsive desktop of their choice that they can access anywhere, anytime, from any supported device.

Amazon AppStream 2.0 vs WorkSpaces


Amazon DocumentDB

Overview

Amazon DocumentDB (with MongoDB compatibility) is a fast, scalable, highly available, and fully managed document database service that supports MongoDB workloads. As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data.

Amazon DocumentDB is a non-relational database service designed from the ground-up to give you the performance, scalability, and availability you need when operating mission-critical MongoDB workloads at scale. In Amazon DocumentDB, the storage and compute are decoupled, allowing each to scale independently, and you can increase the read capacity to millions of requests per second by adding up to 15 low latency read replicas in minutes, regardless of the size of your data.

Amazon DocumentDB is designed for 99.99% availability and replicates six copies of your data across three AWS Availability Zones (AZs). You can use AWS Database Migration Service (DMS) for free (for six months) to easily migrate your on-premises or Amazon Elastic Compute Cloud (EC2) MongoDB databases to Amazon DocumentDB with virtually no downtime.

Amazon Lex

Amazon Lex is a service for building conversational interfaces into any application using voice and text. Amazon Lex provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text, to enable you to build applications with highly engaging user experiences and lifelike conversational interactions. With Amazon Lex, the same deep learning technologies that power Amazon Alexa are now available to any developer, enabling you to quickly and easily build sophisticated, natural language, conversational bots (“chatbots”).

With Amazon Lex, you can build bots to increase contact center productivity, automate simple tasks, and drive operational efficiencies across the enterprise. As a fully managed service, Amazon Lex scales automatically, so you don’t need to worry about managing infrastructure.


Amazon Rekognition

Amazon Rekognition makes it easy to add image and video analysis to your applications using proven, highly scalable, deep learning technology that requires no machine learning expertise to use. With Amazon Rekognition, you can identify objects, people, text, scenes, and activities in images and videos, as well as detect any inappropriate content. Amazon Rekognition also provides highly accurate facial analysis and facial search capabilities that you can use to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.

With Amazon Rekognition Custom Labels, you can identify the objects and scenes in images that are specific to your business needs. For example, you can build a model to classify specific machine parts on your assembly line or to detect unhealthy plants. Amazon Rekognition Custom Labels takes care of the heavy lifting of model development for you, so no machine learning experience is required. You simply need to supply images of objects or scenes you want to identify, and the service handles the rest.


CloudSearch

Amazon CloudSearch is a managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application.


Kinesis Video Streams

Amazon Kinesis Video Streams makes it easy to securely stream video from connected devices to AWS for analytics, machine learning (ML), playback, and other processing. Kinesis Video Streams automatically provisions and elastically scales all the infrastructure needed to ingest streaming video data from millions of devices. It durably stores, encrypts, and indexes video data in your streams, and allows you to access your data through easy-to-use APIs. Kinesis Video Streams enables you to playback video for live and on-demand viewing, and quickly build applications that take advantage of computer vision and video analytics through integration with Amazon Rekognition Video, and libraries for ML frameworks such as Apache MxNet, TensorFlow, and OpenCV. Kinesis Video Streams also supports WebRTC, an open-source project that enables real-time media streaming and interaction between web browsers, mobile applications, and connected devices via simple APIs. Typical uses include video chat and peer-to-peer media streaming.


Amazon Mechanical Turk


Device Farm


Macie

Overview

Amazon Macie is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS. Amazon Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property, and provides you with dashboards and alerts that give visibility into how this data is being accessed or moved. The fully managed service continuously monitors data access activity for anomalies, and generates detailed alerts when it detects risk of unauthorized access or inadvertent data leaks. Amazon Macie is available to protect data stored in Amazon S3.


Amazon Pinpoint

Amazon Pinpoint is a flexible and scalable outbound and inbound marketing communications service. You can connect with customers over channels like email, SMS, push, or voice. Amazon Pinpoint is easy to set up, easy to use, and is flexible for all marketing communication scenarios. Segment your campaign audience for the right customer and personalize your messages with the right content. Delivery and campaign metrics in Amazon Pinpoint measure the success of your communications. Amazon Pinpoint can grow with you and scales globally to billions of messages per day across channels.


Private Marketplace

A private marketplace controls which products users in your AWS account, such as business users and engineering teams, can procure from AWS Marketplace. It is built on top of AWS Marketplace, and enables your administrators to create and customize curated digital catalogs of approved independent software vendors (ISVs) and products that conform to their in-house policies. Users in your AWS account can find, buy, and deploy approved products from your private marketplace, and ensure that all available products comply with your organization’s policies and standards.


Amazon FSx

Amazon FSx makes it easy and cost effective to launch and run popular file systems that are fully managed by AWS. With Amazon FSx, you can leverage the rich feature sets and fast performance of widely-used open source and commercially-licensed file systems, while avoiding time-consuming administrative tasks such as hardware provisioning, software configuration, patching, and backups. It provides cost-efficient capacity with high levels of reliability, and integrates with a broad portfolio of AWS services to enable faster innovation.


Amazon WorkDocs

Amazon WorkDocs is a fully managed, secure content creation, storage, and collaboration service. With Amazon WorkDocs, you can easily create, edit, and share content, and because it’s stored centrally on AWS, access it from anywhere on any device. Amazon WorkDocs makes it easy to collaborate with others, and lets you easily share content, provide rich feedback, and collaboratively edit documents. You can use Amazon WorkDocs to retire legacy file share infrastructure by moving file shares to the cloud. Amazon WorkDocs lets you integrate with your existing systems, and offers a rich API so that you can develop your own content-rich applications. Amazon WorkDocs is built on AWS, where your content is secured on the world's largest cloud infrastructure.