Solutions Architect - Associate

AWS - 10,000 foot overview Part 1 - Global infrastructure
Region : Geographical area, each region consists of 2 or more availability zones

Availability zone: basically a data center, provides redundancy, redundant networking and
connectivity
Edge Locations: endpoints for caching content. Consists of CloudFront, and amazon’s CDN.
there are more edge locations than regions. “An edge location is where end users access
services located at AWS. They are located in most of the major cities around the world and are
specifically used by CloudFront (CDN) to distribute content to end user to reduce latency.”
● Location where content will be cached, currently over 50 edge locations.
● Not just used for READ only, can be used to write them too. When writing, it will send the
object to the origin server.
● Objects are cached for the life of the TTL. you can clear cached objects but it will be
charged.
AWS - 10,000 foot overview Part 2 - 4 - Services

Compute:
● EC2 - cloud compute, virtual machines in AWS platform.
● EC2 Container Service - where you run docker containers
● Elastic beanstalk - automatically handles deployment of your code, from capacity
provisioning, load balancing, auto-scaling to health monitoring
● Lambda - code that you upload, you control and execute, nothing to manage you just
worry about your code
● Lightsail - VPS service, no need to worry about the infrastructure, provisions you a
server with fixed ip address and access
● Batch - batch computing
Storage:
● S3 - oldest storage, simple storage service. Object based storage, you have buckets
where you upload your files
● Efs - simple, scalable file storage for use with EC2. storage capacity is elastic, growing
and shrinking automatically. Manages all file storage infrastructure, avoid complexity of
deploying, patching and maintaining complex file system.
○ Supports network file and system version 4 protocol, applications and tools work
seamlessly with EFS.
○ Multiple EC2 instances can access EFS at the same time, provide common data
source for workloads and apps running on more than one instance / server
○ Pay only for storage used by your file system, no setup cost
● Glacier - data archiver
● Snowball - bring in large f amounts of data into data center and they upload it for you so
it is fast, not relying on network
● Storage gateway - virtual machines you install, enables your on-premise applications to
seamlessly use s3 storage
Database:
● RDS - relational database service, mysql, microsoft sql server, oracle, any relational
database
● dynamoDB - nonrelational database
● Elasticache - caching common queries for your database
● Redshift - data warehousing, data intelligence
Migration:
● AWS migration hub - tracking service, track your services
● Application discovery service - automated tools to check your dependencies
● Database migration service DMS - easy way to migrate your databases
● Server migration service - migrate your virtual and physical services
● Snowball - same as storage
Networking & content delivery:
● VPC - amazon virtual private cloud, configure availability zones, firewalls etc. virtual
network dedicated to single AWS account. Logically isolated from other virtual networks.
● cloudFront - content delivery network
● route53 - DNS service
● Api gateway - creating your own API for your own services
● Direct connect - way of running dedicated line from your office directly into amazon,
using VPC
Developer tools:
● Codestar - project management
● Codecommit - store your code, source control service
● Codebuild - compiles your code and runs test, produce software packages
● Codedeploy - deployment, automate application deployment
● Code pipeline - continuous delivery service, visual steps
● X-ray - debug and analyze your serverless applications, debug, find bottlenecks
● Cloud9 - IDE for AWS
Management tools:
● Cloudwatch - monitoring service
● Cloudformation - scripting infrastructure, turn your infrastructure into code to deploy
entire cloud environments via JSON script
● cloudTrail - audit service, stores records for 1 week, turned automatically on
● Config - monitors configuration of aws environment, a historical overview
● Opsworks - automating your environment, config of your environment, uses Chef,
automation platform that treats server configuration as code
● Service catalog - catalog management, used by big organization
● Systems manager - interface for managing resources, normally for EC2, group your
resources
● Trusted advisor - give advice for security, tells you about aws optimization, save money,
kind of like an accountant
● Managed services - manage service
Media Services:
● Elastic transcoder - takes video and re-sizes it for all resolutions
Machine learning:
● Sagemaker - easy for developers to use deep learning
● Comprehend - tells you feedback on your products from customers
● Deeplens - AI aware camera, camera can see what it is looking at
● Lex - powers alexis service, basically AI chat
● Machine learning - for machine learning, analyze dataset
● Polly - takes text and turns into speech
● Rekognition - does video and images, tells you what is in the image and video
● Amazon translate - machine translation, for human languages
● Amazon transcribe - for transcription
Analytics:
● Athena - run sql queries against your s3 bucket, serverless
● EMR / elastic mapreduce - process large amounts of data, like hadoop
● Cloud search - search service
● Kinesis - collating / ingesting large amounts of data streamed from multiple sources
● Kinesis video streams - ingest video streams to run processing
● Quicksight - business intelligence tool
● Data pipeline - move data between different aws services
● Glue - for extract, transform, load, for large amount of data migration
Security & Identity & Compliance
● IAM - identity access management
● Cognitio - device authentication
● Guard Duty - protection from malicious activity
● Inspector - install on virtual machine to run tests for security, generates reports for
vulnerabilities
● Macie - scan s3 buckets for personal information
● Certificate manager - ssl certificates for free
● Cloudhsm - hardware security modules, to store private and public keys, can also store
other encryption key to encrypt objects
● Directory service - integrating microsoft active directory
● WAF - web application firewall, layer 7 firewall to stop scripting, injections etc.
● Shield - get it by default, helps prevent ddos
● Artifact - portal for on demand access, for compliance reports
Mobile Services
● Mobile hub
● Pinpoint - targeted push notifications to drive mobile engagement
● Aws appsync - updates data in real time
● Device farm - testing app on real life devices
● Mobile analytics - analytics service for mobiles
AR / VR
● Sumerian - general release name, use common set of tools for augmentation, virtual
reality etc.
Application integration
● Step functions - manage lambda functions
● Amazon MQ - message queueing
● SNS - notification services
● SQS - oldest service launched in 2006, decoupling infrastructure
● SWF - everytime you order package, creates workflow job
Customer engagement
● Connect - contact center, like a call center
● Simple email service - email service, highly scalable
Business productivity
● Alexa for business
● Chime - like google hangouts, for video conferencing
● Work docs - like drop box
● Work mail - office 365 basically
Desktop & App Streaming
● Work spaces - VDI solutions, run operating systems virtually, can replace local desktop
environments
● AppStream 2.0 - streaming application
Internet of things
● iOT
● Iot device management
● Amazon freeRTOS - OS for microcontrollers
● Greengrass - software for data caching, machine learning, for connected devices
Game Development
● gameLife - for game development
Identity Access Management 101

What does IAM give you?
● Allows you to manage users and their level of access to the console.
● Centralized control of your account
● Shared access to your account
● Granular permissions
● Identify federation (including active directory, facebook, linkedin)
● Multi Factor authentication
● Provide temp access for users/devices and services
● Allows you to set up your own password rotation policy
● Integrates with many different aws services
● Supports PCI DSS compliance
Critical Terms
● Users - end users
○ New users have no permissions when first created
○ New users are assigned access key ID and secret key ID
● Groups - collection of users under one set of permissions / departments
● Roles - assign roles and assign them to aws resources
● Policies - document that defines one or more permissions
○ Policies are defined in JSON
● IAM does not require you to specify a region - GLOBAL
Access Types
● Programmatic access - user requires access to API, AWS CLI, tools or windows
powershell. This creates access key for each new user.
○ Access key ID and secret access key is only used to log into AWS through the
CLI, not through the console
● AWS management console access - if user requires access to AWS management
console. Creates password for each user.
Power user access - allows access to all AWS services except management of groups and
users. Highest user role behind admin access.
Using SAML you can give your federated users single sign on access to your console
S3 - 101
Simple storage service.
Provide developers and IT team with secure, durable highly-scalable storage, store and retrieve
any amount of data anywhere in the web.
Object based storage.
Basics
● Object based - upload files
● Files can be from 0 bytes to 5 TB
○ The largest object from a single put can be 5gb
● Unlimited storage, paid by the gig
● Files stored in buckets, basically a folder
● Universal namespace, each bucket basically had a dns address and must be unique
name
● Built for 99.999% availability and guaranteed
● Amazon guarantees 11 x 9s durability for s3 information
● Tiered storage available - different storage classes
● Lifecycle management - if file is over x days old, move it to another storage or archive it
● Versioning - version control for files
● Encryption
● Secure data using Access control lists and bucket policies
● You can load files faster by enabling multipart upload. Improve the experience for
larger objects by uploading objects in part. They can be uploaded independently, in any
order, and in parallel.
● Able to have 100 buckets per account by default
● As of 2018, S3 provides support for at least 3,500 requests per second to add data, and
5,500 requests per second to retrieve data
● Choosing a region should be based on these factors:
○ Near your customers, your data centers, or other AWS resources in order to
reduce data access latencies
○ Remote from your other operations for geographic redundancy and disaster
recovery purposes
○ Enables you to address specific legal and regulatory requirements
○ Allows you to reduce storage costs. You can choose lower priced region and
save money
● For get intensive requests, consider using cloudfront. By integrating cloudfront with S3,
you can distribute content to uses with low latency and high data transfer rate. Also send
fewer direct requests to amazon S3, reducing costs.
● Urls look like: https://region.amazon.aws/bucketname - direct path
● https://bucketname.amazon.aws - virtual path
● S3 uses content-MDS checksums and cyclic redundancy checks to detect for data
corruption. Performs these checksums on data at rest and repairs corruption using
redundant data. Service also calculates checksums on all network traffic to detect
corruption of data packets
Choosing a naming scheme
● Pay close attention to naming scheme if
○ You want consistent performance
○ Your transfer per second regularly exceed 100 and includes mixture of requests
● Naming schemes help to ensure data partitioning
● Introducing random naming helps S3 partition better, resulting in better performance
● Bad examples:
○ 2016-12-15-photo1.jpg
○ 2016-12-15-photo2.jpg
○ You can see how each file starts with “2016” so they will all be written to the
same partition, this is bad!
● Increase randomness in the keyname
● You can store your filename to a HASH
Data consistency model
● Read after write consistency for PUTS of new objects - if you create a new object into
s3, you can read the info immediately
● Eventual consistency for overwrite PUTS and DELETES - if you update an object or
delete an object, takes some time to propagate, takes time to replicate across availability
zones
Simple key-value store
● Objects consists of:
○ Key - name that you assign to the object.
○ value - made up of sequence of bytes, the content you are storing.
○ Version ID - important for versioning, key and version id uniquely identify the
object
○ Metadata - data about data you store
○ Subresources:
■ Access control lists - input individual permissions for your files
■ Torrent - torrenting
Storage tiers:
S3 storage classes can be configured at an object level, so a single bucket can contain objects
across s3 standard, standard IA etc.
● S3 standard: 11 x 9 durability, designed to sustain the loss of 2 facilities / availability
zones concurrently. Redundantly across multiple devices in multiple facilities. Delivers
low latency and high throughput. S3 lifecycle management for automatic migration of
objects to other storage classes.
○ 99.9 availability
● S3 Intelligent-Tiering: for unknown or changing access storage. Automatically move
data to the most cost effective access tier without impacting performance or operational
overhead. Works by placing objects in two access tiers; tier one is for frequently
accessed objects, tier two for infrequent. Same low latency and throughput as standard
S3. small monthly monitoring and auto-tiering fee.
○ All objects are always available when needed, no retrieval fees
○ Same performance as standard
○ 99% availability
○ Requires minimum storage duration of 30 days
● IA : infrequently accessed. For data that is accessed less frequently, but require rapid
access when needed. Lower free than S3, charged a retrieval fee. Ideal for long term
storage, backups, and disaster recovery files.
○ 99% availability
○ Data deleted within 30 days will be charged for full 30 days
○ Designed for larger objects, min object storage charge of 128KB
● One Zone - IA : only stored in one availability zone, lower cost option. Ideal for storing
secondary backup copies or easily recreatable data. Availability of 99%
○ 20% less cost than standard IA
● Glacier : used for archiving, standard retrieval takes 3 - 5 hours. Three types: expedited,
standard, or bulk, which just gives different retrieval times.
○ Expedited retrieval : 1-5 minutes
○ Standard retrieval: 3 -5 hours
○ Bulk retrieval : 5 - 12 hours
○ Minimum of 90 days of storage, objects deleted before 90 days incur pro-rated
charge equal to storage charge for remaining days
○ Provisioned Capacity - combined with expedited retrieval, ensures the retrieval
capacity is available when you need it.
■ Each unit of capacity provides that at least three expedited retrievals can
be performed every 5 minutes
■ Should use provisioned capacity if workload requires highly reliable and
predictable access to subset of data in minutes
■ If you require expedited retrieval under any circumstances, use with
provisioned capacity
● Reduced Redundancy Storage (RRS): store noncritical, reproducible data at lower
levels of redundancy than S3 standard. Provides highly available solution for distributing
or sharing content that is durably stored elsewhere, or for storing thumbnails, transcoded
media, or processed data that can be easily reproduced. Stores objects on multiple
devices across multiple facilities, providing 400 X durability of typical disk drive, but does
not replicate objects as many times as S3 standard.
○ 99.99% durability and availability
○ Designed to sustain loss of data in a single facility
Charges:
● Charged for storage, requests, storage management pricing (basically charged for
metadata), data transfer pricing, transfer acceleration
● Charges can vary across regions.
● Based on the location of your bucket
● No data transfer fee for data transferred within S3 region via Copy request, however
there is a fee if done between different regions
● No charge for data transferred between Ec2 and S3 in same region
S3 Select
● Run sophisticated queries against data stored without need to move data into separate
analytics platform
● Increase performance and reduce cost for analytics solutions leveraging s3 as data lake
● Easy to retrieve data from contents of data using SQL without retrieving the entire object.
● Retrieve subset of data, using SELECT, WHERE etc. from objects stored in JSON, CSV,
or Apache Parquet format.
● Can be used with lambda to build serverless applications, that use S3 select to retrieve
data from S3.
Amazon Athena
● Serverless service allowing you to analyze data in S3 using standard SQL queries.
● Don't need to load your data into Athena.
Redshift Spectrum
● Run queries against exabytes of unstructured data in S3.
● When you issue query, it goes to redshift SQL endpoint, which generates a query plan.
Determines what data is local and what data is in S3, generates plan to minimize amount
of S3 data that needs to be read.
● Queries run quickly regardless of data size
● Lets you seperate storage and compute, allowing you to scale each independently.
Transfer acceleration:
● Enables fast, easy and secure transfer of files over long distances between s3 bucket
and end user. Takes advantage of cloudfront for quick caching. As data arrives at edge
location, data is routed to amazon s3 over an optimized network path.
● If your objects are smaller than 1gb, consider using CloudFront instead
● Optimal for large distance between source and destination
S3 Inventory
● Provides CSV, ORC or Parquet file output of your objects and corresponding metadata
on daily or weekly basis.
Encryption at Rest:
● Client side encryption - data is encrypted before uploaded to S3. you manage the
encryption locally, using your own managed keys.
○ You can use KMS master key
■ Using AWS SDK a call is made to KMS for data encryption key, KMS
supplies the data encryption key in plaintext and encrypted form. The key
is used to encrypt the data on the client side, and the encrypted data is
uploaded to S3 along with encrypted data key
● Server side encryption
○ With amazon s3 managed keys (SSE-S3) - S3 manages the key for you, each
object is encrypted with a unique data key, and the data key is then encrypted by
a master key. The master key is automatically rotated periodically. Free to use.
■ Uses AES-256
■ Must set header: “x-amz-server-side-encryption”: “AES256”
○ With KMS (SSE-KMS) - you control the keys. Creation & use of master keys and
data keys, disable or rotate the master keys. You choose which key the object is
encrypted by, and the data key is encrypted by master key. The advantage is that
the use of KMS can be audited by cloudtrail, and the keys can be used across
multiple services. There is a charge
■ Must set header: “x-amz-server-side-encryption”: “aws:kms”
○ With customer provided keys (SSE-C) - key managed by you. The symmetric
key is uploaded along with the data, S3 encrypts the data with that key and
deletes it. So to decrypt the data, you need to supply the key again. Must use
HTTPS to upload the objects. Free to use.
■ Use if you want to maintain your own keys but don't want to implement or
leverage a client side encryption library
○ S3 encrypts your data before writing it to disk and decrypts when data is read
from the disk
● Control access to buckets using either bucket ACL or bucket policies
● By default buckets are private and all objects are stored private
● Symmetric encryption - uses same secret key to perform both encryption and
decryption
● Asymmetric encryption - public key encryption, uses two keys, public key for
encryption and corresponding private key for decryption. HTTPS uses asymmetric
encryption
● S3 uses envelope encryption - process of encrypting a key with another key
Versioning:
● Once you enable versioning on a bucket, you cannot remove it. You can only suspend it.
To remove versioning, you would have to remove the whole bucket.
● When deleting the object, you are not actually deleting it. You are basically adding a
delete tag on it and its creating a new version of it.
● To permanently delete the object, put version to SHOW and delete each individual
version of the object.
● You are able to put a two factor authentication to delete buckets.
● Great backup tool.
● Integrates life cycle rules.
● Success will display http status of 200, and if you perform a GET on an object which
current version has delete marker, you'll get a status return of 404 not found.
● You can GET objects with specific version id number to get previous version number
objects
● Only owner of amazon s3 bucket can permanently delete a version
● Remember that you pay for each version you upload
○ IE 1gb object uploaded 5 times will accrue 1gbx5 storage cost
○ Enable versioning with life cycle to mitigate the versioning cost
Cross Region Replication
● Enables automatic, asynchronous copying of objects across buckets from different
regions. Can be owned by same aws owner or different accounts.
● Buckets must be in different regions, and must have versioning enabled
● S3 must have permissions to replicate objects from the source bucket to destination
bucket
● If owner of source bucket doesn’t own object in the bucket, must grant bucket owner
READ and READ_ACP permissions with object ACL
● To copy objects, install the CLI and copy it recursively:
○ Aws s3 cp --recursive s3://bucketname s3://bucketname
● Files in existing bucket are not replicated automatically. All subsequent updated files will
be replicated.
● If you delete an object from primary, it will not delete it from the bucket you are
replicating to
● If you update or add object from primary bucket, it will replicate over successfully, just
delete won’t replicate over
● Depending on the object size, replications may take several hours if the size is really big
● If you are CRR with a bucket that is owned by another AWS account, you have to set a
policy that enables access to the other bucket, see on the AWS document
● If you own the bucket, just leave the user IAM role as “create new” and it will auto create
the policy for you
● Can help with:
○ Comply with compliance requirements - S3 stores data across multiple
geographical availability zones, but if requirements state that you need to store
data at a greater distance, you can use CRR
○ Minimize latency - minimize latency in accessing objects by maintaining copies
in regions that are closer
○ Increase operational efficiency
○ Maintain object copies under different ownership
○ Data protection
■ If specific version is deleted from the source, it is not deleted from the
destination
● You cannot copy objects encrypted with SSE-KMS and with SSE-C, so it only works with
SSE-S3 encryption currently
● Automated lifecycle actions are not replicated
● If the object is already a replica, it cannot be replicated again so it cannot be chained
Amazon Macie
● AI powered security service that helps you prevent data loss via auto discovering,
classifying, and protecting sensitive data in S3.
● Uses machine learning to recognize sensitive data
● Continuously monitors for anomalies and delivers alerts
Life Cycle Management

● Can be applied in conjunction with versioning
● Can be applied to current and previous versions
● Can use to permanently delete after x time, or move to another class storage after x
days
● Transition to the standard - IA (requires file to be at least 128 kb and 30 days after
creation date).
● Transition to Glacier is the same, requires at least 30 days after creation
● Can enable up to 1000 life cycle rules per bucket
● Can create additional rules to versioned buckets
● Cannot transition from S3 IA to S3 standard or reduced
● Cannot transition from glacier to any other class
● Cannot create life cycle rules on MFA enabled buckets
Storage Class Analysis

● Analyses storage patterns to determine when to transition data to the right storage class
● Helps improve life cycle policies by monitoring data access patterns
● Chargeable feature, currently .10 per million objects analyzed
● Data wont be exactly accurate until it has been analyzed for at least 30 days
Event Notification
● Provides ability to trigger notifications when certain events happen within S3
● Enabled by adding a notification configuration to the bucket identifying the events to be
published, and destination where notifications should send
● Can send notifications to SNS, SQS and Lambda
● Events to trigger notification:
○ New object creation
○ Object removal
○ Reduced redundancy storage lost
Bucket policy for public access
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::BUCKET_NAME//*"
]
}
]
}
● If user and bucket belong to same AWS account, all policies are evaluated at the same
time
○ If access is granted via one policy and denied via another, deny always wins
● In most cases you would grant access either via user policy or bucket policy, they
provide finger permissions than ACL, however use ACL when
○ To manage access to objects not owned by bucket owner
○ Manage access to individual objects when permissions must vary between
objects
○ Grant permission to S3 Log Delivery group on bucket
● Two types of policies
○ Resource based policies consisting of bucket policies and ACL
○ User policies
● ACL
○ Xml resource policies, granting access to both buckets and objects
○ Each bucket and object has ACL attached to it as sub resource
○ Default ACL grants resource owner full control over object IE the account owner
○ Only grant permissions to AWS accounts and pre-defined groups, cannot grant
permissions to individual users
○ Used to grant basic read/write permissions to other AWS accounts
○ Three predefined groups
■ Authenticated users - represents all AWS accounts. Granting this allows
any AWS account to access this resource
■ All Users - represents EVERYONE
■ Log Delivery - grant write permission to groups to write server logs. Used
for S3 bucket logging.
○ Canned ACL - predefined ACL used for CLI
■ --acl public-read => an example of public read ACL
● Bucket & User Policies
○ Grant permissions to AWS accounts or IAM users, can grant very fined
permissions and deny access, also conditional permissions
○ Bucket policies are tied down to the specific bucket
○ Policy elements:
■ Principal - account or user that is allowed access
■ Effect - the effect IE allow, deny
■ Action - the list of permissions to allow or deny
■ Resource - Bucket or object for which access applies to. Specified as
amazon resource name
○ For user based policies, there is no principal because the user is the principal
○ Deny will always overwrite access
○ You can also limit VPC endpoint access via bucket policy, using the
aws:sourceVpce
● Cross account access
○ When granting access to another account, that other account can only view your
buckets via the CLI
○ You can set up cross account access via the user policy or the bucket policies
○ When granted via ACL
■ ACL grants access to an AWS account, not the user
■ The conditions cannot be enforced
■ So for users in that account, you need to set up user policies as well
○ When granting via bucket policies
■ Grants directly to IAM users in addition to the account
■ Conditions are enforced and conditional statements can be added to the
policy
● Timed url
○ Grant temp permission to an object using pre-signed url
○ Provide time-limited permission to download an object that would be otherwise
private. Valid for default 3600 seconds, and can change the timeout with
--expires-in [TIME BY SECONDS] argument
○ The bucket owner generates the url by specifying:
■ Security credentials
■ Bucketname
■ Object name
■ HTTP method
■ Expiration date and time
○ These are generated programmatically
○ Aws configure set default.s3.signature_version s3v4 - to have the generated
url compatible with KMS
○ Aws s3 presign s3://bucket_path --expires-in 300 --region bucket_region
■ Important to specify the region of the bucket
● Cloud watch metrics
○ Daily storage metrics
■ On by default
■ One report provided each day
■ No cost
■ Get reports on number of objects per bucket, size of the bucket
■ Logs retained for 15 months
○ Request & data transfer metrics
■ Not on by default
■ Available at 1 minute intervals
■ Standard cloudwatch billing rates
■ Get reports on http method requests, errors, amount of bytes
downloaded/uploaded and latency first byte & total request
■ Enabled at bucket level for all objects
○ Delivery
■ Metrics are delivered on a best effort basis
■ Completeness and timeliness is not guaranteed
● Access logging
○ Tracks request for access to bucket
○ Records information that is useful for tracking IE time/requestor IP/keyname etc.
○ Disabled by default
○ No charge to enable it but charged for log storage
○ Delivered on best effort basis
■ Could be delivered after request was made, or not at all
○ Logs are periodically delivered
■ S3 collects logs, consolidates them and uploads them so they are not
given in real time
○ Can turn on logging buy enabling the logging configuration to the source bucket
■ Specify target bucket where logs should be saved
○ By default only bucket owner has full permissions
○ Don't forget to grant write permission to the S3 log delivery group on the target
bucket
○ Why enable logging?
■ For security and audit reasons
■ To help understand your bill. S3 charges not just in storage basis, but also
the amount of requests
■ To help you understand your customer base by tracking IP location
○ Consider using life cycle rules to auto delete or archive logs because they will
build up over time
○ Logging works at the bucket level, not on S3 as a whole
CloudFront
● CDN - system of distributed servers that deliver web pages and web content to users
based on geographic location of the user
● Origin - origin of all the files that the CDN will distribute IE s3 bucket, ec2 instance,
elastic load balancer or route 53
● Distribution - name of given CDN which consists of collection of edge locations
○ Web distribution
■ Used for websites
■ Speed up distribution of static content like html css etc.
■ Distribute media file sover HTTP or HTTPS
■ Use live streaming to stream events in real time
○ RTMP - media streaming, like adobe and flash
● Cloudfront can be used to deliver website, dynamic and static websites, streaming and
interactive content using edge locations. Requests for your content are automatically
routed to nearest edge location, so content is delivered in best possible performance
● Optimized to work with other AWS services and with any non-AWS origin server, which
stores the original, definitive versions of your files. No additional charge to use custom
origin.
● Delivering all content using single cloudfront distribution helps you make sure
performance optimizations are applied to your entire website/application.
● Using AWS origins, benefit from using different origins for different types of content - e.g.
S3 for static objects, EC2 for dynamic content, paying only for what you use
● Lets you quickly obtain benefits of high performance content delivery, with pay as you go
pricing model. Benefit from tight integration with other AWS services.
● Achieve redundancy by using a backup origin that serves traffic automatically if primary
origin is down
● You can clear cached objects, however you will be charged
● TTL - important design consideration. By default TTL is set to 24 hours, however if you
have a service that requires new files updated every 12 hours, setting it to 24 hours
means that your users will not see that new file until 12 hours after.
○ TTL is the amount of time (in seconds) that cloudfront sends a request to the
origin to check if the object updated
○ Control how long object stays in CloudFront before making request to origin.
Reducing the duration lets you serve dynamic content. Increasing the duration
means users get better performance because objects are served from edge
cache.
● Restrict viewer access - important design consideration. By default selected as no.
basically it is a way to ensure privacy on who gets to access your material. So if you run
a subscription based service, you only want your paying users to access the objects.
Enable it and it will ensure that users use a specific url to access your objects.
● When setting up S3 with cloudfront, automatically you grant permission for anyone to
read files directly via your bucket. To ensure privacy, enable OAI.
● Geo Restriction Settings - choose either to whitelist or blacklist geo locations, can’t do
both.
● Invalidation - invalidating objects removes them from the cloudfront edge caches. So if
you put in something that is confidential, you can remove it for a fee
● Methods of having secure access to private files in S3:
○ Cloudfront signed cookies
○ Cloudfront signed urls
○ Cloudfront origin access identity
■ By placing a bucket policy that only authorizes access via OAI, you
ensure that clients cannot access the s3 bucket directly but have to go
through cloudfront
● If the requested resource does not exist on the cloudfront server, cloudfront will query
the origin server and then cache the resource on the edge location
S3 - Security & Encryption

● By default, all newly created buckets are private
● Setup access control using:
○ Bucket policies for whole bucket
● Buckets can be configured to create access logs which log all requests made
● Encryption
○ In Transit - when sending info from and to your bucket, secured using SSL/TLS
○ At Rest
■ Server side - using S3 managed keys, each object is encrypted using
unique keys, key is also encrypted with a master key that rotates
regularly. Uses 256 bit advanced encryption standard - SSE-S3. Can also
use AWS key management service - SSE-KMS. AWS manages data key
but you manage the master key. Provides you with audit trail of who used
the keys and who is decrypting keys. Customer provided keys - SSE-C,
you manage encryption key. You must use HTTPS, S3 will reject requests
made over http.
■ Client Side - encrypt data on client side and upload it to S3
● KMS
○ Anytime you need to share sensitive information, use KMS
○ KMS can only help in encrypting up to 4kb of data per call
○ If the data is > 4kmb, use envelope encryption
○ Customer master keys:
■ Aws managed service default: free
■ User keys created in KMS: $1 per month
■ User keys imported (must be 256bit): $1 per month
■ API calls to KMS: 0.03$ per 1000 calls
AWS Parameter Store
● Secure storage for configuration and secrets
● Centralized store to manage configuration data, helps separate configuration and
secrets from your code
● Optional seamless encryption using KMS
● Serverless, scalable, durable, easy SDK and free
● Version tracking of configurations and secrets
● Configuration management using path & IAM
● Integration with cloudformation
● Important to set up IAM roles to enable to use the parameter store strings
● Parameter store vs secrets manager
○ Parameter store for single central solution, secrets manager for dedicated
secrets store with lifecycle management
Storage Gateway
● Service that connects on-premise software appliance with cloud-based storage to
provide seamless and secure integration between on-premise IT environment and AWS
storage infrastructure. Enables you to securely store data to the cloud for scalable and
cost-effective storage.
● Typically used for backup and archiving, disaster recovery, moving data to s3 for in cloud
workloads and tiered storage
● Basically a virtual client installed into hypervisor, which replicates your info to AWS
storage.
● Available to download as a VM image that you install on your host in datacenter.
Supports either VMware ESXI or microsoft Hyper-vi. Once you install gateway and
associate it with your AWS account, you can use management console to create storage
gateway options.
● All data transfer is encrypted via SSL. all data stored in s3 via storage gateway is
encrypted via SSE-S3.
● HIPAA eligible and PCI compliant
● File Gateway (NFS) - where you store flat files IE web files, pdf, videos etc.
○ Accessed through network file system mount point. Ownership, permissions and
timestamps are durably stored in S3, in user-metadata of the object. Once stored
in S3, they can be managed as native S3 objects.
○ Virtual client connects to AWS either through direct connect (dedicated line to
AWS), through internet, or through aws VPC.
○ Applications read and write files and directories over NFS or SMB, interfacing to
the gateway as a file server. The gateway translates file operations into object
requests onto your S3 bucket.
○ Most recently used data is cached on the gateway for low latency access, and
data transfer between your data center and AWS is fully managed and optimized.
○ Enables existing file based applications to use S3 without modification
○ Use cases:
■ Migrating on premise file data to S3, while maintaining fast local access to
recently accessed data
■ Backup on-premise file data as objects in S3
■ Hybrid cloud workflows using data generated by on premise applications
for processing by aws services
● Volumes gateway (iSCSI) - block based storage, like an OS or virtual hard disk, not
stored in S3. basically like a virtual hard disk. Data written to volumes can be
asynchronously backed up as point-in-time snapshots of your volumes, and stored in the
cloud as amazon EBS snapshots.. Snapshots are incremental backups that capture only
changed blocks. Supports up to 32 volumes, and each volume can have 32 TB for
maximum of 1PB of data per gateway (32 volumes of 32 TB each). For stored, max is
16TB.
○ Stored volumes - store entire copy of data set locally, while asynchronously
backing up data to AWS. provides on premise applications with low latency
access to entire data sets, while providing durable off-site backups. Create store
volumes and mount them as iSCSI devices. Data written to your stored volume is
stored on your on premise storage hardware, then backed up to amazon s3 in
the form of amazon EBS snapshots. 1 gb - 16TB size.
○ Cached volumes - store only recently accessed data on premise, rest is backed
up to amazon, kind of like a reverse of stored volumes. Cached volumes
minimize the need to scale your on premise storage infrastructure, while
providing your applications with low latency access to frequently accessed data.
You can create storage volumes up to 32 TB and attach them as iSCSI devices
from on premise application servers. Gateway stores data that you write to these
volumes into S3 and retains recently read data on site. 1GB - 32TB for size
cached volumes.
● Tape Gateway (VTL) backup archiving solution, create archived tapes to send to S3.
durable, cost effective solution to archive data. Leverage your existing tape-based
backup application infrastructure to store data on virtual tape cartridges that you create.
Each tape gateway is preconfigured with media changer and tape devices.
● Direct Connect
○ Dedicated line from your site to AWS. reduces network cost, increase bandwidth
throughput and more consistent network experience
○ Available in 10Gbps, 1Gbps
○ However if you have an immediate need to connect to AWS, VPN is a better
option cause its quick to set up
Snowball
Before snowball:
● AWS import/export
○ Accelerates moving large amounts of data in and out of AWS cloud using
portable storage devices for transports. Transfers your data directly on and off
storage devices using AWS internal network.
○ The problem with this is that, there are many different types of hard disks or
storage devices, causes compatibility issues
● Snowball
○ Can import and export from s3.
○ Petabyte scale data transport solution that uses secure applications to transfer
large amounts of data in and out of AWS to bypass the internet. Solves
challenges of transferring large scale data for high network costs, long transfer
times and security concerns. Uses multiple layers of security designed to protect
data including tamper resistant enclosures, 256 bit encryption. Once processed
and verified, aws basically wipes the data on the snowball.
● Snowball edge
○ Same as snowball but contains 100 TB of data with on board storage and
compute capabilities. Kind of like a small aws data center, temporary storage tier
for large local datasets, or support local workloads in remote or offline locations.
Connects existing applications and infrastructure using standard storage
interfaces, streamlining data transfer process and minimizing setup and
integration. Able to perform lambda functions. Like a data center in a box to
compute datasets.
● Snowmobile
○ Massive container on a truck for huge amounts of data. Exabyte scale data
transfer service.
● Command lines:
○ ./snowball start -i xxx.xxx.xxx.xxx (network location) -m downloaded manifest file
-u unlock code
S3 transfer acceleration
● Uses cloudfront edge network to accelerate uploads to s3. Instead of uploading directly
to s3 bucket, use a distinct URL to upload directly to edge location which will transfer file
to s3. Generates a new endpoint link for you to use to upload.
● In general, the further you are away, the faster transfer acceleration would be.
● Costs extra
EC2 - 101
Web service that provides resizable compute capacity in the cloud. Reduces the time required
to obtain and boot new server instances to minutes, allowing you to quickly scale capacity,
matching your computing requirements change.
Allows developers to leverage amazons benefits of massive scale with no upfront investment or
performance compromises. Elastic nature allows developers to instantly scale to meet spikes in
traffic or demand. EC2 responds instantly to demands of the developers, so you can control how
your resources are used at any given point. Unlike traditional hosting where it is a fixed number
of resources for a fixed amount of time.
Terminology:
● Instance: virtual computing environments
● AMI: amazon machine images, preconfigured templates for your instances that package
the bits you need for your server
● Instance types: various configurations of CPU, memory, storage and network
capacities
● Instance store volumes: storage volumes for temporary data thats deleted when you
stop or terminate your instance. Basically local storage volumes, not persistent.
○ Block level storage, located on disks physically attached to host computer.
○ Ideal for temporary storage that changes frequently, like buffers, cache etc.
○ You can only specify instance store volumes for an instance when you launch it,
you cannot detach it and attach it to another instance.
● Amazon EBS volumes: persistent storage volumes for your resources
Options:
● On demand - pay fix rate by the hour or second with no commitment.
○ Perfect for users that want low cost and flexibility without upfront payment or
commitment
○ Applications with short term or spiky unpredictable workloads
○ Applications developed or tested on EC2 for the first time
○ Soft cap of 20 instances
● Reserved - provides you with capacity reservation and offers significant discount on
hourly charge for instance. Contract with 1 year, 3 year term
○ Applications with steady state or predictable usage
○ Applications that require reserved capacity
○ Users can make up front payments to reduce their total computing costs even
further
○ Standard reserved instances can be up to 75% off on-demand
■ Reservation time can be 1 or 3 years, and you reserve a specific instance
type
■ Recommended for steady state usage applications, like a database
○ Convertible reserved instances up to 54% off on demand features capability to
change the attributes of the RI as long as exchange results in the creation of RI
of equal or greater value.
■ Useful when commiting to using EC2 instances for 3 year term, who are
uncertain about their instance needs in the future, or want to benefit from
changes in price
○ Scheduled RI are available to launch within the time window you reserve. Option
allows you to match your capacity reservation to predictable recurring schedule
that only requires a fraction of a day, week or month.
○ You are not able to move a reserved instance from one region to another
● Spot - bid for whatever price you want for instance capacity, providing even greater
savings if your applications have flexible start and end times. Allows you to request
spare Amazon EC2 computing capacity for up to 90% off on the on-demand price
○ Applications that have flexible start and end times
○ Only feasible at very low compute prices
○ Great for users with urgent need for large amounts of additional computing
capacity
○ If terminated by EC2, you will not be charged for partial hour of usage ONLY if it's
in the first hour. However if you terminate the instance, you will be charged for
the complete hour
○ Best suited for fault-tolerant flexible workloads, like big data analysis, batch jobs
or workloads that are resilient to failures
○ Spot instances are reclaimed with a 2 minute notification warning when spot
prices goes above your bid
● You are billed by the second, with a minimum of 60 seconds, you can also pay for other
factors like storage, data transfer, fixed ip address, load balancing
● Dedicated hosts - physical ec2 server dedicated for your use. Help reduce costs by
allowing you to use existing server-bound software licenses.
○ Useful for regulatory requirements that may not support multi tenant virtualization
○ Great for licensing
○ Can be purchased on demand hourly
○ Can be purchased as a reservation for up to 70% off on the on-demand price
● EC2 Instance Types:
○ F1 - FIeld programmable gate array - financial analytics, real time video
processing, big data etc.
○ I3 - High speed storage - noSQL DBs, Data warehousing
○ G3 - Graphics intensive - video encoding
○ H1 - high disk throughput - mapreduce-based workloads
○ T2 - lowest cost, general purpose - web servers/smal dbs
○ D2 - dense storage - file servers, data warehousing, hadoop
○ R4 - memory optimized - memory intensive apps / dbs
○ M5 - general purpose - application servers
○ C5 - compute optimized - CPU intensive apps / dbs, high performance web
servers, high performance computing, scientific modelling, distributed analytics
and machine learning
○ P3 - graphics/general purpose GPU - machine learning, bit coin mining
○ X1 - memory optimized - SAP HANA/Apache spark
● EBS: virtual disk, allows you to create storage volume and attach to EC2 instances. You
can create filesystem on top of these volumes, run a database, or use them in other
ways you would use a block device. Placed in a specific availability zones, they are
automatically replicated to protect you from failure of a single component
○ General purpose ssd (GP2)
■ General purpose, balances both price and performance
■ Ratio of 3 IOPS per GB with up to 10,000 IPS, ability to burst up to 3000
IOPS for extended periods of time
■ Development and test environments, low-latency interactive apps
■ Volume size: 1gb - 16TIB
○ Provisioned IOPS SSD (IO1)
■ Designed for I/O intensive applications such as large relational or noSQL
database
■ Use if need more than 16,000 IOPS
■ Provision up to 20,000 IOPS per volume
■ Not able to burst, IOPS are provisioned
■ Maximum ratio is 50 IOPS to 1 gb
■ Volumes size: 4gb - 16 TB
○ Throughput optimized HDD(ST1)
■ Used for workloads consisting of fast, throughput at low price
■ Max 500 IOPS
■ Able to burst
■ Ideal for throughput intensive workloads with large datasets and large I/o
sizes, streaming workloads requiring consistent, fast throughput at low
price, big data, log processing
■ Data warehouses, mapreduce, kafka, log processing
■ Volume size: 500gb - 16 tib
■ Cannot be a boot volume
○ Cold HDD(Sc1)
■ Lowest cost storage for infrequently accessed workloads
■ Cannot be a boot volume
■ Able to burst
○ Magnetic (standard)
■ Lowest cost per gig of all EBS that is bootable
■ Ideal for workloads where data is accessed in frequently
● Data life cycle manager - service that lets you schedule and manage the creation and
deletion of EBS snapshots
● Subnet - one subnet equals one availability zone
● Common Commands:
○ Download private key and make sure to set permissions to allow only one user
○ Chmod: set 400 permission
○ Windows: right click on file, remove all users in security
○ Download apache server: yum install httpd -y
○ Log in to ec2 instance: ssh ec2-user@public ip address -i privatkey.pem
○ Yum update -y : checks and downloads any updates for instance
○ Chkconfig httpd on : turns on httpd service everytime you start your instance
● System status check
○ Monitor aws system which runs your instance. Detect underlying problems with
your instance that require AWS involvement to repair, or you can repair on your
own.
○ For instances backed by EBS, you can stop and start the instance yourself,
which migrates it to a new host computer.
○ Verifies that the instance is reachable
○ Examples of why it may fail:
■ Loss of network connectivity
■ Loss of system power
■ Software issues on physical host
■ Hardware issues on physical host
● Instance status checks
○ Monitor software and network configuration of your individual instance. AWS
checks health of instance by sending ARP request to the ENI. this check detects
problems that require you to repair.
○ Checks to see if OS is accepting traffic
○ Examples why it may fail
■ Failed system status check
■ Incorrect networking or startup configuration
■ Exhausted memory
■ Corrupted file system
■ Incompatible kernel
● You are not able to encrypt the root device that amazon provides
○ You can use third party tools to encrypt the root volume, or create a copy and
encrypt that
● Termination protection is turned off by default, you must turn it on
● On ebs-backed instance, default action is for root eBS volume to be deleted when
instance is terminated - when you delete virtual machine, the virtual hard disk will be
deleted as well
● Underlying hypervisors for EC2 are Xen and Nitro
● Best practices:
○ Use IAM to control access to your AWS resources
○ Restrict access to only trusted hosts or networks to access ports on your
instance. You can restrict SSH access by restricting incoming traffic on port 22.
○ Review rules to ensure you apply the principle of least privilege.
○ Disable password-based logins for instances launched from your AMI.
● Stop, starting, terminating
○ When instance is stopped, instance performs a shutdown and goes into stopped
state. You are not charged while it is stopped.
○ When instance is stopped, you can attach or detach EBS volumes and create
new AMI. you can change the kernel, ram disk and instance types.
○ When terminated, the root volume is deleted by default but EBS volumes are
preserved, based on the deleteOnTermination settings. You cannot start the
instance again after that.
● EC2 Ram
○ Applications will use ram to store data and objects in memory, mainly for speed
up caching purposes
○ Small amount of ram is used by OS and running applications
○ If the RAM is not enough, usually the ram will extend to the disk (swapping)
resulting in very slow performance
○ Use the free -m command to view memory usage in megabytes
○ Use the top command to view running processes and the resources they use
■ Use shift N to view sorting options
● EC2 CPU
○ Piece of hardware that carries out instructions of program, such as logical, IO,
etc.
○ Anytime server needs to perform computation, or instruction, CPU will be used
○ In Linux system, each core will account for 100% so if you have 4 cores and all
are used, CPU will run at 400%
○ Adding more cores will not help if your application is single threaded
■ Use top and sort by CPU. if you have 2 cores and it shows 100%, you
know its single threaded because if it was multi, it would use up the other
core
○ If CPU is at 100%, review your application / program before you upgrade CPU or
add cores.
● EC2 IO
○ Read and write to the disk
○ Use iostat to find current amount of IO being used
○ If IO is not large enough it can cause timeouts, slowdowns and crashes
○ Mainly crucial for database intensive applications
● EC2 Network
○ Network in EC2 uses ethernet connection
○ Use nload or iftop to find network information
● EC2 GPU
○ Mainly used for video / audio processing, perform computations, and to perform
machine learning
○ Measured by the number of cores, usually over 1k
○ Only receive benefits from GPU if you pick a GPU instance
● General instances
○ Good balance between RAM CPU and Network
○ General purpose instance
● Burstable Instance
○ The T generation
○ Has okay cpu performances, but when machine needs to process something
unexpected IE a huge spike, it can burst and provide additional resources
○ If machine bursts, it uses burst credits. If credits are gone, CPU performance
gets worse. When machine stops bursting, credits will accumulate over time
○ Amazing at handling spikes or unexpected traffic. T is the only generation able to
provide this.
○ If you keep running out of burst credits, might be a good choice to pick another
instance that provides resources to handle the traffic.
○ T2 Unlimited
■ Possible to have unlimited burst credit balance
■ Pay extra money if you go over credit balance, but you don't lose
performance
● Instance identity documents
○ JSON file that describes the instance. Accompanied by a signature and PKC57
signature which can be used to verify the accuracy, origin and authenticity of
information provided.
○ Generated when instance is launched, and exposed through the instance
metadata.
○ curl http://169.254.169.254/latest/dynamic/instance-identity/document
● Supported OS
○ AWS Linux, Ubuntu, Windows Server, Red Hat, SUSE Linux, Fedora, Debian,
CentOS, Gentoo Linux, Oracle Linux, FreeBSD
● Billing
○ Pay for what you use. Pricing is hourly rate depending on instance type
○ Instances that are stopped will not get billed for hourly usage or data transfer, but
the charge is still there for storage for the ebs volumes
● Instances are grouped into 5 types
○ General purpose: suitable for most general purpose applications and come with
fixed performance or burstable
○ Compute optimized - more CPU than memory, suited for compute intensive
applications and high performance computing workloads
○ Memory optimized - large memory sizes for memory intensive applications,
including database and memory caching
○ Accelerating compute instances - takes advantage of parallel processing
capabilities for high performance computing and machine/deep learning
○ GPU graphics - high performance 3d graphics for applications using openGL
and directX
○ Storage optimized - very high, low latency I/O capacity using SSD based local
instance storage for I/o intensive apps.
● Each instance comes with private IP address and internet routable public IP address.
Private address is associated with network interface, and is released when instance is
terminated. The public address is associated exclusively to the instance until it is
stopped, terminated or replaced with elastic address.
● Nitro Hypervisor
○ Provides CPU and memory isolation for instances. VPC networking and EBS
storage resources are implemented by dedicated hardware components, Nitro
Cards, that are part of all current generation instances.
○ Built on core Linux Kernel based VM, but does not include general purpose OS
components
○ Provides consistent performance and increased compute and memory resources
for EC2 virtualized instances
○ Instances running on Nitro Hypervisor can support maximum of 27 additional PCI
devices for EBS and VPC ENI.
○ Instances boot from EBS volumes using NVMe interface. Instances using Xen,
will boot from an emulated IDE hard drive, then switch to XEN paravirtualized
block device drivers.
● Enhanced Networking
○ Supported EC2 instances will get higher packet per second performance, lower
inter-instance latencies and lower network jitter
○ Suitable for apps that need to benefit from high packet per second performance
and low latency networking.
○ No fee. To take advantage of enhanced networking, you need to launch
appropriate AMI on supported instance type in a VPC.
● Elastic Fabric Adapter
○ EFA brings out scalability, flexibility and elasticity of cloud to tightly coupled HPC
applications. HPC apps have access to lower and more consistent latency and
higher throughputs, allowing them to scale better.
○ HPC apps distribute computational workloads across cluster of instances for
parallel processing benefit greatly from EFA.
Security Group
● A virtual firewall controlling traffic to your instance. One instance can have multiple
security groups
● Any rule that you apply to a security group applies instantly, but they are stateful:
○ As soon as you add an inbound rule, will apply to the outbound automatically
○ If you create an inbound rule allowing traffic in, traffic is automatically allowed
back out again
● You cannot block specific ip addresses by using security groups, instead you should use
network access control lists
● You can only specify allow rules, but not deny rules
● All inbound traffic is blocked by default
● All outbound is allowed by default
● Common tips:
○ To allow SSH, enable port 22
○ To enable HTTP traffic, basically to allow people to view your web page, enable
port 80
● Security groups are locked down to a region / VPC combination
● Good practice to maintain one separate security group for SSH access
● If your application is not accessible, most likely a security group issue
● Private IP ranges
○ In AWS, they range from 172.16.0.0 - 172.31.255.255 (172.16.0.0/12)
● CIDR
○ Uses base IP and subnet mask
● Possible to have security groups to reference other security groups instead of IP
references. You can reference your own security group
○ For example, EC2 to EC2 direct communication
○ Or only allow load balancer to talk to EC2 instance
○ Have flexible rules
● Elastic IPS
○ Basically static IPS for your instances, you own it as long as you don't delete it
○ Can only attach it to one instance at a time
○ By default you can only have 5 elastic IP on your account
○ Not recommended to use elastic ips, view other alternatives like using route 53 to
register a dns name to an ip
○ They are free as long as the instance is running
EBS volumes
● You cannot have an EC2 instance in one availability zone and the ebs volume in
another, it must be in the same availability zone
● You can modify volumes in the fly, except for magnetic storage
● To create an EC2 instance at another AZ, take a snapshot of the root volume, then you
can create a new volume and select another AZ
● To create an EC2 instance at another region, take a snapshot of the root volume then
select copy and you can choose another region, then create an image and this will
generate another ec2 instance
● Snapshots
○ A snapshot is basically like a backup, and snapshots exist on S3. they are a point
in time copies of the volume
○ Snapshots are incremental - only blocks that have changed since your last
snapshot are moved to S3
○ You can create AMI (amazon machine images) from EBS backed instances and
snapshots
○ Snapshots of encrypted volumes are encrypted automatically
○ Volumes restored from encrypted snapshots are encrypted automatically
○ You can share snapshots but only if unencrypted
○ EBS snapshots are versioned and you can read an older snapshot to do a
point-in-time recovery
○ Snapshots can be taken in real time while volume is in use, however it only
captures information stored in the volume so it will exclude any locally cached
data. In order to make sure the snapshot is completely consistent, detach the
volume first. If root volume, shut down the machine before taking the snapshot.
○ Snapshots are incremental, so duration of snapshot should not be different if size
is 16TB or 1TB, but factors more based on amount of data changed since last
snapshot
● EBS is recommended for data accessed quickly and require long term persistence. Well
suited for primary storage for file systems, database or applications requiring fine
granular updates.
● EBS are a network drive / virtual drive, not a physical drive
○ Uses the network to communicate to the instance, so there might be some
latency
○ Is detachable
● size of general purpose SSD 1GiB - 16TiB
● RAID configuration
○ RAID 0 - increase performance
■ Basically stacks the throughput and IOPS of ebs volumes into one,
combines them. But if one of the ebs volumes fails, the whole rack fails.
■ It's not very fault tolerant, but the performance is crazy
○ RAID 1 - increase fault tolerance
■ Writes are written to all EBS volumes IE mirroring a volume
■ If one fails, the data still persist since it's on the other drive
■ Our EC2 instance needs 2x the network throughput, because we are
sending data to 2 ebs volumes at the same time
● Encryption
○ Encrypt data at rest via KMS or amazon managed keys. Encryption occurs on
servers that host ec2 instances, encrypting data as it moves between ec2
instances and ebs storage.
○ Combination of encryption and iam access policies improves defense-in-depth
strategy.
○ EBS handles key management for you. Each volume gets unique 256 bit AES
key; volumes created from encrypted snapshots share the same key.
AMI Types (EBS Vs Instance Store)

● You can select AMI based on:
○ Region
○ OS
○ Architecture
○ Launch permissions
○ Storage for the root device
■ Instance store (EPHEMERAL storage)
■ Ebs backed volumes
● With instance store, you cannot stop the instance, so if it is on a failed host, you basically
just lost it
● To launch instance store, go to community market place and find an instance store
volume
● EBS: root device for an instance is launched from AMI as an EBS volume created from
an EBS snapshot
● Instance store: root device for an instance is launched from AMI as instance store
volume created from a template stored in S3.
● You can reboot both and not lose your data
● EBS root volume can persist independently from life of termination of EC2 instance if
you specify it via the “delete-on-termination” behavior
● When copying an AMI, you need to manually copy the user-defined tags, launch
permission and S3 bucket permissions
● Benefits of copying
○ Consistent global deployment: copying ami from one region to another
enables you to launch consistent instances in different regions
○ Scalability: easily design and build global applications that meets needs of your
users
○ Performance: increase performance by distributing your application, as well as
locating critical components of your applications in closer proximity of your users.
Take advantage of region specific features
○ High availability: design and deploy applications across regions to increase
availability
Elastic Load Balancers

● Virtual machine that balances the load of web applications / traffic, balances it across
web servers to avoid bottlenecks and overwhelming
● All balancers will provide a DNS name but not a public IP. AWS will manage it for you
● Load balancer accepts incoming traffic from clients and routes requests to registered
targets (like EC2 instances) in one or more availability zones. Also monitors health of its
registered targets and ensures that it routes traffic only to healthy targets. Increases fault
tolerance of your applications.
● When unhealthy target is detected, stops routing to that target, and resumes only if
target is healthy again.
● Load balancers act as a traffic proxy that distributes network or application traffic across
a number of servers.
● Application load balancer
○ Best suited for HTTP and HTTPS traffic. Operate at layer 7. You can create
advanced request routing, sending specified requests to specific web servers.
○ Able to send traffic determined by host, route to target groups
■ Target groups are basically groups of instances or AWS services
○ More advanced than classic load balancer because you can distribute traffic
depending on the actual content, so if someone wants to access /image, the load
balancer will direct traffic to a specific set of servers
○ Headers may be modified
○ X-forwarded-for header contains client ip address
○ Content based routing, allowing requests to be routed to different applications
behind single load balancer
○ Allows for multiple services to be hosted behind a single load balancer
○ Fully integrated with EC2 container service (ECS), managing target groups,
paths and targets
○ Able to support WAF (Web Application Firewall)
■ Monitor web requests and protect applications from malicious requests
■ Block or allow requests based on conditions like ip addresses
■ Protection to block common attacks like cross site scripting and sql
injection
○ Price: hourly charge, number of load balancer capacity units (LCU) consumed
■ 1 LCU = 25 new connections per second
■ 1 LCU = 3000 active connections per minute
■ 1 LCU = 2.22 Mbps
■ You are charged only on the dimension with the highest usage
○ ALB does not support TCP/IP so you would go with classic LB in that case
● Network load balancer
○ Suited for TCP and SSL traffic where extreme performance is required.
Operating at layer 4, capable of handling millions of requests per second while
maintaining low latencies.
○ Incoming client connection bound to server connection
○ No header modifications
○ Forward TCP traffic to instances
○ Support for static / elastic IP
● Classic load balancer
○ Legacy elastic load balancers. Can load http / https applications and use layer 7
specific features, like X forwarded and sticky sessions. You can use layer 7 load
balancing for application that rely on TCP.
○ Distributes traffic equally, not intelligent enough to determine the content of the
traffic
○ Price: hourly charge + pay per GB that goes through
● Load balancer errors
○ 504 error - gateway timeout error. Balancer is there but trouble communicating
with instance. Can be an issue with the web server. Application is not responding
within time.
● X-forwarded-for header
○ X-forwarded-for header passes public ip address to EC2 instance since EC2
instance does not look for public ip addresses.
○ If you need the ipv4 address for an end user, you would use X-forward
● Rules
○ Each listener can have one or more rules for routing requests to target groups
○ Rules consist of conditions and actions
○ When a request meets condition of rule, the action is taken
○ Conditions can be specified in path pattern format
○ Can also do host based routing
● Cross-Zone load balancing
○ Each load balancer node distributes traffic across the registered targets in al
lenabled AZ. if you disable cross zone load balancing, each load balancer
distributes traffic across the registered targets in its AZ only.
■ So basically, if its enabled, the traffic gets distributed to the actual nodes.
If its disabled, the traffic is balanced between the load balancer targets,
which distributes the traffic between targets unevenly.
■ Cross Zone disabled:
○ Always enabled in Application load balancers, disabled by default in Network

load balancers.
● Request Routing
○ Before client sends request to your LB, it resolves the LB domain name using
DNS server. DNS entry is controlled by AWS.
○ ELB scales your load balancer and updates the DNS entry as traffic to your
application changes over time.
○ The client determines which IP address to use to send requests to the load
balancer. The load balancer node that receives the request selects a healthy
registered target and sends the request to the target using private IP address
■ Routing algorithm
● For Application load balancers, load balancer node that
receives the request evaluates the listener rules in priority order to
determine which rule to apply, then selects target from target
group, using round robin routing.
● For Network load balancers, the LB node receives connection
and selects target from target group for the default rule using flow
hash algorithm.
● For Classic load balancers, LB node receives request selects
registered instance using round robin for TCP listeners and the
least outstanding requests routing algorithm for HTTP and
HTTPS.
● Internet facing
○ Nodes have a public ip address.
○ Dns name of the internet facing load balancer is publicly resolved to the public
address of the nodes.
● Internal load balancer
○ Nodes have private ip address only. DNS name of internal load balancer is
publicly resolvable to the private IP addresses of the node.
○ Can only route requests from clients with access to the VPC for the LB.
● Sticky sessions
○ For CLB, you can enable the LB to bind a user session to a specific instance.
Ensures that all requests from the user during session are sent to the same
instance.
○ Each http request-response pair between client and app happen on different TCP
connections. However if load balancer is between app and the client, the app
cannot use TCP as a way to remember conversational context.
○ Solution to this is using cookies, but in distributed system, client can be talking to
a completely different instance / node
○ Benefit of using sticky session is that you don't have to modify existing
applications to use memcache or redis
○ Key to managing sticky session is the TTL
○ Support for SSL over HTTPS
● Connection draining
○ For CLB to ensure to stop sending requests to instances that are de-registering
or deemed unhealthy, while keeping the existing connections open.
○ Complete in-flight requests made to instances that are unhealthy.
CloudWatch
● What metrics are available by default?
○ CPU
○ DISK
○ Network
○ Status Check
● If you are looking for metric outside of default, you will have to custom make it
● Logs
○ Allow you to monitor instances at the application layer, helps you aggregate,
monitor and store logs.
○ Monitor http response codes
○ Receive alarms for errors in kernel logs
○ Count exceptions in application logs
○ Can collect logs from:
■ Elastic beanstalk
■ ECS
■ Lambda
■ Vpc flow logs
■ Api gateway
■ Cloudtrail based on filter
■ Cloudwatch log agents: example on ec2 machines
■ route53 : log dns queries
○ Logs can go to:
■ Batch exporter to s3 for archival
■ Stream to elasticsearch cluster for analysis
○ Architecture:
■ Log groups: arbitrary name
■ Log stream: instances within application, log files, containers
■ Define log expiration policies (never expire, 30 days etc…)
○ When sending logs to cloudwatch, make sure you set up IAM policies correctly!
○ Can encrypt logs using KMS at the group level
● Cloudwatch logs agent provides automated way to send log data to cloudwatch logs
from EC2 instances
● Standard monitoring = 5 minutes = free service
● Detailed monitoring = 1 minute = cost money
● With Cloudwatch, you can create dashboards with widgets that can set up custom
metrics to see what is going on with your AWS instances.
● Dashboards
○ Are global
○ Include graphs from different regions
○ You can change the time zone & time range of the dashboards, and set up
automatic refresh
○ Pricing:
■ 3 dashboards up to 50 metrics are free
■ $3 dashboard a month afterwards
● Alarms - allows you to set alarms to notify you of thresholds
○ Can go to auto scaling, ec2 actions, sns notifications
○ States:
■ Ok
■ Insufficient_data
■ Alarm
■ On high resolution metric, can be triggered often as 10 seconds
● Events - events help you respond to state changes, IE create lambda scripts that
respond to your event
● Trail - provides governance, compliance and audit for your aws account
○ Enabled by default
○ Get history of events / api calls made within your account
●
EC2 User Data

● Possible to bootstrap our instances using EC2 user data script
● Bootstrapping means launching commands when a machine starts, and the script is only
ran once right when the instance starts
● EC2 user data is used to automate boot tasks such as:
○ Installing updates
○ Installing software
○ Downloading common files
● For example, when creating an instance, on the user data you can put in
○ #!/bin/bash
○ Yum update -y
○ Yum install -y httpd
○ Service httpd start
○ Chkconfig httpd on
AWS Command Line

● To access buckets in the CLI like to copy etc. in other regions that are outside of your
region, you may have to use the --region regionname flag so for example
○ --region eu-west-2
● If you want your instance to have access to S3 or have permissions, assign it a role
instead of configuring aws user, as that opens up security concerns
● Curl http://169.254.169.254/latest/meta-data/
○ This brings up metadata info about the instance, remember this url!
● Aws configure
○ Used to configure your CLI to your programmatic user access
○ Creates a hidden folder .aws which contains your config of access keys and
default regions
AWS Auto Scaling

● Enables you to configure automatic scaling for AWS resources that are part of your
application.
● Scale Out (Add EC2 instances) to increase load
● Scale In (Remove EC2 instances) to decrease load
● Automation: launching one or more Ec2 instances in automated way
● Scalability: adjusting number of EC2 instances in current workload
● Availability: replacing failed EC2 instances automatically
● Launch configuration is used to specify what kind of machine you want your AMI to be
hosted on
● You can configure and manage scaling via a scaling plan, which dynamically increases
or decreases the amount of instances you need. Ensures that you add the required
computing power to handle the load and then remove it when no longer required.
● Useful for applications that experience daily or weekly variations in traffic
○ Cyclical traffic during specific business hours and low usage during night
○ On and off traffic patterns like batch processing, testing
○ Variable traffic patterns like marketing campaigns
● Scaling plans
○ Set of instructions to scale your resources. You can create one calling plan per
application source (like AWS cloudformation stack, set of tags, or specific EC2
groups).
○ Dynamic scaling
■ Uses target tracking scaling policies to adjust capacity in response to live
changes to how your resources are utilized. Similar to your thermostat
that maintains temperature.
■ Example: configure scaling plan to use EC2 services running at 75
percent CPU. when CPU goes above 75%, triggers policy to add another
task to your service to help out with the load
○ Predictive scaling
■ Uses machine learning to analyze each resource historical workload and
forecasts the future load for the next two days.
● The better you understand your application, the more effective your scaling plan
○ How long it takes to launch and configure server
○ Whether you have existing scaling policies from other consoles
○ What metrics have most relevance to your performance
○ Target utilization that makes sense to scale resources in your application
○ Whether the metric history is sufficiently long to use for predictive scaling. In
general, have 14 days of historical data for more accurate forecast.
● Auto scaling manages the scaling of each target group independently
● Auto scaling doesn’t always have to have an ELB
● Alarms
○ Remember that metrics are computed for OVERALL instances, not per instance!
● Auto scaling is free, you only pay for your EC2 instances
● Default termination policy
○ Auto scaler will select the AZ with the most instances, and terminate the instance
launched from the oldest launch configuration. If the instances were launched
from same launch configuration, it will select the instance closest to the next
billing hour
● Customize termination policy
○ If one AZ has more instance than other AZ, your policy will be applied to
instances from the AZ with more instances.
● Lifecycle Hooks
○ Enables you to perform custom actions by pausing instances as the ASG
launches or terminates them. When instance is paused, it remains in wait state
for some time (1 hour by default) or till you continue.
○ You can perform custom actions:
■ Define cloudwatch events target to invoke lambda function when lifecycle
action occurs.
■ Define notification target. ASG sends message to the notification target.
Message contains information about the instance that is launching or
terminating
■ Create script that runs on instance as the instance starts.
● Scaling based on SQS
○ Custom metric sent to cloudwatch that measures number of messages in SQS
queue per ec2 instance in the group
○ Target tracking policy that configures your ASG to scale based on custom metric
and set target value. Cloudwatch alarms invoke the scaling policy
EC2 Placement Groups

● Clustered placement group
○ Grouping of instances within single availability zone. Recommended for
applications that need low network latency, high network throughput, or both.
○ Only certain instances can be launched into a clustered placement group
○ If the rack fails, all instances will fail at the same time since they are on the same
machine, high risk
○ Use cases:
■ Big data job that needs to complete fast
■ Applications that needs low latency and high network throughput
● Spread placement group
○ Group of instances that are placed on distinct underlying hardware, spread
across different AZ
○ Recommend for applications that have small number of critical instances that
should be kept separate from each other
○ Opposite of cluster group, ensure its across multiple devices in multiple
availability zones
○ Minimum amount of risk of failure
○ Limited to 7 instances per AZ per placement group
● Name you specify for placement group must be unique within your AWS account
● Only certain types of instances can be launched in placement group (compute optimized,
GPU, memory optimized, storage optimized) T micro does not work
● Not applicable for T2 instances
● Recommended homogenous instances within placement group
● You cannot merge placement groups
● You cannot move an existing instance into a placement group. You can create an AMI
from an existing instance, then launch a new instance from AMI into a placement group.
● Recommended that you launch number of instances you need in a single launch
request, and use the same instance type for all instances in the group
● If you receive capacity error when launching an instance in a group that's already
running, stop and start all the instances in the group and try the launch again
EFS
● File storage service for elastic compute cloud instances (EC2)
● Easy to use and provides simple interface that lets you create and configure file systems
quickly and easily
● Storage capacity is elastic, growing and shrinking automatically as you need
● Features
○ You only pay for storage you use
○ Support NFSv4 protocol
○ Scale up to petabytes
○ Can support thousands of concurrent NFS connections
○ Data is stored across multiple AZ within a region
○ Block based storage
○ Read after write consistency
● Perfect use case is for file servers
● EFS allows multiple instances to connect, but EBS only allows one instance
● EFS basically lets you use one big repository that transfers files over to other instances
in other availability zones
● To install:
○ yum install -y amazon-efs-utils
○ mount -t efs fs-0f6234a6:/ /var/www/html
■ Note that the file system ID is what your EFS is set to, and the path you
want to mount the file
■ Don't forget for inbound rules to allow NFS protocol at port 2049, security
group name put the name of the file system id
Lambda - 101
● Compute service where you can upload your code to create a lambda function.
● Takes care of provisioning and managing servers that you use to run the code. Don't
have to worry about operating systems, patching, scaling etc.
● You can use lambda as:
○ Event driven compute service where lambda runs your code in response to
events. These events could be changes to data in an S3 bucket or DynamoDB
table
○ Compute service where you run your code in response to http requests
● Important to understand that lambda functions are run on its own instance. So if a user is
executing a trigger, the lambda function will invoke per instance, so per user. Basically, 1
event = 1 function
● Lambda scales OUT, not up automatically
● Lambda is a serverless
● AWS supports runtime environments:
○ Python
○ Node
○ Java
○ Go
○ .NET / C#
○ Ruby
● Available triggers:
○ API gateway
○ AWS IoT
○ Alexa skills kit
○ Alexa smart home
○ cloudFront
○ Cloudwatch events
○ Cloudwatch logs
○ Codecommit
○ Cognito sync trigger
○ dynamoDB
○ Kinesis
○ S3 SNS
● How its priced
○ Based on the number of requests
■ First 1 million requests are free. .20$ per 1 million requests after
○ Duration
■ Calculated from the time your code begins executing until it returns or
otherwise terminates, rounded up to nearest 100ms
■ Price depends on amount of memory you allocate to your function.
Charged 0.0001677 for every GB-second used
■ Important to know that your function cannot execute for longer than
15 minutes
● Execution limits:
○ Memory allocation: 128MB - 3008MB (64 mb increments)
○ Disk capacity in function container: 512 mb
○ Concurrency limits: 1000
● Deployment limits:
○ Size of function deployed: 50 mb zipped
○ Size of uncompressed function deployed: 250 mb, code + dependencies
○ Size of environment variables: 4 kb
● API Gateway:
○ Fully managed service making it easy for developers to publish, maintain,
monitor and secure APIS at any scale. Create an API that acts as a front door for
applications to access data, business logic, functionality from your backend
services.
○ No minimum fee or startup costs. You pay only for the API calls you receive and
the amount of data transferred out.
○ Supported HTTP verbs
■ GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS
○ Resource policy
■ JSON policy document that control whether a specified principal (a user
or role) can invoke the API.
■ Enable users from different accounts to access your API, or allow API to
only be invoked from specific source IP
○ Remember to enable CORS to allow gateway to interact with S3 and other
services
○ Api caching
■ Cache your endpoints response. Reduce the number of calls made to
your endpoint, improves latency of the requests to your API.
■ Caches response from your endpoint for a specific TTL. API Gateway
then responds to the request by looking up response from the cache
instead of making a request to your endpoint
○ Low cost and efficient, scales effortlessly, throttle requests to prevent attacks,
connect to cloudwatch
○ Same origin policy
■ Web browser permits scripts contained in the first web page to access
data in second web page, but only if they have the same origin/domain
■ CORS promotes same origin policy.
● Mechanism that allows restricted resources on a web page to be
requested from another domain outside the domain from which the
request was served
Route 53
● Name was given inspired by route 66, they chose 53 because DNS resides on port 53
● DNS
○ Translates IP addresses to human readable form, like a phone book
● Route 53 provides highly scalable and available DNS, domain registration and health
checking services. Effectively connects user requests to infrastructure running in AWS,
like EC2 instances, load b e is served by its own set of virtual DNS servers. The DNS
server names for each hosted zones are assigned by the system when that host zone is
created.
● Hosted zone is analogous to traditional DNS zone file; represents collection of records
that can be managed together, belonging to single parent domain name. All resource
record sets within hosted zone must have the hosted zones domain name as a suffix IE
www.example.com , www.test.example.com
● Pricing is based on usage of the service for hosted zones, queries, health checks and
domain names. Billed when created, and then on the first day of each month.
● SLA provides for service credit if customers monthly uptime percentage is below service
commitment in any billing cycle.
● Route 53 uses Anycast networking and routing technology, that helps end user DNS
queries get answered from optimal location given network conditions.
● You can create multiple hosted zones that provides you with testing environments.
● Pay 0.50$ per month per hosted zone
● Common records:
○ A: url to ipv4
○ AAAA: url to IPV6
○ CNAME: url to url
○ Alias: URL to AWS resource
● CNAME vs Alias
○ Cname works only for nonroot domain IE example.mydomain.com
○ Alias works for root and nonroot domains IE mydomain.com, it's free of charge
and native health check
● Alias records
○ Used to map resource record sets in your hosted zone to ELB, cloud front, or s3
buckets that are configured as websites
○ Similar to a CNAME in that you can map one dns name www.example.com to
another target dnsd name elb1234.elb.amazonaws.com
○ The difference is that CNAMEs can’t be used for naket domain names, it has to
be either an A record or an Alias.
○ Always choose an Alias over a CNAME if you have the choice
○ Saves you time because Route 53 automatically recognizes changes in the
record sets that the alias record refers to.
○ Alias are not visible to resolvers; resolves only see the A record and the resulting
IP address of the target record
○ Alias records provide route 53-specific extension to DNS functionality
○ A CNAME record assigns an Alias name to a Canonical name
● ELBs do not have a pre-defined IPv4 address, you resolve them using a DNS name
● By default, you will always have 2 record sets created, the NS lookup and your SOA
● State of Authority
○ SOA records is information stored in a DNS zone about that zone. A DNS zone is
part of a domain for which individual DNS server is responsible. Each zone
contains a single SOA record.
● Routing Policies
○ Simple routing
■ Default routing policy. Most commonly used when you have single
resource that performs a given function for your domain, like one web
server that serves content to a website
■ User -> dns request to route 53 -> route 53 hits to ec2 instance
○ Weight routing
■ Able to add weights to resource record sets in order to specifiy the
frequency in which they are served.
■ Ideal for A/B testing
■ For example, one set has weight of 3, other set has a weight of 1, the
weight of 3 will have a 75% chance of getting served
○ Latency-based routing
■ Able to route traffic to DNS that has the lowest latency for the user
■ Utilizes latency measurements between viewer networks and AWS data
centers
○ Failover routing
■ Used when you want to create an active/passive set up, so while the
active is up, traffic is routed there but if that fails over, it will point over to
the passive site
■ Provides redundancy
■ You can create a DNS failover for ELB by creating an alias record pointing
to the ELB and set the evaluate target health parameter to true.Route 53
will create and manage health checks for your ELB.
■ You can use DNS failover to maintain a backup site and fail over to this
site in event primary site is unreachable
■ You can create route 53 resource record that points to address outside of
AWS, you can set up health checks for parts of your application running
outside of AWS
■ By default, health check observations are conducted every 30 seconds, or
you can change it to 10 seconds however that may cause some latency
■ Active - active
● When you want all your resources to be available the majority of
the time.
● Records that have the same name, the same type, and the same
routing policy are active unless Route 53 considers them
unhealthy.
■ Active - passive
● When you want primary resources available at all times, and
secondary resource to be on standby incase primary is
unavailable.
○ Geolocation routing
■ Lets you choose where traffic will be sent based on geographic location of
your users. Sends to the appropriate endpoints.
■ Possible to customize localized content
■ You may want all queries from europe to be routed to a fleet of EC2
instances that are specifically configured for your EU customers.
■ Provides three levels of granularity: continent, country and state, and also
provides global record which is served in case where the location doesn’t
match any specifications.
■ Can combine with other routing types.
○ Multi value routing
■ Allows you to have multiple record sets
■ Up to 8 healthy record are returned for each multi value query
■ Not a substitute for having an ELB, but a similar concept to ELB
● By default there is a limit of 50 domain names you can have, however the limit can be
increased by contacting AWS
● Traffic flow
○ Easy to use and cost effective traffic management service
○ You can improve performance and availability of your application for end users by
running multiple endpoints around the world. Connect your users to the best
endpoint based on latency, geo and endpoint health
○ Supports all route 53 DNS routing policies including latency, end point health,
multivalue; answers, weighted round robin and geo
○ Also supports geo proximity based routing with traffic biasing
● Traffic Policy
○ Set of rules that you define to route end users requests to one of your
applications endpoints
○ By itself it does not affect the end users routing, you need to combine it with a
policy record which associates the traffic policy with the correct DNS name within
route 53 hosted zone that you own
○ No charge for traffic policy, but there is a charge for policy records
● Private DNS
○ Lets you have authoritative DNS within your VPCs without exposing DNS records
○ You can manage private ip addresses within VPC using 53 private DNS feature,
where you create a private hosted zone and 53 will only return these records
when queried from within the VPC that you have associated with the private zone
○ Private DNS uses VPC to manage visibility and provide DNS resolutions for
private DNS hosted zones, so you need to configure a vpc and migrate your
resources into it
○ You can associate multiple VPC within single hosted zone
● Route 53 Resolver
○ Provides recursive DNS lookups for names hosted in EC2 as well as public
names on the internet
○ Recursive DNS
■ Route 53 is both authoritative and recursive DNS service. Authoritative
DNS contains final answer to a DNS query, generally an IP address.
Devices don’t communicate directly to authoritative DNS, but rather to the
recursive DNS service, which then finds the correct authoritative answer
for the DNS query.
■ Once an answer is found, recursive DNS server may cache the answer
for period of time
○ DNS endpoints
■ Includes one or more elastic network interfaces that attach to your VPC.
ENII is assigned an IP address from the subnet space of your VPC. this
IP address then serves as a forwarding target for on premise DNS
servers to forward queries.
AWS COGNITO
● Give users an identity so they can interact with our application
● Cognito user pools:
○ Integrate with api gateway
○ Sign in functionality for app users
○ Serverless database of users for your mobile apps
○ Simple login, and enable federated identities
○ Returns a json web token
● Identity pools (federated identity):
○ Provide direct access to aws resources from the client side
○ Log in to federated identity provider, and get temp credentials
○ Credentials come with pre defined iam policies
Databases 101
● Relational databases
○ Uses rows and columns like a spreadsheet
○ MySQL installations default to port number 3306
○ RDS provisioned IOPS storage with Microsoft SQL server DB engine, max size
RDS volume by default is 16TB
○ To look for RDS errors, you have to look for the error node in the response from
RDS API
● Non relational database
○ Collection = table
○ Document = row
○ Key value pairs = fields
● Provisioned IOPS SSD storage
○ Used for production application that requires fast and consistent I/O performance
○ Storage type that delivers predictable performance and consistently low latency.
○ Optimized for online transaction processing workloads that have consistent
performance requirements
○ When creating a DB, you specify IOPS rate and size of the volume.
○ Maria DB - 32TiB
○ SQL Server - 16 TiB
○ MySQL - 32TiB
○ Oracale - 32TiB
○ PostgreSQL - 32TiB
● Data warehousing
○ Used to pull in a very large and complex data sets, usually used by management
to do queries on data
○ Used for business intelligence, tools like Cognos, Jaspersoft, SQL server
reporting services, oracle hyperion
● OLTP VS OLAP
○ OLTP online transaction processing
■ Mainly RDS
■ SQL, MySQL, PostgreSQL, Oracle, MariaDB, AWS Aurora
○ OLAP online analytical processing
■ Used mainly for data processing, to pull in large number of records
■ Queries like: sum of radios sold in the pacific, sales price of each radio
■ Since these queries take a lot of resources, usually you would copy your
production DB and store it in a data warehouse to do these intensive
queries
■ RedShift is an example of AWS service for OLAP
■ Focuses more on adding the columns
● Elasticache
○ Web service that makes it easy to deploy, operate and scale in memory cache in
the cloud. Improves the performance of web applications by allowing you to
retrieve information from fast, managed in memory caches instead of relying on
slower disk based database
○ Improve load and response times to user actions, reduce cost associated with
scaling web applications
○ Automates common administrative tasks required to operate a distributed
in-memory environment. Use Elasticache you can add caching or in memory
layer to your architecture.
○ In-memory caching can significantly improve latency and throughputs for many
read-heavy application workloads, like social networking, gaming, streaming, or
compute intensive workloads.
○ Supports two open source in memory caching engines:
■ Redis
■ Memcached
○ Can be used as primary in-memory key-value data store, providing fast data
performance, high availability and scalability
○ Fully managed service
○ Node: fixed size chunk of secure RAM, running instance of memcache or redis
and has it's own dns name and port.
○ Only pay for what you use, no minimum fees.
○ Reserved nodes: similar to reserved instances, provides discount when you
commit to one - three year term. Only difference to on demand is the pricing. Up
to 20 reserved nodes, soft cap.
● Db instance
○ Basic building block of RDS. isolated database environment in the cloud. Can
contain multiple user-created databases.
○ Each DB instance runs a DB engine, and each engine has its own supported
features and versioning.
○ Computation and memory is determined by the DB instance class.
○ Comes in three types: Magnetic, General purpose (SSD) and Provisioned IOPS
■ Magnetic - for backward compatibility
■ General purpose SSD - cost effective storage ideal for broad range of
workloads. Deliver single-digit millisecond latencies and ability to burst to
3,000 IOPS. baseline performance is 3 IOPS per each GiB, so larger
volumes have better performance.
■ Provisioned IOPS - for I/O intensive workloads, particularly database
workloads that require low I/O latency and consistent I/O throughput.
● You are charged for the provisioned resources whether or not if
you use them in a given month.
○ Factors that affect storage performance
■ System activities
● Multi-az standby creation
● Read replica creation
● Changing storage types
■ Database workload
● Throughput limit of underlying instance type is reached
● Queue depth is consistently less than 1 because app is not driving
enough I/O operations
● Experience query contention in database
● RDS only gives out a DNS and never a public ip
● For security permissions, don’t forget to add for inbound rules for your DB instance, to
allow the EC2 instance in
● RDS slow queries
○ For prod databases, enable enhanced monitoring, which provides over 50 CPU,
memory, file system and disk i/o metrics.
○ High levels of CPU utilization can reduce query performance
○ For MySQL or MariaDB, enable the “slow_query_log” DB parameter and query
the mysql.slow_log table to retrieve slow running queries.
● Recommended to keep database instance upgraded to the most current minor version,
as it contains latest security and functionality fixes.
○ Use the modify DB instance command on console or the modifydbinstance
API and set the DB engine version parameter to the most recent. Upgrade will be
applied during next maintenance window.
○ If new engine minor version contains significant bug fixes, AWS will auto upgrade
your instances if they have the auto minor version upgrade setting enabled.
● Deprecated versions
○ Minor versions are supported for at least 1 year, major versions for 3 years
○ When fully deprecated, minor versions are given 3 month period after automatic
upgrade. Major version are given 6 months before auto upgrade.
● RDS upgrades IE the instance class, the instance will be temporarily unavailable lasting
around maybe a couple of minutes. The upgrade will occur during the maintenance
window of your DB instances, unless you specify that you want it done immediately.
● RDS Billing
○ Pay for what you use
○ Based on
■ Db instance hours - based on class and partial db hours are consumed as
full hours
■ Storage per month - if you upgrade your storage, you are pro rated
■ i/o request per month - for RDS magnetic storage and amazon aurora
only
■ Provisioned iops per month - for rds provisioned iops ssd only
■ Backup storage - associated with automatic and customer initiated
snapshots. Increasing backup retention period or taking additional
snapshots increase the price. Price is more than regular storage because
backup storage is geo-redundant replication.
■ Data transfer - internet transfer in and out of your db instance
● RDS reserved instance - option to reserve db instance for one or three year term,
significant discount
○ Functionally the same as on-demand
○ Purchase via management console for rds, or rds api, or through the CLI
○ No capacity reservations
○ You can purchase up to 40 reserved db instances, soft limit
○ To cover an existing DB instance with RI - purchase new db instance with RI as
same class, db engine, multi-az option and license model, within same region as
current db instance, and the new hourly charge goes off of the RI
● RDS uses EBS volumes for database and log storage
● Use RDS provisioned IOPS for I/O intensive, transactional OLTP database workloads
● RDS hardware & scaling
○ Memory and CPU resources are modified by changing your DB instances and
changing the storage availability
○ Changing storage capacity incurs no downtime, however scaling compute
resources incurs downtime
● RDS security groups
○ When you add a rule to an RDS DB security group, you don't need to specify a
port number or protocol, as its automatically applied to the RDS DB security
group.
○ DB security group
■ Controls access to EC2-classic DB instances that are not in VPC
■ Each rule enables a specific source to access a DB instance that is
associated with that DB security group. When you specify an EC2
security group as the source, you allow incoming traffic from all EC2
instances that uses that security group.
■ Applies to inbound traffic only; outbound traffic is not permitted
■ You don't need to specify a destination port number.
○ VPC security group
■ Controls access to DB instances and EC2 instances inside a VPC
■ Rules can govern both inbound and outbound traffic, however outbound
traffic rules only apply if the db instances act as a client.
■ When you create rules for your VPC security group that allows access to
instances in your VPC, you must specify a port for each range of address
that rule allows access for.
○ EC2 Security group
■ Controls access to an EC2 instance
● Two different types of backups
○ Automated backups
■ Recover your DB at any point within retention period. Period can be
between one and 35 days.
■ By default retention time is 7 days
■ Takes full daily snapshot and will store transaction logs throughout the
day.
■ When you do a recovery, AWS will first choose the most recently daily
backup, and apply transaction logs relevant to that day. Allows you to do
point in time recovery.
■ Enabled by default
■ Data is stored in S3 and you get free storage space equal to size of your
database
■ Create automated backups of your DB instance during the backup
window.
■ Supports point in time recovery
■ Your DB must be in an active state
■ When you delete a DB instance, you can retain automated backups
● In multi-az, snapshots are taken off of your secondary source to
not interrupt primary
■ Sometimes you may end up having automated db snapshots more than
number of days in retention period, this is normal
■ Backups are deleted when the db instance is deleted
○ Snapshots
■ Snapshots of the database at the point in time
■ Done manually
■ Stored even after you delete the original RDS instance, so they sit as a
standalone file
■ Performing a DB snapshot can cause a brief halt in I/O that can last a few
minutes depending on size of instance - only on Single AZ
○ Backups are stored in S3
● Migration of DB instances from inside to outside VPC is not supported
● RDS Security
○ Encryption in transit is supported for MySQL, MariaDB, SQL Server, PostgreSQL
and Oracle
■ RDS generates SSL certificates for each DB instance, however SSL is
compute intensive and will increase latency of your database connection.
■ SSL is supported for encrypting the connection between app and db
instance, not for authenticating the DB instance itself. So SSL just
encrypts the data in flight
○ Encryption at rest is supported for all database engines, using keys you manage
via KMS.
■ If enabled, data stored at rest is encrypted, so are backups, read replicas
and snapshots.
■ You can add encryption to previous unencrypted db instance by creating
a snapshot, then creating a copy of that snapshot and specifying a KMS
encryption key.
■ You can only encrypt at rest when you first create your db instance.
Otherwise, use the same procedure as an EC2 instance: take a snapshot,
copy the snapshot as encrypted, and create a new DB from that snapshot
○ You can authenticate using IAM if using MySQL or PostgreSQL. With this, you
don't need a password when connecting to a database, instead you use an
authentication token. These tokens have a lifetime of 15 minutes, and you don't
need to store credentials, because its managed by IAM. Benefits are:
■ Traffic is encrypted using SSL
■ Authentication is centrally managed by IAM
■ Easy to use EC2 instance roles to connect to the database
○ To enforce ssl:
■ postgreSQL: rds.force_ssl=1 in the RDS console
■ MySQL: within the db: grant usage on *.* to
‘mysqluser’@’%’REQUIRE SSL
○ Transparent Data Encryption
■ Only supported by oracle or sql server db
■ Adds another layer on top of KMS but may affect performance
● Restoring backups
○ Whenever you restore backups, new RDS instance will have a new DNS
endpoint, example restored.eu-west-1.rds so it will create a new instance
○ Changes you make to your primary DB will reflect immediately to your backup
● Backups only contain the difference in data, the first snapshot of your instance contains
all your data, in other words, subsequent snapshots are incremental
● DB parameters - act as containers for engine configuration values. Default parameter
group is used if nothing is specified, and contains engine defaults and RDS system
defaults optimized for the instance you are running.
○ If you update parameters within a db group, the changes apply to all db instances
associated with that group
○ When changing a dynamic parameter, the changes are applied immediately.
When you change a static parameter, the change takes effect after you reboot
your instance.
○ When changing the db parameter group associated with an instance, you must
manually reboot the instance
● Use AWS config to monitor record configuration changes to RDS instances, db subnet
groups, db snapshots, db security groups and event subscriptions.
● Multi-AZ
○ Provides redundancy by copying/writing data onto another AZ as a failover
○ Primary DB instance is synchronously replicated across AZ to provide data
redundancy, elimite I/O freezes and minimize latency spikes during backups
○ Enabling Multi-AZ may increase write and commit latency compared to single AZ,
due to the synchronous data replication.
○ So if your primary endpoint fails over, the backup endpoint picks up
○ Have exact copy of db in another AZ, and happens automatically
○ Only for disaster recoveries only!!! Not used for performance improvements
○ Failover times range between 60-120 seconds. Once failover is complete, it can
take additional time for RDS console UI to reflect the changes.
○ Available for RDS
○ Failovers can occur:
■ When AZ outage occurs
■ Loss of network to primary
■ Primary db instance fails
■ Instance class changed
■ Software patches
■ Manual failover
○ Fail overs will not occur in response to database operations such as long running
queries, deadlocks or database corruption errors
○ When converting single AZ instance to Multi-az
■ Snapshot of primary is taken
■ New standby instance is created in different AZ, from that snapshot
■ Synchronous replication is configured between two instances
■ No downtime is incurred
○ When failover occurs, the CNAME of your db instance to point to the standby,
which is then promoted to the primary.
● Read replica
○ Uses MariaDB, MySQL and PostgreSQL DB engines built in replication
functionality to create special type of DB instance called Read Replica.
○ Updates made to the source DB are asynchronously copied to the read replica.
You can reduce the load on your source DB instance by routing read queries to
the read replica.
○ If your source DB can’t take I/O requests, you can direct read traffic to read
replica, however the data on the read replica might be “stale”
○ Can have 5 read replicas per production database for MySQL, MariaDB,
PostgreSQL, and 15 for Aurora
○ Read only copy of production database, primarily used for read heavy database
workloads, used to scale out
○ Available for MySQL, PostgreSQL, Aurora, MariaDB
○ Used for performance improvements
○ Must have automatic backups turned on.
○ You can have a read replica of a read replica but causes latency, only supported
for MySQL and MariaDB
○ You can have read replicas that have multi-az enabled, this is done to support
disaster recovery and minimize downtime from engine upgrades
○ Can have read replicas in different regions
○ Read replicas can be promoted to be their own database for immediate disaster
recovery, but this breaks the replication as it will be a standalone db
■ Mainly used for implementing failure/disaster recovery
■ Sharding - involves breaking a large database into smaller databases
■ Performing DDL operations (only for MySQL and MariaDB)
■ Promoted read replica retains the backup retention period, the backup
window and the parameter group of former source. Promotion can take
several minutes.
○ No charge is incurred
○ Read replicas can be created in the same AZ, a different AZ or a new region
○ Asynchronous data replication, data will be consistent between the two nodes
eventually. Reasons why a read replica might fall behind:
■ Write i/o volume to the source exceeds the rate at which changes can be
applied to the read replica (common if the compute capacity of read
replica is less than the source)
■ Complex or long running transactions on the source can hold up the
replication
■ Network partitions or latency between source and read replica
○ When to use:
■ Scaling - for read heavy operations. Scaling beyond the compute or I/o
capacity of single DB instance
■ Source DB unavailable - when you need to continue to read the traffic
■ Reporting or data warehouse - run queries against the read replica
instead of your primary
■ Disaster recovery - promote read replica to standalone
■ Lower latency
■
○ Pricing:
■ Same billing as standard, however you are not charged for data transfer
○ Applications must update the connection string to leverage the read replicas
● Scaling
○ For multi-AZ, there is minimal downtime when scaling, however for single AZ
instances, the instance will be unavailable during the scaling operation
○ Ensure correct licensing is in place
○ For multi-az, the secondary node will scale up first and become available, then
the primary node will start scaling up
● RDS Basic operational guidelines:
○ If database workload requires more I/O than provisioned, recovery after failover
or database failure will be slow. Increase the I/O capacity by:
■ Migrating instance class to higher I/O capacity
■ Convert form standard storage to general purpose, or provisioned iops
■ If already in provisioned iops, provision additional throughput capacity
○ If client application is caching the DNS data of your db instances, set the TTL
value to less than 30 seconds. Because the underlying IP address of instance
can change after a failover, so caching the dns for a long time can lead to
connection failures.
○ Best practice is to allocate enough RAM so working set resides almost
completely in memory
● Performance metrics
○ CPU - percentage of computer processing capacity used
○ Memory - how much ram is available in the instance, how much swap space is
used by the instance
○ Disk space - how much disk space not in use
○ input/output operations
○ Network traffic
○ Database connections
● DynamoDB
○ no SQL database
○ Single digit millisecond latency at any scale
○ Stored on SSD storage
○ Spread across 3 geo distinct data centers / AZ
○ Can only query on primary key, sort key, or indexes, so very important design
consideration
○ Eventual consistent reads ( by default )
■ Consistency across all copies of data is reached within a second.
Repeating a read after a short time should return the updated data (best
read performance)
■ So if you write data in, it might take a second or two for it to be available
to be read
■ Half the cost of strongly consistent reads
○ Strongly consistent reads
■ Read returns a result that reflects all writes that received a successful
response prior to the read
■ Makes the data that was just written instantly available
■ Costs more
○ RCU - read capacity units
■ 1 RCU = 1 strongly consistent read of 4 kb per second
■ 1 RCU = 2 eventually consistent reads of 4 kb per second
○ Pricing
■ Write throughput 0.0065$ per hour for every 10 units
● Write capacity unit can handle 1 write per second
■ Read throughput 0.0065$ per hour for every 50 units
■ Storage costs of .25GB per month
○ Can be expensive for writes but very cheap for reads and really scalable
○ DynamoDB is really scalable, push button scaling, and scales with no down time
unlike RDS where you need to create a snapshot, which takes a while then
adjust the instance size and the size itself has a limit
○ Supported data types
■ scalar - represents exactly one value, number, string, binary, boolean and
null
■ Document - complex structure with nested attributes, like JSON. list and
map
■ Set - represent multiple scalar values. String set, number set and binary
set.
○ TTL mechanism enables you to manage web sessions of your applications. Set
specific timestamps to delete expired items from tables.
○ Stores structured data indexed by primary key to allow low latency read and write
access to items ranging from 1 byte to 400 kb.
○ Partition keys - choosing the right partition key is important for design and
building scalable and reliable applications on top of DynamoDB
■ Partition key - a simple primary key, composed of one attribute known as
the partition key.
■ Partition key and sort key - referred to as composite key, type of key
composed of two attributes. The first attribute is the partition key, second
is the sort key.
■ dynamoDB evenly distributes provisioned throughput among partitions,
and automatically supports your access patterns using the throughput you
provisioned. If your access pattern exceeds 3000 RCU or 1000 WCU for
a single partition key, your request will throttle with
provisionedthroughputexceedexception error
■ Read / write above the limit can be caused by:
● Uneven distribution of data to the wrong choice of partition key
● Frequent access of the same key in a partition (the most popular
item)
● A request rate greater than the provisioned throughput
■ Recommendations for partition keys
● High-cardinality attributes - attributes that have distinct values,
like emailid, employee_no, customerid etc.
● Use composite attributes - combine more than one attribute to
form unique key. Example: customerid+productid+countrycode as
the partition key, and order_date as the sort key
● Cache popular items - when there is high read traffic using DAX.
○ Good for using to store metadata about S3
○ Valid header attributes:
■ X-amz-date
■ X-amz-target
■ Host
■ Content-type
○ Sort key - allows for composite keys. Careful design of sort keys lets you retrieve
commonly needed group of related items using range queries. Also composite
keys lets you define one to many relationships in your data.
○ Secondary index
■ Global - index with partition key and sort key that can be different from
those on the base table. Considered global because queries on the index
can span the whole data in the base table across partitions.
■ Local - index that has same partition key as base table but different sort
key. Every partition of local secondary index is scoped to base table
partition that has same partition key value. Can only create local
secondary index at creation time.
○ Range key - alternative name for the sort key.
○ For read and write throughput capacity, you can only reduce the throughput 4
times in a 24 hour period
○ --generate-cli-skeleton : generates a skeleton json for you to fill out to create a
dynamodb table via the cli
○ “Wait” - you can use this command to have the script wait for a specific event to
take place, and then it will execute
■ Aws dynamodb wait table-exists --table weatherstation_data --profile
nameofprofile
■ Basically waits until the table is created and then it will execute
○ DAX
■ Seamless caching, no application re-write
■ Writes go through dax to dynamodb
■ Microsecond latency for cached reads and queries, solves the hot key
problem ( too many reads)
■ 5 minute ttl for caching by default
■ Up to 10 nodes in the cluster
■ Multi az (3 nodes minimum recommended)
○ Streams
■ Changes in dynamodb are logged into a dynamodb stream
■ Stream can be read by other services, such as lambda
■ Could implement cross region replication using streams
■ Has 24 hour data retention
○ Transactions
■ All or nothing type of operations
■ Coordinated insert, update, delete across multiple tables
○ On demand
■ No capacity planning needed, scales automatically
■ 2.5x more expensive than provisioned capacity
■ Helpful when spikes are unpredictable or application is very low
throughput
○ Global Tables
■ Managed solution for deploying multi region- multi-master database
without having to build and maintain replica solution.
■ DynamoDB automatically creates identical tables in all regions and
propagates ongoing data changes.
○ Auto Scaling
■ Serve dynamically adjusted provisioned throughput capacity, in response
to actual traffic patterns.
■ Enables a table or secondary global index to increase provisioned read
and write capacity. When workload decreases, auto scaler will decrease
the throughput so you don't pay for unused capacity.
● Redshift
○ Fast and powerful, fully managed petabyte scale data-warehouse service
○ Can start small for .25$ an hour with no commitments and can scale to a
petabyte
○ Configuration
■ Single node 160gb
■ Multi-node
● Leader node
○ Manages client connections and receives queries
● Compute nodes
○ Store data and perform queries and computations, up to
128 compute nodes, sends results back to leader
○ Columnar data storage
■ Organize data by column instead of series of rows
■ Ideal for data warehousing and analytics, where queries often involve
aggregates performed over large datasets
■ Since only columns are involved in the queries, processed and stored
sequentially on storage media, requires far fewer I/O thus greatly
improving query performance
■ When loading data into empty table, RedShift auto samples your data and
selects most appropriate compression scheme
○ Advanced compression
■ Columnar data stores can be compressed much more than row based
data, because data is stored sequentially on the disk.
■ Does Not require indexing or materialized views so it uses much less
space
○ Massively parallel processing (MPP):
■ Auto distributes data and query load across all nodes.
■ Makes it easy to add nodes to your data warehouse and enables you to
maintain fast query performance
○ Encryption
■ Encrypted in transit using SSL
■ Encrypted at rest using AES-256 encryption
■ By default takes care of key management
○ Only available in 1 AZ
■ Can restore snapshots to new AZ’s in event of an outage
● Elasticache
○ Deploy, operate and scale in-memory cache in the cloud
○ Improves the performance of web apps by allowing you to retrieve info from fast
in memory caches
○ Improve latency and app performance by storing critical pieces of data in memory
for low-latency access
○ Helps make applications stateless
○ AWS takes care of OS maintenance, patching etc. just like RDS
○ Multi az with failover capability
○ Memcached
■ Widely adopted memory object caching system
■ Does not support Multi-Az
■ Cache does not survive reboots
○ Redis
■ Open source in-memory key value store that supports data structures like
stored sets and lists
■ Supports master/slave replication and Multi-AZ
■ Survives reboot by default
■ Support for read replicas
■ Supports REDIS AUTH
■ SSL in flight encryption must be enabled and used
○ Application queries elasticache, if not available, get it from RDS and store it into
elasticache
○ Can be used to store user sessions IE user logs into app, app writes session
data into elasticache
○ IAM policies on elasticache are only used for AWS API level security
○ Lazy loading: all read data is cached, data can become stale in cache
○ Write through: adds or update in cache when written to a db, so no stale data in
cache
○ Session store: store temp session data in cache using TTL
● Aurora
○ Only runs on AWS infrastructure
○ MySQL compatible, RDS engine that combines speed and availability of high end
commercial database with simplicity and cost effectiveness of open source
○ Provides up to 5 times better performance than mySQL at one tenth price point
○ Failover is also instantaneous
○ Costs around 20% more than RDS
○ Scaling
■ Starts with 10gb, scales in 10gb increments and auto scales up to 64TB
■ Compute scales up to 32vCPU and 244 gb of memory
■ Quickly scales
■ Maintains 2 copies of data is contained in each availability zone, with
minimum of 3 AZ = 6 copies of data
■ Designed to transparently handle the loss of up to 2 copies of data
without affecting database write availability and up to 3 copies without
affecting read
■ Self healing
○ 2 types of replicas
■ Aurora replicas, can have up to 15
■ mySQL read replicas (currently 5)
■ Failover happens in less than 30 seconds
○ 6 copies of your data across 3 AZ
■ 4 copies needed for writes
■ 3 copies needed for reads
■ Self healing with peer to peer replication
○ 1 master, and storage is shared volume across 100s volumes with replication,
self healing and auto expansion
○ You are able to auto scale your read replicas, using a reader endpoint, and load
balancing happens at the connection level
■ The client connects to the reader endpoint which redirects traffic to
the read replicas automatically
○ Provides you with a reader endpoint, and a writer endpoint when you create
the database
○ Failover is automatically handled by Amazon Aurora
○ If you have an Amazon Aurora Replica in the same or a different Availability
Zone, when failing over, Amazon Aurora flips the canonical name record
(CNAME) for your DB Instance to point at the healthy replica, which in turn is
promoted to become the new primary. Start-to-finish, failover typically
completes within 30 seconds.
○ If you do not have an Amazon Aurora Replica (i.e. single instance), Aurora
will first attempt to create a new DB Instance in the same Availability Zone as
the original instance. If unable to do so, Aurora will attempt to create a new
DB Instance in a different Availability Zone. From start to finish, failover
typically completes in under 15 minutes.
○
VPC
● Basically a virtual data center in the cloud
● Every region in the world has a default VPC
● You are always going to lose 5 host ip addresses because of reservations
○ 10.0.0.0 : network address
○ 10.0.0.1 : for vpc router
○ 10.0.0.2: address of DNS server is always base network rage + 2
○ 10.0.0.3: reserved for future use
○ 10.0.0.255: network broadcast address. Is not supported in VPC
● Definition:
○ Lets you provision logically isolated section of AWS cloud where you can launch
services in the virtual network you define. You have control over networking
environment, including selection of your own IP address range, creation of
subnets and configuration of route tables and network gateways
○ Create public facing subnet for your web servers that have access to the internet,
and place backend systems in private facing subnet
○ Leverage multiple layers of security, including security groups and network
access control lists
○ Create hardware VPN connection between corporate datacenter and your VPC
● What can you do?
○ Launch instances into a subnet of your choosing
○ Assign custom ip ranges for each subnet
○ Configure route tables between subnets
○ Create internet gateway and attach it to VPC, can only have 1 internet gateway
per VPC
○ Much better security control over your AWS resources
○ Instance security groups
○ Subnet network access control lists
■ These are stateless, requires you to open inbound and outbound
● By default, you can have 5 VPCs per region
● Default VPC vs Custom VPC
○ Default:
■ User friendly, immediately deploy instances
■ All subnets by default have a route out to the internet
■ No private subnets
■ Each EC2 instance has both public and private ip address
○ Custom:
■ Custom VPC comes with automatic ACL, route table and security group
● VPC peering
○ Allows you to connect one VPC to another via direct network route using private
ip addresses
○ Instances behave as if they are on the same private network
○ You can peer VPC with other aWS accounts
○ Peering is in a star configuration: IE 1 central VPC peers with 4 others. No
transitive peering. basically , you have to peer each VPC with each other
○ Enables you to route traffic between pairing using private ip address. Instances in
either VPC can communicate with each other as long as in the same network.
You can create VPC peering connection between your own VPC, or another VPC
in another account within a single region.
○ AWS uses existing infrastructure of a VPC to create a VPC peering connection;
neither a gateway nor a VPN connection and does not rely on seperate piece of
hardware. No single point of failure.
○ Transitive peering is not supported, and the two VPC’s must not share the same
CIDR block IE must not have matching or overlapping CIDR blocks
● How to give public internet access to private EC2
○ Important because you may need to install mySQL or Apache etc.
○ NAT
■ Enables instances in private subnets to connect to the internet or other
AWS services but prevents the internet from initiating connections with
the instances. NAT forwards traffic from instances in the private subnet to
the internet, and sends the response back.
○ NAT Gateway
■ Requires elastic ip provisioning
■ Does not sit behind the security group, acts as its own instance
■ Preferred by enterprise
■ Supports up to 5Gbps of bandwidth and automatically scales up to 45
Gbps. You can distribute the workload by splitting resources into multiple
subnets if you need more bandwidth.
■ Supports TCP, UDP and ICMP.
■ No need to patch
■ Automatically assigned a public ip address
■ Remember to update your route tables
■ More secure than NAT instance
■ You cannot route NAT gateway traffic through VPC peering connection,
VPN connection or AWS direct connect. NAT gateway cannot be used by
resources on the other side of these connections.
■ A failed NAT gateway is automatically deleted
○ NAT Instances
■ Create using a NAT instance, make sure to disable source/destination
check
■ Must be in the public subnet
■ Must be a route out of the private subnet to the NAT instance
■ Amount of traffic it can support depends on the instance size
■ Behind a security group
■ Requires you to create a script to manage failover
■ Requires you to maintain the instance
● Network access control list
○ Default NACL allows all traffic in and out, but if you create one, it will not allow
any traffic by default
○ You can only associate a subnet to one ACL
○ For ipv4, start at rule 100 and increments of 100, for ipv6, start at 101 and
increments of 1
○ Rules are evaluated in numerical order starting with the lowest numbered rule
first
○ Remember to allow ephemeral ports as an outbound rule only
● VPC flow logs
○ Feature that enables you to capture information about the ip traffic going to and
from network interfaces in your VPC. log data is stored using cloudwatch logs.
○ Created at 3 levels
■ VPC - capture all ENI traffic
■ Subnet
■ Network interface
○ You cannot enable flow logs for VPCS that are peered with your VPC unless the
peer VPC is in your account
○ You cannot tag a flow log
○ After you created a flow log you cannot change its configuration
○ Not all IP traffic is monitored
■ Traffic generated by instances when they contact the amazon DNS
server. If you use your own DNS server, then all traffic will be logged
■ Traffic to and from 169.254.169.254 for instance meta data
■ DHCP traffic
■ Traffic to the reserved ip address for the default router
● NAT vs Bastion
○ Bastion host allows you to SSH or communicate with the bastion server, which
then creates a private communication link to your private instances. Used just for
administration only for SSH or RDP
○ Whenever you have the private key downloaded, run:
○ ssh -i mymaster.pem ec2-user@10.0.2.56 -o "proxycommand ssh -W %h:%p -i
mymaster.pem ec2-user@mybastion.elb.amazonaws.com"
■ To set up a tunnel into the private ec2 instance
○ A NAT is used to provide internet traffic to EC2 instances in private subnets
● VPN connection consists of the virtual private gateway, and the customer gateway
● Egress only internet gateway:
○ Allows IPv6 based traffic within a VPC access to the internet, whilst denying IPv6
based internet resources initiating a connection into a VPC.
● By default, instances in new subnets in a custom VPC can communicate with each other
across AZ
● You are not permitted to conduct your own vulnerability scans on your own VPC without
alerting AWS first
● VPN site to site - by default instances you launch in VPC can't communicate with your
own remote network. You can enable access to remote network from VPC by attaching a
virtual private gateway, creating a custom route table, update your SG rules and creating
a AWS site to site VPN connection.
○ Supports Ipsec connections
○ Ipv5 traffic is currently not supported
○ Components
■ Virtual private gateway - VPN concentrator on AWS site of site-to-site
connection. You can specify the private autonomous system number for
amazon side gateway. If you don't specify, defaulted to default ASN and
you cannot change it. Provides two VPN endpoints for automatic failover
■ Customer gateway - physical device or software app on client side. You
must create a customer gateway resource in AWS, that provides info to
AWS about your customer gateway device.
● Internet routable ip address (static) of gateway external interface
● Type of routing -- static or dynamic
● (dynamic routing only) border gateway protocol ASN of customer
gateway
● When creating a VPN connection, you must specify the type of routing you plan to use,
and update the routing table for your subnet
○ If VPN supports BGP, specify dynamic routing, otherwise put static
○ Each VPN consists of two tunnels for redundancy. Important to configure both
tunnels
● PRICING
○ Charged for VPN connection hourly
○ Charged for each nat gateway hourly
○ Data processing charges for each gig processed via nat gateway
○ Charges for unused or inactive elastic ip
SQS
● Oldest AWS service
● Web service that gives you access to message queue that can be used to store
messages while waiting for computer to process them
● Distributed queue system that enables applications to quickly and reliably queue
messages that one component in the application generates to be consumed by another
component.
● Scales from 1 message per second up to 10,000
● Pull based system
● Messages can be kept in queue from 1 minute to 14 days
● Default retention period is 4 days
● You can decouple components of an application so they run independently, easing
message management between components
● Any component of distributed application can store messages in queue. Messages can
contain up to 256 kb of text in any format. Any component can later retrieve the
messages programmatically using SQS API
● The queue acts as a buffer between the component producing and saving data, and the
component receiving data for processing. This means the queue resolves issues that
arise if producer is producing work faster than the consumer can process it, or producer
or consumer are only intermittently connected to the network
● Standard queue
○ Default
○ Lets you have nearly unlimited number of transactions per second
○ Guarantees that message is delivered at least once
○ However occasionally more than one copy of the message might be delivered out
of order
○ Provide best effort ordering which ensures that messages are generally delivered
in the same order as they are sent
● FIFO queue
○ The order in which messages are sent and received are preserved and
messages are delivered once and remains available until consumer processes
and deletes it, no duplicates are introduced into the queue
○ Supports message groups that allow multiple ordered message groups within
single queue
○ Limited to 300 transactions per second, but have all capabilities of standard
queues
● Visibility timeout
○ Amount of time that the message is invisible in the SQS queue after reader picks
up the message
○ Provided the job is processed before visibility time expres, the message will be
deleted
○ If job is not processed within that time, message will become visible again and
another reader can process it. May result in the same message being delivered
twice
○ Default timeout is 30 seconds, you can increase it to a maximum of 12 hours
○ ChangeMessageVisibility API to change the visibility while processing a
message
● Dead letter Queue
○ If consumer fails to process message within timeout, message goes back into the
queue
○ We can set threshold of how many times a message can go back into the queue,
called redrive policy
○ After threshold is exceeded, message goes into dead letter queue
○ Dead letter queue first must be created
● SQS Delay Queue VS Visibility timeout
○ Both features make messages unavailable for consumers for a specific period of
time
○ Delay queue: message is hidden when it is first added to the queue
○ Visibility: message is hidden after it is consumed from the queue
● Inflight Messages - three basic states
○ Sent to a queue by a producer
○ Received from queue by a consumer
○ Deleted from the queue
○ For standard, there is a limit of approx 120,000 inflight messages. For FIFO its
20k.
● Long polling
○ Way to retrieve messages
○ Regular short polling returns immediately, long polling does not return a response
until a message arrives in the queue. Lets the consumer “wait” for messages to
arrive if there are none in the queue
○ Decreases the number of API calls made to SQS while increasing efficiency and
latency of your application
○ Can save you money by reducing number of empty responses when no
messages are available
○ Wait time can be between 1 second to 20 second (20 second preferable)
● SSE server side encryption lets you transmit sensitive data by protecting contents of
messages in queues using KMS
● Basic architecture:
○ Component 1 sends message A to queue, message is distributed across SQS
servers redundantly
○ Consumer 2 is ready to process message, it consumes message from queue and
message A is returned. While message is being processed, it remains in queue
and isnt returned to subsequent receive requests for duration of visibility timeout.
○ Consumer 2 deletes message A from the queue to prevent message from being
received and processed again when the timeout expires
SWF
● Simple workflow service, web service that makes it easy to coordinate work across
distributed application components. Enables applications for a range of use cases,
including media processing, web application back-ends, business process workflows and
analytics pipelines to be designed as a coordination of tasks
● Tasks represent invocations of various processing steps in an application which can be
performed by executable code, web service calls, human actions and scripts
● Workers
○ Programs that interact with SWF to get tasks, process and return the results
○ Application that can initiate a workflow
● Decider
○ Program controls the coordination of tasks, I.E. their ordering, concurrency, and
scheduling according to the application logic
○ If something has finished in a workflow, decider decides what to do next
● Activity workers
○ Carries out the activity tasks
● AWS manages your workers and deciders, brokers the interaction between the two.
Allows decider to get views into progress of tasks, and initiate new tasks. It also assigns
tasks to workers when they are ready and monitors progress.
● Tasks are only assigned ONCE and never duplicated, unlike SQS
● Workers and deciders run independently and can scale quickly.
● Domains
○ Your workflow and activity types, workflow execution itself are scoped to a
domain.
○ Domains isolate a set of types, executions and task lists from others within the
same account
○ Parameters are specified in JSON format
● Maximum workflow can be 1 year, measured in seconds
● SWF VS SQS
○ SWF is task oriented API, SQS is message oriented
○ SWF ensures task is assigned only once and never duplicated. SQS you need to
handle duplication messages and ensure that it is processed only once
○ SWF keeps track of all tasks and events in application. SQS you need to
implement your own application-level tracking
SNS
● Simple notification service
● Easy to set up, operate and send notifications from cloud
● Highly scalable, flexible and cost effective capability to publish messages from
applications and deliver them
● Follows the “publish-subscribe” paradigm, notifications delivered to clients using push
mechanism that eliminates need to check or poll for new information or updates
● No maintenance or management overhead, and pay as you go pricing
● Push mechanism
● SNS can also deliver notifications by SMS text or email, to SQS queues or any HTTP
endpoint
○ Http, HTTPS, email, email-JSON, SQS, Application, Lambda
● To prevent messages from being lost, all messages are stored redundantly across
multiple AZ
● Topics
○ Access point for allowing recipients to dynamically subscribe for identical copies
of the same notification
○ For example, one topic can support deliveries to multiple endpoint types, like
group together IOS or android subscribers. So you publish once to a topic, SNS
delivers appropriately formatted copies of your message, will send your message
to each subscriber
● Pricing
○ Pay .50$ per 1 million SNS request
○ .006 per 100,00 notifications over HTTP
○ .075 over SMS
○ 2.00 over email
○ Pricing is basically broken down by delivery types
● Data format is in JSON
● SNS + SQS : Fan Out
○ Push once in SNS, receive in many SQS
○ Example: buying service -> SNS topic -> multiple SQS which sends triggers to
microservices
○ Fully decoupled
Elastic transcoder
● Media transcoder
● Convert media files to different formats that will play on smart phones, tablets, etc.
● Provide transcoding presets for popular formats
● Price is based on minutes that you transcode and resolution you transcode
Kinesis 101
● Makes it easy to load and analyze streaming data, provide ability for you to build your
own custom applications for your business needs
● Used to consume big data
● Stream large amounts of social media, news feeds logs etc. into the cloud, and to
process data:
○ Redshift for business intelligence
○ EMR for big data processing
● Data is auto replicated to 3 AZ
● Services
○ Kinesis streams
■ Stores data from data streams for 24 hours, can be increased for 7 days
■ Stored in shards, then data consumers (Ec2 instances) take data from the
shards and process the data and send it to AWS services etc.
■ Shards gives you 5 transactions per seconds for reads, up to maximum
read rate of 2mb per second and 1000 records per writes, up to maximum
1mb per second
■ A stream is made of up many different shards
■ Shard is basically a unit of throughput
■ The total capacity of the stream is the sum of capacities of its shards
■ Able to reprocess / replay data, since the data is stored for a duration and
not deleted upon consumption like SQS
■ Billing is per shard provisioned, can have as many shards as you want
■ Batching available, and the number of shards can evolve over time
■ Records are ordered per shard
■ Multiple applications can consume the same stream
■ Once data is inserted in Kinesis, it cannot be deleted
■ Choose a partition key that is highly distributed, like a user id
■ Provisionedthroughputexceeded - if we go above the throughput.
Solution is to retry with backoff, increase number of shards, ensure your
partition key is highly distributed
○ Kinesis firehose
■ Capture, transform, and load streaming data into S3, redshift, elastisearch
and splunk. Enable near real time analytics with existing business
intelligence tools and dashboards.
■ Fully managed service that auto scales to match throughput of your data
and requires no administration. Can also batch, compress and encrypt
data before loading it.
■ Don't have to worry about shards or streams etc. completely automated
■ Don't have to worry about data consumers, you can analyze data on your
own and send it to S3 -> redshift etc.
■ No automatic data retention, as soon as data comes in, its analyzed
immediately / processed
○ Kinesis analytics
■ Allows you to run sql queries in data in your firehose or streams
Amazon MQ
● When migrating to the cloud, instead of re-engineering application to use SQS and SNS,
you can use Amazon MQ = managed Apache ActiveMQ
● Doesnt scale as much as SQS / SNS
● Runs on a dedicated machine with high availability failover
● Has both queue feature and topic features of SQS and SNS
AWS Organizations
● Account management service that enables you to consolidate multiple AWS accounts
into organization that you create and manage
● Consolidated billing
○ A final billing that includes all organization accounts under root account
○ It saves money because instead of paying bills for what each organization uses,
for example if org A uses 5 on demand instances, org B has 5 RI but only using
3, with consolidated billing you would pay just once, so you would pay for 3 on
demand instances and 5 RI. without consolidated billing, you would pay for 5
demand instances and 5 RI., so it would treat each ORG as a separate billing
entity
● By default you can have only 20 linked accounts
● Cloudtrail is on a per account and per region basis but you can aggregate into a single
bucket in the paying account
● Central management
○ Manage multiple AWS accounts at once. Create groups of accounts then attach
policies to groups to ensure correct policies are applied across the accounts.
Enables you to centrally manage policies across multiple accounts
● Control Access
○ Service control policies that control AWS services. You can allow or deny
individual services. SCP will override IAM policies.
● Automate AWS account creation
○ Use APIs to automate creation and management of new accounts. Create new
accounts programmatically
Cross Account Access
● Makes it easier for you to work with multiple account environments by allowing you to
switch roles within the management console
● Basic steps:
○ Identify account numbers
○ Create group in IAM - Dev
○ Create user in IAM - Dev
○ Log in to production
○ Create the policy
○ Create the cross account role
○ Apply newly created policy to the role
○ Log into the developer account
○ Create new inline policy
○ Applying it to the dev group
Tagging & Resource Groups

● Tags
○ Key value pairs
○ Meta data
○ Tags can sometimes be inherited - autoscaling, cloudformation and elastic
beanstalk can create other resources
● Resource groups
○ Make it easier to group resources using the tags
○ Groups can contain info such as
■ Region
■ Name
■ Health checks
○ Classic resource groups - global
■ Great for seeing resources on a whole
○ Aws systems manager - per region basis, can execute commands based on the
resources
Security Token Service

● Grants users limited and temp access.
● Temp credentials can be valid between 15 minutes to 1 hour
● Federation - let users outside of AWS to assume temp role for accessing AWS
resources
● Federation (IE active directory) - combining or joining a list of users in one domain with
a list of users in another domain IE from IAM to AD
○ Uses SAML
○ Grants temp access based off users AD credentials. Does not need user in IAM
○ Single sign on
● Federation with mobile apps
○ Use facebook/google/amazon to provide log in
● Cross account access compatible
● SAML
○ Open standard that many identity providers use
○ Enables federated single sign on, so users can log in without creating an IAm
user for everyone
○ For enterprises
○ To integrate with AD / ADFS
○ Provide access to AWS console or CLI via temp credentials
○ Client app calls AssumeRoleWithSAML to STS and STS response with
temporary security credentials.
■ Returns set of temporary credentials for users authenticated via SAML. by
default the credentials last for one hour, but you can change it via the
DurationSeconds parameter.
● Identity broker: service that allows you to take an identity from point A and join it to
point B. you need to implement this as it does not come out of the box
○ Custom identity broker
■ Use only if identity provider is not compatible with SAML 2.0
■ Identity broker must determine appropriate IAM policy
● Basic steps:
○ Develop an identity broker to communicate with LDAP and AWS STS
○ Identity broker authenticates with LDAP first, then with AWS STS
○ Application then gets temp access to AWS resources
● AssumeRoleWithWebIdentity
○ Returns set of temporary credentials for users authenticated via mobile or web
app with web identity provider. Examples are Cognito, amazon, facebook, google
○
WorkSpaces
● Cloud based replacement for traditional desktop.
● Available as a bundle of compute resources, storage space and software application
access that allows user to perform day to day tasks like using traditional desktop.
● Windows 7 experience provided by windows server 2008 R2
● By default users can personalize their workspace
● Workspaces are persistent
● All data is on D drive backed up every 12 hours
● By default you will be given local admin access
● Don't need AWS account to log into workspaces
ECS - Elastic container service

● What is docker?
○ A universal standard that allows you to build test and deploy applications quickly
○ Highly reliable, you can quickly deploy and scale applications into any
environment
○ Infinitely scalable, running docker on AWS is a great way to run distributed
applications at any scale
○ Ability to package and run an application in a loosely isolated environment called
containers
○ Packages software into standardized units called containers
■ Containers allow you to easily package an apps code, configurations,
and dependencies into easy to use building blocks that deliver
environmental consistency, operational efficiency, developer productivity
and version control
■ Lightweight, don't need the extra load of a hypervisor but run directly
within host machines kernel.
■ You can run more containers on a given hardware combination than if you
were using virtual machines.
■ Containers are read only templates called an image.
○ Virtualisation
■ Traditional VM
■ Guest os -> dependencies -> application -> VM
■ Wasted space due to more overhead because it needs to maintain a
certain copy of the guest os
○ Containerisation
■ More lightweight, uses bare minimum to run application
■ Dependencies -> application -> docker container
■ Achieve higher density and improve portability by removing the per
container guest OS
■ Don't have to worry about dependencies
■ Isolation, performance or stability issues stay within that container
■ Much better resource management
■ Extreme code portability
■ Create microservices
○ Docker components
■ Docker image
■ Docker container
■ Layers / union file system
■ Dockerfile
■ Docker daemon / engine
■ Docker client
■ Docker registries / docker hub
● ECS - highly scalable, fast container management service that makes it easy to run, stop
and manage docker containers on cluster of Ec2 instances. Lets you launch and stop
container based applications with simple API calls
○ You can use EC2 to schedule placement of containers across your cluster based
on resource needs, isolation policies and availability requirements. Eliminates
that need for you to operate your own cluster management and configuration
system
● Regional service that simplifies running application containers across multiple AZ within
a region.
● ECS agent software supports most flavours of linux, like Ubuntu, MacOS, Amazon
Linux, CentOS, RedHat, Debian, but not Windows
● Image is a read only template with instructions for creating a container
● Images are stored in a registry such as a dockerHub or AWS ECR
● EC2 container registry ECR is managed AWS docker registry service
● Task definition - text file in JSON format that describes one or more containers, to a
max of 10 that forms your application. Basically a blueprint for your app.
○ Specify various parameters for your app IE which container to use, which launch
type, which ports should be opened, what data volumes should be used
● Clusters - running tasks on ECS, you place them on a cluster which is a logical
grouping of resources. Can contain multiple different container instance types. Region
specific. Container instance can only be part of one cluster at a time.
● You can create IAM policies for your clusters to allow or restrict user access.
● Schedule ECS
○ Service scheduler
■ Suited for long running stateless services and applications.
■ Ensures that scheduling strategy you specify is followed and reschedule
tasks when a task has failed
○ Customer scheduler
■ Create your own schedule that meets the needs of your business.
■ Enables you to build schedules and integrate third party schedulers with
EC2
Disaster Recovery
● Recovery Point Objective RTO - maximum length of time after outage that your
company is willing to wait for recovery process to finish
● Recovery point objective RPO - maximum amount of data loss company is willing to
accept as measured in time
● Backup and restore
○ Transferring data to S3 is typically done via network. You can use AWS
import/export transfer very large data sets by shipping storage devices directly to
AWS.
○ Amazon glacier and S3 can be used in conjunction to produce backup solution.
○ AWS storage gateway enables snapshots of your on-premise data volumes to be
transparently copied into S3 for backup.
○ Storage cached volumes allows you to store primary data in S3, but keep
frequently accessed data on site. You can snapshot data volumes to give high
durable backup.
○ You can use gateway-VTL config of storage gateway as backup target for you
existing backup management software. Can be used as replacement for
traditional magnetic tape.
○ If you are running on aws, you can backup data into S3. snapshots of EBS
volumes, RDS and redshift data can be stored in S3.
○ Key steps:
■ Select appropriate tool or method ro backup data
■ Ensure you have appropriate retention policy
■ Ensure appropriate security measures are in place, including encryption
and access policies
■ Regularly test recovery of data and restoration of your system
● Pilot Light
○ Minimal version of environment is always running on cloud.
○ You can maintain pilot light by configuring and running most critical core
elements of your system. When recovery is needed, you can rapidly provision a
full-scale production environment around the critical core.
○ Infrastructure typically includes your database servers replicating data to Ec2 or
RDS. to provision remainder of infrastructure to restore business-critical services,
you would have some pre-configured servers via AMI.
○ When starting recovery, AMI’s would come up quickly. With networking, you
would either:
■ Use elastic IP and associate them with your instances
■ Use ELB to distribute traffic. Then update your DNS records to point at
your Ec2 or your load balancer via CNAME.
○ Key steps:
■ Set up EC2 instances to replicate or mirror data
■ Ensure you have all supporting custom software packages available in
AWS
■ Create and maintain AMI of key servers where fast recovery is required
■ Regularly run tests and apply software updates and configurations
■ Consider automating the provisioning of AWS resources.
○ Recovery phase
■ To recovery remainder of environment, you can start systems via AMI. for
dynamic data servers, you can resize them to handle production volumes.
Horizontal scaling often is most cost effective and scalable approach to
add capacity. For example, you can add more web servers at peak time.
Or you can choose larger Ec2 instance types, scaling vertically.
■ After recovery, ensure redundancy is restored.
○ Key steps for recovery:
■ Start application Ec2 instances from custom AMI
■ Resize existing database/ datastores to process increased traffic
■ Add additional database /datastores to give DR site resilience in data tier;
IE turn on multi-az if using RDS
■ Change DNS to point at EC2 servers
■ Install and configure any non-AMI based systems
● Warm Standby
○ Scale down version of fully functional environment always running on cloud.
Extends pilot light elements and preparation. Further decreases recovery time
because services are always running.
○ Key steps:
■ Setup Ec2 instances to replicate or mirror data
■ Create and maintain AMI
■ Run application using minimal footprint of EC2 instances
■ Patch and update software and configuration files in line with live
environment
Exam Tips
● Cloudtrail provides event history of your AWS account activity, including actions taken
through the console, SDK, CLI and other services. Provides visibility into user activity by
recording API calls. Records important information about each API call including name of
API, identity of caller, the time of call, request parameters, and response elements
returned by the service
● CloudTrail event log files are encrypted using Amazon S3 server-side encryption
(SSE).
● When stopping an EBS backed EC2 instance, the instance performs a normal shutdown
and stops running. The EBS volume remains attached and data persists. Data stored in
RAM of the hot computer are gone and released, and in most cases, the instance is
migrated to a new underlying host computer.
○ EC2- Classic: AWS releases the public and private ip address and assigns a new
one
○ EC2-VPC: instance retains private ip address but ipv4 is released
○ For elastic IP, for classic EC2 it is released so you have to re-associate the
elastic IP when the instance comes up, for VPC it retains the elastic IP
● AWS budgets gives ability to set custom budgets that alert you when you exceed your
costs, or are forecasted to exceed your budget.
● EBS volumes / snapshots in EC2 instances, you can increase the size, change volume
type, adjust IOPS performance without detaching it from the instance.
● Services which you can access underlying OS:
○ AWS EMR
○ Ec2
○ OpsWork
○ Elastic Beanstalk
● EMR enables customer to easily and cost effectively process vast amounts of data,
utilizing Hadoop framework on infrastructure of EC2 and S3. EMR launches number of
EC2 instances for its Hadoop engine. These EC2 instances are fully accessible and
manageable by the customer including full admin rights.
● Cloudwatch custom metrics requiring shell scripting: memory utilization, disk swap
utilization, disk space utilization, page file utilization, log collection
● Lambda encryption for environment variables is automatic via the AWS KMS. when
you first create or update lambda functions, default service key is created automatically,
and is used to encrypt the environment variables.
● Kinesis firehose streams data into S3, Redshift, ElastiSearch, and Splunk.
○ Fully managed service that auto scales to match the throughput of your data and
requires no ongoing administration.
● Amazon s3 Select makes it easy to retrieve specific data from contents of an object
using simple SQL expressions without retrieving the entire object.
● Instance based EC2 cannot be stopped and started, its either running or terminated
● Amazon DynamoDB Accelerator is fully managed, highly available in-memory cache
that can reduce response times from milliseconds to microseconds
● To increase the write performance of database hosted in EC2, you can set up a standard
RAID configuration or increase the size of the EC2 instance IE select an instance drive
that provides more I/O throughput. You can join multiple st1, sc1, gp2, io1 volumes
together in a RAID 0 configuration
● SAML Federation API enables federated single sign on, so users can log into console
without creating an IAM user for everyone. Microsoft active directory implements SAML
so you can set up an SAML based federation API access to your AWS.
● Transferring data from an EC2 instance to an S3 bucket, glacier, dynamoDB, SES, SQS
or SimpleDB in the same region has no cost at all.
● AWS Directory Service provides ways to use AWS cloud directory and microsoft active
directory with other AWS services. Directories store information about users, groups and
devices, and admins using them to manage access to information resources. Designed
to give you easy way to establish relationship between active directory and AWS.
● Temporary credentials are useful in scenarios that involve identity federation, delegation,
cross account access and IAM roles. AWS Security Token Service can generate
temporary tokens. In enterprise identity federation, you can authenticate users in your
organization network, then provide access to AWS without creating new AWS identities
requiring a username and pass. Use the single sign on approach for temporary access.
● SQS messages in queue will continue to exist even after it is consumed, until that
message is deleted.
● AWS redshift uses workload management WLM to define number of query queues that
are available and how queries are routed to those queues for processing. WLM is part of
a parameter group configuration. By default, WLM contains one queue that can run up to
five queries concurrently.
● AWS redshift spectrum allows you to run SQL queries against exabytes of
unstructured data in S3.
● EBS encryption offers encryption for: data at rest inside the volume, all data moving
between volume and instance, all snapshots created from the volume, all volumes
created from those snapshots.
● EBS VS EFS
○ EBS is generally ideal for databases and other low latency interactive
applications that require high IOPS or throughput.
○ EFS is designed for huge amounts of data that big data workloads and analytic
applications generate. Ideal for media processing workflows, content
management and web serving.
● API Gateway provides throttling at multiple levels including global and by service call.
Limits can be set for standard rates or for burst. Any requests over the limit will receive a
429 HTTP response. You can add caching to API calls by provisioning the cache and
specifying the size.
● Network ACL controls traffic coming in and out of your VPC network.
● Snapshots occur asynchronously which means the volume can be used as normal while
snapshot is being created.
● AWS OpsWorks - configuration management service leveraging Chef and Puppet. They
are automation platforms that allow you to code to automate configurations of your
servers, how they are deployed and managed across EC2 instances. Chef consists of
recipes to maintain consistent state.
● Cold HDD - ideal for large, sequential workloads if you require infrequent access to data
and want to save costs.
○ Performance defined in terms of throughput rather than IOPS.
○ Lower throughput limit than throughput optimized HDD
● VPC Peering does not support edge to edge routing. In other words, if either VPC in
peering relationship has following connections:
■ VPN connection or AWS direct connection to corporate network
■ Internet connection via internet gateway
■ Internet connection in private subnet through NAT device
■ VPC endpoint to AWS service, IE endpoint to S3
○ Then the VPC peering cannot use the connection to access resources on other
side of connection.
● RDS performs failover in events:
○ Loss of availability in primary AZ
○ Loss of network connection to primary
○ Compute unit failure on primary
○ Storage failure on primary
● Bastion host implementation - create a small EC2 instance which should have a SG
from a particular IP for maximum security. Use a small instance as it servers as a jump
server to connect to other instances in your VPC.
● AWS IoT Core - managed service that lets connected devices easily and securely
interact with cloud applications and other devices. Provides secure communication and
data processing across different connected devices and locations.
● Database in RAID - creating a snapshot in a RAID configuration is different. You have to
stop all I/O activity of the volume before creating the snapshot. Flush all caches to the
disk. Confirm that the EC2 instances are no longer writing to the RAID array. Then take
the snapshot. It is important that there is no data I/O from the volumes when creating the
snapshots. RAID arrays introduce data interdependencies and a level of complexity not
present in a single EBS configuration.
● By default, all data stored in storage gateway in S3 are encrypted server-side with AWS
S3 managed encryption keys at rest. Data stored in AWS Glacier is also encrypted at
rest by default.
● Perfect Forward Secrecy - feature that provides additional safeguards against
eavesdropping of encrypted data through use of unique random session keys.
Cloudfront and ELB supports it.
● Monitoring best practices:
○ Make monitoring priority to head off small problems before they become big
○ Create and implement plan that collects data from all parts of AWS solutions so
you can easily debug. Address the following:
■ What are your goals?
■ What resources will you monitor?
■ How often will you monitor?
■ What tools will you use?
○ Automate the tasks
○ Check the log files of your EC2 instances
● To allow communication between two subnets, network ACL should be set to allow
communication. The SG also needs to be configured so web server can communicate
with DB.
● Single SQS message queue can contain unlimited messages, but there is a 120k limit
for number of inflight messages for standard and 20k for FIFO.
● Important features of EBS:
○ Automatically replicated within AZ to prevent data loss / single loss of failure
○ Can be attached to only one EC2 instance at a time
○ When you create volume, you can attach it to any EC2 instance in same AZ
○ EBS volume is off-instance storage that persist independently of life from the
instance
○ Support live configuration changes
○ 256 bit encryption
○ Offers 99.999% SLA
● Prerequisites for routing traffic to website hosted in S3:
○ S3 configured for static website.
○ Same name as domain or sub domain
○ Registered domain name
○ Route 53 as DNS service.
● ENI is a logical networking component in VPC that represents a network card. You can
attach ENI to EC2 instance:
○ While its running (hot attach)
○ While its stopped ( warm attach)
○ When its being launched (cold attach)
● S3 notification feature enables you to receive notifications based off of certain events
happening in your buckets. Can publish events to:
○ SNS
○ SQS
○ Lambda
● AWS security token service lets you create and provide trusted users with temp
security credentials. Works identical to long term access credentials, with differences:
○ Configured to last anywhere between few minutes to several hours.
○ Not stored with the user but generated dynamically
○ Works globally
● AWS Data Lifecycle Management allows you to automate creation, retention and
deletion of snapshots taken to backup your EBS volumes.
○ Protect valuable data by enforcing regular backup schedule
○ Retain backups as required by auditors or internal compliance
○ Reduce storage costs by deleting outdated backups
● Auto scaling cooldown:
○ Ensures the group does not launch or terminate additional EC2 instances before
previous scaling takes effect
○ Default is 300 seconds
● Network ACL optional layer of security, rules are evaluated by rule number, form lowest
to highest, and executed immediately when matching allow/deny rule is found
● EC2 ahs soft limit of 20 instances per region.
● RDS failover is automatically handled. It simply flips the CNAME for your instance to
point at the standby, which is in turn promoted to become the new primary
● Enable long polling with the ReceivedMessageWaitTimeSeconds to a number greater
than 0 if your polling is causing large number of CPU cycles and costing a lot
● Maximum response time for business level premium support case = 1 hour
● Virtulizations available:
○ Hardware virtual machine HVM
○ Paravirtual machine PV
● OSX is not supported by EC2
● CloudHSM is a cloud based hardware security module that allows you to easily add
secure key storage and high performance crypto operations to your applications. No
upfront costs. Locating HSM appliances near your EC2 instances decrease network
latency, which improves application performance
● All eC2 instance types and OS are supported by cloudwatch
● Amazon will never have root level SSH access to your EC2 instances
● Converting your RDS instance from single AZ to multi-AZ:
○ Snapshot of primary is taken
○ New standby instance created in a different AZ from the snapshot
○ Synchronous replication is configured between primary and standby
● Data stored on EBS volumes are automatically redundantly stored in multiple physical
volumes in the same AZ as part of normal service
● AWS KMS provides an audit trail so you can see who used your key to access which
object and when, and view failed attempts to access data from users without permission
to decrypt data
● Request headers for S3 PUT:
○ content-MDS
○ Content-length
○ X-amz-storage-class
○ X-amz-meta-
● Aurora has a superior read performance than MySQL, 5x throughput compared to
MySQL and 3x throughput compared to PostgreSQL
● Route 53 DNS has a security features that prevents internal DNS from being read by
external sources.
● AWS reserves both the first four and last IP address in each subnet CIDR block.
● Valid S3 Encryption
○ Server side encryption S3
○ SSE-C - use if you want to maintain your own encryption keys, but don't want to
implement or leverage a client side encryption library
○ SSE-KMS
○ Client library such as S3 encryption client
● Configuring WAF
○ Size constraint conditions
○ Ip match conditions
○ String match conditions
● For EMR to mitigate issues with terminated spot instances, increase the bid price for
task nodes so that you have a greater threshold before the task nodes are terminated,
and change task does to on-demand instances
● For dedicated hosting for EC2, you can use the following modes to transition between
stopping and starting the instance
○ Host & dedicated
○ Default & dedicated
● If you are on a dedicated hosting tenancy and want to revert back to default:
○ Create an AMI of all your instances
○ Create a new VPC with default as the hosting tenancy attribute, and use them to
create new instances using default tenancy
● Elasticache Redis provides operations for Sorted Sets, Pub/Sub/ and in-memory data
store.
● SQl and Oracle have limits to how many DB’s that can run per instance. Limit is due to
the technology being proprietary.
● AWS Config - use to continuously record configuration changes to RDS DB instances,
DB subnet groups, DB snapshots, DB security groups and event subscriptions and
receive notifications of changes through SNS
● Some RDS types support multi-az read replica
● Restoring an object from Glacier, you can do so via the S3 API and using the AWS
console
● Cloudwatch stores metrics for terminated EC2 instances or deleted ELBS for 15 months
● ProvisionedThroughputExceededException - message you receive when you are
exceeding the individual partitions throughput capacity, even if it's not exceeding the
overall table capacity. DynamoDB distributes capacity evenly across all available
partitions.
● S3 409 conflict - the bucket already exists, or you are attempting to delete the bucket
without removing the contents in the bucket
● S3 400 bad request - refers to an incompleteBody, invalidDigest,
invalidBucketName
● S3 403 - refers to invalidObjectState
● AMI cannot be launched to another region unless you copy it first.
● DynamoDB LimitExceedExceptionError - happens when you are attempting to create
more than one table with a secondary index at a time.
● Using SNS JSON message generator, you can choose the appropriate endpoint types
and edit the code to send different messages depending on the endpoints
● DynamoDB Controlplane lets you create and manage tables. Allows you to work with
indexes, streams, and other objects dependant on the table. It does not let you perform
operations like create, read, update etc.
● One dynamoDB read capacity unit is equal to one strongly consistent read per second,
for an item up to 4kb in size, and 2 eventually consistent reads. For example:
○ Item to read is 8kb in size, you require 2 read requests to sustain one strongly
consistent read, and 1 read request if you choose eventual read consists reads,
or 4 read requests for transactional read requests.
● S3 Multipart is required for files larger than 5GB and recommended for files greater than
100 mb
● DynamoDB the maximum length of a sort key value is 1024 bytes, and minimum is 1
byte
● With a conditional write, an operation succeeds only if the item attribute meets one or
more expected conditions.
● A single DynamoDB table partition can support a maximum of 3000 read capacity units,
or 1000 write capacity units.
● Each SWF workflow execution / task can run for a maximum of 1 year
● To create more than one table with secondary indexes, you must do it sequentially. So
first create the table, wait for it to become active, then create the next table and wait for it
to become active.
● A global secondary index is an index with primary key that is different from the tables
primary key.
● You can use S3 transfer acceleration with multipart uploads. S3 transfer acceleration
leverages CloudFront’s globally distributed AWS edge locations, but it does not use
cloudfronts caching feature
○ Additional data transfer applies
○ Should be used only for large distances and if customers are uploading /
downloading generally 1gb +
● Protection against DDOS
○ Use cloudfront to distribute static and dynamic content
○ Use app load balancer with auto scaling to restrict direct internet traffic to your
RDS via private subnet
○ Setup cloudwatch alerts to look for high network in and cpu utilization metrics
○ Decouple your infrastructure. Decoupling applications limit internet access to
critical system components
○ Route 53 provides shuffle sharding and Anycast routing capabilities to protect
domain names from dns based attacks
● AWS Trusted Advisor provides best recommendations for Cost Optimization,
Performance, Fault TOlerance, Security and Service Limits
● AWS Step Functions provides serverless orchestration for modern applications.
Centrally manages workflow by breaking it into multiple steps, adding flow logic and
tracking inputs and outputs between steps. Maintains application state, tracking workflow
steps and stores event log of data passed between components.
● DynamoDB Auto scaling uses auto scaling service to adjust and provision throughput
capacity on your behalf in response to traffic patterns. Enables a table or global
secondary index to increase its provisioned read and write capacity to handle increase in
traffic without throttling.
● AWS Database Migration Service - helps you migrate database to AWS quickly and
securely. Source database remains fully operational during migration. Supports
homogeneous migrations IE oracle to oracle, as well as heterogeneous migrations.
Migrations can be from on-premise database to AWS, or EC2 to RDS etc.
○ Two step process because the schema structure, data types and code of source
and target are different, requiring a schema transform. Use AWS schema
conversion tool to convert source schema and code to match target database,
then AWS database migration service to migrate the data.
● AWS systems manager run command lets you remotely and securely manage the
configuration of your managed instances.
● AWS Aurora failover is handled automatically.
○ If you have a read replica = the CNAME will be pointed to the new healthy
replica, and will be promoted
○ If you don't have a read replica = It will first attempt to create a new DB instance
in the same AZ as original, and if it fails, it will attempt to create a new DB in a
different AZ.
● Cloud trail event logs are encrypted using SSE by default.
● All API created via API Gateway are exposed to HTTPS, and do not support HTTP.
● Sharing an AMI does not affect the ownership of that ami, however if you copy an ami
that was shared to you, you become the new owner. However, to copy an AMI that was
shared to you, the owner must grant you read permissions for the storage that backs that
AMI.
● You cannot copy an encrypted AMI that was shared to you from another account.
● You cannot copy an AMI with an associated billingProduct code that was shared to you.
This includes windows AMI and AMI from the marketplace. To copy AMI with
billingProduct code, launch an EC2 instance in your account using the shared AMI, and
then create an AMI from that instance.
● If one EC2 instance is receiving a lot of traffic compared to others in a load balancer,
issue may be that sticky sessions is enabled
● NLB gets a static IP per AZ
○ Public facing: must attach Elastic IP - can help whitelist by clients
○ Has cross zone balancing
○ Has SSL termination to enable HTTPS
● Load balancers use X.509 certificate for SSL/TLS server certificate, and manage the
certificates via ACM, however you can upload your own certificates as well
○ You can add optional list of certs to support multiple domains
○ Clients can use SNI (server name indication) to specify the hostname they reach
● Auto scaling IAM roles attached to an ASG will get assigned to the EC2 instances
● ASG Default termination policy:
○ Default policy is to find the AZ which has the most number of instances
○ If there are multiple instances in the AZ, delete the one with the oldest launch
configuration
○ ASG will always try to balance the number of instances across the AZ
● ASG cooldown period ensures that the ASG doesn't launch of terminate additional
instances before previous scaling activity takes effect
○ Default value is 300 seconds
● Creating S3 via CLI if you don't specify a region, it will default to us-east-1
● You can enable detailed monitoring for your EC2 to have more effective auto scaling
● Glacier and Storage Gateway encrypt data at rest by default
● Can use CloudWatch alarms to automatically stop, terminate, reboot or recover ec2
instances based on health metrics
● Instances in RAID configuration have a different snapshot process
○ Steps:
■ Stop all applications writing to raid array
■ Flush all caches to the disk
■ Confirm EC2 instances are no longer writing to raid array
■ Take snapshot of ebs volumes in the array
● Perfect forward secrecy feature that provides additional safeguards against
eavesdropping of encrypted data, through use of unique random session key. Prevents
decoding of captured data. Cloudfront and ELB are supported
● Allowing multiple domains to serve SSL traffic over same ip address:
○ Generate SSL certificate with AWS certificate manager. Associate the certificate
with cloudfront web distribution and enable support for SNI
● Lambda supports synchronous and asynchronous invocation of lambda functions. You
can control the invocation only when you invoke a lambda function.
● Durability think S3 and DynamoDB
○ DynamoDB and AppSync can be used to build collaborative apps that keep
shared data updated in real time. Allows app to access data in DynamoDB,
trigger lambda functions, or run elasticsearch queries
● Amazon Resource Name uniquely identifies AWS resources. Helps you uniquely
identify AWS resources such as IAM policies, RDS tags and API calls.
● Configure cross region snapshots for AWS redshift cluster
● Cloudwatch provides enhanced monitoring for RDS for RDS child processes and OS
processes
● AWS GLUE - fully managed extract, transform, and load (ETL) service that makes it
easy for customers to prepare and load their data for analytics.
● You can use Run Command from the console to configure instances without having
to login to each instance.
● VPC endpoints:
○ There are two types of VPC endpoints: interface endpoints and gateway
endpoints. You have to create the type of VPC endpoint required by the
supported service.
○ An interface endpoint is an elastic network interface with a private IP address
that serves as an entry point for traffic destined to a supported service.
Basically supports every service except for S3 and DynamoDB
○ A gateway endpoint is a gateway that is a target for a specific route in your
route table, used for traffic destined to a supported AWS service. It is
important to note that for Amazon S3 and DynamoDB service, you have to
create a gateway endpoint and then use an interface endpoint for other
services.
● An Elastic IP address doesn’t incur charges as long as the following conditions are
true:
○ -The Elastic IP address is associated with an Amazon EC2 instance.
○ -The instance associated with the Elastic IP address is running.
○ -The instance has only one Elastic IP address attached to it.
● Web Identity Federation
○ Lets users authenticate with own identity providers like facebook, google, and
receive an authentication token, and exchange the token for temp security
credentials in AWS that map to an IAM role
● RDP uses port 3389
● CodeDeploy is a deployment service that automates application deployments to
Amazon EC2 instances, on-premises instances, or serverless Lambda functions. It
allows you to rapidly release new features, update Lambda function versions, avoid
downtime during application deployment, and handle the complexity of updating your
applications, without many of the risks associated with error-prone manual
deployments.
● egress-only Internet gateway is a horizontally scaled, redundant, and highly available
VPC component that allows outbound communication over IPv6 from instances in
your VPC to the Internet, and prevents the Internet from initiating an IPv6 connection
with your instances.
● Blue/Green Deployments
○ Provision a new set of instances which codedeploy installs latest version of
application. Then reroute load balancer traffic from existing instances running
previous version, to the new set of instances.
○ Allows you to test the new application version before sending production
traffic to it.
○ If there is an issue with the new version, you can always roll back to previous
version and send traffic back.
● EBS root volume cannot be encrypted. To encrypt, you have to launch the instance
unencrypted, then take a snapshot of the image and launch a new instance from that
snapshot

Solutions Architect - Associate

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Solutions Architect - Associate

Caricato da

Copyright:

Formati disponibili

AWS - 10,000 foot overview Part 1 - Global infrastructure

Region : Geographical area, each region consists of 2 or more availability zones

AWS - 10,000 foot overview Part 2 - 4 - Services

Identity Access Management 101

Life Cycle Management

Storage Class Analysis

S3 - Security & Encryption

AMI Types (EBS Vs Instance Store)

Elastic Load Balancers

○ Always enabled in Application load balancers, disabled by default in Network

EC2 User Data

AWS Command Line

AWS Auto Scaling

EC2 Placement Groups

Tagging & Resource Groups

Security Token Service

ECS - Elastic container service

Potrebbero piacerti anche