Aws CP Complete Notes
Aws CP Complete Notes
EC2
• provides scalable computing capacity
• Features
▪ Virtual computing environments, known as EC2 instances
▪ Preconfigured templates for EC2 instances, known as Amazon Machine Images (AMIs),
that package the bits needed for the server (including the operating system and additional
software)
▪ Various configurations of CPU, memory, storage, and networking capacity for your
instances, known as Instance types
▪ Secure login information for your instances using key pairs (public-private keys where
private is kept by user)
▪ Storage volumes for temporary data that’s deleted when you stop or terminate your instance,
known as Instance store volumes
▪ Persistent storage volumes for data using Elastic Block Store (EBS)
▪ Multiple physical locations for your resources, such as instances and EBS volumes, known
as Regions and Availability Zones
▪ A firewall to specify the protocols, ports, and source IP ranges that can reach your instances
using Security Groups
▪ Static IP addresses, known as Elastic IP addresses
▪ Metadata, known as tags, can be created and assigned to EC2 resources
▪ Virtual networks that are logically isolated from the rest of the AWS cloud, and can
optionally connect to on premises network, known as Virtual private clouds (VPCs)
• Amazon Machine Image
▪ template from which EC2 instances can be launched quickly
▪ does NOT span across regions, and needs to be copied
▪ can be shared with other specific AWS accounts or made public
• Purchasing Option
▪ On-Demand Instances
o pay for instances and compute capacity that you use by the hour
o with no long-term commitments or up-front payments
▪ Reserved Instances
o provides lower hourly running costs by providing a billing discount
o capacity reservation that is applied to instances
o suited if consistent, heavy, predictable usage
o provides benefits with Consolidate Billing
o can be modified to switch Availability Zones or the instance size within the
same instance type, given the instance size footprint (Normalization factor)
remains the same
o pay for the entire term regardless of the usage, so if the question targets cost
effective solution and answer mentions reserved instances are purchased
& unused, it can be ignored
▪ Spot Instances
o cost-effective choice but does NOT guarantee availability
o applications flexible in the timing when they can run and also able to handle
interruption by storing the state externally
o AWS will give a two minute warning if the instance is to be terminated to save any
unsaved work
▪ Dedicated Instances, is a tenancy option which enables instances to run in VPC on
hardware that’s isolated, dedicated to a single customer
▪ Light, Medium, and Heavy Utilization Reserved Instances are no longer
available for purchase and were part of the Previous Generation AWS EC2
purchasing model
• Enhanced Networking
▪ results in higher bandwidth, higher packet per second (PPS) performance,
lower latency, consistency, scalability and lower jitter
▪ supported using Single Root I/O Virtualization (SR-IOV) only on supported
instance types
▪ is supported only with an VPC (not EC2 Classic), HVM virtualization type and
available by default on Amazon AMI but can be installed on other AMIs as well
• Placement Group
▪ provide low latency, High Performance Computing via 10Gbps network
Glacier
• suitable for archiving data, where data access is infrequent and a retrieval time of several
hours (3 to 5 hours) is acceptable (Not true anymore with enhancements from AWS)
• provides a high durability by storing archive in multiple facilities and multiple devices at a
very low cost storage
• performs regular, systematic data integrity checks and is built to be automatically self
healing
• aggregate files into bigger files before sending them to Glacier and use range
retrievals to retrieve partial file and reduce costs
• improve speed and reliability with multipart upload
• automatically encrypts the data using AES-256
• upload or download data to Glacier via SSL encrypted endpoints
CloudFront
• provides low latency and high data transfer speeds for distribution of static, dynamic web
or streaming content to web users
• delivers the content through a worldwide network of data centers called Edge
Locations
• keeps persistent connections with the origin servers so that the files can be fetched from
the origin servers as quickly as possible.
• dramatically reduces the number of network hops that users’ requests must pass through
• supports multiple origin server options, like AWS hosted service for e.g. S3, EC2, ELB or
an on premise server, which stores the original, definitive version of the objects
• single distribution can have multiple origins and Path pattern in a cache behavior
determines which requests are routed to the origin
• supports Web Download distribution and RTMP Streaming distribution
▪ Web distribution supports static, dynamic web content, on demand using
progressive download & HLS and live streaming video content
▪ RTMP supports streaming of media files using Adobe Media Server and the Adobe Real-
Time Messaging Protocol (RTMP) ONLY
• supports HTTPS using either
▪ dedicated IP address, which is expensive as dedicated IP address is assigned to each
CloudFront edge location
▪ Server Name Indication (SNI), which is free but supported by modern browsers only
with the domain name available in the request header
• For E2E HTTPS connection,
▪ Viewers -> CloudFront needs either self signed certificate, or certificate issued by
CA or ACM
▪ CloudFront -> Origin needs certificate issued by ACM for ELB and by CA for
other origins
• Security
▪ Origin Access Identity (OAI) can be used to restrict the content from S3 origin to be
IAM
• securely control access to AWS services and resources
• helps create and manage user identities and grant permissions for those users to access
AWS resources
• helps create groups for multiple users with similar permissions
• not appropriate for application authentication
• is Global and does not need to be migrated to a different region
• helps define Policies,
▪ in JSON format
▪ all permissions are implicitly denied by default
▪ most restrictive policy wins
• IAM Role
▪ helps grants and delegate access to users and services without the need of creating
permanent credentials
▪ IAM users or AWS services can assume a role to obtain temporary security
credentials that can be used to make AWS API calls
▪ needs Trust policy to define who and Permission policy to define what the user or
service can access
▪ used with Security Token Service (STS), a lightweight web service that
provides temporary, limited privilege credentials for IAM users or for authenticated
federated users
▪ IAM role scenarios
o Service access for e.g. EC2 to access S3 or DynamoDB
o Cross Account access for users
o with user within the same account
o with user within an AWS account owned the same owner
o with user from a Third Party AWS account with External ID for enhanced
security
o Identity Providers & Federation
o Web Identity Federation, where the user can be authenticated using external
authentication Identity providers like Amazon, Google or any OpenID IdP using
AssumeRoleWithWebIdentity
o Identity Provider using SAML 2.0, where the user can be authenticated using on
premises Active Directory, Open Ldap or any SAML 2.0 compliant IdP using
AssumeRoleWithSAML
o For other Identity Providers, use Identity Broker to authenticate and provide
temporary Credentials using Assume Role (recommended) or
GetFederationToken
• IAM Best Practices
▪ Do not use Root account for anything other than billing
CloudHSM
• provides secure cryptographic key storage to customers by making hardware security
modules (HSMs) available in the AWS cloud
• single tenant, dedicated physical device to securely generate, store, and manage
cryptographic keys used for data encryption
• are inside the VPC (not EC2-classic) & isolated from the rest of the network
• can use VPC peering to connect to CloudHSM from multiple VPCs
• integrated with Amazon Redshift and Amazon RDS for Oracle
• EBS volume encryption, S3 object encryption and key management can be done with
CloudHSM but requires custom application scripting
• is NOT fault tolerant and would need to build a cluster as if one fails all the keys are lost
• expensive, prefer AWS Key Management Service (KMS) if cost is a criteria
AWS Directory Services
• gives applications in AWS access to Active Directory services
• different from SAML + AD, where the access is granted to AWS services through
Temporary Credentials
• Simple AD
▪ least expensive but does not support Microsoft AD advance features
▪ provides a Samba 4 Microsoft Active Directory compatible standalone directory
service on AWS
▪ No single point of Authentication or Authorization, as a separate copy is maintained
▪ trust relationships cannot be setup between Simple AD and other Active Directory
domains
▪ Don’t use it, if the requirement is to leverage access and control through centralized
authentication service
• AD Connector
▪ acts just as an hosted proxy service for instances in AWS to connect to on-premises
Active Directory
▪ enables consistent enforcement of existing security policies, such as password
expiration, password history, and account lockouts, whether users are accessing
resources on-premises or in the AWS cloud
▪ needs VPN connectivity (or Direct Connect)
▪ integrates with existing RADIUS-based MFA solutions to enabled multi-factor
authentication
▪ does not cache data which might lead to latency
• Read-only Domain Controllers (RODCs)
▪ works out as a Read-only Active Directory
▪ holds a copy of the Active Directory Domain Service (AD DS) database and
respond to authentication requests
▪ they cannot be written to and are typically deployed in locations where physical
security cannot be guaranteed
▪ helps maintain a single point to authentication & authorization controls, however
needs to be synced
• Writable Domain Controllers
▪ are expensive to setup
▪ operate in a multi-master model; changes can be made on any writable server in the
forest, and those changes are replicated to servers throughout the entire forest
AWS WAF
• is a web application firewall that helps monitor the HTTP/HTTPS requests forwarded to
CloudFront and allows controlling access to the content.
• helps define Web ACLs, which is a combination of Rules, which is a combinations of
Conditions and Action to block or allow
• Third Party WAF
▪ act as filters that apply a set of rules to web traffic to cover exploits like XSS and SQL
injection and also help build resiliency against DDoS by mitigating HTTP GET or
POST floods
▪ WAF provides a lot of features like OWASP Top 10, HTTP rate limiting, Whitelist or
blacklist, inspect and identify requests with abnormal patterns, CAPTCHA etc.
▪ a WAF sandwich pattern can be implemented where an auto scaled WAF sits
between the Internet and Internal Load Balancer
AWS – Networking Services – Important Notes
VPC
• helps define a logically isolated dedicated virtual network within the AWS
• provides control of IP addressing using CIDR block from a minimum of /28 to
maximum of /16 block size
• Components
▪ Internet gateway (IGW) provides access to the Internet
▪ Virtual gateway (VGW) provides access to on-premises data center
through VPN and Direct Connect connections
▪ VPC can have only one IGW and VGW
▪ Route tables determine where network traffic from subnet is directed
▪ Ability to create subnet with VPC CIDR block
▪ A Network Address Translation (NAT) server provides outbound Internet access for
EC2 instances in private subnets
▪ Elastic IP addresses are static, persistent public IP addresses
▪ Instances launched in the VPC will have a Private IP address and can have a
Public or a Elastic IP address associated with it
▪ Security Groups and NACLs help define security
▪ Flow logs – Capture information about the IP traffic going to and from network
interfaces in your VPC
• allows Tenancy option for instances
▪ shared, by default, allows instances to be launched on shared tenancy
▪ dedicated allows instances to be launched on a dedicated hardware
• NAT
▪ allows internet access to instances in private subnet
▪ performs the function of both address translation and port address translation (PAT)
▪ needs source/destination check flag to be disabled as it is not actual destination of the
traffic
▪ NAT gateway is a AWS managed NAT service that provides better availability,
higher bandwidth, and requires less administrative effort
• Route Tables
▪ defines rules, termed as routes, which determine where network traffic from the
RDS
• provides Relational Database service
• supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server, and the new,
MySQL-compatible Amazon Aurora DB engine
• as it is a managed service, shell (root ssh) access is not provided
• manages backups, software patching, automatic failure detection, and recovery
• supports use initiated manual backups and snapshots
• daily automated backups with database transaction logs enables Point in Time
recovery up to the last five minutes of database usage
• snapshots are user-initiated storage volume snapshot of DB instance, backing up the
entire DB instance and not just individual databases that can be restored as a
independent RDS instance
• support encryption at rest using KMS as well as encryption in transit using SSL
endpoints
• for encrypted database
▪ logs, snapshots, backups, read replicas are all encrypted as well
▪ cross region replicas and snapshots does not work across region
• Multi-AZ deployment
▪ provides high availability and automatic failover support and is NOT a scaling
solution
▪ maintains a synchronous standby replica in a different AZ
▪ transaction success is returned only if the commit is successful both on the
primary and the standby DB
▪ Oracle, PostgreSQL, MySQL, and MariaDB DB instances use Amazon
technology, while SQL Server DB instances use SQL Server Mirroring
▪ snapshots and backups are taken from standby & eliminate I/O freezes
▪ during automatic failover, its seamless and RDS switches to the standby
instance and updates the DNS record to point to standby
▪ failover can be forced with the Reboot with failover option
• Read Replicas
▪ uses the PostgreSQL, MySQL, and MariaDB DB engines’ built-in replication
ElastiCache
• managed web service that provides in-memory caching to deploy and run Memcached or
Redis protocol-compliant cache clusters
• ElastiCache with Redis,
▪ like RDS, supports Multi-AZ, Read Replicas and Snapshots
▪ Read Replicas are created across AZ within same region using Redis’s
asynchronous replication technology
▪ Multi-AZ differs from RDS as there is no standby, but if the primary goes down a
Read Replica is promoted as primary
▪ Read Replicas cannot span across regions, as RDS supports
▪ cannot be scaled out and if scaled up cannot be scaled down
▪ allows snapshots for backup and restore
▪ AOF can be enabled for recovery scenarios, to recover the data in case the node fails
or service crashes. But it does not help in case the underlying hardware fails
▪ Enabling Redis Multi-AZ as a Better Approach to Fault Tolerance
• ElastiCache with Memcached
▪ can be scaled up by increasing size and scaled out by adding nodes
▪ nodes can span across multiple AZs within the same region
▪ cached data is spread across the nodes, and a node failure will always result in
some data loss from the cluster
▪ supports auto discovery
▪ every node should be homogenous and of same instance type
• ElastiCache Redis vs Memcached
▪ complex data objects vs simple key value storage
▪ persistent vs non persistent, pure caching
▪ automatic failover with Multi-AZ vs Multi-AZ not supported
▪ scaling using Read Replicas vs using multiple nodes
▪ backup & restore supported vs not supported
• can be used state management to keep the web application stateless
Redshift
• fully managed, fast and powerful, petabyte scale data warehouse service
• uses replication and continuous backups to enhance availability and improve data
durability and can automatically recover from node and component failures
• provides Massive Parallel Processing (MPP) by distributing & parallelizing queries
across multiple physical resources
• columnar data storage improving query performance and allowing advance
compression techniques
• only supports Single-AZ deployments and the nodes are available within the same AZ,
if the AZ supports Redshift clusters
• spot instances are NOT an option
AWS – Application Services – Important Notes
SQS
• extremely scalable queue service and potentially handles millions of messages
• helps build fault tolerant, distributed loosely coupled applications
• stores copies of the messages on multiple servers for redundancy and high
availability
• guarantees At-Least-Once Delivery, but does not guarantee Exact One Time Delivery which
might result in duplicate messages (Not true anymore with the introduction of FIFO queues)
• does not maintain or guarantee message order, and if needed sequencing information needs
to be added to the message itself (Not true anymore with the introduction of FIFO queues)
• supports multiple readers and writers interacting with the same queue as the same time
• holds message for 4 days, by default, and can be changed from 1 min – 14 days after which
the message is deleted
• message needs to be explicitly deleted by the consumer once processed
• allows send, receive and delete batching which helps club up to 10 messages in a
single batch while charging price for a single message
• handles visibility of the message to multiple consumers using Visibility Timeout, where
the message once read by a consumer is not visible to the other consumers till the timeout
occurs
• can handle load and performance requirements by scaling the worker instances as the
demand changes (Job Observer pattern)
• message sample allowing short and long polling
▪ returns immediately vs waits for fixed time for e.g. 20 secs
▪ might not return all messages as it samples a subset of servers vs returns all
available messages
▪ repetitive vs helps save cost with long connection
• supports delay queues to make messages available after a certain delay, can you used to
differentiate from priority queues
• supports dead letter queues, to redirect messages which failed to process after certain
attempts instead of being processed repeatedly
• Design Patterns
▪ Job Observer Pattern can help coordinate number of EC2 instances with number of
job requests (Queue Size) automatically thus Improving cost effectiveness and
performance
▪ Priority Queue Pattern can be used to setup different queues with different handling
either by delayed queues or low scaling capacity for handling messages in lower priority
queues
SNS
• delivery or sending of messages to subscribing endpoints or clients
• publisher-subscriber model
• Producers and Consumers communicate asynchronously with subscribers by
producing and sending a message to a topic
• supports Email (plain or JSON), HTTP/HTTPS, SMS, SQS
• supports Mobile Push Notifications to push notifications directly to mobile devices with
services like Amazon Device Messaging (ADM), Apple Push Notification Service (APNS),
Google Cloud Messaging (GCM) etc. supported
• order is not guaranteed and No recall available
• integrated with Lambda to invoke functions on notifications
• for Email notifications, use SNS or SES directly, SQS does not work
SWF
• orchestration service to coordinate work across distributed components
• helps define tasks, stores, assigns tasks to workers, define logic, tracks and monitors the task
and maintains workflow state in a durable fashion
• helps define tasks which can be executed on AWS cloud or on-premises
• helps coordinating tasks across the application which involves managing inter task
dependencies, scheduling, and concurrency in accordance with the logical flow of the
application
• supports built-in retries, timeouts and logging
• supports manual tasks
• Characteristics
▪ deliver exactly once
▪ uses long polling, which reduces number of polls without results
▪ Visibility of task state via API
▪ Timers, signals, markers, child workflows
▪ supports versioning
▪ keeps workflow history for a user-specified time
• AWS SWF vs AWS SQS
▪ task-oriented vs message-oriented
▪ track of all tasks and events vs needs custom handling
SES
• highly scalable and cost-effective email service
• uses content filtering technologies to scan outgoing emails to check standards and
email content for spam and malware
• supports full fledged emails to be sent as compared to SNS where only the message is
sent in Email
• ideal for sending bulk emails at scale
• guarantees first hop
• eliminates the need to support custom software or applications to do heavy lifting of
email transport
AWS – Management Tools – Important Notes
CloudFormation
• gives developers and systems administrators an easy way to create and manage a collection
of related AWS resources
• Resources can be updated, deleted and modified in a orderly, controlled and predictable
fashion, in effect applying version control to the AWS infrastructure as code done for
software code
• CloudFormation Template is an architectural diagram, in JSON format, and Stack is the
end result of that diagram, which is actually provisioned
• template can be used to set up the resources consistently and repeatedly over and over
across multiple regions and consists of
▪ List of AWS resources and their configuration values
▪ An optional template file format version number
▪ An optional list of template parameters (input values supplied at stack creation
time)
▪ An optional list of output values like public IP address using the Fn::GetAtt
function
▪ An optional list of data tables used to lookup static configuration values for e.g., AMI
names per AZ
• supports Chef & Puppet Integration to deploy and configure right down the
application layer
• supports Bootstrap scripts to install packages, files and services on the EC2 instances by
simple describing them in the CF template
• automatic rollback on error feature is enabled, by default, which will cause all the AWS
resources that CF created successfully for a stack up to the point where an error occurred to
be deleted
• provides a Wait Condition resource to block the creation of other resources until a
completion signal is received from an external source
• allows Deletion Policy attribute to be defined for resources in the template
▪ retain to preserve resources like S3 even after stack deletion
▪ snapshot to backup resources like RDS after stack deletion
• Depends On attribute to specify that the creation of a specific resource follows another
• Service role is an IAM role that allows AWS CloudFormation to make calls to
resources in a stack on the user’s behalf
• support Nested stacks that can separate out reusable, common components and create
dedicated templates to mix and match different templates but use nested stacks to create a
single, unified stack
Elastic BeanStalk
• makes it easier for developers to quickly deploy and manage applications in the AWS
cloud.
• automatically handles the deployment details of capacity provisioning, load balancing, auto-
scaling and application health monitoring
• CloudFormation supports Elastic Beanstalk
• provisions resources to support
▪ a web application that handles HTTP(S) requests or
▪ a web application that handles background-processing (worker) tasks
• supports Out Of the Box
▪ Apache Tomcat for Java applications
▪ Apache HTTP Server for PHP applications
▪ Apache HTTP server for Python applications
▪ Nginx or Apache HTTP Server for Node.js applications
▪ Passenger for Ruby applications
▪ Microsoft IIS 7.5 for .Net applications
▪ Single and Multi Container Docker
• supports custom AMI to be used
• is designed to support multiple running environments such as one for Dev, QA, Pre- Prod
and Production.
• supports versioning and stores and tracks application versions over time allowing easy
rollback to prior version
• can provision RDS DB instance and connectivity information is exposed to the application by
environment variables, but is NOT recommended for production setup as the RDS is tied up
with the Elastic Beanstalk lifecycle and if deleted, the RDS instance would be deleted as
well
OpsWorks
• is a configuration management service that helps to configure and operate
applications in a cloud enterprise by using Chef
• helps deploy and monitor applications in stacks with multiple layers
• supports preconfigured layers for Applications, Databases, Load Balancers, Caching
• OpsWorks Stacks features is a set of lifecycle events – Setup, Configure, Deploy,
Undeploy, and Shutdown – which automatically runs specified set of recipes at the
appropriate time on each instance
• Layers depend on Chef recipes to handle tasks such as installing packages on
instances, deploying apps, running scripts, and so on
• OpsWorks Stacks runs the recipes for each layer, even if the instance belongs to
multiple layers
• supports Auto Healing and Auto Scaling to monitor instance health, and provision new
instances
CloudWatch
• allows monitoring of AWS resources and applications in real time, collect and track pre
configured or custom metrics and configure alarms to send notification or make resource
changes based on defined rules
• does not aggregate data across regions
• stores the log data indefinitely, and the retention can be changed for each log group at any
time
• alarm history is stored for only 14 days
• can be used an alternative to S3 to store logs with the ability to configure Alarms and
generate metrics, however logs cannot be made public
• Alarms exist only in the created region and the Alarm actions must reside in the same
region as well
CloudTrail
• records access to API calls for the AWS account made from AWS management
console, SDKs, CLI and higher level AWS service
• support many AWS services and tracks who did, from where, what & when
• can be enabled per-region basis, a region can include global services (like IAM, STS etc),
is applicable to all the supported services within that region
• log files from different regions can be sent to the same S3 bucket
• can be integrated with SNS to notify logs availability, CloudWatch logs log group for
notifications when specific API events occur
• call history enables security analysis, resource change tracking, trouble shooting and
compliance auditing
AWS – Analytics Services – Important Notes
Data Pipeline
• orchestration service that helps define data-driven workflows to automate and
schedule regular data movement and data processing activities
• integrates with on-premises and cloud-based storage systems
• allows scheduling, retry, and failure logic for the workflows
EMR
• is a web service that utilizes a hosted Hadoop framework running on the web-scale
infrastructure of EC2 and S3
• launches all nodes for a given cluster in the same Availability Zone, which improves
performance as it provides higher data access rate
• seamlessly supports Reserved, On-Demand and Spot Instances
• consists of Master Node for management and Slave nodes, which consists of Core
nodes holding data and Task nodes for performing tasks only
• is fault tolerant for slave node failures and continues job execution if a slave node goes
down
• does not automatically provision another node to take over failed slaves
• supports Persistent and Transient cluster types
▪ Persistent which continue to run
▪ Transient which terminates once the job steps are completed
• supports EMRFS which allows S3 to be used as a durable HA data storage
Kinesis
• enables real-time processing of streaming data at massive scale
• provides ordering of records, as well as the ability to read and/or replay records in the same
order to multiple Kinesis applications
• data is replicated across three data centers within a region and preserved for 24 hours, by
default and can be extended to 7 days
• streams can be scaled using multiple shards, based on the partition key, with each shard
providing the capacity of 1MB/sec data input and 2MB/sec data output with 1000 PUT
requests per second
• Kinesis vs SQS
▪ real-time processing of streaming big data vs reliable, highly scalable hosted queue for
storing messages
▪ ordered records, as well as the ability to read and/or replay records in the same order vs
no guarantee on data ordering (with the standard queues before the FIFO queue feature
was released)
▪ data storage up to 24 hours, extended to 7 days vs up to 4 days, can be configured from
1 minute to 14 days but cleared if deleted by the consumer
▪ supports multiple consumers vs single consumer at a time and requires multiple
queues to deliver message to multiple consumers
AWS Exam Important Notes
AWS Exams cover a lot of topics and a wide range of services with minute details for features,
patterns, anti patterns and their integration with other services. This is just to have a quick
summary of all the services and key points for a quick glance before you appear for the exam
Consolidate Billing
• Paying account with multiple linked accounts
• Paying account is independent and should be only used for billing purpose
• Paying account cannot access resources of other accounts unless given exclusively
access through Cross Account roles
• All linked accounts are independent and soft limit of 20
• One bill per AWS account
• provides Volume pricing discount for usage across the accounts
• allows unused Reserved Instances to be applied across the group
• Free tier is not applicable across the accounts
Tags & Resource Groups
• are metadata, specified as key/value pairs with the AWS resources
• are for labelling purposes and helps managing, organizing resources
• can be inherited when created resources created from Auto Scaling, Cloud Formation,
Elastic Beanstalk etc
• can be used for
▪ Cost allocation to categorize and track the AWS costs
▪ Conditional Access Control policy to define permission to allow or deny access on
resources based on tags
• Resource Group is a collection of resources that share one or more tags
IDS/IPS
• Promiscuous mode is not allowed, as AWS and Hypervisor will not deliver any traffic to
instances this is not specifically addressed to the instance
• IDS/IPS strategies
▪ Host Based Firewall – Forward Deployed IDS where the IDS itself is installed on the
instances
▪ Host Based Firewall – Traffic Replication where IDS agents installed on
instances which send/duplicate the data to a centralized IDS system
▪ In-Line Firewall – Inbound IDS/IPS Tier (like a WAF configuration) which
identifies and drops suspect packets
DDOS Mitigation
• Minimize the Attack surface
▪ use ELB/CloudFront/Route 53 to distribute load
▪ maintain resources in private subnets and use Bastion servers
• Scale to absorb the attack
▪ scaling helps buy time to analyze and respond to an attack
▪ auto scaling with ELB to handle increase in load to help absorb attacks
▪ CloudFront, Route 53 inherently scales as per the demand
• Safeguard exposed resources
▪ user Route 53 for aliases to hide source IPs and Private DNS
▪ use CloudFront geo restriction and Origin Access Identity
▪ use WAF as part of the infrastructure
• Learn normal behavior (IDS/WAF)
▪ analyze and benchmark to define rules on normal behavior
▪ use CloudWatch
• Create a plan for attacks
1. No Capital Investment
2. Only Operational costs with pay-as-go prices
3. Flexibility in Capacity
4. Speed and Agility
5. Users can concentrate more on their core business processes with no Datacenter
maintenance
6. Go Global in Minutes
Virtualization
• It is the process of creating virtual format of resources like hardware, software, etc.
• Virtualization is nothing but a software layer that creates in between OS and host
machine.
Uses/Benefits
- Infrastructure-as-a-service (IaaS)
- Platform-as-a-service (PaaS)
- Software-as-a-service (SaaS)
Comparison for IaaS, PaaS, SaaS
Based on Deployment models, we have
- Public Cloud
- Private Cloud
- Hybrid Cloud
Public Cloud
Public cloud usually uses shared resources and it also means, if all parts of the application
run in the cloud which built on low-level infrastructure prices.
Private Cloud
Hybrid Cloud
Infrastructure of AWS
https://www.infrastructure.aws/
AWS Datacenter
Layers of Security
- Perimeter layer
- Infrastructure layer
- Data layer
- Environmental layer
Security
- The AWS Cloud enables a shared responsibility model. While AWS manages security of
the cloud, you are responsible for security in the cloud
- The provider must ensure that their infrastructure is secure, while the user must take
measures to fortify their application and use strong passwords and authentication
measures.
Benefits of AWS Security
- Keep your data safe
- Meet your compliance requirements
- Save money
- Scale quickly
AWS Cloud Platform
- AWS consists of many cloud services that you can use in combinations tailored to your
business or organizational needs
- To access the services, you can use
- AWS Management Console
- Command Line Interface (CLI)
- Software Development Kits (SDKs)
Available AWS Certifications
Compute Services
Elastic Compute Cloud(EC2)
Servers
- Application Server
- Database Server
- Web Server ..
Elastic Compute Cloud
- Elastic represents if properly configured, we can increase and decrease the amount of
servers according to your application demand automatically
In AWS, we call it as Instances instead of servers
- The root device for your instance contains the image used to boot the instance
- When AWS introduced EC2, all the AMI were backed by Amazon EC2 instance store from a
template stored in S3
- After EBS was introduced, AMI s are backed by EBS
- This means that the root device for an instance launched from the AMI is an Amazon EBS
volume created from an Amazon EBS snapshot.
- Instances that use Amazon EBS for the root device automatically have an Amazon EBS
volume attached.
- EBS backed Instance can be stopped and start without affecting your data
AMI – Amazon Machine Image
• Amazon EC2 enables you to share your AMIs with other AWS accounts.
• You can allow all AWS accounts to launch the AMI (make the AMI public), or only allow a
few specific accounts to launch the AMI.
• AMIs are a regional resource. Therefore, sharing an AMI makes it available in that region.
To make an AMI available in a different region, copy the AMI to the region and then
share it.
- When you launch an instance, the instance type that you specify determines the hardware
of the host computer used for your instance.
- General Purpose
- Compute Optimized
- Memory Optimized
- Storage Optimized
- Accelerated Computing
There are current generation and previous generation instance types under the instance
families, AWS recommend to move to current generation instance types for best performance
but AWS still support the previous generation types.
Finding an Instance type
Before you can launch an instance, you must select an instance type to use. The instance
type that you choose can differ depending on your requirements for the instances that you'll
launch. For example, you might want to consider the following requirements:
- Region
- The architecture: 32-bit (i386), 64-bit (x86_64), or 64-bit ARM (arm64)
- Compute
- Memory
- Storage
- Network performance
- As your needs change, you might find that your instance is over-utilized or under-utilized.
If this is the case, you can change the size of your instance. For example, if
your t2.micro instance is too small for its workload, you can change it to another instance
type that is appropriate for the workload.
- You might also want to migrate from a previous generation instance type to a current
generation instance type to take advantage of some features; for example, support for
IPv6.
Instance Purchasing Options
Amazon EC2 provides the following purchasing options to enable you to optimize your costs
based on your needs
- On-Demand Instances – Pay, by the second, for the instances that you launch.
- Reserved Instances – Reduce your Amazon EC2 costs by making a commitment to a
consistent instance configuration, including instance type and Region, for a term of 1 or 3
years.
- Scheduled Instances – Purchase instances that are always available on the specified
recurring schedule, for a one-year term.
- Spot Instances – Request unused EC2 instances, which can reduce your Amazon EC2 costs
significantly.
- Dedicated Hosts – Pay for a physical host that is fully dedicated to running your instances,
and bring your existing per-socket, per-core, or per-VM software licenses to reduce costs.
- Dedicated Instances – Pay, by the hour, for instances that run on single-tenant hardware.
- Capacity Reservations – Reserve capacity for your EC2 instances in a specific Availability
Zone for any duration.
The below is the pricing for the EC2 with Linux as operating system in North Virginia for
a1.medium as the Instance type which is reserved instances
Comparison of On-demand, reserved, Spot instances
Private IP, Public IP, Elastic IP
Elastic IP: An Elastic IP address is a static IPv4 address designed for dynamic cloud
computing.
• An Elastic IP address is associated with your AWS account.
• With an Elastic IP address, you can mask the failure of an instance or software by rapidly
remapping the address to another instance in your account.
• An Elastic IP address is a public IPv4 address, which is reachable from the internet.
Private IP: When EC2 instances are launched, the primary elastic network interface is
assigned a reserved private IP address from the default VPC DHCP pool.
• The private IP address stays assigned to the network interface until it is deleted.
• The instance's primary network interface cannot be removed; it stays assigned to the
instance until the instance is deleted.
• It is not possible to remove or change the private IP address of the primary network
interface, but it is possible to add more private IP addresses to the network interface.
Public IP: A public IP address is an IPv4 address that's reachable from the Internet. You can
use public addresses for communication between your instances and the Internet. Each
instance that receives a public IP address is also given an external DNS hostname; for
example,ec2-203-0-113-25.compute-1.amazonaws.com.
Instance Lifecycle
- The lifecycle of an instance starts when it is launched and ends when it is terminated. The
purchasing option that you choose affects the lifecycle of the instance.
You can only stop an Amazon EBS-backed instance. When you stop a running instance, the
following happens
- The instance performs a normal shutdown and stops running; its status changes
to stopping and then stopped.
- Any Amazon EBS volumes remain attached to the instance, and their data persists.
- Any data stored in the RAM of the host computer or the instance store volumes of the host
computer is gone.
- In most cases, the instance is migrated to a new underlying host computer when it's
started.
- The instance retains its private IPv4 addresses and any IPv6 addresses when stopped and
started. AWS release the public IPv4 address and assign a new one when you start it.
- The instance retains its associated Elastic IP addresses.
Hibernate your instance
- When you hibernate an instance, AWS signal the operating system to perform hibernation
(suspend-to-disk). Hibernation saves the contents from the instance memory (RAM) to your
Amazon EBS root volume. AWS persist the instance's Amazon EBS root volume and any
attached Amazon EBS data volumes.
- You can hibernate an instance only if it's enabled for hibernation
- Root volume needs to be encrypted for hibernation
- When you start your instance:
- The Amazon EBS root volume is restored to its previous state
- The RAM contents are reloaded
- The processes that were previously running on the instance are resumed
- Previously attached data volumes are reattached and the instance retains its instance
ID
Reboot your instance
- You can delete your instance when you no longer need it. This is referred to
as terminating your instance. As soon as the state of an instance changes to shutting-
down or terminated, you stop incurring charges for that instance.
Placement Groups:
You can launch or start instances in a placement group, which determines how instances are
placed on underlying hardware. When you create a placement group, you can create one of
the following strategies for the group:
Spread
Security groups
A security group acts as a virtual firewall that controls the traffic for one or more instances.
When you launch, you can specify one or more security groups otherwise, AWS use the
default security group.
You can add rules to each security group that allow traffic to or from its associated
instances.
You can modify the rules for a security group at any time; the new rules are automatically
applied to all instances that are associated with the security group.
When AWS decide whether to allow traffic to reach an instance, AWS evaluate all the
rules from all the security groups that are associated with the instance.
When you launch an instance in a VPC, you must specify a security group that's created
for that VPC.
Security group rules are always permissive; you can't create rules that deny access.
Security groups are stateful
Key Pairs
Amazon EC2 uses public–key cryptography to encrypt and decrypt login information. Public–
key cryptography uses a public key to encrypt a piece of data, such as a password, then the
recipient uses the private key to decrypt the data. The public and private keys are known as
a key pair.
To log in to your instance, you must create a key pair, specify the name of the key pair when
you launch the instance, and provide the private key when you connect to the instance.
1. What is EC2 ?
2. Describe AMI?
3. Public and Private Images(AMI) ?
4. What is Instance type ?
5. Types of Instance families ?
6. What is placement group ?
7. Difference between Stop and Terminate in EC2?
8. What is Stop-Hibernate ?
9. Different purchasing options in EC2 ?
10. Public IP , Private IP & Elastic IP
11. Security Group rules ?
12. Key pair ?
Introduction to Elastic Block Store
An Amazon EBS volume is a durable, block-level storage device that you can attach to one
or more instances.
After a volume is attached to an instance, you can use it like any other physical hard drive.
EBS volumes are flexible.
For current-generation volumes attached to current-generation instance types, you can
dynamically increase size, modify the provisioned IOPS capacity, and change volume type
on live production volumes.
An EBS volume and the instance to which it attaches must be in the same Availability
Zone.
Amazon EBS provides the following volume types, which differ in performance characteristics
and price.
• SSD-backed volumes optimized for transactional workloads involving frequent read/write
operations with small I/O size, where the dominant performance attribute is IOPS
• HDD-backed volumes optimized for large streaming workloads where throughput
(measured in MiB/s) is a better performance measure than IOPS
Magnetic (standard)
Magnetic volumes are backed by magnetic drives and are suited for workloads where data is
accessed infrequently, and scenarios where low-cost storage for small volume sizes is
important. These volumes deliver approximately 100 IOPS on average, with burst capability
of up to hundreds of IOPS, and they can range in size from 1 GiB to 1 TiB.
Note: Magnetic is a Previous Generation Volume. For new applications, AWS recommend
using one of the newer volume types.
Amazon EBS Encryption
Amazon EBS encryption offers a simple encryption solution for your EBS volumes without the
need to build, maintain, and secure your own key management infrastructure.
Encryption is supported by all EBS types (General Purpose SSD [gp2], Provisioned IOPS
SSD [io1], Throughput Optimized HDD [st1], Cold HDD [sc1], and Magnetic [standard]).
Same IOPS performance on encrypted volumes as on unencrypted volumes, with a
minimal effect on latency.
Can access encrypted volumes the same way that you access unencrypted volumes.
When you create an encrypted EBS volume and attach it to a supported instance type, the
following types of data are encrypted:
Root Volume
• When you launch an instance, the root device volume contains the image used to boot
the instance. When we introduced Amazon EC2, all AMIs were backed by
Amazon EC2 instance store, which means the root device for an instance launched from
the AMI is an instance store volume created from a template stored in Amazon S3.
Data Volume
• An Amazon EBS volume is a durable, block-level storage device that you can attach to a
single EC2 instance.
• Can use EBS volumes as primary storage for data that requires frequent updates, such as
the system drive for an instance or storage for a database application.
Life Cycle Manager
You can use Amazon Data Lifecycle Manager (Amazon DLM) to automate the creation,
retention, and deletion of snapshots taken to back up your Amazon EBS volumes.
Automating snapshot management helps you to:
An EBS snapshot is a backup of a single EBS volume. The EBS snapshot contains all the data
stored on the EBS volume at the time the EBS snapshot was created.
An AMI image is a backup of an entire EC2 instance.
1. Define EBS ?
2. What are the volume types for EBS ?
3. Differences between Root volume and Data volume ?
4. Difference between AMI and EBS snapshots ?
5. How EBS volumes are persistent when compared to Instance store ?
6. Can an EBS volume attach to 2 or more instances at a time ?
7. Can EBS volume can be attached to an EC2 instance which are in different availability
zone ?
8. If there are 2 EC2 instances in different availability zones, say Instance 1 and Instance 2. Is
it possible to detach from Instance 1 and attach to Instance 2. If yes, how ? If No, why ?
9. Once you define volume size (e.g. 10 GB), is it possible to decrease the volume to 8 GB or
less ? And is it possible to Increase to 12 GB or more ?
10. Is it possible to detach the root volume from the instance ?
EFS(Elastic File System)
What is EFS ?
Amazon EFS provides a simple, scalable, elastic file system for Linux-based workloads for
use with AWS Cloud services and on-premises resources.
It is built to scale on demand to petabytes without disrupting applications, growing and
shrinking automatically as you add and remove files
It is designed to provide massively parallel shared access to thousands of Amazon EC2
instances
It is a fully managed service
There is a Standard and an Infrequent Access storage class available with Amazon EFS
Using Lifecycle Management, files not accessed for 30 days will automatically be moved
to a cost-optimized Infrequent Access storage class reducing cost up to 85%
Amazon EFS is a regional service storing data within and across multiple Availability Zones
(AZs) for high availability and durability.
Can access your file systems across AZs, regions, and VPCs and share files between
thousands of Amazon EC2 instances and on-premises servers via AWS Direct Connect or
AWS VPN.
Dynamic elasticity
Scalable performance
Shared file storage
Fully managed
Cost effective – pay for what you use (no upfront capacity planning)
Pricing
Within your first 12 months on AWS, you can use up to 5 GB/month for free.
Amazon EFS is designed to meet the performance needs of the following use cases.
Advantages of S3
Amazon S3 is intentionally built with a minimal feature set that focuses on simplicity and
robustness
- Creating buckets – Create and name a bucket that stores data. Buckets are the
fundamental container in Amazon S3 for data storage.
- Storing data – Store an infinite amount of data in a bucket. Upload as many objects as
you like into an Amazon S3 bucket. Each object can contain up to 5 TB of data.
- Downloading data – Download your data or enable others to do so. Download your data
anytime you like, or allow others to do the same.
- Permissions – Grant or deny access to others who want to upload or download data into
your Amazon S3 bucket.
Amazon S3 concepts
Buckets
- To upload your data (photos, videos, documents etc.) to Amazon S3, you must first create
an S3 bucket in one of the AWS Regions.
- A bucket is a region specific
- A bucket is a container for objects stored in Amazon S3.
- Every object is contained in a bucket.
- By default, you can create up to 100 buckets in each of your AWS accounts. If you need
more buckets, you can increase your account bucket limit to a maximum of 1,000 buckets
by submitting a service limit increase.
- For example, if the object named photos/puppy.jpg is stored in the john bucket in the US
West (Oregon) Region, then it is addressable using the URL
https://john.s3.us-west-2.amazonaws.com/photos/puppy.jpg
- For Bucket name to be created, follow the naming guidelines
• Bucket name should be globally unique and the namespace is shared in all
accounts. This means that after a bucket is created, the name of that bucket
cannot be used by another AWS account in any AWS Region until the bucket is
deleted.
• Once created it cannot be changed
• Bucket names must be at least 3 and no more than 63 characters long.
• Bucket names must not contain uppercase characters or underscores.
• Bucket names must start with a lowercase letter or number.
• Bucket names must not be formatted as an IP address (for example,
192.168.5.4).
• After you create the bucket, you cannot change the name, so choose wisely.
• Choose a bucket name that reflects the objects in the bucket because the
bucket name is visible in the URL that points to the objects that you're going to
put in your bucket.
Region
- You can choose the geographical AWS Region where Amazon S3 will store the buckets
that you create.
- You might choose a Region to optimize latency, minimize costs, or address regulatory
requirements.
- Objects stored in a Region never leave the Region unless you explicitly transfer them to
another Region.
- For example, objects stored in the Europe (Ireland) Region never leave it.
Object
- Amazon S3 is a simple key, value store designed to store as many objects as you want.
- You store these objects in one or more buckets.
- S3 supports object level storage i.e., it stores the file as a whole and does not divide them
- An object size can be in between 0 KB and 5 TB
- When you upload an object in a bucket, it replicates itself in multiple availability zones in
the same region
An object consists of the following:
Object Versioning
- When you re-upload the same object name in a bucket, it replaces the whole object
- You can use versioning to keep multiple versions of an object in one bucket.
- For example, you could store my-image.jpg (version 111111) and my-image.jpg (version
222222) in a single bucket.
- Versioning protects you from the consequences of unintended overwrites and deletions.
- You must explicitly enable versioning on your bucket. By default, versioning is disabled.
- Regardless of whether you have enabled versioning, each object in your bucket has a
version ID.
- If you have not enabled versioning, Amazon S3 sets the value of the version ID to null. If
you have enabled versioning, Amazon S3 assigns a unique version ID value for the object.
- When you enable versioning on a bucket, objects already stored in the bucket are
unchanged. The version IDs (null), contents, and permissions remain the same.
- Enabling and suspending versioning is done at the
bucket level.
- When you enable versioning for a bucket, all
objects added to it will have a unique version ID.
Unique version IDs are randomly generated.
- Only Amazon S3 generates version IDs. They cannot
be edited.
- This functionality prevents you from accidentally
overwriting or deleting objects and affords you the
opportunity to retrieve a previous version of an
object.
- When you DELETE an object, all versions remain in
the bucket and Amazon S3 inserts a delete marker,
as shown in the following figure.
- The delete marker becomes the current version of
the object.
You can, however, GET a noncurrent version of an object by specifying its version ID
You can permanently delete an object by specifying the version you want to delete. Only the
owner of an Amazon S3 bucket can permanently delete a version.
- Versioning is a means of keeping multiple variants of an object in the same bucket. You
can use versioning to preserve, retrieve, and restore every version of every object stored
in your Amazon S3 bucket. With versioning, you can easily recover from both unintended
user actions and application failures.
- Once you version-enable a bucket, it can never return to an unversioned state. You can,
however, suspend versioning on that bucket.
- Server access logging provides detailed records for the requests that are made to a
bucket. Server access logs are useful for many applications.
- For example, access log information can be useful in security and access audits.
- Each access log record provides details about a single access request, such as the
requester, bucket name, request time, request action, response status, and an error
code, if relevant.
- Both the source and target S3 buckets must be owned by the same AWS account, and
the S3 buckets must both be in the same Region.
Multipart upload
AWS recommend that you use multipart uploading in the following ways
- With Amazon S3 object lock, you can store objects using a write-once-read-many (WORM)
model. You can use it to prevent an object from being deleted or overwritten for a fixed
amount of time or indefinitely.
- You can only enable Amazon S3 object lock for new buckets. If you want to turn on
Amazon S3 object lock for an existing bucket, contact AWS Support.
- When you create a bucket with Amazon S3 object lock enabled, Amazon S3 automatically
enables versioning for the bucket.
- Once you create a bucket with Amazon S3 object lock enabled, you can't disable object
lock or suspend versioning for the bucket.
- Data protection refers to protecting data while in-transit (as it travels to and from
Amazon S3) and at rest (while it is stored on disks in Amazon S3 data centers). You have
the following options for protecting data at rest in Amazon S3:
o Server-Side Encryption – Request Amazon S3 to encrypt your object before saving it
on disks in its data centers and then decrypt it when you download the objects.
o Client-Side Encryption – Encrypt data client-side and upload the encrypted data to
Amazon S3. In this case, you manage the encryption process, the encryption keys,
and related tools.
Server Side Encryption
- When you use Server-Side Encryption with Amazon S3-Managed Keys (SSE-S3), each
object is encrypted with a unique key.
- As an additional safeguard, it encrypts the key itself with a master key that it regularly
rotates.
Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key Management
Service (SSE-KMS)
- Server-Side Encryption with Customer Master Keys (CMKs) Stored in AWS Key
Management Service (SSE-KMS) is similar to SSE-S3, but with some additional benefits
and charges for using this service.
- There are separate permissions for the use of a CMK that provides added protection
against unauthorized access of your objects in Amazon S3.
- SSE-KMS also provides you with an audit trail that shows when your CMK was used and by
whom.
- With Server-Side Encryption with Customer-Provided Keys (SSE-C), you manage the
encryption keys and Amazon S3 manages the encryption, as it writes to disks, and
decryption, when you access your objects.
Client-Side Encryption
Client-side encryption is the act of encrypting data before sending it to Amazon S3. To enable
client-side encryption, you have the following options:
- Use a customer master key (CMK) stored in AWS Key Management Service (AWS KMS).
- Use a master key you store within your application.
Static Website Hosting
- You can host a static website on Amazon S3. On a static website, individual webpages
include static content.
- To host a static website, you configure an Amazon S3 bucket for website hosting and then
upload your website content to the bucket.
- This bucket must have public read access. It is intentional that everyone in the world will
have read access to this bucket.
- Depending on your Region, Amazon S3 website endpoints follow one of these two
formats:
- http://bucket-name.s3-website.Region.amazonaws.com
- http://bucket-name.s3-website-Region.amazonaws.com
- This above URL will return the default index document that you configured for the
website.
- To request a specific object that is stored at the root level in the bucket, use the following
URL structure
- http://bucket-name.s3-website.Region.amazonaws.com/object-name
- For example
- http://example-bucket.s3-website.us-west-2.amazonaws.com/photo.jpg
- To request an object that is stored in a folder in your bucket, use this URL structure
- http://bucket-name.s3-website.Region.amazonaws.com/folder-name/object-name
- For example
- http://example-bucket.s3-website.us-west-2.amazonaws.com/docs/doc1.html
- Instead of accessing the website by using an Amazon S3 website endpoint, you can use
your own domain, such as example.com, to serve your content. Amazon S3, along with
Amazon Route 53, supports hosting a website at the root domain. For example, if you
have the root domain example.com and you host your website on Amazon S3, your
website visitors can access the site from their browser by typing
either http://www.example.com or http://example.com.
Storage Classes
For performance-sensitive use cases (those that require millisecond access time) and
frequently accessed data, Amazon S3 provides the following storage class
- Standard—The default storage class. If you don't specify the storage class when you
upload an object, Amazon S3 assigns the Standard storage class.
Storage Class That Automatically Optimizes Frequently and Infrequently Accessed Objects
⁻ The Standard_IA and Onezone_IA storage classes are designed for long-lived and
infrequently accessed data
⁻ Standard_IA and Onezone_IA objects are available for millisecond access (similar to the
Standard storage class)
⁻ Amazon S3 charges a retrieval fee for these objects, so they are most suitable for
infrequently accessed data.
⁻ The Standard_IA and Onezone_IA storage classes are suitable for objects larger than 128
KB that you plan to store for at least 30 days. If an object is less than 128 KB, Amazon S3
charges you for 128 KB.
⁻ Onezone_IA - Amazon S3 stores the object data in only one Availability Zone, which makes
it less expensive than Standard_IA
For example, you might choose the Standard_IA and Onezone_IA storage classes:
⁻ The Glacier and Deep Archive storage classes are designed for low-cost data archiving
Glacier
- Long-term data archiving with retrieval times ranging from minutes to hours
- It has minimum storage duration period of 90 days
- If you have deleted, overwritten, or transitioned to a different storage class an object
before the 90-day minimum, you are charged for 90 days.
- To manage your objects so that they are stored cost effectively throughout their lifecycle,
configure their lifecycle.
- A lifecycle configuration is a set of rules that define actions that Amazon S3 applies to a
group of objects.
You can replicate objects between different AWS Regions or within the same AWS Region.
- You can delete the objects individually. Or you can empty a bucket, which deletes all the
objects in the bucket without deleting the bucket.
- You can also delete a bucket and all the objects contained in the bucket.
- If you want to use the same bucket, don’t delete the bucket, can empty the bucket and
keep it.
- After you delete the bucket, It is available for re-use, but the name might not be available
for you to reuse for various reasons. For example, it might take some time before the
name can be reused, and some other account could create a bucket with that name
before you do.
Cloud Front
Introduction to Cloud Front distributions and Cloud Front Edge locations
Is a web service that speeds up distribution of your static and dynamic web content, such as
.html, .css, .js, and image files, to your users
⁻ CloudFront delivers your content through a worldwide network of data centers called
edge locations
⁻ When a user requests content that you're serving with CloudFront, the user is routed to
the edge location that provides the lowest latency (time delay), so that content is
delivered with the best possible performance
⁻ If the content is already in the edge location with the lowest latency, CloudFront delivers
it immediately
⁻ If the content is not in that edge location, CloudFront retrieves it from an origin that
you've defined—such as an Amazon S3 bucket, a Media Package channel, or an HTTP
server (for example, a web server)
⁻ You also get increased reliability and availability because copies of your files (also known
as objects) are now held (or cached) in multiple edge locations around the world
How Cloud Front works
1. What are bucket naming guidelines ?
2. What is a bucket and object ?
3. Min and max size of an object ?
4. Maximum size of an S3 bucket ?
5. Maximum number of buckets per account (soft and hard limits) ?
6. What is multipart upload ?
7. What is versioning in S3 ?
8. Different storage classes and their uses ?
9. What is Lifecycle management ?
10. Cross region and same region replication ?
Virtual Private Cloud
Introduction to VPC
• Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a
virtual network that you've defined.
• A virtual private cloud (VPC) is a virtual network dedicated to your AWS account.
• It is logically isolated from other virtual networks in the AWS Cloud.
• You can launch your AWS resources, such as Amazon EC2 instances, into your VPC.
• You can specify an IP address range for the VPC, add subnets, associate security groups, and
configure route tables.
• A VPC spans all the Availability Zones in the region.
• After creating a VPC, you can add one or more subnets in each Availability Zone.
• Each subnet must reside entirely within one Availability Zone and cannot span zones.
When you create a VPC, you must specify a range of IPv4 addresses for the VPC in the form of
a Classless Inter-Domain Routing(CIDR) block; for example, 10.0.0.0/16. This is the primary
CIDR block for your VPC.
Default Vs. Custom VPC
• If your account supports the EC2-VPC platform only, it comes with a default VPC that has
a default subnet in each Availability Zone.
• A default VPC has the benefits of the advanced features provided by EC2-VPC, and is
ready for you to use. If you have a default VPC and don't specify a subnet when you
launch an instance, the instance is launched into your default VPC. You can launch
instances into your default VPC without needing to know anything about Amazon VPC.
• Regardless of which platforms your account supports, you can create your own VPC, and
configure it as you need. This is known as a non-default VPC. Subnets that you create in
your non-default VPC and additional subnets that you create in your default VPC are
called non-default subnets.
Accessing the Internet
- Your default VPC includes an internet gateway, and each default subnet is a public
subnet. Each instance that you launch into a default subnet has a private IPv4 address
and a public IPv4 address. These instances can communicate with the internet through
the internet gateway.
- By default, each instance that you launch into a non-default subnet has a private IPv4
address, but no public IPv4 address, unless you specifically assign one at launch, or you
modify the subnet's public IP address attribute. These instances can communicate with
each other, but can't access the internet.
VPC and Subnet sizing
- When you create a VPC, you must specify an IPv4 CIDR block for the VPC
- The allowed block size is between a /16 netmask (65,536 IP addresses) and /28 netmask
(16 IP addresses)
- When you create a VPC, AWS recommend that you specify a CIDR block (of /16 or smaller)
from the private IPv4 address ranges as specified in RFC 1918
10.0.0.0 - 10.255.255.255 (10/8 prefix)
172.16.0.0 - 172.31.255.255 (172.16/12 prefix)
192.168.0.0 - 192.168.255.255 (192.168/16 prefix)
- The CIDR block of a subnet can be the same as the CIDR block for the VPC, or a subset of
the CIDR block for the VPC (for multiple subnets)
- The allowed block size for a subnet is between a /28 netmask and /16 netmask. If you
create more than one subnet in a VPC, the CIDR blocks of the subnets cannot overlap.
- The first four IP addresses and the last IP address in each subnet CIDR block are not
available for you to use, and cannot be assigned to an instance.
- For example, in a subnet with CIDR block 10.0.0.0/24, the following five IP addresses are
reserved:
10.0.0.0: Network address.
10.0.0.1: Reserved by AWS for the VPC router.
10.0.0.2: Reserved by AWS for the IP address of the DNS server
10.0.0.3: Reserved by AWS for future use.
10.0.0.255: Network broadcast address. We do not support broadcast in a VPC
- You can associate secondary IPv4 CIDR blocks with your VPC
- The allowed block size for secondary CIDR is between a /28 netmask and /16 netmask.
- The CIDR block must not overlap with any existing CIDR block that's associated with the
VPC.
- You cannot increase or decrease the size of an existing CIDR block.
The instances in the public subnet can send outbound traffic directly to the Internet,
whereas the instances in the private subnet can't. Instead, the instances in the private
subnet can access the Internet by using a network address translation (NAT) gateway that
resides in the public subnet.
VPC Flow Logs
- VPC Flow Logs is a feature that enables you to capture information about the IP traffic
going to and from network interfaces in your VPC.
- Flow log data can be published to Amazon CloudWatch Logs or Amazon S3
- You can create a flow log for a VPC, a subnet
Security Groups
- A security group acts as a virtual firewall for your instance to control inbound and
outbound traffic.
- When you launch an instance in a VPC, you can assign up to five security groups to the
instance.
- Security groups act at the instance level, not the subnet level
- For each security group, you add rules that control the inbound traffic to instances, and a
separate set of rules that control the outbound traffic
- You can specify allow rules, but not deny rules
- When you create a security group, it has no inbound rules. Therefore, no inbound traffic
originating from another host to your instance is allowed until you add inbound rules to
the security group
- By default, a security group includes an outbound rule that allows all outbound traffic
- Security groups are stateful
Network Access Control Lists (NACL)
- A network access control list (ACL) is an optional layer of security for your VPC that acts as
a firewall for controlling traffic in and out of one or more subnets
- Each subnet in your VPC must be associated with a network ACL
- You can associate a network ACL with multiple subnets
- A subnet can be associated with only one network ACL at a time
- A network ACL has separate inbound and outbound rules, and each rule can either allow
or deny traffic
- Network ACLs are stateless
VPC Networking Components
Route Tables
- A route table contains a set of rules, called routes, that are used to determine where
network traffic from your subnet or gateway is directed.
- Main route table—The route table that automatically comes with your VPC. It controls the
routing for all subnets that are not explicitly associated with any other route table.
- Custom route table—A route table that you create for your VPC.
- Each subnet in your VPC must be associated with a route table
- A subnet can only be associated with one route table at a time, but you can associate
multiple subnets with the same subnet route table
Internet Gateway
- You can use a NAT device to enable instances in a private subnet to connect to the
internet (for example, for software updates) or other AWS services, but prevent the
internet from initiating connections with the instances. A NAT device forwards traffic
from the instances in the private subnet to the internet or other AWS services, and then
sends the response back to the instances.
- AWS offers two kinds of NAT devices
- NAT Gateway : You are charged for creating and using a NAT gateway in your
account. NAT gateway hourly usage and data processing rates apply.
- NAT Instance - A NAT instance is launched from a NAT AMI
VPC Peering
- A VPC peering connection is a networking connection between two VPCs that enables
you to route traffic between them privately.
- Instances in either VPC can communicate with each other as if they are within the same
network.
- You can create a VPC peering connection between your own VPCs, with a VPC in another
AWS account, or with a VPC in a different AWS Region.
- AWS uses the existing infrastructure of a VPC to create a VPC peering connection; it is
neither a gateway nor an VPN connection, and does not rely on a separate piece of
physical hardware.
- There is no single point of failure for communication or a bandwidth bottleneck.
AWS Transit Gateway
- By default, instances that you launch into an Amazon VPC can't communicate with your
own (remote) network
- You can optionally connect your VPC to your own corporate data center using an AWS
VPN connection, making the AWS Cloud an extension of your data center.
- A Site-to-Site VPN
connection consists of a
virtual private gateway
attached to your VPC and a
customer gateway located in
your data center.
- A virtual private gateway is
the VPN concentrator on the
Amazon side and a customer
gateway is a physical device
or software appliance on
your side of the Site-to-Site
VPN connection.
- It is internet based
AWS VPN Cloud Hub
- AWS VPN Cloud Hub uses an Amazon VPC virtual private gateway with multiple customer
gateways, each using unique BGP autonomous system numbers (ASNs)
- Border Gateway Protocol (BGP) is a standardized exterior gateway protocol designed to
exchange routing and reachability information between autonomous systems (AS) on the
Internet.
Introduction to Direct Connect
• AWS Direct Connect is a cloud service solution that makes it easy to establish a dedicated
network connection from your premises to AWS.
• Using AWS Direct Connect, you can establish private connectivity between AWS and your
datacenter, which in many cases can reduce your network costs, increase bandwidth
throughput, and provide a more consistent network experience than Internet-based
connections.
VPN connection options
VPC Endpoints
- A VPC endpoint enables you to privately connect your VPC to supported AWS services
and to VPC endpoint services that are powered by AWS PrivateLink
- This method does not require an internet gateway, NAT device, VPN connection, or
AWS Direct Connect connection
- Instances in your VPC
do not need to know
the public IP address of
the resources that they
want to communicate
with
- The traffic between
your VPC and the
service provider does
not leave the Amazon
network.
- Two types of VPC
endpoints
- Gateway Endpoint
- Interface endpoint
AWS PrivateLink interface endpoint
- AWS PrivateLink enables you to securely connect your VPCs to supported AWS
services, third-party services on AWS Marketplace, and your own services on AWS
- The traffic between your VPC and the target service uses private IP addresses, and it
never leaves the AWS network
1. What is VPC
2. What are the VPC Components ?
3. Difference between public and private Subnet ?
4. What is ‘Auto-assign Public IPV4 address’ in subnet ?
5. Internet Gateway ?
6. How many internet gateways can be attached to a VPC ?
7. Define Route Table?
8. Difference between Main and custom Route Table ?
9. Is it possible to associate multiple subnets under a route table ?
10. Can a subnet can be associated under multiple route tables ?
11. Differences between Security Group and NACL ?
12. What is NAT and types of NAT in VPC ?
13. Difference between NAT instance and NAT Gateway ?
14. VPC Flow Logs ?
15. VPC Peering ?
16. VPC Transit Gateway
17. VPN
18. VPN Cloud hub
19. Direct connect
20. VPC Endpoints
Introduction to database services of AWS
• AWS offers a wide range of database services to fit your application requirements. These
database services are fully managed and can be launched in minutes with just a few
clicks.
AWS Offerings
• Amazon RDS
• Amazon Redshift
Key-value
AWS Offerings
• Amazon DynamoDB
Document
AWS Offerings
• Amazon DocumentDB (with MongoDB compatibility)
In-memory
In-memory databases are used for applications that require real time access to data. By
storing data directly in memory, these databases provide microsecond latency where
millisecond latency is not enough.
AWS Offerings:
• Amazon ElastiCache for Redis
• Amazon ElastiCache for Memcached
Graph
Graph databases are used for applications that need to enable millions of users to query and
navigate relationships between highly connected, graph datasets with millisecond latency.
AWS Offerings:
• Amazon Neptune
Relational Database Services
Introduction to Amazon RDS
• Easy to set up, operate, and scale a relational database in the cloud.
• Provides cost-efficient and resizable capacity while automating time-consuming
administration tasks such as hardware provisioning, database setup, patching and
backups.
• Makes to focus on your applications so you can give them the fast performance, high
availability, security and compatibility they need.
• RDS is an Online Transaction Processing (OLTP) type of database (INSERT, UPDATE,
DELETE)
• Primary use case is a transactional database (rather than analytical)
Benefits
Easy to administer
Highly scalable
Available and durable
Fast
Secure
Inexpensive
Concepts of Amazon RDS services
Why do you want a managed relational database service? Because Amazon RDS takes over
many of the difficult or tedious management tasks of a relational database:
• When you buy a server, you get CPU, memory, storage, and IOPS, all bundled together.
With Amazon RDS, these are split apart so that you can scale them independently.
• Manages backups, software patching, automatic failure detection, and recovery.
• As RDS is a managed service, you do not have access to the underlying EC2 instance (no
root access)
• As it is managed it will not give some system procedures and tables that require advanced
privileges.
• Automated backups can perform, when you need them, or manually create your own
backup snapshot. You can use these backups to restore a database. The Amazon RDS
restore process works reliably and efficiently.
• High availability with a primary instance and synchronous secondary instance that you can
fail over to when problems occur.
• You can also help protect your databases by putting them in a virtual private cloud.
• Database instances are accessed via endpoints.
Different types of database engines supported in AWS RDS
• MySQL
• Maria DB
• PostgreSQL
• Oracle
• Microsoft SQL Server DB engines
• Amazon Aurora
Database machine types
General purpose
T3, T2, M5, M4
Memory Optimized
R5, R4, X1e, X1
Storage
• Storage for Amazon RDS for MySQL, Maria DB, PostgreSQL, Oracle, and SQL Server is built
on Amazon EBS, a durable, block-level storage service.
• Amazon RDS provides volume types to best meet the needs of your database workloads:
General Purpose (SSD), Provisioned IOPS (SSD).
DB Subnet Group
- Is a collection of subnets in a VPC that you allocate for DB instances launched in the VPC
- Each DB subnet group must have at least one subnet in each AZ in a region for high
availability
- You must have minimum of 2 subnets in a subnet group
Automated Backups
- By default, RDS automatically creates a snapshot upon DB instance daily, you don’t charge
for RDS backing up your DB instance
- You can disable by setting retention period to Zero
- By default you have the retention period to 7 days with maximum of 35 days
- Automated backups are deleted when you delete your RDS instance
- You cannot share the automated backups
- If need to share, take a copy manually and share the copy
Manual Snapshots
- These snapshots would be created by the account owner
- Can share the manual snapshots with other accounts directly
- Manual snapshots are not deleted automatically, you go head and delete them
Multi-AZ Deployment
- Multi-AZ for RDS provides high availability, data durability, and fault tolerance for DB
instances
- You can select the Multi-AZ option during RDS DB instance launch or modify an existing
stand alone RDS instance
- AWS creates a secondary database in different availability zone in the same region for
high availability
- Not possible to Insert/Update/Select the data to the Secondary (stand-by) RDS database
- OS patching, System upgrades and DB scaling are done on standby DB first and then on
primary
- Snapshots and automated backups are done on stand-by to avoid suspension on primary
instance for I/O
- DB engine version upgrades happen on both primary and secondary at the same time
which causes an outage
Database Read replicas
• Scale-out performance beyond the capacity constraints of a single database for read-
heavy workloads
• When the required read I/O capacity is reached but still more I/O capacity is required for
heavy/intensive read applications, RDS read replicas can be helpful
• A read replica is a replica of the primary database that can be used only for read actions
- Read replicas can be created in the same region and they must be in the same VPC not
in a peered VPC
- Read replicas can be created in different regions also then RDS creates a secure channel
between the primary DB and read replica
- RDS does not support read replicas on EC2 instances which is managed by user or on-
premises
- When read replica is initiated, RDS take a snapshot from the master DB and replica is
created from that snapshot
- The read replica’s storage type or instance class can be different from the source DB
instance
- DB engine type CANNOT be changed though, it is inherited from the source (primary)
DB instance
- If you scale the source DB instance, you should also scale the Read Replicas.
- Multi-AZ for read replica
- It is also possible to create a stand-by read replica DB instances for your read-
replica instances
- This is independent whether the primary DB is stand-alone or multi-AZ
- Primary instance becomes the source of the replication to the read replica
- Using asynchronous replication (time lag exists) data gets replicated to the read
replica
- If in multi-AZ setup, the replication is done by stand-by DB instead
Encrypting Amazon RDS Resources
- You can encrypt your Amazon RDS DB instances and snapshots at rest by enabling the
encryption option for your Amazon RDS DB instances. Data that is encrypted at rest
includes the underlying storage for a DB instances, its automated backups, Read Replicas,
and snapshots.
- You can’t disable encryption on an encrypted DB
- You can not enable encryption for an existing, un-encrypted database instance, but there
is an alternate way
- Create a snapshot of the DB
- Copy the snapshot and choose to encrypt it during the copy process
- Restore the encrypted copy into a New DB
Scaling
- You can scale up the compute and storage capacity of your RDS instances not scale down
- You can scale your storage while the DB instance is running
- You cannot scale your compute capacity while running, will face downtime when
scaling
- If you have the highest or largest DB instance, still you need to scale
- Use partitioning and split your RDS DB over multiple RDS instances
Alternative to Amazon RDS
If your use case isn’t supported on RDS, you can run databases on Amazon EC2.
- You can run any database you like with full control and ultimate flexibility.
- You must manage everything like backups, redundancy, patching and scaling.
- Good option if you require a database not yet supported by RDS, such as IBM DB2 or
SAP HANA.
- Good option if it is not feasible to migrate to AWS-managed database.
Introduction to Dynamo DB
Benefits
⁻ Performance at scale: DynamoDB supports some of the world’s largest scale applications
by providing consistent, single-digit millisecond response times at any scale
⁻ Serverless: there are no servers to provision, patch, or manage and no software to install,
maintain, or operate
Use cases
Customers
⁻ Nike
⁻ Samsung
⁻ Netflix
Tables
Tables
Items
- Dynamo DB supports both Eventual Consistency (default) and Strong Consistency models
- When you read data from a Dynamo DB table, the response might not reflect the results
of a recently completed write operation
- Best read throughput
- Consistency across all copies is reached in 1 second
- A read returns a result that reflects all writes that received a successful response prior to
the read
- Users/Applications reading from Dynamo DB tables can specify in their requests if they
want strong consistency, otherwise it will be eventually consistent reads (default)
- So the application will dictate what is required, Strong, Eventual, or both
Read Capacity Units
- One read capacity unit represents one strongly consistent read per second, or two
eventually consistent reads per second for an item up to 4KB in size
- If you need to read an item that is larger than 4 KB, Dynamo DB will need to consume
additional read capacity units
- The total number of read capacity units required depends on the item size, and whether
you want an eventually consistent or strongly consistent read.
- One write capacity unit represents one write per second for an item up to 1 KB in size
- If you need to write an item that is larger than 1 KB, Dynamo DB will need to consume
additional write capacity units
- The total number of write capacity units required depends on the item size.
Scalability
- It provides for a push button scaling on AWS where you can increase the read/write
throughput and AWS will go a head and scale it for you (up or down) without downtime
or performance degradations
- You can scale the provisioned capacity of your Dynamo DB table any time you want
- There is no limit to the number of items(data) you can store in a Dynamo DB table
- There is no limit on how much data you can store per Dynamo DB table
Dynamo DB Accelerator
- Amazon ElastiCache allows you to seamlessly set up, run, and scale popular open-Source
compatible in-memory data stores in the cloud
- Build data-intensive apps or boost the performance of your existing databases by
retrieving data from high throughput and low latency in-memory data stores
- Elasticache can be used if data stores have areas of data that are frequently accessed but
seldom updated
- Additionally, querying a database will always be slower and more expensive than
locating a key in a key-value pair cache.
Uses cases
Session Stores
Gaming
Real-Time Analytics
Queuing
Features
⁻ Extreme performance by allowing for the retrieval of information from a fast, managed, in-
memory system (instead of reading from the DB itself)
⁻ Improves response times for user transactions and queries
⁻ It offloads the read workload from the main DB instances (less I/O load on the DB)
⁻ It does this by storing the results of frequently accessed pieces of data (or
computationally intensive calculations) in-memory
⁻ Fully managed
⁻ Scalable
⁻ Deployed using EC2 instances (AWS managed)
⁻ Elasticache EC2 nodes deployed cannot be accessed from the internet, nor can they
be accessed by EC2 instances in other VPCs
⁻ Can be on-demand or Reserved Instances too (NOT Spot instances)
⁻ Access to Elasticache nodes is controlled by VPC security groups and Subnet groups
⁻ You need to configure VPC Subnet groups for Elasticache (VPC that hosts EC2
instances and the Elasticache cluster)
⁻ Changing the subnet group of an existing Elasticache cluster is not currently
supported
⁻ If an Elasticache node fails it is automatically replaced by AWS Elasticache (fully managed
service)
⁻ Your application connects using endpoints
⁻ A cluster can have one or more nodes included within
⁻ A cluster can be as small as one node
Supports two caching engines
- A cluster is a collection of one or more nodes using the same caching engine
(Memcached or Redis)
- Redis and Memcached are the two most popular in-memory key-value data stores.
Memcached is designed for simplicity while Redis offers a rich set of features that make it
effective for a wide range of use cases. Understand the differences between the two
engines to decide which solution better meets your needs
Amazon Elasticache for Memcached
- Is not persistent
- Can not be used as a data store
- If the node fails, the cached data (in the node) is lost
- Ideal front-end for data stores (RDS, DynamoDB…etc)
- Does not support Multi-AZ failover, replication, nor does it support Snapshots for
backup/restore
- Node failure means data loss
- You can, however, place your Memcached nodes in different AZs to minimize the impact
of an AZ failure and to contain the data loss in such an incident
Use cases
- Cache contents of a DB
- Cache data from dynamically generated webpages
Amazon Elasticache for Redis
Use cases
Web
Mobile Apps
Health care Apps
Financial Apps
Gaming
Internet of Things
Redshift
Introduction to Redshift
- Redshift, is an AWS fully managed, petabyte scale data warehouse service in the cloud
- A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing
- It usually contains historical data derived from transaction data, but it can include
data from other sources
- To perform analytics you need a data warehouse not a regular database
- OLAP (On-line Analytical Processing) is characterized by relatively low volume of
transactions
- Queries are often very complex and involve aggregations (group the data)
- RDS (MySQL.. Etc.) is an OLTP database, where there is detailed and current data,
and a schema used to store transactional data
- Amazon Redshift gives you fast querying capabilities over structured data using familiar
SQL-based clients and business intelligence (BI) tools using standard ODBC and JDBC
connections
Data Security (Exam View)
At Rest
- Supports encryption of data “at rest” using hardware accelerated AES-256bits (Advanced
Encryption Standard)
- By default, AWS Redshift takes care of encryption key management
- You can choose to manage your own keys through HSM (Hardware Security Modules), or
AWS KMS (Key Management Service)
In-Transit
- Supports SSL Encryption, in-transit, between client applications and Redshift data
warehouse cluster
- You can’t have direct access to your AWS Redshift cluster nodes, however, you can
through the applications themselves
Performance (Exam View)
- Advanced Compression
- Data is stored sequentially in columns which allows for much better compression
- Less storage space
- Redshift automatically selects compression scheme
- Massive Parallel Processing(MPP)
- Data and query loads are distributed across all nodes
Redshift Cluster
- Amazon Redshift automatically patches and backs up (Snapshots) your data warehouse,
storing the backups for a user-defined retention period in AWS S3
- It keeps the backup by default for one day (24hours) but you can configure it from 0
to 35days
- Automatic backups are stopped if you choose retention period of 0
- You have access to these automated snapshots during the retention period
- If you delete the cluster
- You can choose to have a final snapshot to use later
- Manual backups are not deleted automatically, if you do not manually delete them,
you will be charged standard S3 storage rates
- AWS Redshift currently supports only one AZ (no Multi-AZ option)
- You can restore from your backup to a new Redshift cluster in the same or a different AZ
- This is helpful in case the AZ hosting your cluster fails
Availability and Durability
- Redshift automatically replicates all your data within your data warehouse cluster
- Redshift always keeps three copies of your data
- The original one
- A replica on compute nodes (within the cluster)
- A backup copy on S3
- Cross Region Replication
- Redshift can asynchronously replicate your snapshots to S3 in another region for DR
- Amazon Redshift automatically detect and replace a failure node in your data
warehouse cluster
- The data warehouse cluster will be unavailable for the queries and updates until a
replacement node is provided and added to the DB
- Amazon Redshift makes your replacement node available immediately and loads
the most frequently accessed data from S3 first to allow you to resume querying
your data as quickly as possible
1. What is a Database
2. What is RDBMS
3. Define RDS
4. Different types of Database engines supported in RDS
5. Define Subnet Group
6. Difference between Automated backups and Manual snapshots in RDS
7. What is Multi-AZ deployment
8. Define Database Read-replicas
9. Define Dynamo DB & how it is different from RDS
10. What are Data-items and Attributes in a Dynamo-DB table
11. What are the attributes which must be defined when creating a Dynamo-DB table
12. What is the max size of an Item
13. What if the item size exceeds 400 KB
14. What is the maximum size of an Dynamo DB table and how many data-items can be
stored in a table
15. What is Eventual consistency and Strong consistency reads
16. Define Read capacity units and Write capacity units
17. What is Dynamo DB Accelerator
18. What is Elastic Cache
19. What are the Elastic Cache engines supported in AWS
20. How Memcached is different from Redis
21. What Redshift is used for
22. Redshift supports OLAP or OLTP
23. Does Redshift Multi-AZ option
Route53
How DNS Works
- Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web
service
- It is designed to give developers and businesses an extremely reliable and cost effective
way to route end users to Internet applications by translating names like
www.example.com into the numeric IP addresses like 192.0.2.1 that computers use to
connect to each other
- Amazon Route 53 is fully compliant with IPv6 as well
- Amazon Route 53 effectively connects user requests to infrastructure running in AWS as
well as infrastructure outside of AWS
- You can use Route 53 to perform three main functions in any combination: domain
registration, DNS routing, and health checking
1. Register domain names: Your website needs a name, such as example.com. Route 53 lets
you register a name for your website or web application, known as a domain name
2. Route internet traffic to the resources for your domain: When a user opens a web
browser and enters your domain name (example.com) or subdomain name
(acme.example.com) in the address bar, Route 53 helps connect the browser with your
website or web application
3. Check the health of your resources: Route 53 sends automated requests over the
internet to a resource, such as a web server, to verify that it's reachable, available, and
functional. You also can choose to receive notifications when a resource becomes
unavailable and choose to route internet traffic away from unhealthy resources
Benefits
- Amazon Route 53 health checks monitor the health of your resources such as web servers
and email servers. You can optionally configure Amazon Cloudwatch alarms for your health
checks, so that you receive notification when a resource becomes unavailable
Hosted Zones
A hosted zone is a container for records, and records contain information about how you
want to route traffic for a specific domain, such as example.com, and its subdomains
(acme.example.com, zenith.example.com).
- Public hosted zones contain records that specify how you want to route traffic on the
internet
- You get a public hosted zone in one of two ways:
- When you register a domain with Route 53, AWS create a hosted zone for you
automatically.
- When you transfer DNS service for an existing domain to Route 53, you start by
creating a hosted zone for the domain.
- Private hosted zones contain records that specify how you want to route traffic in an
Amazon VPC
- You create a private hosted zone, such as example.com, and specify the VPCs that
you want to associate with the hosted zone.
- You create records in the hosted zone that determine how Route 53 responds to DNS
queries for your domain and subdomains within and among your VPCs. For example,
suppose you have a database server that runs on an EC2 instance in one of the VPCs
that you associated with your private hosted zone
Amazon Route 53 Health Checks
Amazon Route 53 health checks monitor the health and performance of your web
applications, web servers, and other resources. Each health check that you create can
monitor one of the following:
Route 53 Resolver
- A Route53 DNS Resolver is there by default when you create a VPC in AWS
- The default function is to resolve DNS queries within the VPC
- It answers DNS queries for the VPC domain names (for ELBs, EC2instances… etc. within
the VPC)
- For all other Domain names (not within the VPC, such as Public Domain names on the
internet), the Route53 Resolver will do recursive lookups against public DNS resolvers
- Route53 Resolvers are Region specific resources.
Routing Policies
- Simple routing policy – Use for a single resource that performs a given function for your
domain. If you choose the simple routing policy in the Route 53 console, you can specify
multiple values in the same record, such as multiple IP addresses. If you specify multiple
values in a record, Route 53 returns all values to the recursive resolver in random order,
and the resolver returns the values to the client (such as a web browser) that submitted
the DNS query. The client then chooses a value and resubmits the query.
- Failover routing policy – Use when you want to configure active-passive failover. Failover
routing lets you route traffic to a resource when the resource is healthy or to a different
resource when the first resource is unhealthy.
- Geo-location routing policy – Geo-location routing lets you choose the resources that
serve your traffic based on the geographic location of your users, meaning the location
that DNS queries originate from. For example, you might want all queries from Europe to
be routed to an ELB load balancer in the Frankfurt region.
- Latency routing policy – Use when you have resources in multiple AWS Regions and you
want to route traffic to the region that provides the best latency. If your application is
hosted in multiple AWS Regions, you can improve performance for your users by serving
their requests from the AWS Region that provides the lowest latency.
- Multi-value answer routing policy – Use when you want Route 53 to respond to DNS
queries with up to eight healthy records selected at random. Multi-value answer routing
lets you configure Amazon Route 53 to return multiple values, such as IP addresses for
your web servers, in response to DNS queries. Multi-value answer routing also lets you
check the health of each resource, so Route 53 returns only values for healthy resources.
- Weighted routing policy – Weighted routing lets you associate multiple resources with a
single domain name (example.com) or subdomain name (acme.example.com) and choose
how much traffic is routed to each resource
DNS query Types
Recursive Query:
In a recursive query, a DNS client provides a hostname, and the DNS Resolver “must”
provide an answer—it responds with either a relevant resource record, or an error message
if it can't be found.
Iterative Query:
In an iterative query, a DNS client provides a hostname, and the DNS Resolver returns the
best answer it can. If the DNS resolver has the relevant DNS records in its cache, it returns
them. If not, it refers the DNS client to the Root Server, or another Authoritative Name
Server which is nearest to the required DNS zone.
Non-Recursive Query:
A non-recursive query is a query in which the DNS Resolver already knows the answer. It
either immediately returns a DNS record because it already stores it in local cache, or
queries a DNS Name Server which is authoritative for the record, meaning it definitely holds
the correct IP for that hostname.
Types of DNS Servers
The following are the most common DNS server types that are used to resolve hostnames
into IP addresses.
DNS Resolver:
A DNS resolver (recursive resolver), is designed to receive DNS queries, which include a
human-readable hostname such as “www.example.com”, and is responsible for tracking the
IP address for that hostname.
Domain registration:
When you want to get a new domain name, such as the example.com part of the URL
http://example.com, you can register it with Amazon Route 53.
You can also transfer the registration for existing domains from other registrars to Route 53
or transfer the registration for domains that you register with Route 53 to another registrar.
Domain Hosting:
Domain hosting refers to businesses that specialize in hosting domain names for individuals
and companies. Domain names are used in URLs to identify particular Web pages.
Domain transfer:
A domain transfer refers to the process of changing the designated registrar of a
domain name. ... Domain names may be transferred only if they have been registered with
the previous registrar for 60 days or more .
Global Accelerator
Global Accelerator
- AWS Global Accelerator is a service that improves the availability and performance of
your applications with local or global users
- AWS Global Accelerator uses the AWS global network to optimize the path from your
users to your applications, improving the performance of your traffic by as much as 60%
- AWS Global Accelerator continually monitors the health of your application endpoints
and redirects traffic to healthy endpoints in less than 30 seconds
Benefits
It can take many networks to reach the application. Paths to and from the application may
differ. Each hop impacts performance and can introduce risks
Adding AWS Global Accelerator removes these inefficiencies. It leverages the Global AWS
Network, resulting in improved performance.
1. Define Route53
2. What are the 3 main functions which Route53 perform
3. What is a Hosted Zone and types of Hosted zones
4. What are the different types of Routing Policies
Identity and Access
Management
Introduction to IAM
⁻ AWS Identity and Access Management (IAM) is a web service that helps you securely
control access to AWS resources
⁻ Used to control who is authenticated (signed in) and authorized (has permissions) to use
resources
⁻ When you first create an AWS account, you begin with a single sign-in identity that has
complete access to all AWS services and resources in the account
⁻ This identity is called the AWS account root user and is accessed by signing in with
the email address and password that you used to create the account
IAM Features
⁻ Granular permissions
⁻ You can grant different permissions to different people for different resources
⁻ For example, you might allow some users complete access to Amazon Elastic
Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon
DynamoDB, Amazon Redshift, and other AWS services
⁻ For other users, you can allow read-only access to just some S3 buckets, or
permission to administer just some EC2 instances, or to access your billing
information but nothing else
⁻ Secure access to AWS resources for applications that run on Amazon EC2
⁻ You can use IAM features to securely provide credentials for applications that run on
EC2 instances
⁻ These credentials provide permissions for your application to access other AWS
resources. Examples include S3 buckets and DynamoDB tables
⁻ Multi-factor authentication (MFA)
⁻ You can add two-factor authentication to your account and to individual users for
extra security. With MFA you or your users must provide not only a password or
access key to work with your account, but also a code from a specially configured
device
⁻ Identity federation
⁻ You can allow users who already have passwords elsewhere—for example, in your
corporate network or with an internet identity provider—to get temporary access to
your AWS account
⁻ Eventually Consistent
⁻ IAM, like many other AWS services, is eventually consistent
⁻ IAM achieves high availability by replicating data across multiple servers within
Amazon's data centers around the world
⁻ If a request to change some data is successful, the change is committed and safely
stored. However, the change must be replicated across IAM, which can take some
time
⁻ Such changes include creating or updating users, groups, roles, or policies
⁻ Free to use
Users
For greater security and organization, you can give access to your AWS account to specific
users—identities that you create with custom permissions
Role
- An IAM role is very similar to a user, in that it is an identity with permission policies that
determine what the identity can and cannot do in AWS
- However, a role does not have any credentials (password or access keys) associated with it
- Instead, if You're creating an application that runs on an Amazon Elastic Compute Cloud
(Amazon EC2) instance and that application makes requests to other AWS services like RDS
or S3
Policies
- You manage access in AWS by creating policies and attaching them to IAM identities
(users, groups of users, or roles) or AWS resources
- A policy is an object in AWS that, when associated with an identity or resource, defines
their permissions
- AWS evaluates these policies when an IAM principal (user or role) makes a request
- Permissions in the policies determine whether the request is allowed or denied
- Most policies are stored in AWS as JSON documents
⁻ Access keys are long-term credentials for an IAM user or the AWS account root user. You
can use access keys to sign programmatic requests to the AWS CLI or AWS API (directly or
using the AWS SDK).
⁻ Access keys consist of two parts
An access key ID (for example, AKIAIOSFODNN7EXAMPLE)
A Secret access key (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY).
Like a user name and password, you must use both the access key ID and secret access key
together to authenticate your requests.
AWS Organizations
Features
• Creating a software system is a lot like constructing a building. If the foundation is not
solid, structural problems can undermine the integrity and function of the building.
• When architecting technology solutions, if you neglect the five pillars of operational
excellence, security, reliability, performance efficiency, and cost optimization it can
become challenging to build a system that delivers on your expectations and
requirements.
• Incorporating these pillars into your architecture will help you produce stable and
efficient systems.
1. Operational Excellence
The Operational Excellence pillar includes the ability to run and monitor systems to deliver
business value and to continually improve supporting processes and procedures.
There are six design principles for operational excellence in the cloud
The Security pillar includes the ability to protect information, systems, and assets while
delivering business value through risk assessments and mitigation strategies.
The Reliability pillar includes the ability of a system to recover from infrastructure or service
disruptions, dynamically acquire computing resources to meet demand, and mitigate
disruptions such as misconfigurations or transient network issues.
The Performance Efficiency pillar includes the ability to use computing resources efficiently
to meet system requirements, and to maintain that efficiency as demand changes and
technologies evolve.
There are five design principles for performance efficiency in the cloud:
The Cost Optimization pillar includes the ability to run systems to deliver business value at
the lowest price point.
There are five design principles for cost optimization in the cloud:
⁻ AWS Trusted Advisor acts like your customized cloud expert, and it helps you provision
your resources by following best practices.
⁻ Trusted Advisor inspects your AWS environment and finds opportunities to save money,
improve system performance and reliability, or help close security gaps.
How it works
Cost Optimization checks
Low Utilization Amazon EC2 Instances : Checks the Amazon Elastic Compute Cloud
(Amazon EC2) instances that were running at any time during the last 14 days and
alerts you if the daily CPU utilization was 10% or less and network I/O was 5 MB or less
on 4 or more days
Underutilized Amazon EBS Volumes: Checks Amazon Elastic Block Store (Amazon EBS)
volume configurations and warns when volumes appear to be underused
Unassociated Elastic IP Addresses: Checks for Elastic IP addresses (EIPs) that are not
associated with a running Amazon Elastic Compute Cloud (Amazon EC2) instance
Amazon RDS Idle DB Instances: Checks the configuration of your Amazon Relational
Database Service (Amazon RDS) for any DB instances that appear to be idle. If a DB
instance has not had a connection for a prolonged period of time, you can delete the
instance to reduce costs
Performance Checks
High Utilization Amazon EC2 Instances: Checks the Amazon Elastic Compute Cloud
(Amazon EC2) instances that were running at any time during the last 14 days and alerts
you if the daily CPU utilization was more than 90% on 4 or more days
Large Number of Rules in an EC2 Security Group: Checks each Amazon Elastic Compute
Cloud (EC2) security group for an excessive number of rules. If a security group has a
large number of rules, performance can be degraded
Over utilized Amazon EBS Magnetic Volumes: Checks for Amazon Elastic Block Store (EBS)
Magnetic volumes that are potentially over utilized and might benefit from a more
efficient configuration
Security Checks
Security Groups - Specific Ports Unrestricted: Checks security groups for rules that allow
unrestricted access (0.0.0.0/0) to specific ports
Amazon EBS Public Snapshots: Checks the permission settings for your Amazon Elastic
Block Store (Amazon EBS) volume snapshots and alerts you if any snapshots are marked
as public
AWS CloudTrail Logging: Checks for your use of AWS CloudTrail. CloudTrail provides
increased visibility into activity in your AWS account by recording information about AWS
API calls made on the account
Fault Tolerance Checks
Amazon EBS Snapshots: Checks the age of the snapshots for your Amazon Elastic Block
Store (Amazon EBS) volumes (available or in-use)
Amazon EC2 Availability Zone Balance: Checks the distribution of Amazon Elastic Compute
Cloud (Amazon EC2) instances across Availability Zones in a region
Amazon RDS Backups: Checks for automated backups of Amazon RDS DB instances
Amazon RDS Multi-AZ: Checks for DB instances that are deployed in a single Availability
Zone
Amazon S3 Bucket Logging: Checks the logging configuration of Amazon Simple Storage
Service (Amazon S3) buckets
Service Limit Checks
EBS Active Snapshots: Checks for usage that is more than 80% of the EBS Active Snapshots
Limit
EC2 Elastic IP Addresses: Checks for usage that is more than 80% of the EC2 Elastic IP
Addresses Limit
RDS DB Instances: Checks for usage that is more than 80% of the RDS DB Instances Limit
VPC: Checks for usage that is more than 80% of the VPC Limit
VPC Internet Gateways: Checks for usage that is more than 80% of the VPC Internet
Gateways Limit
Introduction to Cloud Trail
- AWS CloudTrail is an AWS service that helps you enable governance, compliance, and
operational and risk auditing of your AWS account
- Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail
- Events include actions taken in the AWS Management Console, AWS Command Line
Interface, and AWS SDKs and APIs
- CloudTrail is enabled on your AWS account when you create it
- When activity occurs in your AWS account, that activity is recorded in a CloudTrail
event
- You can easily view recent events in the CloudTrail console by going to Event history
- Visibility into your AWS account activity is a key aspect of security and operational best
practices
- You can use CloudTrail to view, search, download, archive, analyze, and respond to
account activity across your AWS infrastructure
- You can identify
- who or what took which action
- what resources were acted upon
- when the event occurred
- other details to help you analyze and respond to activity in your AWS account
- Can create 5 Trails per region (cannot be increased)
How Cloud Trail Works
CloudTrail Events
Management events provide information about management operations that are performed
on resources in your AWS account. These are also known as control plane operations
- Example
- Registering devices (for example, Amazon EC2 CreateDefaultVpc API operations)
- Configuring rules for routing data (for example, Amazon EC2 CreateSubnet API
operations)
Management events can also include non-API events that occur in your account. For
example, when a user signs in to your account, CloudTrail logs the ConsoleLogin event
Data Events
- CloudTrail Insights events capture unusual activity in your AWS account. If you have
Insights events enabled, and CloudTrail detects unusual activity, Insights events are
logged to a different folder or prefix in the destination S3 bucket for your trail
- CloudTrail event history provides a viewable, searchable, and downloadable record of the
past 90 days of CloudTrail events. You can use this history to gain visibility into actions
taken in your AWS account in the AWS Management Console, AWS SDKs, command line
tools, and other AWS services
Organization Trails
Encryption
- By default, AWS CloudTrail encrypts all log files delivered to your specified Amazon S3
bucket using Amazon S3 server-side encryption (SSE)
- Optionally, can add a layer of security to your CloudTrail log files by encrypting the log
files with your AWS Key Management Service (AWS KMS) key
Validating CloudTrail Log File Integrity
- To determine whether a log file was modified, deleted, or unchanged after CloudTrail
delivered it, you can use CloudTrail log file integrity validation
- This feature is built using industry standard algorithms: SHA-256(Secure Hash
Algorithm) for hashing and SHA-256 with RSA for digital signing
- When you enable log file integrity validation, CloudTrail creates a hash for every log file
that it delivers
- Every hour, CloudTrail also creates and delivers a file that references the log files for
the last hour and contains a hash of each
- This file is called a digest file
- CloudTrail signs each digest file using the private key of a public and private key pair
- After delivery, you can use the public key to validate the digest file
- CloudTrail uses different key pairs for each AWS region
- The digest files are delivered to the same Amazon S3 bucket associated with your trail as
your CloudTrail log files
- If your log files are delivered from all regions or from multiple accounts into a single
Amazon S3 bucket, CloudTrail will deliver the digest files from those regions and accounts
into the same bucket
- The digest files are put into a folder separate from the log files
Cloudwatch
Introduction to Cloud Watch
- Amazon Cloudwatch monitors your Amazon Web Services (AWS) resources and the
applications you run on AWS in real time
- The Cloudwatch home page automatically displays metrics about every AWS service you
use
- You can additionally create custom dashboards to display metrics about your custom
applications by installing the CloudWatch agent on the instance/server
- You can create alarms which watch metrics and send notifications or automatically make
changes to the resources you are monitoring when a threshold is breached
- For example, you can monitor the CPU usage and disk reads and writes of your
Amazon EC2 instances and then use this data to determine whether you should
launch additional instances to handle increased load. You can also use this data to
stop under-used instances to save money
- With Cloudwatch, you gain system-wide visibility into resource utilization, application
performance, and operational health
Metric
Namespace
⁻ You can create a CloudWatch alarm that watches a single CloudWatch metric
⁻ The alarm performs one or more actions based on the value of the metric or expression
relative to a threshold over a number of time periods.
⁻ The action can be an Amazon EC2 action, an Amazon EC2 Auto Scaling action, or a
notification sent to an Amazon SNS topic.
⁻ You can also add alarms to CloudWatch dashboards and monitor them visually.
⁻ When an alarm is on a dashboard, it turns red when it is in the ALARM state, making it
easier for you to monitor its status proactively.
⁻ Alarms invoke actions for sustained state changes only.
⁻ CloudWatch alarms do not invoke actions simply because they are in a particular state,
the state must have changed and been maintained for a specified number of periods.
⁻ After an alarm invokes an action due to a change in state, its subsequent behavior
depends on the type of action that you have associated with the alarm.
- Amazon Simple Notification Service (SNS) is a highly available, durable, secure, fully
managed pub/sub messaging service that enables you to decouple micro services,
distributed systems, and serverless applications
- Amazon SNS provides topics for high-throughput, push-based, many-to-many messaging.
- Using Amazon SNS topics, your publisher systems can fan out messages to a large
number of subscriber endpoints for parallel processing, including Amazon
SQS queues, AWS Lambda functions, and HTTP/S web hooks
- Additionally, SNS can be used to fan out notifications to end users using mobile push,
SMS, and email.
Benefits
⁻ Amazon Simple Queue Service (Amazon SQS) offers a secure, durable, and available
hosted queue that lets you integrate and decouple distributed software systems and
components
⁻ Fully managed Queuing service
⁻ Eliminates the complexity and overhead associated with managing and operating
message oriented middleware
⁻ Can send, store, and receive messages between software components at any volume,
without losing messages or requiring other services to be available
⁻ Get started with SQS in minutes
Lifecycle of an Amazon SQS message in Distributed Queues
- There are three main parts in a distributed messaging system: the components of your
distributed system, your queue (distributed on Amazon SQS servers), and the messages in
the queue
- Your system has several producers (components that send messages to the queue)
and consumers (components that receive messages from the queue). The queue (which
holds messages A through E) redundantly stores the messages across multiple Amazon
SQS servers
Step 1
A producer (component 1)
sends message A to a queue,
and the message is
distributed across the
Amazon SQS servers
redundantly
Step 2
When a consumer
(component 2) is ready to
process messages, it
consumes messages from the
queue, and message A is
returned
Step 3
The consumer (component 2)
deletes message A from the
queue to prevent the
message from being received
and processed again when
the visibility timeout expires
SQS Limits
Standard queues
FIFO queues
Benefits
Standard Queue
FIFO Queue
⁻ High Throughput - By default, FIFO queues support up to 3,000 messages per second
with batching. To request a limit increase, file a support request.
⁻ Exactly-Once Processing – A message is delivered once and remains available until a
consumer processes and deletes it. Duplicates aren't introduced into the queue.
⁻ First-In-First-Out Delivery – The order in which messages are sent and received is strictly
preserved.
SQS Retention Period
- SQS messages can remain in the queue for up to 14 days (SQS retention period)
- Range is 1 min to 14 days (default is 4 days)
- Once the maximum retention period of a message is reached, it will be deleted
automatically from the queue
- Messages can be sent to the queue and read from the queue simultaneously
- SQS can be used with :
- Redshift, DynamoDB, EC2, ECS, RDS, S3, Lambda to make distributed/decoupled
applications
- You can have multiple SQS queues with different priorities in case you want one SQS
queue messages to be handled with higher priority over other SQS queue messages
SQS Reliability
- Amazon SQS stores all message queues and messages within a single, highly-available
AWS region with multiple redundant Availability Zones (AZs),
- No single computer, network, or AZ failure can make messages inaccessible
How SQS is different from SNS
SQS : Queue
SNS : Topic (Pub/Sub system)
Message consumption
SQS : Pull Mechanism - Consumers poll and pull messages from SQS
- SQS is distributed queuing system
- Messages are NOT pushed to receivers
- Receivers have to poll or pull messages from SQS
- Messages can't be received by multiple receivers at the same time
- Any one receiver can receive a message, process and delete it
- Other receivers do not receive the same message later
- Polling inherently introduces some latency in message delivery in SQS unlike SNS where
messages are immediately pushed to subscribers.
Persistence
SQS : Messages are persisted for some (configurable) duration if no consumer is available
SNS : No persistence. Whichever consumer is present at the time of message arrival gets the
message and the message is deleted. If no consumers are available then the message is lost.
1. Define SNS
2. What is the Primary Task to start with SNS
3. Clients for SNS
4. Define SQS
5. What are the Types of messaging queues in SQS
6. Which queue allows duplicate messages
7. What is the retention period of SQS messages
8. What is the maximum size of SQS message
9. How SQS is different from SNS
What is Elastic Load Balancing
- Elastic Load Balancing distributes incoming application or network traffic across multiple
targets, such as Amazon EC2 instances, containers, and IP addresses, in multiple
Availability Zones
- Elastic Load Balancing scales your load balancer as traffic to your application changes
over time. It can automatically scale to the vast majority of workloads
Benefits
- A load balancer distributes workloads across multiple compute resources, such as virtual
servers. Using a load balancer increases the availability and fault tolerance of your
applications
- You can add and remove compute resources from your load balancer as your needs
change, without disrupting the overall flow of requests to your applications
- You can configure health checks, which monitor the health of the compute resources, so
that the load balancer sends requests only to the healthy ones
- You can also offload the work of encryption and decryption to your load balancer so that
your compute resources can focus on their main work
The necessity of Elastic Load Balancer
- There is a key difference in how the load balancer types are configured
- With Application Load Balancers and Network Load Balancers, you register targets in
target groups, and route traffic to the target groups
- With Classic Load Balancers, you register instances with the load balancer
- You can select a load balancer based on your application needs
Availability Zones and Load Balancer Nodes
- When you enable an Availability Zone for your load balancer, Elastic Load Balancing
creates a load balancer node in the Availability Zone
- If you register targets in an Availability Zone but do not enable the Availability Zone,
these registered targets do not receive traffic
- Your load balancer is most effective when you ensure that each enabled Availability
Zone has at least one registered target
- AWS recommend that you enable multiple Availability Zones
- This configuration helps ensure that the load balancer can continue to route traffic
- If one Availability Zone becomes unavailable or has no healthy targets, the load
balancer can route traffic to the healthy targets in another Availability Zone
- After you disable an Availability Zone, the targets in that Availability Zone remain
registered with the load balancer
- However, even though they remain registered, the load balancer does not route
traffic to them
Cross-Zone Load Balancing
- The nodes for your load balancer distribute requests from clients to registered targets
- When cross-zone load balancing is enabled, each load balancer node distributes traffic
across the registered targets in all enabled Availability Zones
- When cross-zone load balancing is disabled, each load balancer node distributes
traffic only across the registered targets in its Availability Zone
When you create a load balancer in a VPC, you must choose whether to make it an internal
load balancer or an Internet-facing load balancer.
- Classic Load Balancer provides basic load balancing across multiple Amazon EC2
instances
- Classic Load Balancer is intended for applications that were built within the EC2-Classic
network
- CLB send the requests directly to EC2 instances
- AWS recommend Application Load Balancer for Layer 7 and Network Load Balancer for
Layer 4 when using Virtual Private Cloud (VPC)
Key Features
The following diagram illustrates the basic components. Notice that each listener contains a
default rule, and one listener contains another rule that routes requests to a different target
group. One target is registered with two target groups.
Network Load Balancer Overview
Benefits
- Ability to handle volatile workloads and scale to millions of requests per second
- Support for containerized applications
- Support for monitoring the health of each service independently
DNS setup for ELB
- Each Load Balancer receives a default Domain Name System (DNS) name
- This DNS name includes the name of the AWS region in which the load balancer is
created
- For example, if you create a load balancer named my-loadbalancer in the US West
(Oregon) region, your load balancer receives a DNS name such as my-loadbalancer-
1234567890.us-west-2.elb.amazonaws.com
- To access the website on your instances, you paste this DNS name into the address
field of a web browser
- However, this DNS name is not easy for your customers to remember and use
- If you'd prefer to use a friendly DNS name for your load balancer, such
as www.example.com, instead of the default DNS name, you can create a custom domain
name and associate it with the DNS name for your load balancer
- When a client makes a request using this custom domain name, the DNS server resolves it
to the DNS name for your load balancer.
Auto Scaling
Getting started with Auto Scaling
⁻ Amazon EC2 Auto Scaling helps you ensure that you have the correct number of Amazon
EC2 instances available to handle the load for your application
⁻ You create collections of EC2 instances, called Auto Scaling groups
⁻ You can specify the minimum number of instances in each Auto Scaling group, and
Amazon EC2 Auto Scaling ensures that your group never goes below this size
⁻ You can specify the maximum number of instances in each Auto Scaling group, and
Amazon EC2 Auto Scaling ensures that your group never goes above this size
⁻ If you specify scaling policies, then Amazon EC2 Auto Scaling can launch or terminate
instances as demand on your application increases or decreases
For example, the following Auto Scaling group has a minimum size of one instance, a desired
capacity of two instances, and a maximum size of four instances. The scaling policies that you
define adjust the number of instances, within your minimum and maximum number of
instances, based on the criteria that you specify.
Entities of Auto Scaling
⁻ AWS Auto Scaling monitors your applications and automatically adjusts capacity to
maintain steady, predictable performance at the lowest possible cost
⁻ Using AWS Auto Scaling, it’s easy to setup application scaling for multiple resources across
multiple services in minutes
⁻ The service provides a simple, powerful user interface that lets you build scaling plans for
resources such as Amazon EC2 instances
⁻ AWS Auto Scaling makes scaling simple with recommendations that allow you to optimize
performance, costs, or balance between them
⁻ It’s easy to get started with AWS Auto Scaling using the AWS Management Console,
Command Line Interface (CLI)
⁻ AWS Auto Scaling is available at no additional charge. You pay only for the AWS resources
needed to run your applications and Amazon CloudWatch monitoring fees
Benefits
⁻ An Auto Scaling group contains a collection of Amazon EC2 instances that are treated as a
logical grouping for the purposes of automatic scaling and management
⁻ Size of an Auto Scaling group depends on the number of instances you set as the desired
capacity. You can adjust its size to meet demand either manually, or by using automatic
scaling
⁻ An Auto Scaling group starts by launching enough instances to meet its desired capacity.
⁻ It maintains this number of instances by performing periodic health checks on the
instances in the group
⁻ The Auto Scaling group continues to maintain a fixed number of instances even if an
instance becomes unhealthy. If an instance becomes unhealthy, the group terminates the
unhealthy instance and launches another instance to replace it
⁻ You can use scaling policies to increase or decrease the number of instances in your group
dynamically to meet changing conditions
⁻ When the scaling policy is in effect, the Auto Scaling group adjusts the desired capacity of
the group, between the minimum and maximum capacity values that you specify, and
launches or terminates the instances as needed
Launch Configurations
⁻ At any time, you can change the size of an existing Auto Scaling group
⁻ Update the desired capacity of the Auto Scaling group, or update the instances that
are attached to the Auto Scaling group.
⁻ When you configure dynamic scaling, you must define how to scale in response to
changing demand
⁻ For example, you have a web application that currently runs on two instances and
you want the CPU utilization of the Auto Scaling group to stay at around 50 percent
when the load on the application changes
⁻ This gives you extra capacity to handle traffic spikes without maintaining an excessive
amount of idle resources
⁻ You can configure your Auto Scaling group to scale automatically to meet this need
⁻ The policy type determines how the scaling action is performed.
1. Why Elastic Load Balancer used
2. What are the functionalities of ELB other than balancing the traffic
3. What are the types of Elastic Load Balancer and how are they different from each other
4. Difference between Internet and Internal Load balancer
5. What is cross-zonal load balancing
6. Is it possible to register Targets (EC2 instances) after the creation of Elastic load balancer
with target group
7. Define Auto-scaling feature
8. What is the prerequisites to create an auto-scaling group
9. Difference between manual and dynamic Auto scaling
10. For an application with fixed number of users at all times, do you think there is a need of
Auto Scaling
11. Can an Auto Scaling can have 2 Launch configuration at a time
12. Once auto scaling is created, is it possible to change the Launch configuration for that
auto scaling
Other AWS Services
High Level Description
Introduction to Lambda
- AWS Lambda is a compute service that lets you run code without provisioning or
managing servers
- AWS Lambda executes your code only when needed and scales automatically, from a few
requests per day to thousands per second
- You pay only for the compute time you consume - there is no charge when your code is
not running
- With AWS Lambda, you can run code for virtually any type of application or backend
service - all with zero administration
- AWS Lambda runs your code on a high-availability compute infrastructure and performs
all of the administration of the compute resources, including server and operating system
maintenance, capacity provisioning and automatic scaling, code monitoring and logging
- All you need to do is supply your code in one of the languages that AWS Lambda
supports
- You can use AWS Lambda to run your code in response to events, such as changes to data
in an Amazon S3 bucket or an Amazon DynamoDB table
When Should I Use AWS Lambda?
- AWS Lambda is an ideal compute platform for many application scenarios, provided that
you can write your application code in languages supported by AWS Lambda, and run
within the AWS Lambda standard runtime environment and resources provided by
Lambda
- When using AWS Lambda, you are responsible only for your code
- AWS Lambda manages the compute fleet that offers a balance of memory, CPU, network,
and other resources
- This is in exchange for flexibility, which means you cannot log in to compute instances, or
customize the operating system on provided runtimes
- These constraints enable AWS Lambda to perform operational and administrative
activities on your behalf, including provisioning capacity, monitoring fleet health,
applying security patches, deploying your code, and monitoring and logging your Lambda
functions
- Lambda is a highly available service
- Languages supported by AWS Lambda are .NET, Go, Java, Node.js, Python, Ruby
- If you need to manage your own compute resources, Amazon Web Services also offers
other compute services to meet your needs
- Amazon Elastic Compute Cloud (Amazon EC2) service offers flexibility and a wide
range of EC2 instance types to choose from. It gives you the option to customize
operating systems, network and security settings, and the entire software stack, but
you are responsible for provisioning capacity, monitoring fleet health and
performance, and using Availability Zones for fault tolerance
- Elastic Beanstalk offers an easy-to-use service for deploying and scaling applications
onto Amazon EC2 in which you retain ownership and full control over the underlying
EC2 instances
AWS Lambda - Building Block
- Lambda Function : The foundation, it is comprised of your custom code and any
dependent libraries
- Event Source : An AWS service, such as Amazon SNS, that triggers your function and
executes its logic
- Log Streams : While Lambda automatically monitors your function invocations and
reports metrics to CloudWatch
Event Source Mapping
- In AWS Lambda, Lambda functions and event sources are the core components
- An event source is the entity that publishes events, and a Lambda function is the custom
code that processes the events
- Supported event sources refer to those AWS services that can be pre-configured to work
with AWS Lambda
- The configuration is referred to as event source mapping which maps an event source to a
Lambda function
- AWS Lambda supports many AWS services as event sources
- When you configure these event sources to trigger a Lambda function, the Lambda
function is invoked automatically when events occur
- Some of the supported AWS Event sources for Lambda functions are
- Amazon S3
- Amazon DynamoDB
- Amazon Simple Notification Service
- Amazon SQS
- Amazon Cloudwatch
Lambda Function Configuration
- You only specify the amount of memory you want to allocate for your Lambda function
- AWS Lambda allocates CPU power proportional to the memory
- You can update the configuration and request additional memory in 64 MB increments
from 128 MB to 3008 MB
- If the maximum memory use is exceeded, function invocation will be terminated
- Functions larger than 1536 MB are allocated multiple CPU threads, and multi-threaded or
multi-process code is needed to take advantage of the additional performance
- You pay for the AWS resources that are used to run your Lambda function
- To prevent your Lambda function from running indefinitely, you specify a timeout
- When the specified time out is reached, AWS Lambda terminates your Lambda function
- Default is 3 seconds, maximum is 900 seconds (15minutes)
Lambda@Edge
- Lambda@Edge is a feature of Amazon CloudFront that lets you run code closer to users
of your application, which improves performance and reduces latency
- Lambda@Edge runs your code in response to events generated by the Amazon
CloudFront content delivery network (CDN)
- With Lambda@Edge, you can enrich your web applications by making them globally
distributed and improving their performance — all with zero server administration
- you don't have to provision or manage infrastructure in multiple locations around the
world
- You pay only for the compute time you consume - there is no charge when your code is
not running
- Just upload your code to AWS Lambda, which takes care of everything required to run
and scale your code with high availability at an AWS location closest to your end user
Benefits
- You can customize your users' experience by transforming images on the fly based on the
user characteristics
- For example, you can resize images based on the viewer's device type—mobile,
desktop, or tablet
- You can also cache the transformed images at CloudFront Edge locations to further
improve performance when delivering images
Overview of Cloud Formation
⁻ AWS Cloud Formation provides a common language for you to describe and provision all
the infrastructure resources in your cloud environment
⁻ Cloud Formation allows you to use a simple text file to model and provision, in an
automated and secure manner, all the resources needed for your applications across all
regions and accounts
⁻ This file serves as the single source of truth for your cloud environment.
⁻ AWS Cloud Formation is available at no additional charge, and you pay only for the AWS
resources needed to run your applications
Benefits
- MODEL IT ALL
AWS Cloud Formation allows you to model your entire infrastructure in a text file. This
template becomes the single source of truth for your infrastructure. This helps you to
standardize infrastructure components used across your organization, enabling
configuration compliance and faster troubleshooting.
How it Works
Introduction to FSx
- Amazon FSx makes it easy and cost effective to launch and run popular file systems
- With Amazon FSx, you can leverage the rich feature sets and fast performance of widely-
used open source and commercially-licensed file systems, while avoiding time-consuming
administrative tasks like hardware provisioning, software configuration, patching, and
backups
- It provides cost-efficient capacity and high levels of reliability, and it integrates with other
AWS services so that you can manage and use the file systems in cloud-native ways
Amazon FSx provides you with two file systems to choose from
- Amazon FSx for Windows File Server provides fully managed, highly reliable file storage
that is accessible over the industry-standard Service Message Block (SMB) protocol
- It is built on Windows Server, delivering a wide range of administrative features such as
user quotas, end-user file restore, and Microsoft Active Directory (AD) integration
- It offers single-AZ and multi-AZ deployment options, fully managed backups, and
encryption of data at rest and in transit
- Amazon FSx file storage is accessible from Windows, Linux, and MacOS compute instances
and devices running on AWS or on premises
- You can optimize cost and performance for your workload needs with SSD and HDD
storage options
- Amazon FSx helps you optimize TCO with no data duplication, reducing costs by up to 50-
60% on your general-purpose file shares
Amazon FSx for Lustre for high-performance file system
- Amazon FSx for Lustre makes it easy and cost effective to launch and run the world’s most
popular high-performance file system, Lustre
- Use it for workloads where speed matters, such as machine learning, high performance
computing (HPC), video processing, and financial modeling
- The open source Lustre file system was built for and is used for the world’s most
demanding workloads, and is the most widely used file system for the 500 fastest
computers in the world
- Amazon FSx brings the Lustre file system to the masses by making it easy and cost
effective for you to use Lustre for any workload where you want to process data as quickly
as possible
- You can also link your FSx for Lustre file systems to Amazon S3, making it simple to
process data stored on S3
Benefits
Moving your data and applications to the cloud isn't easy, but Amazon has a number of
services that can take some of the load off. Some of them are :
⁻ AWS Migration Hub provides a single place to monitor migrations in any AWS region
where your migration tools are available
⁻ There is no additional cost for using Migration Hub. You only pay for the cost of the
individual migration tools you use, and any resources being consumed on AWS
Features
⁻ Allows you to import information about their on-premises servers and applications into
the Migration Hub so you can track the status of application migrations.
⁻ Shows the latest status and metrics for your entire migration portfolio. This allows you
to quickly understand the progress of your migrations, as well as identify and
troubleshoot any issues that arise.
⁻ Provides all application details in a central location. This allows you to track the status of
all the moving parts across all migrations, making it easier to view overall migration
progress
AWS Database Migration Service
⁻ AWS Database Migration Service helps you migrate databases to AWS quickly and
securely
⁻ Can migrate your data to and from most widely used commercial and open-source
databases
⁻ Supports homogeneous migrations such as Oracle to Oracle, as well as heterogeneous
migrations between different database platforms, such as Oracle or Microsoft SQL Server
to Amazon Aurora
Benefits
⁻ Simple to use
⁻ Minimal downtime
⁻ Supports widely used databases
⁻ Low cost
⁻ Fast and easy to setup
⁻ Reliable
Introduction to the Snow Family
- The Snow Family of services offers a number of physical devices and capacity points,
including some with built-in compute capabilities
- These services help physically transport up to Exabyte's of data into and out of AWS
- The Snow Family of services are owned and managed by AWS and integrate with AWS
security, monitoring, storage management and computing capabilities
⁻ AWS Snowball is a petabyte-scale data transport solution that uses secure appliances to
transfer large amounts of data into and out of AWS
⁻ The AWS Snowball service uses physical storage devices to transfer large amounts of data
between Amazon Simple Storage Service (Amazon S3) and your onsite data storage
location at faster-than-internet speeds
⁻ Snowball uses multiple layers of security designed to protect your data including tamper-
resistant enclosures, 256-bit encryption, and an industry-standard Trusted Platform
Module (TPM) designed to ensure both security and full chain of custody of your data.
⁻ Once the data transfer job has been processed and verified, AWS performs a software
erasure of the Snowball appliance
How it works
Snowball Features
Benefits
- Cloud migration
If you have large quantities of data you need to migrate into AWS, Snowball is often much
faster and more cost-effective than transferring that data over the Internet.
- Disaster Recovery
In the event that you need to quickly retrieve a large quantity of data stored in Amazon S3,
Snowball devices can help retrieve the data much quicker than high-speed Internet.
- Content distribution
Use Snowball devices if you regularly receive or need to share large amounts of data with
clients, customers, or business associates. Snowball devices can be sent directly from AWS to
client or customer locations.
Snowmobile
Use cases
- Datacenter decommission
There are many steps involved to decommissioning a datacenter to make sure valuable data
is not lost. Snowball can help ensure that your data is securely and cost-effectively
transferred to AWS during this process.
Storage Gateway
AWS Storage Gateway
AWS Storage Gateway offers file-based, volume-based, and tape-based storage solutions:
File Gateway
⁻ A file gateway supports a file interface into Amazon Simple Storage Service (Amazon S3)
and combines a service and a virtual software appliance
⁻ By using this combination, you can store and retrieve objects in Amazon S3 using industry-
standard file protocols such as Network File System (NFS) and Server Message Block
(SMB)
⁻ The gateway, is deployed into your on-premises environment as a virtual machine (VM)
running on VMware ESXi or Microsoft Hyper-V hypervisor
⁻ The gateway provides access to objects in S3 as files or file share mount points
Simple Architecture
Volume Gateway
A volume gateway provides cloud-backed storage volumes that you can mount as Internet
Small Computer System Interface (iSCSI) devices from your on-premises application servers.
Cached volumes
⁻ You store your data in Amazon Simple Storage Service (Amazon S3) and retain a copy of
frequently accessed data subsets locally
⁻ Cached volumes offer a substantial cost savings on primary storage
⁻ You also retain low-latency access to your frequently accessed data
Stored volumes
⁻ If you need low-latency access to your entire dataset, first configure your on-premises
gateway to store all your data locally. Then asynchronously back up point-in-time
snapshots of this data to Amazon S3
⁻ This configuration provides durable and inexpensive offsite backups that you can recover
to your local data center or Amazon EC2
⁻ For example, if you need replacement capacity for disaster recovery, you can recover the
backups to Amazon EC2
Cache Volume Architecture
Stored Volume Architecture
Tape Gateway
⁻ Can cost-effectively and durably archive backup data in Amazon S3 Glacier or S3 Glacier
Deep Archive
⁻ A tape gateway provides a virtual tape infrastructure that scales seamlessly with your
business needs and eliminates the operational burden of provisioning, scaling, and
maintaining a physical tape infrastructure
⁻ You can run AWS Storage Gateway either on-premises as a VM appliance, as a hardware
appliance, or in AWS as an Amazon Elastic Compute Cloud (Amazon EC2) instance
Tape Gateway Architecture
SSL Certificate
⁻ SSL (Secure Sockets Layer) and Transport Layer Security (TLS) is the standard
security technology for establishing an encrypted link between a web server and a
browser
⁻ This link ensures that all data passed between the web server and browsers remain
private and integral
⁻ These are cryptographic protocols designed to provide communications security over a
computer network
⁻ AWS Certificate Manager is a service that lets you easily provision, manage, and deploy
public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for
use with AWS services and your internal connected resources
⁻ SSL/TLS certificates are used to secure network communications and establish the identity
of websites over the Internet as well as resources on private networks
⁻ AWS Certificate Manager removes the time-consuming manual process of purchasing,
uploading, and renewing SSL/TLS certificates
⁻ With AWS Certificate Manager, you can quickly request a certificate, deploy it on ACM-
integrated AWS resources, such as Elastic Load Balancers, Amazon CloudFront
distributions, and APIs on API Gateway, and let AWS Certificate Manager handle
certificate renewals
⁻ It also enables you to create private certificates for your internal resources and manage
the certificate lifecycle centrally
Use Cases
Protect and secure your website: SSL, and its successor TLS, are industry standard protocols
for encrypting network communications and establishing the identity of websites over the
Internet
Protect and Secure your internal resources: Private certificates are used for identifying and
securing communication between connected resources on private networks, such as servers,
mobile and IoT devices, and applications
1. Define Lambda
2. What are the languages supported by lambda
3. What is the min and max memory can be allocated to a lambda function
4. What is the min and max execution time (Run time environment)
5. What is event source mapping
6. If you need to manage your own compute resources, what are the AWS offerings
7. Define Lambda@Edge
8. What is cloud Formation Service
9. What are the languages supported by Cloud Formation
10. What are the different types of data migration service available
11. What is storage gateway
12. What are the types of storage gateway in AWS
13. Define AWS certificate manager