Amazon EC2 Auto Scaling

Amazon EC2 Auto Scaling is a service that helps you maintain application availability and allows you to automatically add or remove EC2 instances according to conditions you define. You can use the fleet management features of EC2 Auto Scaling to maintain the health and availability of your fleet.

Instead of guessing how many servers you need (provisioning for peak load and wasting money during lulls), Auto Scaling ensures you have exactly the right amount of compute power right when you need it.

Scaling Amazon EC2 means you start with the resources you require at the time of starting your service and build your architecture to automatically scale in or out, in response to the changing demand. As a result, you only pay for the resources you utilize. You don't have to be concerned about running out of computational power to satisfy your consumer's demand.

Core Components of Auto Scaling

To configure Auto Scaling, you need to define three main components:

1. Launch Template

This defines the configuration of the instance. It includes the AMI (OS image), Instance Type (e.g., t3.micro), Key Pair, Security Groups, and User Data (startup scripts).

Note: Launch Configurations are legacy. Always use Launch Templates for new workloads as they support versioning and mixed instance policies.

2. Auto Scaling Group

This creates the logical grouping of instances. You define the VPC and Subnets where instances should launch.
You also define the capacity limits:
Minimum Capacity: The floor. AWS will never go below this (e.g., 2 for High Availability).
Maximum Capacity: The ceiling. A safeguard to prevent runaway costs.
Desired Capacity: How many instances you want running right now.

3. Scaling Policies (The "When"):

Provisioning servers for peak traffic ensures demand is met but can lead to excess capacity and higher costs
Allocating resources based on average demand reduces costs but may affect performance during spikes
EC2 Auto Scaling automatically adds or removes instances based on real-time demand
Uses EC2 instances to provide a cost-efficient architecture, charging only for resources actually used

EC2-Auto-Scaling-2 — Capacity-Day of the Week Graph

Amazon EC2 auto-scaling will helps you to scale the resources of EC2 depending on the demand of incoming traffic. It will maintain the high availability and optimize the cost of AWS EC2.
EC2 Auto Scaling is will helps to create collection of EC2 instances called an Autoscaling group where load balancer will transfer the load to this instances. The minimum, maximum and preferred capacity for your Auto Scaling group can then be specified. To keep instances running at the appropriate capacity EC2 Auto Scaling will start and stop them automatically.
EC2 auto scaling will offers you to configure the policies where you mention the details like at which percent of CPU utillization or memory usage you need to scale the instance based on the demand. They can be scaled automatically based on the traffic to the demand.

Auto-Scaling EC2

That's the point where Amazon EC2 Autoscaling comes into the picture. You may use Amazon EC2 Auto Scaling in order to add or delete Amazon EC2 instances with respect to changes in your application demand. You can maintain a higher feeling of application availability by dynamically scaling your instances in and out as needed.

Features of AWS Auto Scaling

Here are the some most important features of Aws Auto scaling

Dynamic Scaling: Dynamic scaling adjusts EC2 instances based on demand, allowing users to match the application’s load in real time. Using metrics like CPU usage or request count per target, EC2 Auto Scaling automatically increases or decreases instances to maintain optimal performance.

Load Balancing: Load balancing involves distributing incoming traffic across multiple instances to improve performance and availability. Amazon Elastic Load Balancing (ELB) is a service that automatically distributes incoming traffic across multiple instances in one or more Availability Zones.
Multi-Availability Zone Deployment: Multi-Availability Zone (AZ) deployment involves launching instances in multiple AZs to improve availability and fault tolerance. Amazon EC2 Auto Scaling can be used to automatically launch instances in additional AZs to maintain availability in case of an AZ outage.
Containerization: Containerization involves using containers to package and deploy applications, making them more portable and easier to manage. Amazon Elastic Container Service (ECS) is a service that makes it easy to run, stop, and manage Docker containers on a cluster of EC2 instances.

Types of AWS (Amazon Web Services) Autoscaling

AWS offers several ways to scale your infrastructure:

1. Target Tracking Scaling (Recommended)

This is the "thermostat" approach. You set a target value for a specific metric, and AWS handles the rest.

Example: "Keep Average CPU Utilization at 50%."
If CPU hits 70%, ASG adds instances. If it drops to 30%, ASG removes instances.

2. Step Scaling & Simple Scaling

Used for more advanced or "stepped" logic.

Example: "If CPU > 50%, add 1 instance. If CPU > 80%, add 3 instances."
Note: Simple scaling requires a "Cooldown Period" (a pause after scaling) to prevent oscillation. Step scaling is generally preferred over Simple scaling today.

3. Scheduled Scaling

Useful for predictable traffic patterns.

Example: You know traffic spikes every Monday at 9 AM. You configure the ASG to increase Min Capacity to 10 at 8:55 AM and lower it back down at 5:00 PM.

4. Predictive Scaling

Uses Machine Learning to analyze historical traffic data and forecasts future demand. It schedules the scaling actions proactively so the capacity is ready before the traffic hits.

Types-Of-AWS-Autoscaling — Types of AWS Auto Scaling

Advanced Features

Mixed Instances Policy (Cost Optimization)

You don't have to use just one instance type or pricing model.

Spot + On-Demand: You can configure your ASG to run a baseline of reliable On-Demand instances (e.g., 20%) and fill the rest of the demand with Spot Instances (80%) to save up to 90% on costs.
Multiple Instance Types: You can list multiple acceptable instance types (e.g., t3.medium, t3a.medium, m5.large). If one is unavailable or expensive, ASG automatically picks the next best one.

Health Checks: EC2 vs. ELB

EC2 Health Check (Default): The ASG checks if the instance is running. If the hardware fails, ASG replaces it. However, if your application crashes but the OS is still running, EC2 thinks the instance is healthy.
ELB Health Check: The ASG communicates with the Load Balancer. If the Load Balancer says an instance is returning 500 Errors or 404s, the ASG considers it unhealthy and terminates/replaces it. This is crucial for web applications.

Lifecycle Hooks

These allow you to "pause" the creation or termination of an instance to perform custom actions.

Launching Hook: Pause the instance before it accepts traffic to download software or run a configuration script.
Terminating Hook: Pause the instance before it is deleted to upload logs to S3 or drain connections gracefully.

AWS Autoscaling For EC2 (Elastic Cloud Computing)

Amazon EC2 Auto Scaling automatically adjusts instances based on demand and replaces unhealthy ones. It performs three key functions:

Balances instance capacity across Availability Zones to ensure even distribution of traffic
Replaces and repairs unhealthy instances to maintain service reliability
Monitors instance health and ensures traffic is evenly allocated among running instances

Amazon-Web-Services-Scaling-Amazon-EC2

Use Cases of AWS (Amazon Web Services) Auto Scaling

Automatic Scaling: Application scaling can be done automatically based upon the incoming traffic if the load is increased then the application will scale up and the load decrease application will scale down automatically.
Schedule Scaling: Based the data that previously available in at which particular point of time there going to be peak point and at which time there going to be less traffic we can schedule the auto scaling.
Integration: You can integrate with other service in the AWS. Mainly the machine learning which will helps to predict the incoming traffic and can scale according to the traffic.

Configuring AWS Auto Scaling Steps

Automatically adjusts the number of instances based on traffic or CPU load
Monitors instances in an Auto Scaling group and maintains balanced performance
Scales out when load increases and scales in when load decreases
Replaces failed instances to maintain the desired capacity

To know how to create autoscaling refer to Create and Configure the Auto Scaling Group in EC2.