What is Amazon Auto-Scaling

Updated 17 January 2025

Our previous blogs introduced the AWS elastic file system and the AWS elastic load balancer.

We successfully set up our first EFS and ELB with EC2 instances in later blogs.

This web service automatically launches or terminates EC2 instances based on user-defined policies, schedules, and health checks.

The AWS ELB distributes incoming traffic across multiple targets to ensure server stability during high-traffic scenarios, ensuring high availability and tolerance.

EFS acts as a centralized file directory that hundreds or thousands of EC2 instances or containers can access over the web.

It helps the targets or, in our case, EC2 instances under load balancer to access the same filesystem which ensures data consistency and data persistency, even after the failure of the instances.

Although these services ensure high availability and data consistency, setting up an architecture with these services is not a cost-effective approach.

Assume we set up 3 AWS EC2 instances in different availability zones within a region, each with EFS mounted as a centralized file system.

Also, we are going through a high incoming traffic phase when all three of our EC2 instances are getting traffic all around the globe via a load balancer.

Now, what happens when incoming traffic gets lowered?

Our 2 instances can handle incoming requests, making the 3rd instance unnecessary, wasting resources, and incurring extra costs.

Also, if there comes a time when all 3 instances would not be sufficient to attend massive god-level traffic, then our architecture will fall.

What we need here is flexibility, or moreover, the need for automated scaling of our EC2 instances depending upon our need.

Amazon Auto Scaling

Amazon Web Services provides Auto Scaling which ensures that you have the correct number of Amazon EC2 instances available to handle the load for your application.

This web service automatically launches or terminates Amazon EC2 instances based on user-defined policies, schedules, and health checks.

It easily scales in and out instances based on defined policies. An auto-scaling group is created with thresholds for the minimum and maximum number of instances.

It ensures the auto-scaling group maintains minimum instance counts during scaling in and prevents launching instances once the maximum limit is reached.

Auto-scaling allows attaching a load balancer to evenly distribute traffic across instances within the group.

Since newly auto-scaled instances would have to access the same content, we can provision mounting of EFS on EC2 instances.

Benefits of Amazon Auto Scaling

  • Better fault Tolerance
  • High Availability
  • Better Cost Management

Auto-scaling groups contain EC2 instances in one or more availability zones within a region, but not across regions. Each group has a launch configuration.

The Auto-scaling group uses a launch configuration as a template to launch EC2 instances, which contains details like AMI ID, instance type, security groups, subnets, key pair, and user data.

User data contains a set of commands that the auto-scaling group executes on the launched instances. This launch configuration defines the attributes of your EC2 instances.

We can also define metrics and custom scaling policies on our auto-scaling groups. Scaling policies define the policy by which instances will get scaled out or scaled in.

But before deploying scaling policies, we need to define cases over which scaling will occur.

Amazon CloudWatch

Amazon CloudWatch, here, comes in action. CloudWatch monitors operational and performance metrics for AWS cloud resources and applications.

Using it, we can define thresholds, crossing which an alarm will get triggered. Scaling Policies use these CloudWatch alarms to define the scaling actions.

These alarms can be set on either of the metrics available like EC2 metrics, ELB metrics, RDS metrics, etc.

We define thresholds for metrics (e.g., CPU utilization, latency) in the alarm, which triggers scaling policies when activated.

For example, our auto-scaling group will check for CPU utilization percentage at fixed intervals. If it exceeds 80 % for a certain amount of time, then it triggers the alarm.

The auto-scaling group will launch a new instance if needed, and terminate one if CPU utilization stays below 70% for a set period..

Take note that, despite the scaling policies, the auto-scaling group will always retain the desired minimum instance count and will not exceed above maximum instance count limit.

EC2 instance Lifecycle

EC2 instances in auto auto-scaling group have a path or lifecycle that differs from that of other EC2 instances.

The lifecycle begins when the auto-scaling group launches an instance and ends when it is terminated or removed from service.

We can also add lifecycle hooks to auto-scaling groups. It enables us to perform custom actions whenever an instance gets launched or terminated.

The following flowchart is from AWS docs illustrating the lifecycle of EC2 instances,

We can receive notifications via Amazon Simple Notification Service whenever we use CloudWatch alarms to initiate auto-scaling actions, or when auto-scaling completes an action.

We can also run On-Demand or Spot Instances, including those inside your VPC or high-performance computing clusters.

This has been a brief introduction to AWS Auto Scaling, its features, and its applicability.

In our next blog, we will begin with setting up auto-scaling and understanding its various terminologies.

If you need any help or query, please contact us or raise a ticket.

Category(s) AWS cloud
author
. . .

Leave a Comment

Your email address will not be published. Required fields are marked*


1 comments

  • Set up Amazon Auto Scaling - Cloudkul
  • Start a Project




      Message Sent!

      If you have more details or questions, you can reply to the received confirmation email.

      Back to Home