AWS EC2 Auto Scaling

In traditional IT world, there are limited number of servers to handle the application load. When the number of requests increases the load on the servers also increases, which causes latency and failures.

Amazon Web service provides Amazon EC2 Auto Scaling services to overcome this failure. Auto Scaling ensures that Amazon EC2 instances are sufficient to run your application. You can create an auto-scaling group which contains a collection of EC2 instances. You can specify a minimum number of EC2 instance in that group and auto-scaling will maintain and ensure the minimum number of EC2 instances. You can also specify a maximum number of EC2 instances in each auto scaling group so that auto-scaling will ensure instances never go beyond that maximum limit.

You can also specify desired capacity and auto-scaling policies for the Amazon EC2 auto-scaling. By using the scaling policy, auto-scaling can launch or terminate the EC2 instances depending on the demand.

Auto Scaling Components

  1. Groups

Groups are the logical groups which contain the collection of EC2 instances with similar characteristics for scaling and management purpose. Using the auto scaling groups you can increase the number of instances to improve your application performance and also you can decrease the number of instances depending on the load to reduce your cost. The auto-scaling group also maintains a fixed number of instances even if an instance becomes unhealthy.

To meet the desired capacity the auto scaling group launches enough number of EC2 instances, and also auto scaling group maintains these EC2 instances by performing a periodic health check on the instances in the group. If any instance becomes unhealthy, the auto-scaling group terminates the unhealthy instance and launches another instance to replace it. Using scaling policies you can increase or decrease the number of running EC2 instances in the group automatically to meet the changing conditions.

2. Launch Configuration

The launch configuration is a template used by auto scaling group to launch EC2 instances. You can specify the Amazon Machine Image (AMI), instances type, key pair, and security groups etc.. while creating the launch configuration. You can also modify the launch configuration after creation. Launch configuration can be used for multiple auto scaling groups.

3. Scaling Plans

Scaling plans tells Auto Scaling when and how to scale. Amazon EC2 auto-scaling provides several ways for you to scale the auto scaling group.

Maintaining Current instance level at all time:- You can configure and maintain a specified number of running instances at all the time in the auto scaling group. To achieve this Amazon EC2 auto-scaling performs a periodic health check on running EC2 instances within an auto scaling group. If any unhealthy instance occurs, auto-scaling terminates that instance and launches new instances to replace it.

Manual Scaling:- In Manual scaling, you specify only the changes in maximum, minimum, or desired capacity of your auto scaling groups. Auto-scaling maintains the instances with updated capacity.

Scale based on Schedule:- In some cases, you know exactly when your application traffic becomes high. For example on the time of limited offer or some particular day in peak loads, in such cases, you can scale your application based on scheduled scaling. You can create a scheduled action which tells Amazon EC2 auto-scaling to perform the scaling action based on the specific time.

Scale based on demand:- This is the most advanced scaling model, resources scales by using a scaling policy. Based on specific parameters you can scale in or scale out your resources. You can create a policy by defining the parameter such as CPU utilization, Memory, Network In and Out etc. For Example, you can dynamically scale your EC2 instances which exceeds the CPU utilization beyond 70%. If CPU utilization crosses this threshold value, the auto scaling launches new instances using the launch configuration. You should specify two scaling policies, one for scaling In (terminating instances) and one for scaling out (launching instances).

Types of Scaling polices:-

  • Target tracking scaling:- Based on the target value for a specific metric, Increase or decrease the current capacity of the auto scaling group.

  • Step scaling:- Based on a set of scaling adjustments, increase or decrease the current capacity of the group that vary based on the size of the alarm breach.

  • Simple scaling:- Increase or decrease the current capacity of the group based on a single scaling adjustment.

Setup

As a pre-requisite, you need to create an AMI of your application which is running on your EC2 instance.

  • Setup: Launch Configuration:

  1. Go to EC2 console and click on Launch Configuration from Auto Scaling

2. From Choose AMI, select the Amazon Machine Image from My AMIs tab, which was used to create the image for your web application.

3. Then, select the instances type which is suitable for your web application and click Next: Configure details.

4. On Configure details, name the launch configuration, you can assign if any specific IAM role is assigned for your web application, and also you can enable the detailed monitoring.

5. After that, Add the storage and Security Groups then go for review. Note: Open the required ports for your application to run.

6. Click on Create launch configuration and choose the existing key pair or create new key pair

  • Setup: Auto Scaling Group:

  1. From EC2 console click on Auto Scaling Group which is below the launch configuration. Then click on create auto scaling group.

  2. From Auto scaling Group page, you can create either using launch configuration or Launch Template. Here I have created using Launch Configuration. You can create new Launch Configuration from this page also. Since you had already created the launch configuration, you can go for creating auto scaling group by using “Use a existing launch configuration”.

3. After clicking on next step, you can configure group name, group initial size, and VPC and subnets. Also you can configure load balance with auto scaling group by clicking Advanced Details.

After that click on next to configure scaling policies

4.On scaling policy page, you can specify the minimum and maximum number of instance in this group. Here you can use target tracking policy to configure the scaling policies. In metric type you can specify such as CPU utilisation and Network In or Out and also you can give the target value as well. Depending on the target value the scaling policy will work. You can also disable scale-in from here.

You can also use Step and simple scaling policies.

It works based on alarm, so first create the alarm by clicking on ‘add new alarm’.

Here the alarm created is based on CPU utilization above 65%. If CPU utilization crosses 65% the auto scaling launches new instances based on the step action.

You can specify more step actions based on your load, but in simple policy you can’t categorise depending on the percentage of CPU utilisation. Also you need to configure scale-in policies once the traffic become low, as it reduces the billing.

5. Next click on ‘Next: Configure Notification’ to get the notification based on launch, terminate, and fail etc. to your mail ID, and enter the tag and click on ‘Create auto scaling group’.

Last updated