What is autoscaling?

Autoscaling is the ability to automatically grow or shrink the number of compute resources that are allocated to your application to meet on its needs at any given time based on metrics that represent the application workload. Cloud computing makes it easier to scale automatically than on-premises deployment.


Benefits of autoscaling

  • Fault tolerance. Autoscaling can detect when an instance is unhealthy, terminate it, and launch a new instance to replace it. You can also often configure autoscaling to use multiple availability zones. If one availability zone becomes unavailable, autoscaling can launch instances in another one to compensate.
  • Availability. Autoscaling helps ensure that an application always has the right amount of capacity to handle the current traffic demand. You can take advantage of the safety and reliability of geographic redundancy by spanning autoscaling groups across multiple availability zones within a region. When one availability zone becomes unhealthy or unavailable, autoscaling launches new instances in an unaffected availability zone. When the unhealthy availability zone returns to a healthy state, autoscaling automatically redistributes the application instances evenly across all of the designated availability zones.
  • Cost management. Autoscaling can dynamically increase and decrease capacity as needed. Because you pay for the compute instances you use, you save money by launching instances when they are needed and terminating them when they aren’t. This enables you to meet workload demands without keeping (and paying for) underutilized capacity.
  • Team focus. IT teams no longer have to worry about scaling the environment up and down. Automating that process allows techs to focus on other business priorities.
  • Consistency. You can more easily offer an optimal user experience at all times regardless of the volume of traffic or amount of resources used.
  • Variable usage management. Autoscaling resources helps you keep up with increased resource demands, such as when you launch promotions that bring in a flood of users. When CPU load and bandwidth volume vary significantly for a web application, you need the resources to meet the peak levels of usage, but don’t have to pay for those resources unless needed. For example, some sites have consistent resource usage during the weekdays, but website traffic might spike over the weekend.


Suggested Reading and Related Topics