6 Common Mistakes in AWS EC2 and Azure Cloud VM Optimization

By Bob Farzami, Vice President, Cloud Strategy


No matter what’s driving your move to an AWS or Azure cloud, two things are true. One, you don’t want to under-provision, which could create performance and availability issues. And two, you don’t want to overpay, because no one ever wants to do that. One of the key decisions you must make is which Amazon EC2 or Microsoft Azure virtual machine instance configuration you need. It’s a scoping exercise, but several factors make this easier said than done. There are six common barriers that can prevent companies from accurately scoping their cloud VM requirements, forcing them to rely on guesswork and increasing the chances they’ll under-provision or overpay. 

1. The choice is overwhelming. 

Having options is good, and so is configurability—these enable you to better tailor deployment to your needs. But when the total possible permutations number in the hundreds of thousands, you have too much choice. This can be paralyzing especially as new types, sizes, and generations are introduced by cloud providers every couple of weeks. You need to be able to automate the analysis of all those combinations so you can easily find the ones that will truly best suit your needs.  

2. A CPU-centric focus provides an incomplete picture. 

There is guidance available to help you navigate the range of options, but it often focuses primarily on CPU utilization. That’s certainly important, but it’s by no means the only factor you must consider. If you don’t take other critical computing dimensions—such as memory usage, IOPS, and network bandwidth—into account, you may create a bottleneck in capacity, causing a slowdown in your application performance. 

3. Average and max data are misleading. 

When considering what you need to size your VM to, there are several approaches. One is to use averages. This might be sufficient if there’s very little variability in your workloads, but that’s often not the case. The more variable demand is, the less representative the average number is for your environment and you risk under-provisioning your VM. Another approach is to use max, or peak, data. This is the “safest” approach from a capacity planning perspective, but if your peak is very short-lived—for example, a few minutes at midnight during the week due to backups—you could end up over-provisioning. You need to consider other statistical aggregation methods, such as 95-percentile, to be safe without resulting in waste. 

4. Cost reduction must to be weighed against risk.

Because there are so many options available at different price points, it can sometimes seem as if the final decision comes down to cost. This might be true if we lived in world of certainty and predictability, but, of course, we don’t. Instead, you need to evaluate cost and risk. If one option is cheaper by several hundred dollars but has, at a certain threshold, a much higher risk of performance bottlenecks or even downtime what would cost the organization thousands, do you still go with the lower-cost choice? Maybe. It depends on your company’s individual risk profile. The point is, you should make the decision deliberately and by calculating risk tolerance mathematically.

5. You need to plan for the future. 

Your company wants to grow and evolve—every business does—and your infrastructure must keep up, and not solely based on historic usage. It’s not easy to plan ahead. This is especially in a climate of transformational and fast changes, or if you are provisioning an environment ahead of your application launch, which may be part of the reason you migrated to the cloud in the first place. You need to understand how much headroom each option provides, based on your expected workloads and risk tolerance, which requires the capability to conduct what-if analysis. 

6. You need to account for existing commitments and requirements.

There may be certain conditions you want to take into account. For example, if you’ve already pre-paid or otherwise committed to a particular EC2 or VM type (Reserved Instance), you need to take full advantage of it for the length of the term. Or if there are certain types of VMs you know won’t work or you otherwise don’t want to use, for example, if you need to avoid burstable CPU credit, or want to stick exclusively with Intel chips (vs. AMD). You may also have varying requirements across different applications, regions, etc., where workloads may require different thresholds than others. You need to factor all of these considerations into your analysis. 

AWS and Azure VM Optimization With CloudWisdom

Virtana’s CloudWisdom cloud management platform includes an automated recommendation tool to help companies right-size their AWS and Azure VMs. It combines granular measurement and aggregation of resource utilization with knowledge of the thousands of ever-expanding configurations for AWS and Azure resources to identify opportunities to optimize. It looks at all the critical dimensions, including CPU, memory, IOPS, and network bandwidth based on usage over time. It allows you to set constraints, including or excluding factors to reflect your specific requirements, so you can find the ideal resource settings before making long-term reservation commitments. And it allows you to perform what-if analyses to help you scale resources to match growth. Contact us to learn more.