Cloud infrastructures have introduced increasing levels of complexity—you have to manage workloads across on-premises, private, and multiple public cloud environments. This requires you to migrate efficiently, optimize effectively, and stay rightsized on an ongoing basis, all while meeting evolving business requirements. With so many moving parts, it can be a massive challenge with lots of pitfalls that can cost you time and money and even put your business results in jeopardy. But there are ways you can set your cloud team up to minimize the headaches and maximize the business value of your hybrid cloud infrastructure.

The Different Dimensions of Cloud Infrastructures

Sometimes it can seem as if talking about cloud infrastructures is as complicated as managing them. There are a number of concepts that sound similar, but actually mean different things. The first step is to understand these different terms. 

Hybrid-Cloud Infrastructures

hybrid-cloud infrastructure is one that uses some combination of public cloud, private cloud, and on-premises infrastructure. It’s rare nowadays for an organization to be 100% on premises, but it’s also rare to be 100% in the cloud. There are many reasons why you’d want to run some workloads on premises and some in the public cloud (five common scenarios are detailed here) and the reality is that the vast majority of enterprises are running a hybrid cloud infrastructure.

Multi-Cloud Infrastructures

multi-cloud infrastructure is one that uses multiple public cloud service providers (CSPs). This is different from hybrid-cloud, and it’s easy to get the two confused. You could have an infrastructure that’s hybrid-cloud but not multi-cloud (i.e., you have workloads running on-premises and on one CSP) or you could be multi-cloud but not hybrid-cloud (you’re running a 100% public-cloud deployment with multiple CSPs). Most organizations are both hybrid- and multi-cloud. 

Cross-Cloud Capabilities

Cross-cloud capabilities are needed in a hybrid-/multi-cloud infrastructure to enable secure data sharing across cloud providers and regions. Where hybrid-cloud and multi-cloud describe where you have workloads running, cross-cloud is about enabling management and visibility across those clouds rather than treating each one as an independent silo.

Cloud Infrastructure Challenges

While there are many benefits to the cloud—agility, scalability, business continuity, efficiency, capex savings, and many more—there are also a number of challenges that come with a cloud infrastructure in any hybrid and/or multi permutation. 

Cloud migration is complex

Most enterprises have hundreds or thousands (or even hundreds of thousands) of existing workloads. The evolution to a hybrid-cloud infrastructure requires them to migrate some portion of those workloads from on premises to a public or private cloud. Making this happen efficiently—and in a manner that minimizes risk of failure during migration and maximizes both performance and cost in the cloud post-migration—requires careful planning. (Check out our 8-part guide to successfully migrating workloads to the cloud.)

Managing the costs of your cloud infrastructure is difficult

There are lots of pricing and capacity variables that make it hard to understand the factors driving your cloud costs. Even if your CSP provides you with capabilities to analyze your spend, your ability to explore various dimensions in granular detail and compare spend over time is typically limited. Additionally, there may be large line items in your bill that you can’t parse out. For example, AWS has an “EC2 Other” cost bucket that contains multiple service-related usage types. EC2 Other can show up on your bill as one of the biggest dollar amounts, but the AWS native tools don’t let you analyze those costs in detail. And in a multi-cloud environment, you’ve got different bill analysis capabilities across your different CSPs, giving you an inconsistent view into your cloud costs. 

Cloud rightsizing is an ongoing challenge

Matching capacity with workloads for the best performance at the lowest cost and risk is difficult enough pre-migration given the thousands of configuration options for cloud resources. But your cloud infrastructure isn’t a constant environment. As your business evolves, so will your workloads and the conditions they operate in. At the same time, CSPs are continually offering new options. All of this makes rightsizing a moving target for your cloud infrastructure.

Views into cloud data don’t align with business needs

IT—and cloud infrastructures—exist to serve the business. And yet, in a recent survey, 66% of cloud decision makers said it’s hard to understand whether they’re delivering the service levels their business needs to succeed, and 65% reported difficulty identifying the overall business impact in the event of an issue. That’s because many of the CSP tools are built with an IT and technology focus and often don’t allow you to view your cloud data from a business perspective

Cloud waste is rampant

By just about every account, about one-third of all cloud spending is wasted. If you’re not rightsized, as discussed above, you’re spending more money than you need to, but that’s not the only culprit. Do you know if you’re paying for unused compute instances that could be terminated? Or if you’re being billed for storage blocks that are attached to a stopped instance or no longer attached to a compute instance? Do you have unattached load balancers or idle elastic IP addresses? It can be hard to find these expensive needles in your cloud infrastructure haystack. 

Common Cloud Infrastructure Pitfalls That Are Completely Avoidable

Given all of these challenges, there are number of common problems that frequently trip up enterprises. Here are seven common cloud infrastructure pitfalls that companies face and the fallout that results. (In the next section, we’ll discuss the capabilities you need to build so you can avoid these setbacks.)

You break dependencies during migration

Enterprises have hundreds if not thousands of applications, infrastructure components, and service sets, many of which are shared, creating a complex web of interconnected dependencies. You will need to migrate some portion of these to the cloud, and that process will likely be performed in phases. Without smart planning and careful prioritization, you could end up severing connections between dependent components, processes, or services, disrupting business operations. 

Unforeseen problems force you to repatriate migrated workloads

Cloud and on-premises are fundamentally different environments. Too often, enterprises are blindsided by unexpected issues when they migrate workloads from on-site data centers to one or more public clouds that are severe enough that they are forced to hit the “undo” button. In fact, our own research found that 72% of organizations have had to repatriate migrated workloads for a variety of reasons, including: unexpected cloud costs, performance degradation, technical issues associated with public cloud provisioning, and the discovery that the applications should never have been moved to a public cloud in the first place. 

You don’t see expected gains post-migration

While there are a variety of ways to get workloads into the cloud, most enterprises tend to focus on approaches that minimize the changes needed to their applications. Rather than rebuild their applications from scratch to be cloud-native or replace them entirely with a new cloud-native application, they’re more likely to either replatform with minimal upgrades to take advantage of cloud benefits or refactor parts of their applications to better support the cloud environment. This means that the way those applications operate on-premises—their existing workload characteristics—will affect how they run in the cloud. What can happen is that undiagnosed application inefficiencies and issues that are relatively inconsequential in the data center end up creating real performance and/or cost penalties in the public cloud. 

You make ill-informed reservation commitments

Reserved instances enable you to get some good discounts on the resources you use by committing to a particular plan. However, if you lock yourself into buying capacity that you won’t use, those savings may not materialize. And while you might have some flexibility to modify or exchange a reserved instance, you typically can’t cancel it. 

Your cloud costs are higher than expected

The concept of “pay for what you use” in the public cloud can create a false sense of budgetary security. While it’s true that you don’t have to overprovision now to support future growth the way you do in an on-premises environment—which is what we usually mean by the phrase—it’s also true that if you’re not paying attention, or not making the optimal choices, you could be on the hook for skyrocketing cloud bills

You can’t easily get a global view of cloud costs

When your infrastructure spans on-premises, private, and multiple-public cloud environments, it can be difficult to get a single view of your entire estate and its associated costs. In fact, around two-thirds of organizations can’t easily see and manage costs across all their public clouds and an even higher number have to wait hours, days, or even longer to get an up-to-date global view of cloud costs. If you can’t easily see your cloud costs, you’re not in a good position to control those costs. 

You can’t adequately support the business 

The limited visibility that prevents enterprises from getting a global view of cloud costs also hinders their ability to maximize value for the business. Too many organizations report that they are struggling to understand whether they’re delivering the service levels their business needs to succeed, and that they have a hard time identifying the overall business impact in the event of an issue. 

Here’s How To Prevent Cloud Infrastructure Headaches

The good news is that with the right approach, information, and tools, you can avoid these common cloud infrastructure pitfalls and the headaches that come with them. Here are nine things you need to reduce the stress and effort required to manage your cloud infrastructure effectively. 

1. Take a workload-centric approach

When you’re running a hybrid cloud infrastructure, you’ve got a lot of different environments to manage. But it’s your workloads that are running your business. You make the decision of whether to run them on premises or in a private or public cloud based on how that better supports the business. That’s why you need to take a workload-centric approach to migrating, managing, and then optimizing capacity, costs, and performance of your entire estate in a unified way regardless of location. 

2. Understand workload characteristics before migrating to the cloud

If you want to ensure that your workloads perform as expected after you migrate them to the cloud, you need to first understand their characteristics in their current environment. To do this, you need to perform a baseline assessment to get detailed information about the health, utilization, and performance characteristics—including seasonality—of your workloads in the on-premises infrastructure. This provides you with a reference point for comparing utilization and performance in the cloud to ensure your post-migration deployment delivers on performance and other KPIs to support the enterprise’s objectives.

3. Prioritize workloads to be migrated 

With so many interdependencies among enterprise workloads, you need to be able figure out what to migrate and in what order—and how to do it as efficiently as possible. Often, prioritization decisions are made based on business need, and this is important. But if you can understand the utilization attributes of your workloads and combine like types from that perspective to create move groups—workloads that should be targeted to move together—then you can build an intelligent, prioritized plan to ensure a smooth migration. 

4. Make more informed workload placement decisions

A hybrid-cloud infrastructure gives you lots of options when it comes to workload placement. This gives you more opportunity for workload optimization—ensuring that each workload runs in an environment for the best possible performance at the lowest possible cost within your specific risk tolerance. But to maximize that opportunity, you need to be able to measure the size of your workloads, appraise the objective for each workload, and then evaluate your best-fit-for-purpose options—and make these decisions on a workload-by-workload basis.  

5. Identify unused compute instances and unattached or abandoned storage

It’s easy to spin things up and down in the cloud, which is great for agility. But elements, such as storage, can be forgotten in the down part of that process. When this happens, you end up paying a lot of money for resources that you’re not using. You need to be able to identify idle storage/disk resources or elastic IP addresses, storage blocks that are no longer attached to a compute instance or are attached to a stopped instance, and unattached load balancers. 

6. Get rightsizing recommendations

CSPs offer thousands, if not hundreds of thousands, of configuration options, with new ones constantly emerging. This means you have a lot of choices that may look similar at first glance, but could end up having varying impacts over the long term. You want to do smart “comparison shopping” to make the best selections for your workloads but this can be difficult. You need to be able to winnow the options with recommendations based on your usage, budgets, and risk so you can focus on a short list that makes sense for your needs. 

7. Tune sizing with “what if” analysis

The recommended short list is just a starting point. You then need to be able to tune sizing based on your organization’s risk tolerance. To do this, you need to be able to perform “what if” analysis that includes CPU, memory, I/O, and ingress and egress charges.

8. Optimize reservation discounts

To maximize long-term savings and avoid costly reservation mistakes, you need to ensure your cloud resource planning meets SLAs and stays on budget while accounting for both peak and non-peak usage. You want to be able to conduct “what if” analysis here, too, for potential savings in varied scenarios. And you need to be able to track the amortized value of your reservation usage at the instance level. 

9. Enable more sophisticated cost analysis

CSPs provide cost analysis capabilities, but these are often insufficient for providing the level of insight and control you need. To truly understand your cloud costs and make better decisions—for the business and for your budget—you need to be able to filter, group, and stack by cost allocation tags and to analyze them on more than one dimension. You also need to be able to unravel big “buckets” in your cloud bill that appear as a black-box spend so you can see what’s actually going on. This includes breaking down the “EC2 Other” bucket described above and the flexibility to view amortized, blended, and unblended costs. 

Virtana: Your Cloud Infrastructure Partner

With Virtana Platform, you can manage your entire IT infrastructure across on-premises, public cloud, and hybrid cloud deployments with precision observability—the combination of AIOps, ML, and data-driven analytics for efficient migration and ongoing performance, capacity, and cost optimization. Virtana Platform delivers all the capabilities you need to manage your cloud infrastructure effectively and efficiently—without the headaches.

Mark Heslop
Mark Heslop

Director of Product Marketing, Virtana

Cloud
January 03 2022Jon Cyr
Hybrid Cloud Infrastructure: A Complete Migration, Cost Management, and Optimization Checklist
The success of your enterprise’s digital transformation relies in no small part on your&nbs...
Read More
Misc
November 29 2021Mark Heslop
Cloud Elasticity: What Happens When You Lose Control
Basics of Cloud Elasticity What is cloud elasticity? In an on-premises environment, y...
Read More