EPIC Electronic Health Record (EHR) Systems

Back to Blog

As a healthcare CIO mandated with the transition to electronic healthcare records, you’ve done your due diligence, evaluated various EHR vendors and most likely settled on Epic, a software vendor with over 25% of the US acute care hospital market share. You’ve followed Epic guidelines for minimum hardware requirements and acquired what you believe is the right workstation, server, networking, and storage infrastructure for Epic. How do you take precautions to ensure that you don’t have outages due to underlying shared infrastructure – comparable to the 6 day outage reported at Boston Children’s Hospital in 2015?

There are many factors unrelated to Epic itself which could cause application performance issues, and even downtime:

If you are a large organization, Epic recommends that you have a tiered database architecture and this is based on InterSystems’ Enterprise Caché Protocol (ECP) technology. Caché is a commercialized version of a 53-year old database called MUMPS (Massachusetts General Hospital Utility Multi-Programming System). A unique characteristic of Caché is that it has an I/O access pattern of continuous, random database file reads interspersed every 80 seconds by a large burst of writes. This puts a demand on the shared SAN-attached storage array that it should have enough cache available to absorb all 80 seconds of write into cache and de-stage it before the next write cycle from Caché hits the storage array. If the storage array doesn’t have enough write cache available it will hold off acknowledging write requests till its own cache is free up. If Epic Caché cannot finish its writes in 80 secs, database access could exhibit latency. This could be due to no fault of Epic but a factor of the underlying shared hardware infrastructure.

Why can’t you just rely on the application performance monitoring tools provided by your vendor? For the simple reason that arrays usually have tracing with 60 second summaries. As you can imagine, a 60 second summary of an 80 second process isn’t terribly useful. Virtana Infrastructure Performance Management has wire level visibility into Fibre channel (or NAS protocol) traffic – every single conversation at line rate – and offers second-by-second summaries which can make a world of difference. In addition, 99.9th and 99.99th percentiles are shown by timing every single exchange on the wire and placing the timed result in histogram represented buckets, ranging from sub milliseconds upwards.

You may be thinking: But I’ve followed all the guidelines sent to me by Epic… For instance, Epic might tell you “all average read latencies must be 12 msec or less for ECP configurations”. However, your datacenter monitoring expert might turn around and ask you “Noted but at what granularity? Is it to be every 5 min, every 1 min or every 1 sec?” Guidelines from a software vendor are good but the devil is in the detail in how customers interpret these guidelines.

You may well ask: Why can’t I just rely on Epic System Pulse to monitor Epic? For the simple reason that while System Pulse does a good job of collecting performance and health metrics across the Epic service it has no visibility into anything like shared SAN or shared network storage – one of the primary causes of EPIC application slowdowns.

What else could go wrong? If your Clarity reporting application is running on a host connected via Fibre channel Host Bus Adapters (HBA) to a Brocade SAN switch which uses Virtual Channels, you might have a scenario where all existing HBAs may be sharing the same virtual channel resulting in significant congestion. With help from Virtana Infrastructure Performance Management and by manually mapping FCIDs you could determine which Virtual Channels were being used by specific device ports. This will help you determine if ports should be reallocated & additional HBAs deployed so that they may use other virtual channels. Alternately you might have incorrect queue depth settings on the HBAs in the hosts running Clarity which may cause latency in the SAN fabric. Virtana Infrastructure Performance Management can help you detect this and recommend the right queue depth settings to use.

What could go wrong outside the server (virtualized or physical), HBA and storage? 3^rd party backup software (for backup of Epic including the Caché database) may end up taking hours more than expected for the backup. Virtana Infrastructure Performance Management could use time comparison charts to show you when there is a gap in the backup read workloads, using this data your backup manager could use trend data to identify what may turn out to be a timeout in the dedupe engine of the backup software.

The recurring theme in this article it is that there are many variables in the underlying infrastructure: servers, HBAs, SAN switches, networked storage, backup software – all outside the control of the Epic application which often contributes to latency and downtime in Epic. Your goal then should be to catch any deviations and take proactive action before they spiral into downtime. That is the motivation behind so many healthcare firms using Virtana Infrastructure Performance Management to monitor Epic and Cerner infrastructure.

Important Gotta-Have Information

Drop us a line and we’ll be in touch! And don’t forget to follow us on Twitter, LinkedIn, and Facebook to stay up to date on the latest and greatest in hybrid AIOps.

The Deepest Hybrid Infrastructure Observability Platform

Virtana empowers enterprises to ensure the availability, efficiency, and resiliency of their mission-critical services. Virtana’s AI-powered platform unifies visibility across on-premises, cloud, and Kubernetes environments, providing comprehensive real-time insights and intelligent automation. Virtana can help your IT teams proactively address issues, streamline operations, and transform infrastructure into strategic assets in today's rapidly evolving digital landscape. In an era where digital transformation is no longer optional, having clear visibility and control over your infrastructure isn’t just an IT priority—it’s a business imperative. Let’s get deeper

Learn More

Virtana Insight

AIOps

March 27 2025Virtana Insight

Optimizing Every Layer: From Cloud to On-Premises

As digital infrastructures become more complex, businesses need an agile, unified platform ...

Capacity Management

May 30 2023Marc Bachmeier

Infrastructure is Fundamental: Learn Your Hybrid Cloud ABCs

In 21st-century business, computing is what makes daily operations, competitive advantage, ...

Capacity Management

April 30 2023James Harper

Cloud Capacity Planning Is a Hit-or-Miss Exercise That Mostly Misses

The goal of capacity planning is to match resources with demand. There are essentially thre...

Avoid downtime with EPIC Electronic Health Record (EHR) systems by using Virtana Infrastructure Performance Management

Virtana Insight

Optimizing Every Layer: From Cloud to On-Premises

Infrastructure is Fundamental: Learn Your Hybrid Cloud ABCs

Cloud Capacity Planning Is a Hit-or-Miss Exercise That Mostly Misses