Dan's Take

How Many Nines of Uptime Do You Need?

Service providers have many uptime guarantees. Which one you choose should be dictated on a workload-by-workload basis.

Many organizations are embarking on the journey to both a more virtualized environment and executing that virtual environment in someone else's data center, via cloud computing. Their concerns often are centered on reducing overall costs by taking advantage of the mass purchasing power of the service provider to reduce costs of IT staff, power, cooling and communications, as well as reducing the costs of systems, storage, networking and software licenses.

Many go beyond the focus on cost reduction to consider workload performance and consolidation. Few, unfortunately, also consider workload reliability or availability until they actually experience a problem.

The few that consider availability and reliability go beyond the stories of availability presented by the suppliers to consider the real impact of an outage or a slowdown. If a supplier proudly presents that workloads hosted in their data centers are 99.9 percent reliable, is that really good enough? Maybe -- but maybe not.

Suppliers often speak about what level of availability they're offering as an uptime percentage. Let's look at uptime to gain an understanding of what adding a "nine" will do to an organization's exposure to downtime. Take a look at Table 1.

Monthly Uptime

Monthly Downtime

Seconds Down per Month

Minutes Down per Month

Hours Down per Month

99.000%

1.0%

26,280.00

438.000

7.300

99.900%

0.1%

2,628.00

43.800

0.730

99.990%

0.01%

262.80

4.380

0.073

99.999%

0.001%

26.28

0.438

0.007

99.9999%

0.0001%

2.63

0.044

0.001

Table 1. Uptime figures, based on "nines."

Workload uptime is dependent on many factors, including: the uptime of physical systems, virtual systems, storage subsystems and devices, networks and networking equipment; other factors include supplier IT operations staff errors. The service supplier, of course, won't be responsible for enterprise user or IT staff errors.

If a service provider is offering 99 percent uptime, that's 2 nines of uptime. It's really telling its customers that, on average, they'll experience just over 7 hours of downtime in any given month. Is that enough? It depends; it might be good enough for workloads that aren't critical to enterprise survival, but it's woefully lacking for critical workloads.

If the service provider adds an additional "nine" to that uptime claim -- providing 99.9 percent uptime  or 3 nines -- those same enterprise subscribers would experience about three quarters of an hour of downtime in any given month.

If we step back for a moment to consider the uptime seen with typical industry standard (x86-based) physical systems, 95 percent to 99 percent uptime is a pretty common range. If we look at clustered industry standard systems, the range typically falls between 99.5 percent and 99.9 percent uptime. Even if one of these solutions could offer 99.99 percent or 4 nines, the organization would experience 4.32 minutes of downtime in any given month.

Let's think about what would happen to an enterprise in the financial marketplace if its systems actually were up 99.99 percent of the time. If its EFT or trading systems are down just a few minutes every month, it could still lose millions of dollars.

Dan's Take: Be Proactive About Your Nines
Suppliers of continuous processing systems or mainframes point out that 99.99 percent uptime, a figure often offered by cloud service providers for single workload instances just isn't good enough for critical applications. These applications can't be allowed to experience any downtime.

Mainframe suppliers like IBM point out that their basic systems offer 99.999 percent, or 5 nines, of uptime. Continuous processing system suppliers like Stratus Technologies state that their hardware-based systems offer 99.9999 percent (6 nines) of uptime and their software-based solutions offer 99.999 percent (5 nines) of uptime. Can cloud service providers provide similar levels of availability?

The service provider may offer ways to execute workloads on multiple systems in a single datacenter for an additional fee. Going further, and executing them in multiple datacenters can add additional uptime and avoid issues should an entire datacenter fail.

Enterprises having a need for constant uptime should explore the options offered by their cloud service provider before a disaster hits. The extra cost will probably be reasonable, considering the criticality of certain workloads.

About the Author

Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. He has been a business unit manager at a hardware company and head of corporate marketing and strategy at a software company.

Featured

Subscribe on YouTube