Storage Virtualization and the Question of Balance
Focusing too much on processors leads to problems.
- By Dan Kusnetzky
The storage virtualization industry is repeating an error it made long ago in the early days of industry standard x86 system: a focus on processing performance to the exclusion of other factors of balanced system design.
Let's take a stroll down memory lane and then look at the problems storage virtualization is revealing in today's industry standard systems.
Balanced system design is where system resources such as processing, memory, networking and storage utilization are consumed at about the same rate. That is, there are enough resources in each category so that when the target workload is imposed upon the system, one resource doesn't run out while others still have capacity to do more work.
The type of workload, of course, has a great deal to do with how system architectures should be balanced. A technical application might use a great deal of processing and memory, but may not use networking and storage at an equal level. A database application, on the other hand, might use less processing but more memory and storage. A service oriented architecture application might use a great deal of processing and networking power, but less storage and memory than the other types of workloads.
A properly designed system can do more work at less cost than unbalanced systems. In short, systems having an excess of processing capability when compared to other system resources might do quite a bit less work at a higher overall system price than a system that's better balanced.
Mainframes to x86 Systems
Mainframe and midrange system designers worked very hard to design systems for each type of workload. Some systems offered large amounts of processing and memory capacity. Others offered more networking or storage capacity.
Eventually, Intel and its partners and competitors broke through the door of the enterprise data center with systems based on high-performance microprocessors. The processor benchmark data for these systems was impressive. The rest of the system, however, often was built using the lowest cost, off-the-shelf components.
Enterprise IT decision makers often selected systems based upon a low initial price without considering balanced design or overall cost of operation. We've seen the impact this thinking has had on the market. Systems designed with expensive error correcting memory, parallel networking and storage interconnects often lose out to low cost systems having none of those "mainframe-like" enhancements.
This means that if we walked down a row of systems in a typical datacenter, we'd see systems having under-utilized processing power trying to drive work through configurations having insufficient memory and/or networking and storage bandwidth.
To address performance problems, enterprise IT decision makers often just purchase larger systems, even though the original systems have enough processing power; an unbalanced storage architecture is the problem.
Enter Storage and Networking Virtualization
As industry standard systems become virtualized environments, the industry is seeing system utilization and balance come to the forefront again. Virtualization technology takes advantage of excess processing, memory, storage and networking capability to create artificial environments; environments that offer important benefits.
While virtual processing technology is making more use of industry standard systems' excess capacity to create benefits, other forms of virtualization are stressing systems in unexpected ways.
Storage virtualization technology often uses system processing and memory to create benefits such as deduplication, compression, and highly available, replicated storage environments. Rather than to put this storage-focused processing load on the main systems, some suppliers push this work onto their own proprietary storage servers.
While this approach offers benefits, it also means that the data center becomes multiple islands of proprietary storage. It also can mean scaling up or down can be complicated or costly.
Another point is that many industry standard operating systems do their best to serialize I/O; that is, do one storage task at a time. This means that only a small amount of a system's processing capability is devoted to processing storage and networking requests, even if sufficient capacity exists to do more work.
Parallel I/O to the Rescue
If we look back to successful mainframe workloads, it's easy to see that the system architects made it possible to add storage and networking capability as needed. Multiple storage processors could be installed so that storage I/O could expand as needed to support the work. The same was true of network processors; many industry standard system designs have a great deal of processing power, but the software they're hosting doesn't assign excess capacity to storage or network tasks, due to the design of the operating systems.
DataCore has been working for quite some time on parallel storage processing technology that can utilize excess processing capability without also creating islands of storage technology. When Lenovo came to DataCore with a new, highly-parallel hardware design and was looking for a way to make it perform well, DataCore's software technology came to mind. DataCore made it possible for Lenovo's systems to dynamically use their excess processing capacity to accelerate virtualized storage environments. The preliminary testing I've seen is very impressive and shows a significant reduction in cost, while also showing improved performance. I can hardly wait to see the benchmark results when they're audited and released.
Dan's Take: It's Time to Consider Parallel I/O
In my article "The Limitations of Appliance Servers," I pointed out that we've just about reached the end of deploying a special-purpose appliance for each and every function. The "herd-o'-servers" approach to computing has become too complex and too costly to manage. I would point to the emergence of "hyperconverged" systems in which functions are being brought back into the system as a case in point.
Virtual systems need virtual storage. Virtual storage needs access to processing, memory and networking capability to be effective. DataCore appears to have the technology to make this all work.
Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. He has been a business unit manager at a hardware company and head of corporate marketing and strategy at a software company.