Lies, Damned Lies and Benchmarks (Part 2)
A few months ago I saw a headline that stuck in my mind: "VMware Admins are Storage Admins." While I can remember neither the source of the article nor its author, the idea expressed in the headline is central to understanding the primary challenge of the virtualization industry today. And it is directly related to the subject of benchmarking that I promised to continue from
my last post.
I am sure that by now you have seen more than a fair share of benchmark data published by the leading virtualization vendors. Sometimes the comparison of the benchmark data gets very heated and even provides a certain degree of entertainment. However, if you find enough patience to sift through the wordsmithing and see past the entertainment value, you'll quickly realize that there is actually little real data there.
All of the modern hypervisor technologies are essentially identical when it comes to the quality of implementation of the actual processor virtualization layer. The reason is very simple: Ever since Intel and AMD introduced the V-enabled processors (i.e., in the last five years) the hard task of virtualizing the running code has been done by the processor hardware. That is why, when you see the comparative benchmarks of various hypervisors, they are nearly identical when it comes to the computational part.
Any time you notice a substantial difference in reported performance numbers you should look very carefully at the benchmark configuration and try to understand what exactly was measured by the benchmark. It is not always trivial.
Consider the following automotive analogy: If you drive a Yugo and a Porsche between San Francisco and San Jose at 4pm on a Friday afternoon, you'd be measuring the average speed of the traffic jam on the Bayshore Freeway. What kind of a conclusion about the relative driving qualities of the two cars could you derive from the experiment? Well, you'd definitely know how comfortable the seats are in both cars, which may be all you need to know if you do most of your driving during rush hour on the Bayshore Freeway.
I often think about this analogy when reading through benchmark reports and studies published by virtualization vendors. Here is a good example: the study of performance gains provided by the dynamic resource scheduling in a cluster of several virtualized servers. This is a good, solid piece of work done by competent professionals. It shows that under certain conditions the dynamic resource scheduling software may increase overall productivity of a four-server virtualized cluster by up to 47 percent. That means you can get about 50 percent (let's be generous and round up a few percentage points) more from your servers merely by installing a new piece of infrastructure management software. Fifty percent of four servers are two servers, so by installing the software you'd be able to save yourself the cost of buying two more servers. Awesome.
Now let's dig below the surface a little bit. The paper specifies the hardware configuration for the benchmark: four servers connected to a shared storage disk array. The server configurations are listed as well -- number of CPU cores, memory, etc. I did a quick search and figured out that in the summer of 2009 (when the paper was published) such a server would cost about $6,000. So a 50 percent saving on the server hardware is worth about $12,000. That is pretty cool! I would definitely look into buying the piece of infrastructure software that can get such spectacular savings. Even if I pay a couple of thousand for the software license, I'd still save a lot of money on the hardware side.
Now let's look at the shared storage used in the benchmark. Getting prices for high-end storage hardware is a much more involved exercise than pricing servers -- it can be even more entertaining at times. Perhaps someone should start a Wikileaks for storage hardware pricing, though I am not sure if inviting the wrath of the mighty storage hardware vendors would be any better than inviting that of the world's governments, but I digress.
At any rate, I went through the exercise and figured that the list price for the storage hardware involved in the benchmark was -- are you ready for this? -- around $600,000. Now, I do understand that nobody pays list prices for these puppies. So let's figure a 30 percent discount. Heck, make it a 60 percent discount, for a nice round number of $360,000. It is still 15 times more than your servers. Looking at the price differential, it seems to me that rather than being ecstatic about potentially saving $12,000 on the server side you should be thinking about how to reduce the cost of storage. It seems like a much greater potential savings.
If you have read my previous posts on the I/O Blender problem you should be able to guess the reason behind this seemingly obscene price discrepancy between servers and storage. The highly random nature of the disk IO loads generated by virtualized servers requires a storage system capable of sustaining a very large number of IO operations per second -- IOPs. And the only way to get this with a traditional storage architecture is to increase the number of individual disks, often called the spindle count. That leads to a dramatic (I prefer to call it "obscene") increase in the storage costs.
Now imagine that instead of installing a dynamic resource scheduling software, you'd have an option to install a new type of storage management software that would allow you to dramatically (it is "dramatically" in this case) reduce the required spindle count -- perhaps by a factor of three times. You'd easily save up to $400,000 on your storage -- just add up how much more raw computing power you can get with this kind of savings -- about 60 TIMES more. And yes, such storage software does exist.
I hope this post gave you a taste of how and why to look beyond the obvious benchmark results. As always, I welcome any comments, questions, objections or other interaction.
Posted by Alex Miroshnichenko on 12/06/2010 at 12:48 PM