SSD: Huge Improvement, Not a Magic Bullet
Solid-state drives have been grabbing a lot of headlines lately. SSDs hold promise for blazing fast storage performance. Unlike traditional hard disk drives (a.k.a. "spinning pieces of rust") they have no moving mechanical parts. The SSD term applies to devices based on a number of different technologies. However, recently it has been used primarily to describe devices based on flash memory. This distinction is important for our discussion.
Today all the major storage vendors, plus a lot of startups you probably haven't heard of, seem to have an SSD story. Based on the buzz at the Flash Memory Summit in August, one might think that all of the world's troubles, including global warming, can be resolved by switching all your storage to SSDs.
I am not so sure about the global warming bit, but SSDs are in fact a major breakthrough in storage technology. However, wholesale replacement of round pieces of spinning rust with square boxes full of flash memory chips is hardly a practical option.
First, using SSDs is still a very expensive proposition. Storage vendors recently announced "major price breakthroughs" by offering flash-based SSDs for $10,000 per terabyte. For comparison shopping, I suggest that you stop by your neighborhood electronics retailer to check out those two-terabyte hard drives for about $100.
Second, there are concerns about SSD durability. These concerns are being addressed by the industry, but here is the truth that no one can contest: SSDs have a finite number of write cycles. So whenever you're planning a long-term system where you have to do a huge number of update cycles, you need to take that into account. It is one thing to use a flash card in a digital camera, where even a British Royal paparazzo is unlikely to reach the write cycle limit. However, using flash memory for a high-frequency financial trading application on the other side of London is quite another matter. You may burn through the write cycle limit in a matter of months, if not weeks.
Third, they're not as fast as a flash memory datasheet would have you think. Unlike good old hard drives, flash memory is very asymmetrical when it comes to differences between read and write performance. Writes are much more expensive in terms of time required. This is particularly true for small, random write operations, which, as we learned from my previous post, is exactly the type of load that's generated by virtualization. Flash memory SSDs are extremely good at random read operations but weak at small random writes.
The reason for this is because you don't just write a data block to a flash memory. You have to erase it first. This is the nature of physics behind the flash memory devices. (I am afraid I can't explain it to you any deeper. I could many years ago when I had just gotten my degree in applied physics, but now we all have Google to tell us things we used to know.)
Even though the erase operation is fast, it can only be applied to a large portion of the memory cell at once, and the typical size is hundreds of megabytes. So if the cell happens to have some data stored in these hundreds of megabytes--and usually it does--you need to do something about that data--like relocate it to another section of the memory cell. Relocating data means that you have to write it, so you'd have to find another chunk to erase. Well, isn't that what we were trying to avoid with the first write? As you can see, the problem quickly snowballs into something rather non-trivial.
A freshly erased flash memory cell is easy to write to (that's what keeps paparazzi in business), but as the write operations arrive at random they fragment the free space. Soon any new write, no matter how small, would require a major reallocation of data before proceeding. So the writes can slow to the point that overall performance could be worse than that of a hard disk!
As you can see by now, the techniques to maintain acceptable performance in flash memory devices can be extremely complicated and resource intensive. Solving these issues requires a substantial investment in software development.
Of course, hardware vendors are constantly trying new techniques, methods and tricks to get around that problem, and so "enterprise-class" SSD devices have very sophisticated controllers--which add to the cost of the drive. One workaround often used is doubling the size of the memory but only presenting about half of it to you as usable space. In fact, you're paying for double the memory you think you have.
Cheap SSD vendors simply ignore the problem of poor random write I/O performance. Early adopters of flash SSDs in laptops discovered this when they paid hundreds of dollars for SSD upgrades from rotating disks, only to find that their computer became slower rather than faster.
Throwing an expensive box full of flash memory chips at your storage interconnection is not magically going to solve your problem of a virtualization storage performance bottleneck. You really have to have the correct file system and storage management software that can fully take advantage of the unique performance benefits and issues of SSDs. Implemented correctly, SSD can do wonders for your performance, but you first have to lay the proper foundation using the right tools and facilities.
Posted by Alex Miroshnichenko on 10/19/2010 at 12:48 PM