Data Storage Startup Provides Real-Time Analytics on Billions of Files

This first look shows how powerful Qumulo Core is.

Qumulo Inc. just released Qumulo Core, second-generation data-aware scale-out NAS software that uses a brand-new file system, Qumulo Scalable File System (QSFS). I was fortunate enough to spend some time with the Qumulo crew up in Seattle while working with the product for a few days.

The system behaved well, far better than what I typically expect from a new product. What impressed me the most, however, were the real-time analytics built directly into QSFS. The product I worked with was a four-node cluster of Qumulo Q0626 appliances, the first product in the Qumulo data-aware scale-out NAS portfolio.

The Qumulo storage system is based on entirely new code that was architected by the three primary inventors of Isilon OneFS. Before laying out the architecture, Qumulo interviewed hundreds of storage professionals. In doing so, the company was able to identify three major pain points: scalability, data awareness and ease of use. It addressed each of these pain points accordingly with its new product. I will do a complete evaluation in a later article, but what I wanted to give you a glimpse today of the feature that most blew me away -- the ability of Qumulo Core to show real-time analytics on a file system containing billions of files.

The screen capture in Figure 1 shows the Qumulo Core dashboard, which displays the overall health and activity of the system in real time. From this dashboard you can easly monitor cluster activity, IOPS and throughput to the storge system.

[Click on image for larger view.] Figure 1. The Qumulo Core dashboard.

For a variety of reasons, it's traditionally been very difficult to capture analytics in real time for storage systems that contain a large number of files. I have literary seen it take days to run a disk usage (du) report on a file system with 100 million files, making analyzing and planning very difficult.

With Qumulo Core, however, I saw the dashboard report disk usage in real time as hundreds of files and multiple directories were being created every minute. I also saw the same real-time information being updated as thousands of files were being deleted. Qumulo Core does more than just report on du in real time; it also reports on the clients that are using the most IOPS and the top consumers of storage space.

If you're still curious about the claim I made about real-time analytics on a single file system containing a billion files, the screen capture seen in Figure 2 was taken from a Qumulo Core system with more than 5 billion files. The left hand displays, in real time, a bar chart of the top consumers of storage space, as well as the top IOPS by client and by path. If you hover over the bars, it displays more detailed information (size, dates, permissions and so on) on the object. If you select a file or directory, it will drill down into the object.

[Click on image for larger view.] Figure 2. Analytics for a single file system with more than 5 billion files.

Further solidifying its claim as a second-generation storage system are the Qumulo agile software development and delivery methodology, “API-first” philosophy, and software-defined storage.

Next time, I'll provide an exclusive in-depth first look at the Qumulo Q0626 storage system with a full evaluation.

About the Author

Tom Fenton has a wealth of hands-on IT experience gained over the past 30 years in a variety of technologies, with the past 20 years focusing on virtualization and storage. He currently works as a Technical Marketing Manager for ControlUp. He previously worked at VMware in Staff and Senior level positions. He has also worked as a Senior Validation Engineer with The Taneja Group, where he headed the Validation Service Lab and was instrumental in starting up its vSphere Virtual Volumes practice. He's on X @vDoppler.


Subscribe on YouTube