Reviews
        
        Data Storage Startup Provides Real-Time Analytics on  Billions of Files
        This first look shows how powerful Qumulo Core is.
        
        
          
  Qumulo Inc. just released Qumulo Core, second-generation data-aware  scale-out NAS software that uses a brand-new file system, Qumulo  Scalable File System (QSFS). I was fortunate enough to spend some time with the  Qumulo crew up in Seattle while working with the product for a few days.
 
  The system behaved well, far better than what I typically expect  from a new product. What impressed me the most, however, were the real-time  analytics built directly into QSFS. The product I worked with was a four-node  cluster of Qumulo Q0626 appliances, the first product in the Qumulo data-aware  scale-out NAS portfolio.
  The Qumulo storage system is based on entirely new code that  was architected by the three primary inventors of Isilon OneFS. Before laying  out the architecture, Qumulo interviewed hundreds of storage professionals. In  doing so, the company was able to identify three major pain points: scalability, data  awareness and ease of use. It addressed each of these pain points accordingly  with its new product. I will do a complete evaluation in a later article, but  what I wanted to give you a glimpse today of the feature that most blew me  away -- the ability of Qumulo Core to show real-time analytics on a file system  containing billions of files.
 
  The screen capture in Figure 1 shows the Qumulo Core dashboard, which displays the overall health and activity of the system in  real time. From this dashboard you can easly monitor cluster activity, IOPS and  throughput to the storge system.
	
     [Click on image for larger view.]	
		Figure 1. The Qumulo Core dashboard.
    
	
		[Click on image for larger view.]	
		Figure 1. The Qumulo Core dashboard.
	
 
  For a variety of reasons, it's traditionally been very  difficult to capture analytics in real time for storage systems that contain a  large number of files. I have literary seen it take days to run a disk usage  (du) report on a file system with 100 million files, making analyzing and  planning very difficult. 
  With Qumulo Core, however, I saw the dashboard report disk usage in real time as hundreds of files and multiple directories were being  created every minute. I also saw the same real-time information being updated  as thousands of files were being deleted. Qumulo Core does more than just report on  du in real time; it also reports on the clients that are using the most  IOPS and the top consumers of storage space. 
  If you're still curious about the claim I made about real-time  analytics on a single file system containing a billion files, the screen  capture seen in Figure 2 was taken  from a Qumulo Core system with more than 5 billion files. The left hand displays, in  real time, a bar chart of the top consumers of storage space, as well as the  top IOPS by client and by path. If you hover over the bars, it displays more  detailed information (size, dates, permissions and so on) on the object. If you  select a file or directory, it will drill down into the object.
	
     [Click on image for larger view.]	
		Figure 2. Analytics for a single file system with more than 5 billion files.
    
	
		[Click on image for larger view.]	
		Figure 2. Analytics for a single file system with more than 5 billion files.
	
  Further solidifying its claim as a second-generation storage  system are the Qumulo agile software development and delivery methodology, “API-first”  philosophy, and software-defined storage. 
  Next time, I'll provide an exclusive in-depth first look at the  Qumulo Q0626 storage system with a full evaluation.
        
        
        
        
        
        
        
        
        
        
        
        
            
        
        
                
                    About the Author
                    
                
                    
                    Tom Fenton has a wealth of hands-on IT experience gained over the past 30 years in a variety of technologies, with the past 20 years focusing on virtualization and storage. He previously worked as a Technical Marketing Manager for ControlUp. He also previously worked at VMware in Staff and Senior level positions. He has also worked as a Senior Validation Engineer with The Taneja Group, where he headed the Validation Service Lab and was instrumental in starting up its vSphere Virtual Volumes practice. He's on X @vDoppler.