Scale Computing Single Node Appliance: A Review -- Virtualization Review

Scale Computing Single Node Appliance: A Review

An easy-to-use server that can be pricey.

By Trevor Pott
11/30/2016

I have occasion to review a number of products from various vendors. On a fairly regular basis, I encounter a decent product from a vendor I have zero respect for (usually Microsoft). Sometimes I get a product I think is trash from a vendor I otherwise admire. With their Single Node Appliance Configuration (SNAC), Scale Computing has presented me with a product I can respect from a company I like, although I still have deep disagreements over what it represents.

SNAC nodes are exactly what they sound like: they're a single server. To get a SNAC node you take an otherwise perfectly normal Scale node, one that normally would be part of a three-cluster minimum, and tell it through the command line that it is a "cluster" of one node. This has given rise to the internal name for the SNAC nodes: single-node clusters. I call them archive nodes.

The many names for these units stem from the various perceived uses. One use case has a single SNAC node loaded up with large, slow disks and used as a replication target for another cluster. In other words: your production Scale Computing cluster takes snapshots of the running VMs on a regular basis and sends a copy of those snapshots to the SNAC node on a pre-defined schedule. There is nothing particularly special about this sort of replication. The functionality has been built into Scale's environment for a while now, and it works quite well. Originally, this was used to replicate between two full-sized (3-node minimum) Scale clusters; however, once you trick the SNAC node into thinking it's a "cluster of one," it does the job quite well.

The other use case for the SNAC nodes is as a Remote Office/Branch Office (ROBO) server. Here the roles are reversed: a workload would run on the SNAC node locally and it would be regularly replicated back to the central office to what is, presumably, a larger cluster. In this scenario, it is more than likely the SNAC node would be fitted with more performant and less capacious drives than would be seen in the archive node configuration.

A Quick SNAC
For those who haven't had the pleasure of using a Scale hyperconverged cluster, the whole experience can probably be summed up as "virtualization with crayons". The Scale nerds are obsessed with ease of use. This has the twin side effects of reduced functionality when compared to feature-festooned VMware, and the solution being simple enough that I can teach the most junior of sysadmins everything there is to know about using it in under 15 minutes.

By being built on top of the KVM hypervisor, Scale's offerings have the ability to match VMware almost feature-for-feature. VMware has the edge on a few things, and KVM on others. Scale automates as much of that as possible, and then selectively exposes to the user those features they can't automate.

The reason every single feature isn't exposed in the UI is usually that Scale hasn't figured out how to do so in an idiot-proof fashion. They won't expose a feature to users until they're convinced it's user friendly, and until they've fully instrumented every possible outcome of using that feature. In addition to ease of use, Scale is obsessed with monitoring. They have a custom-built monitoring solution that hoovers up statistics, logs, hardware states and more. This feeds into their automation layer to create that "black box voodoo" that means virtualization newbies don't need a Ph.D. in management software to make a virtual machine (VM).

Because the SNAC nodes are just regular Scale nodes, they have all the benefits and drawbacks of the above. The risk of problems with SNAC nodes don't really emerge from their design, but in their implementation.

Size Matters
If you want a single-node ROBO solution that goes fast, deploy SNAC nodes with faster hard drives in them. Scale has a very well developed hybrid storage solution called HEAT in their new nodes; I've hammered a customer's HEAT-enabled cluster for months and have yet to be disappointed by its performance.

I have no issues with SNAC nodes in a ROBO scenario, unless you happen to need high availability for those workloads. If that's the case, no single-node anything is ever going to work. Get a proper three-node cluster; you'll still be able to replicate back to head office.

If you want a single-node archive solution that stores a lot of snapshots, then in theory you can fill a SNAC node with large 7200 RPM drives. This is where Scale and I start to have disagreements.

Scale's current node offering consists of 1U nodes with four drive bays. The biggest drives they have certified are currently 6TB drives. There are discussions in the super-secret beta slack that 8TB drives are on the way. For me, this is a problem.

In its current configuration, Scale keeps two copies of data throughout the cluster. With a single node cluster, this means that your four drives becomes two usable drives' worth of space. This (2x6TB drives) yields only 12TB, while 2x8TB drives is only 16TB. That's just not enough.

Scale's guidance is to deploy archive clusters that have double the capacity of your production clusters. These production clusters can have up to eight 2.5" hard drives, with the highest end clusters (based on the 4150 nodes) having 6x2TB magnetic hard drives and 2x800GB SSD, for 6.8TB usable per node.

A standard 3-node production cluster of 4150s is 20.4TB usable, which would require three of the as-yet-non-extant 8TB drive SNAC nodes in order to provide adequate archive storage, according to vendor guidance. In this case, the SNAC nodes are a little less "single" and a lot more investment.

The Right Stuff
Overall, I think Scale has the right idea. I think they need a swift kick in the behind to get out there and offer up solutions that can pack in a lot more storage for their archive node than they'll ever get into 1U. I also think the archive nodes need the ability to kick blocks up to ultra-cheap cloud storage, such as Backblaze's B2, and to be able to regularly dump from the archive node to external storage such as a NAS or tape array.

Scale, of course, wants to keep everything Scale-branded. They have a Disaster Recovery-as-a-Service (DRaaS) cloud offering they'd rather you used instead of Backblaze. They would rather you buy more SNAC nodes for your archive needs than repurpose your old NAS or SAN. This makes perfect business sense, but I'm a penny-pinching tightwad, and therefore always looking for the cheapest way to accomplish things.

It's a gentleman's disagreement; one that can and does take many forms, but needn't be acrimonious. The above represents a discussion that we, as virtualization admins, are going to encounter with increasing frequency.

Backup and disaster recovery for our virtual environments is increasingly being delivered as hyper-converged clusters, cloud-in-a-box setups, virtual backup appliances or service delivery provided by the vendor. Vertically integrated plays are displacing do-it-yourself. It may not be long before assembling virtualization infrastructures with multiple vendors is a niche activity.

Questions Needing Answers
Are snapshots and clones from one cluster to another "good enough" to serve as a backup layer? I'm personally in the camp that says hypervisor-level images regularly exported to a completely different storage medium offsite is a requirement. (Agent-based backups can [text censored].)

What if the destination for those snaps and clones is offsite? DRaaS and colocated archive nodes might be enough for some. Does the number of generations of snaps taken matter? How important is it to make copies of your data that exist outside the ecosystem of your infrastructure provider?

Scale's SNAC nodes are, like the rest of their products, easy to use. They do what they say on the tin, and that's still rare enough to be important. The few issues I have are the same I have with any vendor: trying to get them to optimize their offering for my use cases and trying, eternally, to grind them down on price.

Same as it ever was. Now with more point and click.

About the Author

Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley startups better understand systems administrators and how to sell to them.