The Cranky Admin
Backup and Disaster Recovery With ioSafe Server 5
It has limitations, but what it does, it does very well.
Backups and disaster recovery are well-worn topics for which there exists an unlimited amount of conflicting guidance. Everyone has an opinion about how to go about it, but in the end each organization's budget and appetite for risk varies. Adding to the classic options for backups and disaster recovery are ioSafe's line of disaster-proof servers and storage, the implications of which deserve some in-depth discussion given today's changing datacenter and WAN networking environments.
Over the years, ioSafe has sent me several units for review. Some of which I have gotten to keep. In 2013 my team and I set an ioSafe on fire. We burned it for 20 minutes, put the fire out with a fire extinguisher, then poured freezing cold water on it so that that the whole thing would experience rapid temperature changes. We then pulled the drives out and stuffed them into a second ioSafe unit. It is now almost four years later, and those very same drives are still working perfectly.
My company also uses two ioSafe 1513+ NASes. Our most recent addition is an ioSafe Server 5. This is by no means the full extent of our IT plant; the lab is getting disconcertingly large. Using these products in production alongside a diversity of other solutions has allowed for a decent understanding of how ioSafe's offerings fare in the real world.
I've picked ioSafe for a discussion about disaster-proof hardware not because of the free toys, but because of the release of Server 5. It's a general-purpose x86 server, meaning it can be used for bare metal workloads or for virtualization. The concept of fireproof and waterproof solutions onto which we can install whatever we feel like changes the game somewhat.
ioSafe is, to my knowledge, unique in the industry. ioSafe represents the polar opposite of the current trend toward disposable computing. Today's mainstream solution is to buy lots and lots of "cheap" computers, replicate data as far and wide as possible, and throw computers away when they misbehave.
So why ioSafe? How do they perform, how reliable are they and where do they fit in today's mix of storage and compute, both on-premises and off?
Disaster-Proof Storage
If an organization wanted to feel their data was safe, traditionally they would have to set up additional datacenters and arrange to replicate their data offsite. The expense and difficulty of doing so is one of the primary drivers for the popular uptake of cloud storage, which provides the additional site functionality as a service. These arrangements work fine so long as the organization can afford the fees, with the cost of WAN connectivity being an increasingly important consideration.
ioSafe offers a third way. They build fireproof and waterproof chassis, inside which they shove hard drives. The smallest of these are USB-attached drives aimed at the solo professional or field worker. The largest are midtower-sized 5x 3.5" solutions designed to act as either a dedicated NAS or a full blown x86 server with direct-attached storage.
The solution is only designed to protect the drives. By necessity, the rest of the computer -- including motherboard, CPU and RAM -- are outside the disaster-proof "core" of the device. This is because in order for us to make any use of these devices, we have to plug cables into them. The data is what's considered important; if the building burns down around the ioSafe, the chassis can be recovered from the debris, the hard drives salvaged and placed into a new chassis.
Storage Performance
The concept of ioSafe as primary storage will strike enterprise sysadmins as humorous. Clearly something that tops out at 5 SATA disks per chassis isn't going to go toe-to-toe with the latest and greatest all-flash array. If you're seeking storage to serve your fleventy-squillion transactions per second database, then ioSafe probably shouldn't be under consideration.
On the other extreme, I know people who use the ioSafe Rugged Portable SSD as Windows To Go hard drives. These individuals do a lot of field work, and even their Panasonic Toughbooks can't handle the environments they encounter. They swear by the ioSafe USB drives, calling them the "AK-47 of computers:" you can drag them through mud, rivers, jungles and tundra and they'll just keep working.
Somewhere in between these extremes is where most of us live. I don't really see myself hauling a 65-pound Server 5 through any rivers, but I also don't have many customers that run workloads so demanding they need top-of-the-line all-flash arrays, either.
If a RAID of 5 traditional magnetic hard drives is too slow for what you do, it's worth considering going three magnetic drives and two SSDs. The Synology DSM-based 5-bay units will turn that into a hybrid storage offering that's more than fast enough to saturate the 4x 1GbE ports offered. There's little point with the Synology-based units in going to 5 flash drives, however, because the network becomes a bottleneck very quickly.
The Server 5 is a different story. With 2x 10GbE ports, it can push 5x SATA flash drives as hard as they'll go. The motherboard inside the Server 5 is a Supermicro X10SDV-4C-TLN2F sporting a Xeon D 1521 CPU and up to 128GB RAM. Both Linux and Windows Server make this into a beautiful hybrid storage solution with minimal effort.
There are real-world limits to these systems beyond just network capacity. Even in all-flash configurations, all ioSafe server variants are still currently limited to classic SATA drives. This means that lots and lots of small I/Os will bog the storage down even before saturating the network; that's true of any pre-NVMe storage.
In practice, the Synology-based units should handle day-to-day storage operations for up to 25 people. The Server 5 can probably stretch that a little to 50 or 100 people, depending on what exactly it is the organization does. This isn't going to win hearts and minds for datacenter operations teams struggling to keep large enterprises humming, but it works great for small businesses or as a Remote Office/Branch Office (ROBO) unit.
The boundaries of what's possible performance-wise are important, because they inform all the possible places where today's disaster-proof solutions can be useful, as well as where they cannot.
General Compute Performance
The Synology DSM-based Server 5 units can't really be used for general-purpose computing. Synology does have a marketplace built into their product, but it isn't anywhere near as intuitive to use as cloud computing offerings. Furthermore, most of the non-backup apps in that marketplace are related to standing up Web sites. If you expose a Synology NAS to the Internet for any reason, bad things will probably happen fairly quickly. Do not do.
IoSafe's Server 5, on the other hand, is a standard Supermicro x86 server. Install whatever you wish. I've spent more than a month stress-testing the Server 5 using various flavors of Linux, Windows and ESXi. With the exception of having to inject the appropriate network drivers into ESXi's installer, it all works beautifully. (The latest ESXi 6.5 U1 finally packages the appropriate driver into the installer natively.)
With 4 cores (8 thread) at 2.4Ghz (max 2.7Ghz), the Server 5's CPU is a little weedy to start with. It's also not a CPU you want to run at the red line 24/7. ioSafe's core limitation on what kind of hardware they can support using their disaster-proof technology has always been heat dissipation, and if you run your datacenter warm the Server 5 will get stressed.
To be fair, ioSafe did their homework. They claim you can operate the unit in environments up to 35°C (95°F), and they're right. It will work even if you flatline the CPUs 24/7. What will happen, however, is the system won't allow Turbo Boost to kick in at any point, and if there is even a minor localized thermal excursion in a datacenter kept at those temperatures, the ioSafe will go into thermal shutdown.
If your goal is to use these systems to run general purpose remote/branch office (ROBO) workloads and house the data of a ROBO site, the Server 5 is ideal. Put a hypervisor on there, stand up a domain controller, a local instance of your point-of-sales software, mail server, some backup software and maybe a unified communications controller and you're set.
8TB drives are fairly common today. If you went for three of them in a RAID 5 with two flash drives, you'd be looking at 16TB of usable hybrid storage, plenty for a standard ROBO deployment today. Of course, you don't need disaster-proof storage to stand up a small compute node, so what' the point?
RPOs, RTOs and ROBOs
ROBO is the easiest use case to talk about disaster-proof solutions. The commercial validity of branch offices can occur anywhere, not necessarily only where WAN bandwidth is cheap or even available. It may not be possible to run all applications from a central office or from the public cloud. It may not be possible to run applications locally and back up everything remotely within acceptable timeframes. This is where disaster-proof solutions come into play.
When your backup targets are more ambitious than bandwidth will allow, the ability to have the building burn down around your IT but still recover the data starts to look attractive. ioSafe offers a recovery point objective (RPO) of 0 with a recovery time objective (RTO) of "a few days to possibly a couple of weeks," depending on how bad the disaster is and how long it takes to get at the hard drives.
In a ROBO situation, that's usually perfectly acceptable. You want an RPO of 0 so that accounting and insurance types have up-to-the-second data, but if it takes a few days to get at the data, you're probably fine. It's not like the branch office is going to be back online before it's safe enough to recover the ioSafe anyway.
Disaster-proof storage also works wonders as a cloud gateway. In a cloud storage gateway, storage transactions would be made against the local unit during office hours and then unspooled to the public cloud over the full 24-hour period. As these are typically employed in situations where the Internet connection is unable to support real-time access to remote storage, there's always a period of vulnerability where writes only exist on the local gateway device but haven't yet been sent up to the cloud. The building burning down during this period would be bad.
Doing One Thing and Doing it Well
ioSafe units seem like a natural fit here, but one must take some care to design the solution appropriately. The Synology-based solutions predominantly have consumer and SMB-oriented public cloud storage applications in the marketplace. There isn't exactly a wide variety of enterprise storage clients available.
The Server 5 will allow you to install almost any software you care for, but it doesn't take long before one starts bumping up against the limitations of the CPU. The Xeon-D line was designed to fit in between an Atom and a Xeon E3, and when we start looking at deduplication, encryption or WAN acceleration tasks, it shows.
I tried a number of different storage software solutions during my testing, and it was quickly apparent which solutions were optimized to use the AVX 2.0 instructions in the CPU and which weren't. As a standalone cloud gateway with optimized software, the Server 5 is a capable machine. It will not, however, have enough horsepower left over to run many additional workloads.
Virtualization, the traditional solution for cramming multiple workloads into a single node, can be more curse than blessing; without the precise right configuration, the AVX 2.0 instructions won't be exposed to the guest OSes. Even with the CPU fully exposed to guests, attempting multiple virtualized cloud gateway solutions with enterprise storage features on a single unit is a questionable proposition.
Disaster-Proof HCI
Xeon-D systems are quite popular among the enthusiast virtualization community. They're popular systems for home lab clusters; I have a few myself. I feel this is where the Server 5 really starts to shine.
With two or three nodes in a cluster, I can not only run some cloud gateway software, I can add in some WAN acceleration VMs and support all the local compute a small site needs, all while serving demanding storage to local users.
In one deployment, I used a cluster of two non-ioSafe Xeon-D systems and an ioSafe Server 5 system as a three-node hyper-converged infrastructure (HCI) cluster. I run primary workloads on the non-ioSafe Xeon-D systems, and used the ioSafe node as the archive/disaster recovery node.
Essentially, the two primary HCI nodes were configured to store real-time copies of their VMs on the ioSafe, which would then take regular snapshots. Every time a write occurred on the non-ioSafe nodes, that same write occurred on the ioSafe. To protect against ransomware, snapshots were set to be immutable. High availability protected against failure of either of the primary nodes while the ioSafe protected against disasters. It works well.
The Big Caveats
Scale is certainly a limitation for the ioSafe lineup. Unless and until those oft-discussed -- but never seen -- 30TB+ "inexpensive" SSDs start arriving, (and ioSafe moves towards NVMe,) a 5-drive solution is always going to be of some limited appeal. Similarly, all the disaster-proofedness in the world won't protect you against straight-up theft.
At some point, somehow, at least one copy of your data needs to make it to a different physical site. This is the only real protection against physical theft. Many administrators will caution that data should exist in at least three different locations; it isn't unheard of for disaster to befall your location and that of your offsite storage simultaneously.
Here, I have personally taken a somewhat inverted approach for one of my production networks. Rather than rely on expensive public cloud storage, or make multiple copies in multiple locations, I simply parked an ioSafe at a colo facility and use it as my backup target. I don't get RPO 0 like I would with an on-premises ioSafe, but I also don't really care if the colo gets flooded either, so I only pay for one off-site copy of my data.
The existence of disaster-proof storage and compute opens up some interesting options for designing small or widely distributed networks. It is by no means a panacea for all ills, and it has some very real limitations that administrators must be aware of.
On the other hand, ioSafe's technology does work. It can be trusted. There are situations where physically resilient IT solutions are called for. The addition of the general-purpose x86 Server 5 to the mix makes ioSafe a serious consideration for millions of organizations that wouldn't have given a Synology-based anything a second look.
They Take a Lickin' and Keep on Tickin'
Over the years, I have beat up my ioSafe servers. I have run them in harsh, even hostile environments. I have done things to them that I shouldn't, and they keep working. Because I am something of a klutz, this endears me to them.
The success of ioSafe ultimately depends on the channel. There are a number of VARs, systems integrators, MSPs and even CSPs out there with industry-specific experience that could put disaster-proof x86 systems to interesting uses.
With a diskless Server 5 starting at $5,799 per node, I don't think anyone is going to rush out to replace an existing server with one of these unless they think they really need it. On the other hand, with the local ISP's connection fees for wiring a site up with fibre to the premises starting at $30,000, I think there's going to be a market for ioSafe for quite some time.