A Composability Primer -- Virtualization Review

A Composability Primer

How it's changing datacenters and the role of the IT admin.

By Trevor Pott
08/24/2017

What does "composable" mean in the context of IT? Vendors are increasingly talking about composable infrastructure or composable workloads, but how do they differ from how we've managed our datacenters for decades? To understand composable IT, we need to dive headfirst into one of the most heated debates in IT: how to administer our datacenters.

Composability, put simply, is the ability to define application configurations, operating system environments and even an entire datacenter's infrastructure in code. The goal of composability is repeatability: by creating definitions for our infrastructure, we can reconstitute any or all of it from bare installers as many times as needed, whenever we need.

Composability is designed in part to provide resiliency against failure. It is also a means of enabling automation and advancing testing, quality assurance and forensics in an era where IT operates at scales the human mind can no longer comprehend. Composability is about moving away from IT that concerns itself with how someone has configured something, and toward discussions about baselines and variances.

Automation can't be avoided or reversed. It's part of the inevitable, inexorable progress that mankind makes in all fields of endeavor. That said, we remain occupied by a singular question: what should management interfaces for datacenters look like? Attempts to answer question are ultimately at the core of the majority of "innovations" in the tech industry today.

What Is Infrastructure?
Before diving into how we interact with infrastructure, we should take a moment to discuss what the word “infrastructure” means. Up until recently, most administrators would have said that the term means the physical hardware of a datacenter: servers, switches, UPSes and so forth.

Virtualization introduced us to the concept of "virtual infrastructure." Here you'll hear terms like virtual switches, virtual SANs and even entire virtual datacenters. Operating systems and applications were the responsibility of another team.

In a composable world, infrastructure means "everything in a datacenter that is not data." Hardware is just a means to run hypervisors, which are just a means to run operating system environments, which are just a means to run applications, which are just a means to alter data.

Data is sacrosanct. Everything else should be composable.

GUI vs. CLI
There are three primary means for administrators to interact with their infrastructure. Graphical User Interfaces (GUIs) vs. Command Line Interfaces (CLIs) vs. Application Programming Interfaces (APIs). GUIs, CLIs and APIs all cater to administrators with different needs. They have different advantages and different disadvantages, and it's impossible to declare any one approach better than all others.

Modern GUIs are usually just a graphical front end that sits on top of a CLI or API: they expose some or all of the functionality of those interfaces, and don't generally have the ability to perform actions that wouldn't be possible through the other interfaces. Frequently, GUIs lack the ability to perform certain types of administrative tasks; usually those most likely to cause data loss if the administrator doesn't know what they are doing.

None of this should be considered a downside to GUIs, because the primary purpose of a GUI is discoverability. GUIs aren't the right choice for administrators who will interact with a given piece of infrastructure every single day. GUIs are for administrators who will only interact with something at a frequency measured in months or years.

For day-to-day interaction, CLIs are much faster than GUIs for virtually every operation. Unfortunately, CLIs require a lot of rote memorization to be truly useful. It's regular repetition that burns commands and their switches into an administrator's memory, and this is exactly the sort of thing that doesn't occur if you only interact with an interface a few times a year.

CLIs do tend to have help commands, tab fulfillment and so forth, allowing for some level of discoverability. For those looking to make a single change, however, having the information visually represented, with similar commands clustered together, makes life easier. Especially if multiple commands or multiple switches to a single command must be filled out together (and correctly) in order for the command to succeed.

The GUI vs. CLI argument, then, really boils down to a debate between administrators who dedicate themselves to learning and memorizing as much as possible about a particular piece of infrastructure and those who don't. In broader strokes, it's the argument of the generalist vs. the specialist.

GUIs are the opposite of composability. Everything in a GUI is about managing in real time. You make a change and it immediately affects a workload. CLIs are more intermediate; they offer the ability to manage in real time, but their scriptability can be an important asset in making some infrastructure elements, especially older ones, composable.

APIs
Administrators do not directly use APIs. APIs are a means for scripts and applications to interact with infrastructure. Administrators making use of APIs will either write a script to perform a series of actions or create a state definition that will be used by an agent.

While it's possible, with the right tools, for administrators to query an API in real time, this is not generally how APIs are used. APIs are predominantly used either as a means of machine-to-machine communication, or to execute carefully pre-planned actions.

In a sense, APIs and GUIs often rest at opposite ends of administrator interaction: the GUI being a means to execute a change immediately, the API being the preferred tool for the methodically prepared. The CLI bridges both worlds, allowing both real-time interaction and scriptability.

APIs are also on the opposite end of the spectrum from GUIs on discoverability. APIs tend to have searchable documentation, but it is very rare that API documentation clusters or categorizes commands in an intuitive fashion like a GUI. APIs tend to rely on you knowing the specific command you're looking for, and are designed for individuals who have the time to do research.

Scripting
Once you've written a script, you can reuse it. This gives scripts unparalleled utility as a means to either perform a task against a lot of “somethings” at the same time, or perform a lot of tasks against “something.”

Scripts can be simple to write or nightmarishly complex and filled with terror, depending on what you're trying to accomplish and what language you're scripting in. They're generally written in advance of a task, when there is time to research, test and debug before applying against live infrastructure.

While many administrators swear by scripting as the "one true administration method," reality (as always) is a tad more complex. Administering anything except the most basic pieces of infrastructure will require a diversity of tasks to be performed.

If administering entirely by scripting, each task would be its own script, unless one writes scripts complicated enough that they are functionally a surrogate CLI, complete with optional switches. This introduces the problem of managing scripts. Scripts must be organized, categorized, updated as a product's CLI or API changes, and altered to meet the needs of an ever-changing infrastructure. This problem is magnified if an administrator has multiple infrastructure elements to manage.

Scripting is an important part of composability. Most operating systems and applications are not yet fully composable. While some of their configuration can be defined at install, there is frequently some amount of configuration that will have to be injected or applied after instantiation. Scripting allows this to occur as the last part of the instantiation process.

Desired State
The desired state approach can be thought of as the ultimate expression of both composability and APIs. Administrators define how they would like infrastructure to be in an easy-to-read text file using a language such as YAML or JSON. Some piece of software, usually called an agent, reads the definition and implements the configuration.

For the most part, agents apply configurations by interacting with APIs. In some instances they'll fall back to interacting with CLIs when APIs aren't available. Agents regularly poll APIs to determine the status and configuration of infrastructure.

If the status or configuration of infrastructure is other than that which is defined, the agent will attempt to remedy this by applying the desired state. If that fails, the agent will enter an alarm state.

Alarm states can be dealt with in multiple fashions, all depending on how reactions are defined by administrators. The simplest reaction to an agent in an alarm state would be to generate an event in a log, which could in turn lead to the administrator being alerted so that they can investigate.

In truly composable environments, however, infrastructure in an alarm state is frequently jettisoned and rebuilt. The assumption behind this approach is that the baseline definitions for infrastructure are "known good." Alarms are generated by changes (usually made for testing), compromise by malicious actors or hardware failure. Reinstantiation using baseline state definition should thus produce viable infrastructure, and may not require administrator intervention.

Software
Composability is as the heart of many of the tools favored by those who believe in the DevOps approach to systems administration. It's also critical to the rise of containers as a workload packaging mechanism.

Puppet, Chef, Ansible, Saltstack, Otter, CFEngine and others are desired state configuration tools. The public cloud providers all have their own versions as well. You'll find these tools at the center of all things DevOps.

From an organization standpoint, these tools are useful not only because they make configuration of datacenters at scale feasible, but because they allow the use of standard developer tools to solve multiple problems around change management. Because infrastructure is defined in a text file, version control systems like Git can be used to track every change made. Identifying the difference between what works and what doesn't thus becomes trivial.

Version control systems with robust Role Based Access Controls (RBAC) can be used to assist in attribution in the event of failure or compromise. They can also be useful during audits, as it's often possible to see not only who made a given change, but who authorized that change's deployment, and when.

Containers can be thought of as the logical end result of composability. x86 virtualization presumes that everything from the operating system environment up through to data must be preserved, because it was designed for workloads from an era before desired state configuration tools existed. Containers presume composability, and thus have largely eschewed the advanced resiliency capabilities of full virtualization.

Software-Defined
Software-defined solutions -- the real ones, not the ones just trying to ride the hype bandwagon -- are how composability is applied below the operating system environment. Let's consider a couple of examples.

In the old way of approaching infrastructure, in order to install an operating system, a storage administrator might have carved out a chunk of storage called a LUN. The storage administrator would have had to choose which storage device it was going to be placed on, make that storage available to a virtualization cluster, pick which storage media on which it would be placed, define backups and so on. Each storage system was different and had its own quirks to master.

With software-defined storage, at worst an administrator says they needed X amount of storage that meets Y performance, replica, and backup criteria. The storage fabric software figures out which physical storage devices, media and so on to use.

Software-defined networking and other software-defined infrastructure elements are very similar: administrators no longer tell infrastructure what to do. They say "I want X and it needs to obey parameters Y" and the software figures out the rest.

Cloud
Where this all comes together is clouds. What is a cloud? A cloud is piece of management software where a systems administrator or end user can instantiate a usable workload by defining what workload they want (or what they want the workload to do). No other interaction from the administrator or user is required.

In traditional IT, fulfilling such a request would have required multiple administrators to manually provision physical and virtual infrastructure, install software potentially including hypervisors, operating system environments and applications, and configure the entire stack. Only then would it be made available. This highly manual process was prone to human error. Scripts helped. Software-defined solutions helped. Composability helped a whole bunch.

Clouds take the humans completely out of the provisioning loop. Clouds take a human's "I want this workload" and convert that into "I want this storage," "I want this networking," "I want this operating system," "I want this application" and so on. Software-defined solutions convert "I want" into "do this."

The entire thing, top to bottom, is composable. (Or, at the very least, it should be, if you want your cloud to be actually usable.) Clouds rely on composability. They rely on intelligent software-defined infrastructure.

Clouds make ordering up workloads -- with all of the intricate minutiae of provisioning, configuring and making those workloads available -- the sort of task best suited for a GUI: it's something that an individual user or administrator will likely only do infrequently, as needed. Unless they're the type of administrator that creates and destroys a great many workloads every day, in which case they'll use scripts…

And that, dear readers, is composability.