Cloud & Virtualization Computing: Defining the Terms -- Virtualization Review

Cloud & Virtualization Computing: Defining the Terms

To properly discuss IT-industry issues, a set of definitions needs to be established.

By Trevor Pott
10/27/2017

The tech industry is terrible at naming things. Ask 15 different people what they mean when they say "cloud" and you'll get a dozen different answers. Worse, those answers will change over time as the technology evolves. This article is about codifying some terminology.

For simplicity's sake, I need to set a few definitions in place. The alternative is irritating for everyone involved, as I will start having to call various parts of the global network by names that are four or five descriptors long.

It's also helpful to define terms when we're talking about slightly nebulous concepts that frequently get lost in semantic debates. I've lost track of the number of heated arguments I've had on social media that ultimately ended up being violent agreement between both parties about concepts entered into through disastrous misunderstandings about nomenclature.

With this in mind, let's set up some definitions. If you arrived here because this article was linked to elsewhere and you're looking for a specific term, the find command -- Ctrl-F or Command-F -- is your friend.

Cloud Computing
Cloud computing is essentially a self-service layer on top of IT services. It can be virtualization, containerization, or bare metal-as-a-service, as long as there is a self-service portal. In today's world, I'd add that in addition to the self-service GUI there should be an API so that it can be addressed programmatically.

Infrastructure-as-a-Service (IaaS)
IaaS is the ability to spin up basic fundamental infrastructure to support a workload via a self-service portal. Basic infrastructure needs to include something to run a workload on or in, and a means for that workload to interact with the world. A virtual machine (VM) and virtual networking is the typical example. Technically, however, it could be a hardware box and a connected serial port. That's just a super niche example.

Platform-as-a-Service (PaaS)
PaaS is a pre-canned environment. An easy example might be a VM template that includes Linux, Apache, MariaDB and PHP. A complicated example would have the words "containers" and "microservices" in it.

The goal behind PaaS is that a developer be able to instantiate an environment and immediately build or inject an application. Whereas IaaS provides infrastructure for a systems administrator to build an environment in without needing to wait on infrastructure provisioning, PaaS provides a pre-built environment so developers don't have to wait on systems administrators to build and patch it.

Software-as-a-Service (SaaS)
Software as a Service is the ability for an end user to instantiate an entire application without needing to wait on the developer, the systems administrator or an infrastructure team. Push button, receive bacon.

First-party vs. third-party cloud services
First-party cloud services are services offered by the cloud provider. Contrast with SaaS applications made available by third-party vendors operating within the cloud provider's ecosystem. Third-party cloud providers typically ply their wares through the cloud provider's marketplace.

Amazon's Aurora and Reshift are examples of first-party services. Rationally, Oracle databases running on Amazon Cloud are third-party services. Amazon's Relational Database Services, however, offer numerous third-party databases -- including Oracle's -- as a managed service. Public cloud providers make clear-cut definitions of everything hard.

Bulk Data Computational Analysis (BDCA)
BDCA is my term for a collection of machine analysis and decision-making technologies, techniques and algorithms. BDCA is a catch-all term that includes -- but is not limited to -- machine learning, artificial intelligence, business intelligence and analytics.

BDCA tools can be provided as software -- or in some cases specialized hardware -- that can be used on-premises, or as part of a larger software package. The really sexy BDCA tools, however, are offered as proprietary first-party cloud services. These are typically consumed via API.

The Public Cloud
The public cloud consists of the small number of large cloud providers that incorporate all of the above concepts. A public cloud provider offers IaaS, PaaS, SaaS, as well as a list of common first-party cloud services (databases, backups and so on).

Most importantly, the public cloud vendors offer proprietary BDCA services. These offerings set them apart from lesser clouds.

BDCA tools are the ultimate lock-in. While today's BDCA offerings from the major public cloud vendors do similar things, emerging BDCA tools aren't designed, developed, architected or programmed. They're trained.

A tenth-year BDCA might be a commercially viable digital wunderkind or a jabbering wreck. You'll never know until you feed it data and let the algorithm evolve. This makes next-generation BDCA tools, by the time they're released as services for consumption, as unique as a person.

From a vendor's standpoint, unique is good. Unique means you have something you can make customers dependent on, and dependency causes friction. Friction keeps customers from leaving.

Lock-in is what separates public clouds from services providers. At current, the public cloud providers are Amazon, Microsoft, Google and IBM.

Services Provider Clouds
Services provider clouds contain some, perhaps even most, of the features of a public cloud. They may even contain some common first-party cloud services such as backup or database offerings.

Services providers do not typically build most of the services they offer. Depending on the size of the services provider, they may build the management interface and/or customer-facing self-service portal. Many of the first-party cloud services they offer, however, are accomplished by licensing the technology from third parties.

There is a great diversity in the first-party cloud services offered by services provider clouds. This is in part because there are an unlimited number of startups offering software to service providers. Every variation on every cloud service you can imagine has been done -- at least twice -- and the startups that make this software are falling all over themselves to offer it up to services providers.

Major services providers include the likes of Digital Ocean and Rackspace. These companies have made customer service their unique selling point, but have failed to develop friction-generating proprietary services.

Private Clouds
A private cloud is a cloud owned and operated by the organization that uses it. Ish. This can actually get a bit murky because a company with a private cloud can turn around and open their self-service portal up to customers tomorrow, turning their private cloud into a services provider cloud.

The cloud software in play makes a big difference here. VMware can be anything you want it to be, provided you're willing to learn how to beat it into shape.

Microsoft's Hyper-V can similarly do anything you want it to do, but until recently the management tools have been appallingly bad. There's Azure Stack, which killed Azure Pack, but it's not really for you to offer clouds to others so much as a way to get you to pay to onboard yourself into Microsoft's Azure public cloud.

If you want to build a private cloud that you can easily open up to customers, Yottabyte may be your best bet. (Disclosure: Yottabyte is a customer of mine.) To my knowledge, they are the only ones offering a fully baked cloud-in-a-can that natively supports nested virtual datacenters. You can create a virtual datacenter that contains a fully isolated virtual datacenter, which can contain a fully isolated virtual datacenter, and so on and so forth.

Compare this to Nodeweaver. Out of the box, Nodeweaver offers IaaS and some limited pre-canned PaaS, but doesn't even fully offer virtual datacenter isolation. If you wanted nested Nodeweaver, you'd actually have to make a VM and load up Nodeweaver inside that VM. This gets inefficient in a right hurry.

Hybrid Cloud Services
Hybrid cloud services come in two flavors. The first is an on-premises service that utilizes the IaaS components of a cloud provider. Veeam to Azure is a fantastic example. The second version of a hybrid cloud service would be something that lives in the cloud but has an on-premises component. Unified communications services like Skype for Business would be an example.

Hybrid Infrastructure
Hybrid infrastructure vs. hybrid cloud is a source of great debate. Scale Computing will call its current GCP-based Disaster Recovery-as-a-Service (DRaaS) hybrid cloud. I vehemently disagree.

Scale's offering merely lights up a VM on GCP that contains an instance of Scale's hyper-convergence software and some networking glue. This allows an on-premises cluster to back itself up on Google's cloud. Once snapshots of VMs exist in the GCP Scale VM, they can be spun up as nested VMs if a disaster recovery scenario should occur.

While Scale's offering is certainly useful and will absolutely find a great many customers, the only thing cloud about it is that the destination DR node is virtual and living in GCP. As per the definition above, clouds must have self-service portals. Scale does not. Scale has a systems administrator/virtualization administrator UI and that's it. There is no end user-facing portal.

Similarly, if I placed a VMware cluster in a colo, created a stretch cluster and vMotioned a VM from my premises to the colo, I have not created a hybrid cloud. It's just hybrid infrastructure that is still IT-facing, controlled and consumed by a single organization/department.

Hybrid Clouds
Hybrid clouds are clouds -- proper clouds, with self-service portals -- that can move workloads between infrastructure controlled by the organization using the cloud and infrastructure controlled by another organization. The canonical example of a hybrid cloud is Azure Stack.

OpenStack can be a great example of a hybrid cloud as well. It really depends on the flavor you get. Most OpenStack is inter-compatible and most of it runs on Kernel Virtual Machine (KVM). If you have a KVM-based OpenStack cloud and you have a services provider with a KVM-based OpenStack cloud, chances are you can arrange to send your workloads to them and vice-versa.

Most of the hybrid cloud offerings on the market are OpenStack based, though there are a good number of VMware service providers out there as well. If you're willing to brave vRealize, you can make a hybrid VMware cloud.

Oh, and there's VMware on AWS. That's probably going to stay a thing.

Yottabyte deserves a mention here as well. Yottabyte has built all the network glue into its offering to ensure that it will work as a hybrid cloud offering. They're one of the few independent, non-OpenStack providers to really put the effort in to tick all the boxes.

Cloud in a Can
Cloud in a can does what it says on the tin. (I'm sorry, I couldn't resist.) Cloud in a can is a hyper-converged cluster that comes with a hypervisor and self-service cloud software pre-installed. Stratoscale is a good example, and one that can also be a hybrid cloud.

Cloud as a Service
Cloud as a service an offering where a vendor installs and manages a cloud offering on-premises. They may or may not also run their own cloud which may or may not be connected as another region on which the organization can instantiate workloads. This cloud as a service may or may not be able to move, instantiate or control workloads on services provider clouds or public clouds.

Hypergrid is an example of a player here. Cisco's Metacloud is another.

Telco Cloud
A telco cloud is a cloud owned and operated by a telco. The main advantages of a telco cloud are proximity, low latency and bandwidth cost savings. Many telcos will charge less for network access and bandwidth consumption if the traffic stays "on net," and doesn't require peering to another network or the wider Internet.

Edge Computing
Edge computing is cloud provider owned and controlled infrastructure hosting latency-sensitive first-party cloud services. These will most likely be BDCA services. See here for a more detailed discussion and here for one way in which it will probably go horribly wrong.

Latency sensitivity is the key. The point of edge computing is that it's designed to offer complex services to devices at near real-time latencies. Latencies low enough that life-critical computing services can be built to take advantage of first-party cloud services like BDCA tools.

The Core
The core consists of the large cloud datacenters of the world. Contrasting with the edge, where modest amounts of computer are placed proximate to demand, datacenters in the core measure their compute in acres. Latency to the core is typically tens or even hundreds of milliseconds, but the compute power of the core is rivaled only by supercomputers and the super datacenters of signals intelligence agencies.

Fog Computing
See: edge computing. Cisco likes renaming things. It has also been used to refer to an as-yet-non-existent layer of cloud provider-controlled infrastructure that is intermediate in computer power and latency between the edge and the core.

The Treadmill
As is always the case, this article will likely be out of date almost as soon as it is published. Some new term will come along, or a great big Twitter spat will break out over just what exactly qualifies as hyper-convergence and I'll wish I'd done a section on that.

Tech marketing terms are a never-ending linguistic treadmill. Hopefully, however, this article will serve well enough to distinguish between key concepts for a few years. It's the concepts themselves that matter, not the terminology. The distinctions -- and the blurred lines -- that mark the tech landscape of the closing months of 2017.

About the Author

Trevor Pott is a full-time nerd from Edmonton, Alberta, Canada. He splits his time between systems administration, technology writing, and consulting. As a consultant he helps Silicon Valley startups better understand systems administrators and how to sell to them.