Are Computer Analytics Doing More Harm Than Good? -- Virtualization Review

Are Computer Analytics Doing More Harm Than Good?

The law of unintended consequences rears its ugly head -- again.

By Trevor Pott
10/17/2017

The new en vogue way to cut business costs is to replace highly trained and skilled knowledge workers with Bulk Data Computational Analysis (BDCA) tools such as Artificial Intelligence (AI), Machine Learning (ML), Business Intelligence (BI) and other forms of analytics. Right now, however, all BDCA tools are as flawed as the humans they seek to replace, and there is no solution in sight.

BDCA tools are everywhere. Many of the simple ones work quite well, and we rely on them every day. In the virtualization industry, tools like VMware's Distributed Resource Scheduler (DRS) analyze parameters of running virtual machines (VMs) and migrate them from host to host to keep a cluster load-balanced. More advanced versions of this exist, with the most popular being Turbonomic. These are simple, practical BDCA tools that mostly work.

Mostly.

DRS has its flaws. If it didn't, Turbonomic wouldn't exist. Turbonomic is sort of like DRS, but it operates both on-premesis and in the public cloud, and will help not only move existing VMs around, but provide advice on how to change the configuration of VMs to optimize utilization.

Turbonomic also has its flaws; it's based on an idealized version of an economic model that has never actually worked in the real world. If the criteria driving your VM placement and sizing decisions are simple and based on nothing more than price, Turbonomic typically saves organizations money.

Questions That Need Answers
That said, it takes me about 30 seconds in a briefing with Turbonomic to start running up against criteria that Turbonomic's algorithms simply don't consider. I worry about things like legal exposure. What data is inside a given VM -- or processed by a given VM -- and to whom is that data visible, and under what circumstances?

Maybe I have data that can't ever be placed on a service with an American legal attack surface. Maybe I have data that has to comply with the GDPR. Maybe I handle data for some customers who are more paranoid about privacy issues, while also having other customers who just don't care.

There are also interrelations to consider. VMs on this infrastructure here have these sorts of data protection characteristics, while on that infrastructure they can only have those data protection options. These VMs need to run together, except on alternate Tuesdays when I also need those VMs to run with these other VMs because they'll get together and do a huge analysis. After the Tuesday party is over, however, the two groups can split to different tiers of usage.

With work, I can beat Turbonomic into solving some of this. I can also turn to other automation tools designed for private clouds. In both cases, however, I need to have a fair amount of technical skill, because what I ultimately have to do is convert my needs into something an existing analytics algorithm can understand.

Trying to teach a VM optimization application how to care about additional criteria is comparatively simple. This is just extending how that particular BDCA tool thinks, and these are really primitive BDCA tools to start with. When we start looking at how BDCA tools are used to try to eliminate knowledge workers, however, things get more complicated.

The problem is always the same: BDCA tools are designed to handle fairly straightforward and rational criteria. The real world isn't always rational, and it's almost never straightforward.

For BDCA tools to be truly useful, they need to be capable of considering complex (and ever-changing) social, political and economic issues. This is a problem. Teaching BDCA algorithms about politics is a bit like trying to explain the Prime Directive to a cat. There's a lot of evolution that needs to take place before understanding can be achieved.

The Social Part of Today's Media
Social media companies have a lot to teach us about the fuzzy grey area where BDCA tools and politics intersect. Each have encountered real-world problems where politics has required humans to be used. Not because humans are better, but because objective application of rule sets is not what's actually desired.

All of these companies are under a great deal of pressure to combat various social problems that manifest themselves on services offered by these organizations. Bullying, hate speech and false news reporting are prime examples. While legally little more than minor irritants in the United States, all are illegal in various countries around the world, and all are perennial problems on all of today's social media platforms.

Vulnerable individuals have been driven to suicide by individuals abusing social media. Extremists and orthodoxists have used social media as a platform for radicalization and even to "dox" individuals they wish to be punished or murdered. Elections and other political events have been swayed by state-sponsored groups using bots to amplify false and misleading information as "news."

There exist some reasonably simple things that the organizations in question could do to block the bulk of such traffic, especially with today's BDCA tools. Certain uses of rhetoric, dog whistle vocabulary, targeted vulgarity and so forth are easy to identify. While there is a certain linguistic treadmill effect to compensate for, these organizations are huge, and a handful of people paying attention to Reddit could easily keep the BDCA tools up to date with the evolution of language.

Today's social media companies choose not to engage in this sort of "automated moderation," not because the tools don't exist, but because today's BDCA tools are blunt instruments that cannot be modified to compensate for the social, political and economic impacts of their use.

Twitter May Yet Kill Us All
Consider for a moment Donald Trump, who some fear may cause a nuclear war through his use of the social media service Twitter. This is the most extreme of outlier events, and unlikely to have been considered during the design of the service, but it is at least hypothetically possible.

In the realm of slightly more mundane considerations, a wide range of individuals have been accused of using Twitter to spread falsehoods, attack opponents and the press, and threaten others. In some cases, Twitter reacts by deleting tweets, locking out accounts or even deleting accounts; but Twitter's response to similar tweets by different individuals is hardly uniform.

Many believe that powerful individuals, including the current President of the United States of America, violate Twitter's terms of service almost daily. Despite the similarity of tweets by these powerful individuals to those which have gotten others banned, Twitter will not challenge certain individuals.

Herein lies the problem of the algorithm. Twitter absolutely could define their terms and conditions in such a fashion that they could be rigidly upheld by BDCA-backed bots. We have the technology to program tools to make objective assessments of what someone writes and determine if it is bullying, hateful, incitement, threatening or so forth. Applied objectively, however, a great many powerful individuals and groups would be sanctioned, and this would very quickly lead to the end of Twitter.

Some think Twitter CEO Jack Dorsey and all other executives at Twitter are cowards for refusing to challenge how powerful individuals use Twitter. The supporters of those powerful individuals would be just as irate should Dorsey and the Twitter board take action.

In addition to alienating political factions, Twitter must consider that they may face direct reprisals from said powerful individuals where they to regulate their use of the service. Again, Trump serves as a useful example. If Twitter were to ban Trump, he would almost certainly retaliate.

If Trump brought to bear his personal wealth, he could drown Twitter in lawsuits. Using the office of the presidency he could work to have social media outlets considered publishers and responsible for the content of their users: if he can't use Twitter, Trump certainly has the means to ensure no one else can either.

Even if Trump took the high road and eschewed the use of strong-arm tactics, he -- like many other powerful and controversial figures -- commands millions of followers around the world. A coordinated boycott egged on by a powerful demagogue could hurt Twitter right in the value proposition: Twitter's active user count.

The problems facing Twitter are human problems. As a platform where humans interact, Twitter finds itself caught between a series of nuclear options and potentially being used as a nuclear option. No amount of code-enabled BDCA pixie dust can resolve this.

Policy Inaction
BDCA use gets complicated by social and political issues rather quickly. Were a social media organization like Twitter to enforce their terms and conditions with an algorithm, they are by definition strictly codifying them. That's a huge departure from the deliberately vague and fuzzy terms that exist today, which explicitly leave room for human oversight.

Different countries have different laws. Let's consider a case in which an individual located in the United States uses Twitter to attack someone in Germany. This individual may be entirely within their rights under the laws of the United States; however, they may be violating the rights of the individual in Germany, as seen by German law.

If Twitter codifies their terms and conditions in an algorithm, they are making a choice as to which individual's rights to uphold, and they can be sued by the party whose rights are ignored. Similarly, if Twitter were to code enforcement algorithms that ignored certain groups or certain hate terms, they would be codifying their biases and opening themselves up to an entirely different set of legal attacks.

This same argument can be made about privacy and data sovereignty. Google is currently in the process of losing lawsuits around the world over similar issues related to biases in its search algorithm.

Balancing Morality and Pragmatism
We're not collectively ready to create a world in which we're forced to judge ourselves by the same standards we wish to apply to others. Survival -- both personally and as an organization -- requires that we balance morality and pragmatism in our decision making. A balance we daren't commit to code.

There can be no solution to this, and it places an upper limit on the utility and applicability of BDCA tools. In turn, this limits the utility and applicability of public clouds; with the rise of private clouds, the unique selling feature of those public clouds are the BDCA tools they offer.

Human nature means that some things will always have to remain unsaid, uncodified, and held just outside the reach of the day's laws. Different groups will always want different outcomes, and politics will always be a problem.