How-To
Reduce the Risk Associated with AI-Based Data Analytics
It's a well-established fact that large language models can hallucinate and produce answers that sound confident, but are completely wrong. What is talked about far less often, however, is a similar phenomenon that can occur when AI is used for data analytics.
AI-based data analytics is all about finding patterns in business data and then trying to derive business insight from those patterns. The risk here isn't that AI finds patterns in the data, but rather that AI feels compelled to explain those patterns, even when the patterns are nothing more than a coincidence. This can cause the AI to overreach, rationalize, and ultimately give bad advice. So what can you do about it?
Before I get into a discussion of mitigation techniques, I want to give you an example of a potentially problematic dataset. For the sake of illustration, let's pretend that a company sells widgets and that the company's sales data points to two trends.
The first of these trends is that every year, around the holidays, sales increase by 20%. For the sake of this discussion, we will assume that this is a "real" trend.
The other trend points to the idea that on the second Tuesday of the month, the company sells more blue widgets than widgets of any other color. This is what I like to think of as a junk statistic. It's real from the standpoint that the data supports it, but the increased sales of blue widgets (at least for the purposes of this example) are nothing more than a coincidence. There is no real-world factor driving the sale of blue widgets.
If you gave this data to AI for analysis, it would almost certainly recommend taking steps to aggressively promote blue widgets on Tuesdays in hopes of further increasing sales. The AI might also advise the company to order an entire truckload of blue widgets so that the company can keep pace with the demand.
In reality, this strategy might result in a few extra blue widgets being sold, but the same could be said for any product that is being aggressively promoted. The important takeaway is that in this example, AI is building an entire business strategy around a perceived trend that is really nothing more than a coincidence.
So in this type of situation, how can the business revise the AI so that it is able to better differentiate between actual trends (such as increased sales around the holidays) and trends that are supported by data, but completely coincidental?
Although there is no single method that can always differentiate between real and fake trends in every single situation, there are some techniques that can significantly improve the odds that AI will get it right.
The first technique is to perform what is known as a structured adversarial analysis. This is a fancy way of saying that you can feed AI additional prompts that cause it to critically evaluate its own advice. This can be done visibly or it can all take place behind the scenes.
The most appropriate follow-up questions are going to vary depending on what the AI is designed to do and what type of data you are working with, but here are a few sample follow-up questions:
- What data supports your recommendation?
- Does any of the data contradict your recommendation?
- Are there any other plausible explanations for the data? If so, please rank those explanations by plausibility, based on evidence.
- How confident are you in your recommendation?
- Have you made any assumptions, that if untrue, would lead you to a different conclusion?
Based on my own experiences, I have found that asking these types of questions goes a long way toward forcing AI to come up with grounded responses. However, this method is not perfect. After all, both the initial recommendation and the answers to the follow-up questions are all based on the same data. If that dataset is incomplete, noisy, or contradictory, you can end up in a situation where the follow-up questions end up reinforcing a mistake that was made in the initial recommendation.
I have seen organizations try to get around this limitation by using one AI engine to perform the initial analysis and use a different AI engine to play devil's advocate. This process involves telling the secondary AI engine something like, "here is my data, here is the advice I was given related to my data, tell me why that advice is wrong."
While this approach does sometimes work, you can end up with two AI models disagreeing, but for the wrong reason. The real problem is that neither AI actually knows the truth. Both AI models are simply reacting to patterns within the data. Hence, the devil's advocate approach is less truth vs. critic and more guess vs. a different guess.
A better approach to deriving meaningful insight from business data is to teach the AI to focus on strong signals and to ignore weaker signals. Going back to my previous example, sales increasing by 20% around the holidays is a strong signal. It happens every year in a very predictable manner. Conversely, selling more blue widgets on the second Tuesday of the month, because it is coincidental, would represent a weaker signal.
Weak signals depend heavily on noise and lack any real-world drivers. Breaking the dataset into narrow windows of time or perturbing the data in other ways might involve making a copy of the dataset, and then removing random sections of data to see if the pattern collapses. You might also add a bit of noise, or shuffle timestamps. If the observed pattern survives these perturbations, then the pattern may be real.
In an effort to determine statistical significance, you can also look to see if the pattern holds true across various areas. As an example, are you selling more blue widgets on Tuesdays across all of your regions or across all of your customer segments? Real signals tend to generalize, while coincidences may not.
Another possibility is to break the sales data into multiple chunks, analyzing one chunk and then testing the recommendations against different chunks. This is one of the best ways of validating AI-generated advice, since you are testing the advice against real data.
If all else fails, then there are two more ways that you may be able to differentiate between real trends and coincidences. One way is to simply ask the AI to explain why the trend is happening. In the case of the 20% increase in sales around the holidays, AI should have no trouble pointing to things like seasonality or consumer behavior. Conversely, AI would be hard-pressed to come up with a good explanation for "Blue Tuesdays."
The other thing that you can do is to design the AI to take Occam's Razor into account. To paraphrase, Occam's Razor is based on the idea that the simplest explanation is usually the correct one. When it comes to AI-based data analytics, this might mean giving preferential treatment to insights that are both simple and broad. As an example, the statement "sales increase during the holidays" is both simple and broad.
Conversely, the AI should be immediately suspicious of any rule that is highly specific, such as "we sell more blue widgets than green ones, but only on the second Tuesday of the month."
Ultimately, there is no way to teach AI the difference between coincidence and truth. What you can do, however, is to teach the AI how to apply various layers of skepticism, and how to eliminate a pattern when it does not hold up to scrutiny.
About the Author
Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.