In-Depth
AI Guardrails for Custom Business Applications
It's quickly becoming the norm for businesses to find ways of integrating AI capabilities into their custom applications. After all, when properly implemented, AI can make custom applications vastly more capable. While it is relatively easy to tie an application to a back-end AI engine, the real trick is to build guardrails into the application as a way of making sure that AI doesn't do anything that it should not.
We all know that AI is not perfect and that AI chatbots are prone to occasional hallucinations. When it comes to weaving AI into a business application, however, hallucinations are only one of several problems. Some of the problems that you will have to watch out for are inconsistent output, drifting responses that eventually violate the rules that AI was initially given, or responses that violate business logic or regulatory constraints.
This is where AI guardrails come into play. Contrary to what some have said, guardrails are not about limiting intelligence, but rather about avoiding problems that will inevitably occur unless you properly constrain the AI.
Another way of thinking about this is that guardrails shift the power balance. By using guardrails, you are essentially saying that AI is not in charge, your application is. Of course, the real question is, how do you build guardrails into your application?
Building Guardrails into an Application
Building guardrails into an application is, at a high level, a two-step process. The first of these two steps is prompt engineering. The need for prompt engineering stems from the fact that you cannot predict how AI will respond to a prompt. If left unchecked, the AI is likely to come up with a response that wrecks your application. It could be that the response contains a value that is outside of the expected range, but it is also possible that the response includes an invalid character or output that the application just can't parse. The trick, therefore, is to force AI to use an output format that aligns with the application's requirements. Let me give you an example.
Imagine for a moment that you work for a bank and you want to build a loan application that reviews loan applications and assesses the level of risk. The simple approach to building such an application would be to create an AI prompt that says, "summarize this loan application and assess the level of risk." That prompt would likely give you human-readable output that could be used when determining whether or not to approve the loan.
Although this process is likely to work, it is ultimately problematic because of its inconsistency. The application might not give you the same level of detail from one loan application to the next, thereby making it tough to justify approving or rejecting certain loans.
If the goal is to ensure consistency, then a better approach would be to create a prompt that looks more like this one:
Please review the loan application and assess the risk. Please format your response as a JSON object, adhering to this schema:
{
"risk_score": "number (0-100)",
"risk_level": "LOW | MEDIUM | HIGH",
"decision": "APPROVE | REVIEW | REJECT",
"confidence": "number (0-1)"
}
This approach shifts the AI engine from creatively generating text to being more deterministic, resulting in the creation of objective data that is machine-readable.
It's worth noting that you may have to experiment with the prompt a bit in order to consistently get the results that your application expects. I recently wrote an application that used the same types of AI guardrails I am discussing here. In doing so, I found that I needed to include a few more rules in the prompt as a way of preventing empty fields, errant syntax, or invalid formatting.
Validating AI Output Before Use
The second step in the process of building AI guardrails is to validate the JSON before using it. This process works similarly to how you might validate human input. As an example, you know that the risk score should be a number from 0 to 100, so you can write code that checks to make sure that the risk score is not blank, does not contain anything other than integers, and does not contain negative values or values greater than 100.
If invalid input is detected, then you can handle it in one of two ways. The first option is to correct the input in real-time. If your code is expecting an integer, but the AI returns a floating point number, for example, then you can round the number and convert it to an integer. You might also use REGEX expressions to strip away any problematic characters.
The second approach is to regenerate the prompt and keep requiring AI to redo its response until it gets it right.
All of this is to say that when it comes to AI integration in applications, getting the results that you want is not just about writing better prompts. It's about forcing AI to write its response in a machine-readable format and then having application-level rules in place that will validate the AI output rather than allowing the application to blindly act on it.
About the Author
Brien Posey is a 22-time Microsoft MVP with decades of IT experience. As a freelance writer, Posey has written thousands of articles and contributed to several dozen books on a wide variety of IT topics. Prior to going freelance, Posey was a CIO for a national chain of hospitals and health care facilities. He has also served as a network administrator for some of the country's largest insurance companies and for the Department of Defense at Fort Knox. In addition to his continued work in IT, Posey has spent the last several years actively training as a commercial scientist-astronaut candidate in preparation to fly on a mission to study polar mesospheric clouds from space. You can follow his spaceflight training on his Web site.