Never deploy AI without doing these 3 things

AI is hard at work delivering huge ROIs and efficiencies for businesses in every sector of commerce, but it’s also something that can (quite spectacularly) fail, causing major financial losses, and harm to the brand your team has worked so hard to build. If not done carefully, AI deployments can quickly turn into disaster, as we see more and more of in the news each day.

In this (first!) blog post of Arthur AI, we’re going to share some best practices and lessons we’ve learned from deploying real-world, business-critical AI systems into production and along the way, we’ll give you practical steps you can take to make sure your AI efforts are a huge success.

Here’s what you need to build AI you can trust:

A business interested in building reliable AI, whose decisions can be trusted, is a business that puts appropriate guard rails around their model maintenance, long after it’s been trained, tested, and deployed. But seldom do businesses truly take steps to ensure that, after those models are trained, they stay relevant, operational, and healthy.

There are 3 core pillars to Trusted AI:

  1. Performance

  2. Explainability

  3. Bias

We will be updating this blog with deep explainers for all 3 of the above in the weeks to come, but to be quick about it, we’ll explain why they’re important… right now!


As the first pillar of Arthur’s “Trusted AI”, you might think that Performance should be the most important one, because a thing can’t do its job if it’s broken. And we agree!

Many AI models are notoriously “brittle”, and are vulnerable to breaking by virtue of small changes that actually turn out to have a big impact. What if your model’s accuracy degrades over time and no one notices? Without someone looking after the performance and accuracy of your models, you could be losing tons of money on inaccurate predictions, and you wouldn’t know that until it comes time to retrain.

Or, an even lesser-known problem: what if your model stays the same as the world changes? It’s called data drift or concept drift, and it costs businesses hundreds of millions of dollars every year. 

Better performance = more trust in the decisions your AI makes.


A lot of businesses find themselves in a difficult spot when deciding what modeling strategies to use. Deep learning, for instance, is a highly accurate technique, but quite impossible to understand the reasoning behind why the machines make the predictions they do. The math is too complex for a human to understand, and we call them “black box” models for a reason.

If you could use deep learning to solve more business challenges, you’d be able to save a lot in efficiency. But you’d never be able to explain why a particular decision went the way it did.

This is fine for some use cases, but not in all of them. If your model decides creditworthiness, you’ll need to be able to explain to consumers what factors went into that decision-making process due to the Equal Credit Opportunity Act.

Enter the field of Explainability - which simulates the conditions under which a decision was made, and reverse engineers an explanation. Ideally, all your AI should be explainable, especially if it’s a black box, to make sure some categories aren’t influencing your decisions when they shouldn’t be. If a sensitive or protected category is overrepresented in the decision-making, it could one day mean a fine.

That’s why the GDPR issues consumers a “right to an explanation”, which goes to show how important this is and will be as the regulatory bodies begin to enforce lesser-known provisions of the GDPR. 

In fact, we’ll be talking about the Data Protection Authorities and other facets of the GDPR in a blog post to come very soon!


All predictive models are biased. If you’re in data science, you’ll know that this is an uncontroversial claim, but bias means something different to different groups.

It could mean that your algorithm is discriminating against a protected class, but it could also just mean that you have too many photos of cats in your dataset, and the model won’t work well because of it.

Mitigating bias is a hot topic in the field of AI fairness, and one that many businesses are adopting. Most of the time, bias checks exist only in the pre-deployment phase, and become deprioritized after that model’s launch. But again, changes to your business and changes in your user base could mean that your model’s biased suddenly or over time, and if you’re not actively watching for these shifts, you’re exposed to lots of risk. What’s worse is that with GDPR regulation here, and US privacy laws looming, you could even be exposed to a violation of the law.

Enter “Trusted AI”, by Arthur

A lot of marketing buzzwords have been thrown around to describe the process of transforming your AI practice into one that’s reliable, trustworthy, and safe. Some people call it “Responsible”, others “Ethical”, but most groups will agree that maintaining an AI practice that’s healthy involves monitoring for (at least) 3 key concepts. These pillars of “Trusted AI”, when combined, can increase model performance and maintain its ability to do what it’s doing for as long as it needs to run.

And that’s what our platform does! We help businesses monitor for anomalies, track accuracy and data drift over time, and make sure all your algorithms’ decisions are explainable.

Tune in next week for a deep dive on the first pillar… Explainability!

Liz O'Sullivan