From Black Box to Glass Box: Transparency in XAI

With the rise of OSS security concerns, divergence of explainability goals, and custom, proprietary XAI algorithms, is transparency still possible?

Explainable AI (XAI) typically involves tools and techniques to understand how a complex model behaves, in a simple, straightforward, and intuitive way so humans can understand it. It answers why an automated decision making tool resulted in a specific output that impacts customers.

Market Size

It’s predicted the explainable AI market size is estimated to reach $21.8 billion by 2030, up from $4.1 billion in 2021. And Gartner’s crystal ball paints a picture that “by 2025, 30% of government and large enterprise contracts for the purchase of AI products and services will require the use of explainable and ethical AI.”

Regulatory’s Role

So, what’s fueling predicted market growth? The accelerant for the explainable AI market is due in part to EU advent of GPDR’s Article 13-15 and 22, which establishes rights specific to algorithmic decision making, including a right of both notification and access to meaningful information about the logic involved and the right of the significance of and envisioned effects of solely automated decision making. Explainability, in this specific use case, is a legal obligation of enterprises to inform regulatory officials as well as end customers about why models made the decisions they did. End customers should be able to comprehend explanations, which should be written simply in their native language and include non-technical jargon.

Additionally, Article 13 (1) of the EU’s future Artificial Intelligence Act (AIA) mandates that high-risk AI systems be “sufficiently transparent to enable users to interpret the system’s output and use it appropriately.”

Given the fact there are over 100 different XAI methods available to data scientists today and they often select the one that takes the least amount of effort/time, future regulation doesn’t specifically prescribe which explainability method should be used. The enterprise can elect to use local, global, or counterfactual explanations, but it “must be faithful to the model in the sense that they need to be an, at least approximately, correct reconstruction of the internal decision making parameters: explanation and explanandum need to match.”¹

While there is no nationally passed regulation in the U.S. regarding explainability, the National Institute of Standards and Technology (NIST) proposed in 2020 four principles for judging how explainable an artificial intelligence’s decisions are.

And, the most recently released White House Blueprint for an AI Bill of Rights includes a Notice & Explanation principle, stating: “Automated systems should provide explanations that are technically valid, meaningful, and useful to you and to any operators or others who need to understand the system, and calibrated to the level of risk based on the context.”

‍Explosion of XAI Solutions

Academic R&D labs, open-source communities, and private software enterprises alike have seen legal compliance signals as a trigger to brainstorm XAI algorithms. The Partnership on AI (PAI) reports that “each year the number of available XAI tools (developed by both academics and industry practitioners) grows, resulting in more options than ever for those interested in using them. Here, we define ‘XAI tool’ broadly to mean any means for directly implementing an explainability algorithm. In our initial research, PAI identified more than 150 XAI-related tools published between 2015 and 2021.”

The goal of the PAI project is to give enterprises tools to make more informed decisions about which XAI tool is best to deliver value to a business and help scale explanations.

‍OSS Concerns

The vast majority of XAI tools are free-to-use open-source software (OSS). Public by nature, OSS offers a lot of benefits including crowdsourced examination for bugs or code evolution as well as enabling ethical conversations around ML applications. While OSS XAI libraries such as LIME or SHAP have done a lot to advance a broader understanding in the industry, they also pose performance doubts and security vulnerabilities.

Some ML engineers are hesitant to apply OSS explainability methods into an application because they can slow down MLOps workflows and AI pipeline momentum. Additionally, cybersecurity experts are voicing concern that OSS explainable models are less secure given that when internal workings of model algorithms are publicized, bad actors can potentially manipulate the information via evasion, oracle, or poisoning attacks. Enterprises in competitive industries (where algorithms are treated as trade secrets or confidential IP), are worried that explainability may embolden industry competitors to reverse engineer ML models.

OSS caution is also echoed by the data science community. In a recent StackExchange post, it was acknowledged that commonly used and widely adopted open source ML packages are not regularly tested for reliability or de-bugged.

One user posted, “Quite often those packages GitHub repos have existing unresolved issues and we may not go through them to identify any pitfalls. Business will be making critical actions based on the predictions/insights we, as a data scientist provide, which in turn could be based on those packages. How can we minimize the risk in such scenarios?”

Custom Explainability Trends

Given the combination of OSS security concerns and stakeholder resistance, companies are creating their own custom or proprietary explainability methods in-house or outsourcing the task to boutique consultants.

As enterprises shift from OSS to XAI algorithmic IP, one would assume transparency would suffer—but that’s not necessarily true. Proprietary algorithmic IP may enable enterprises to fine-tune XAI methods to explain outcomes on an individual audience level, to provide more context around decision making rationale. Additionally, it gives the enterprise greater control over explanation content and verification. Ultimately, it may ensure there is sufficient domain knowledge expertise assigned to investigate models and dynamic datasets in order to fully comprehend the explanation.

It’s predicted the trend for custom explainability will span across the model lifecycle, integrating into upstream and downstream ML team tasks. Given regulators—across GDPR, EU AI Act, or the AI Bill of Rights—are requiring easy-to-understand explanations for end customers and business stakeholders alike, there’s been recent work advancing natural language formatted explanations vs. technically dense feature importance scores based on LIME or SHAP.

As much as regulators are proponents of OSS, they also accept proprietary algorithms as long as there is sufficient evidentiary internal documentation and public disclosure to satisfy explainability laws.

Divergence of Stakeholder Explainability Goals

Regardless of whether XAI methods are built on OSS or proprietary or a combination of both, the biggest challenge facing enterprises is that internal stakeholders don’t share the same explainability objectives. Each department has a distinct yet disparate goal of what they hope explainability will achieve.

The Brookings Institute’s article, Explainability won’t save AI, broke down these fundamental differences.

Typically, an explainability formula seeks to answer one perspective but fails to capture a broader context capturing angles from diverse, multi-stakeholders. Which is why the pursuit of explainability—either instigated by internal audit or external regulatory ask—in itself is not a panacea alone for risk management.

However, it is a starting point to shed light on the complex “black box” decision making that occurs between a machine learning system’s inputs and outputs.

Discover Arthur’s explainability features across the pre-production and post-production MLOps lifecycle, including regional importance, global importance, and feature importance.