III.

How reliable are XAI explanations?

As we learned before, the whole point of employing XAI methods is to make the workings of machine learning algorithms understandable to a human. What use would it be for Jane Hill to know that Apple’s credit card algorithm has a Shapley value of .03 for the input feature "gender" if she has no clue of what Shapley values mean? Can you really say that your XAI method was successful? In computer science, we call the ability of XAI to serve its purpose reliability. If you aim to apply XAI outside of fictional settings, it’s important to test it so you know the explanation is serving its purpose.

The challenge of providing reliable XAI explanations is made more complex by the fact that every model is different, and every person is different. Try to explain to an adult and a toddler why they should eat their vegetables, and consider the different responses you get. Indeed, it’s much easier to reason with the toddler than the adult. If you replace vegetables with algorithms, you also need to consider the different audiences your explanation might reach, from users to auditors and policymakers. This section will give you some key aspects to consider about the reliability of your explanations.

Finally, we conclude this chapter by looking at XAI's bright and not-so-bright sides. We examine XAI in the scope of trustworthiness and how the two concepts are related, looking at how explainability is only one of many dimensions of trust. After all, a contractor may be able to explain all their decisions when building your house, but would you still hire them if their work resulted in a leaky roof? Also, understanding how machine learning models work may be something that isn’t only used for good. If you know how Apple decides on credit cards, you may also use that information to rig the algorithm in your favor or feed data into it that makes it perform worse.

Metrics for evaluating XAI

Here, we introduce a set of properties that can help analyze the differences between XAI models and their explanations. To illustrate them, let’s use the example of applying LIME over a machine learning model classifying dog and cat images. (Surprising, huh?)

Evaluating XAI methods

These properties describe how well an XAI method performs or fits a chosen application type:

Expressive power: The way or structure in which the XAI method presents an explanation. In the case of LIME for our pet classification model, visual explanations are shown as segmentations overlapping the input image. These segments detail the features' contributions for the model's decision.
Translucency: How much the XAI method focuses on "looking into" the machine learning model, i.e., the training data or model structure, to produce explanations. Model-specific methods and inherently explainable ("glass-box") models have high translucency. XAI methods relying on manipulating inputs and observing outputs, such as LIME, have low translucency. Translucent models can use more information to generate an explanation but are less portable.
Portability: The range of different machine learning models the XAI method can be applied to. Generally, model-agnostic methods are more portable than model-specific methods. LIME is very portable, as it can explain many AI algorithms but may struggle with complex neural network representations.

Evaluating individual explanations

These are some of the most commonly used metrics to evaluate the quality of an individual explanation generated by an XAI method:

Accuracy: How well the explanation performs with unseen data compared to human-expected outcomes. An explanation's low accuracy might be acceptable if the underlying machine learning model's accuracy is also low and the explanation's fidelity is high. For example, if our machine learning model misclassifies a cat image as a dog image and our LIME explanation does the same (low accuracy and high fidelity), it still can help explore why the model made the mistake.
Fidelity: How well a surrogate model's explanation approximates the original model's prediction. If a machine learning model has high accuracy and its explanation has high fidelity, the explanation also has high accuracy. Explanations with low fidelity are useless for explaining a model because they don't approximate it well enough. For example, if our main model classifies an image as a dog and the LIME explainer classifies it as a cat, it doesn't help understanding why it was originally classified as a dog. Local fidelity refers to an explanation that approximates the model prediction well with a subset of the data.
Consistency: How similar the explanations are for similar outputs from different machine learning models with the same input. For example, if models A and B both classify a certain image as a dog, the explanations are consistent if they are alike for A and B.
Stability: How similar the explanations are for similar inferences (from one machine learning model) to be explained. For instance, if LIME highlights the presence of a certain pattern on a dog's form as contributing to the model's prediction, it should consistently highlight that same pattern across multiple images of dogs.
Comprehensibility: How well humans understand the explanations. This might be difficult to measure, but it is perhaps the most important metric. Understandably, it also depends on the audience. Simple ways to estimate our explanations' comprehensibility could be to ask human evaluators to rate how easy they are to understand or see how well people learn to predict the model's behavior.

What XAI provides versus what the end user would like to have

Matching XAI output with expectations can be a difficult task. As mentioned above, interpretability and understandability are important requirements for a successful explanation – and finding the right way to fulfill these requirements is further impacted by the fact that what’s expected of an explanation can differ greatly between contexts, such as in which domain the XAI is applied and who the user is. For example, an explanation for an urgent decision where time is a factor would require a very different kind of explanation compared to contexts where automated decisions can be carefully diagnosed and analyzed. Equally, the situation and identity of the explainee can fluctuate greatly between contexts, and expectations can vary greatly between developers, professionals, and laypeople.

An increasingly popular trend in XAI is to look at lessons that can be learned from social sciences. After all, these fields already have a long history of analyzing what makes a good explanation in social contexts between humans. Another related approach is research into interactive and personalizable explanations. They would be able to take input from the user and change what’s being displayed depending on user needs within their given context, such as prior knowledge or what aspects they’re looking for in particular. There is also the idea of an explanation as a conversation, where users can enter queries and receive more specific feedback. This type of conversation can take the form of personalized counterfactual explanations, where the user can adjust their conditions to generate their own counterfactuals (for example, an explanation of why a woman didn’t get a job could be counterfactually adjusted in a way that asks “what if they were a man?” to see the new outcome). There is also the literal method of employing conversations in explanations, such as utilizing text-based conversational agents as a medium between the explanation and the human.

Finding the right way to design interface elements for XAI is an important task. Ben Schneiderman, professor of Computer Science at the University of Maryland and author of the much-acclaimed book ‘Designing the user interface: Strategies for effective human-computer interaction’ (HCI), has identified that the design of algorithmic accountability interfaces that allow users to “better understand underlying computational processes in search, recommender, and other algorithms” to be one of the grand challenges of HCI research.

Mental models are also important in providing explanations that meet user expectations. Accurately matching a user’s mental model has proven to increase efficiency in user interaction with the system. A mental model expresses how we interpret and understand the world around us and the cause-and-effect relationships within. For example, a person who drives a car for daily travel but has no detailed knowledge of how engines work probably has a very different mental model of how cars operate in comparison to a mechanic. They may still drive the same kind of vehicle in their daily lives, but their perspective and reactions may vary greatly. If the car breaks down and they both look under the hood, the mechanic will see a familiar mechanical system and know where to look for a problem, while the layperson may stare blankly, only seeing a confusing mess of mechanical parts. Equally, XAI explanations, no matter how detailed and accurate, make no sense if the users are unsure of what they’re looking at.

In Chapter 1, we introduced a set of questions that businesses could ask themselves regarding the trustworthiness of AI. We can follow a similar process for explainability:

Why do we need to make our AI algorithm explainable? XAI needs investment – human, monetary, and environmental (CO2 emissions). Do you really need it, or just use it because it sounds cool? To be fair, saying: “We are using cutting edge LIME XAI methods” does sound cool, but probably not enough to justify the investment.
Who are you explaining AI to? Explaining an algorithm to an auditor is different from explaining it to a user. Your target group will also probably determine the level of detail of the explanation, how it’s presented, and what XAI method you’re using.
How reliable is the XAI method? Now that you have decided to use XAI and know who you’re targeting, you should assess the reliability and usefulness of the method. After all, if the explanations don’t serve their purpose, your investments might be misplaced.

Could there be unintended consequences of deploying XAI? The method you chose is technically feasible, appropriate to the target group, and reliable — congratulations! However, could someone be taking advantage of it to jeopardize your algorithm? Carefully consider the risks of XAI before bringing it into a real-world environment. Everybody taking this course is certainly a very nice person, but there are some bad people out there!

Part summary

In this chapter, we took a look into the decision process implemented by AI models. We first explored the complexity of the decision process and then highlighted which tools can be used to characterize and quantify their logic.

Explainability can be looked at from a local or a global perspective – the local perspective gives an explanation for a specific decision, whereas global explanations look at the model's behavior overall.
Explainability methods can be divided into model-specific and model-agnostic approaches depending on what types of models they can explain – any, or only a specific kind.
The explainability of an AI solution is closely tied to what type of algorithm it uses. Some algorithms (such as neural networks) are “black box”, or very difficult to explain.
Some of the benefits of explainability are increasing trust and stakeholder buy-in, meeting regulations, and gaining a more comprehensive understanding of the data and it’s implications
Explainability comes with a cost -– explainable models might not perform as well as their “black box” counterparts, and explainability approaches on “black box” models might be time-consuming and difficult to set up.

Next Chapter

4. Resilience