II.

Detecting and mitigating bias and unfairness

So far we’ve discussed a lot of theory – but what do bias and unfairness in AI look like in real life? In this section, we’ll show the different domains in which bias is an issue by highlighting real-life cases and then discussing how to mitigate it.

Bias and unfairness cases

Online advertising

Online advertising is often based on data collection to generate tailored content for different user demographics. This too is an algorithmic process prone to unfairness. For example, an audit of Facebook’s advertising algorithm revealed that job advertisements were displayed unequally based on gender – even though gender wasn’t an indicator of how well an individual would perform or be suited for the job position (4). It’s not the first time that this type of illegality has been uncovered in the advertising algorithms of the social network.

For example, women were given preference in job advertisements for nursing or secretarial positions. On the other hand, job advertisements for janitors and taxi drivers were mainly displayed to men, particularly those from minority backgrounds. Even within the same domain, men and women were shown different ads based on pre-existing workplace gender balance. Skilled women in tech, for example, were prioritized for ads regarding positions at Netflix over positions at Nvidia. Netflix already has a higher volume of female employees compared to Nvidia, which could be a contributing factor to such skewed targeting in advertising. Thus the pre-existing gender (im)balance became reinforced by the advertising algorithm.

Social media giant Meta’s (earlier known as Facebook) advertising policy (5) states: “Our Advertising Standards don't allow advertisers to run ads that discriminate against individuals or groups of people based on personal attributes such as race, ethnicity, colour, national origin, religion, age, sex, sexual orientation, gender identity, family status, disability or medical or genetic condition”.

Healthcare risk assessment tools

As we discussed in Chapter 1, AI is increasingly used in healthcare to assess patient risks and help determine the duration and extent of patient care. An algorithm widely used in US hospitals to allocate health care to patients has systematically discriminated against black people (6). Researchers from the United States, from the University of California, Berkeley, Brigham and Women's Hospital, Massachusetts General Hospital, and the University of Chicago have shown the existence of a racial bias in an algorithm widely used in the US health system. According to experts, this bias reduces the number of black patients eligible for additional care by more than half.

According to research into the algorithm, it showed that the bias occurs because the algorithm uses health costs as a rough indicator of health needs. Due to pre-existing discriminatory bias in healthcare, less money is spent on black patients with the same level of need as white patients. Due to this hidden bias in the data, the algorithm falsely concludes that black patients are healthier than white patients, thus lowering their risk scores.

Credit card and loan applications

A good example of why transparency is an important aspect related to fairness can be found in the trouble surrounding Apple and Goldman Sachs’ credit card venture Apple Card. Several applicants reported on social media that male applicants tended to get much better credit limits than female applicants. This even happened in couples of identical credit scores and shared economy. Yet, when customer service was pressed on this issue, the response was that they weren’t authorized to discuss the credit assessment process, meaning that there was no way to gain insight into why the decision had been made by the assessment algorithm (7).

Since discriminatory algorithms are against New York law, an investigation was launched by the Department of Financial Services (DFS). In the end, the investigation found no conclusive evidence of discrimination but did find deficiencies in customer service procedures and transparency (8). The conclusion of the investigation was partially based on the fact that the application process didn’t use “prohibited characteristics” (race or gender) as labels in their data input for the AI model. However, this investigation has also been criticized by AI researchers for not considering that certain model features often act as proxies for these protected classes, meaning other features that correlate with the protected classes can still sneak bias into a system (9). As such, it also highlights the importance of consistent and up-to-date regulations to keep auditing procedures fair and relevant when things do go wrong. With the current speed of AI solutions development, the updating will indeed need to be constant!

Terminology

Protected and unprotected classes

In the machine learning domain, prohibited characteristics are also referred to as “protected classes” – in other words, these are classes that aren’t allowed to play a role in the predictive outcome. When a class is protected, the model is effectively blind to it when training. The usage of protected classes can, for example, be applied in the interest of preserving equal opportunity when training data contains sensitive information that would negatively impact the fairness of the model. In such cases, commonly protected classes are:

  • Age

  • Race

  • Nationality

  • Religion

  • Gender

  • Marital status

Facial recognition software

A common and much-debated bias that has cropped up with the rise of facial recognition software is the differences in error rates in commercial software based on race and gender (10). The Gender Shades project performed research on commercial facial analysis software from IBM, Microsoft and Face++. Their much-cited research paper by Joy Buolamwini and Timnit Gebru (11) concluded that black women are far more likely to be misclassified by facial recognition software than any other intersectional group, with error rates of at least 20%. Their research also shows that 93% of all misclassified subjects were of darker skin tones.

Sometimes, facial recognition software won’t detect black faces at all. Such was the case for Joy Buolamwini when she was working on her art project using AI-enabled facial recognition software, where she had trouble getting her face to be detected consistently – while the system worked perfectly fine for her white friends. She has told in interviews how wearing a white mask increased the software’s ability to detect her face.

Part of the problem behind such unfairness is the way this software is trained. This is very much a sample bias problem as there is a tendency for training datasets in the Western world to favor white men. Similarly, studies have shown that when comparing facial recognition software trained in Western countries to software trained in East Asian countries, they work better for their respective populations.

Speaker recognition software

Speaker characteristics such as age, accent, and gender have been proven to be factors that affect how well voice AI works. Even back in 2013, research showed that speaker recognition technology generally performed worse for female speakers than male speakers. As part of the 2013 Speaker Recognition Evaluation in Mobile Environments challenge, 12 out of 12 submitted systems confirmed this bias, as all evaluated systems performed worse for female speakers, with evaluation showing an average of 49.35% higher error rates for women compared to men (13).

YouTube’s caption system was also shown to contain similar gender and accent biases in a study conducted in 2017 (14). In this study, English speakers from California had the lowest amount of error, while speakers from Scotland had the highest. Even here, the average result continued to show that performance was lower for women than for men. (12.)

Bias detection and mitigation

OK, you might be asking yourself now. Bias is a problem – but is there anything we can do about it? Thankfully, the answer is ‘yes’. We’re now going to take a closer look at how we can detect and mitigate bias. It’s important to note that when speaking of bias, most researchers and professionals speak of mitigation, not removal. This is because there are so many sources and avenues for bias to enter the picture – from being found in the data, to the training process and labeling, to the way the software is deployed – that saying that bias has been completely removed by any given countermeasure is too strong a claim. Instead, we speak of mitigation, where we're looking at ways to reduce the severity and impact of bias.

Three metrics to detect bias

To be able to mitigate bias, the first step is to detect the bias. Detection and mitigation are the two core activities in tackling bias, and they often go hand-in-hand. Detection is usually done through methods employing one or several evaluation metrics or conditions to compare the model predictions and outcomes.

  • One example of a metric is ‘group fairness’ or statistical parity. This metric evaluates fairness in a statistical way by looking at if individuals from different categories of a protected class have an equal probability of a positive outcome. For example, when people apply for a job, it’s a good measurement of fairness if applications from both men and women have an equal statistical likelihood of being passed through for an interview.

  • Another example of a metric is ‘individual fairness’ or similarity-based fairness. This type of approach instead focuses on comparing predictive outcomes between similar individuals, the idea being that similar people should end up with similar decisions in, for example, credit approval.

  • A third type of metric is ‘counterfactual fairness’. This third type of evaluation relies on looking at what would have happened, had certain things been different. Usually, what’s tested for is how changes in protected classes would affect outcomes. If a model is counterfactually fair, then changing the race or gender of an applicant for credit approval would still yield the same decision. If it doesn’t, then race and/or gender played a role in the decision and thus the model was unfair (15, 16).

Mitigating bias: pre-, in-, and post-processing methods

Now that bias has been detected, there are three main categories for methods in mitigating bias: pre-processing, in-processing, and post-processing.

Pre-processing methods rely on going back and adjusting the conditions that were in place before the training took place (in other words before processing). This could for example involve fixing labels that were previously causing problematic results, or adjusting the dataset to make sure it’s more balanced and representative. It could also involve adjusting or adding new protected classes.

In-processing methods instead attempt to mitigate bias by setting up measures within the learning process. This can take the form of regularizers to guide the model toward fairer outcomes, or constraints to enforce them. Regularizers are techniques or mathematical functions that can be used to prevent models overfitting (finding patterns in the data that are undesirable or unnecessary for the model’s function) and underfitting (failing in finding the necessary patterns) during the training process. The drawback of this kind of method is that, in general, most of them require the protected or sensitive features to be known ahead of time in order for the proper functions to be added.

Finally, post-processing methods don’t aim to fix the data, like pre-processing methods, or guide the learning process, like in-processing methods. Instead, they focus on modifying the predictions to create fairer outcomes. For example, in the case of a recruitment tool that’s known to discriminate against women, a post-processing method of tackling this could be to adjust the scoring threshold for applications with a ‘woman’ label so that female applicants are more likely to be picked. This would effectively ‘balance out’ the bias by introducing a new kind of counter-bias. This threshold adjustment could later be tuned until a satisfactory outcome is reached.

Why human oversight is needed

Of course, it’s important to consider the role of the person or entity deploying the algorithm and how they handle a model’s predictions.

For example, in 2020, the UK canceled A-levels exams during the COVID-19 pandemic and instead opted for using an AI-based system to determine student outcomes for university qualifications. First, teachers were asked to estimate how they thought their students would have performed in the exams. Then, to counter a perceived risk of teacher bias to overestimate their students, an AI algorithm was used to weigh these scores by factoring in how well students from each secondary school had performed historically.

This resulted in a disaster where roughly 40% of students got lower estimates than what they predicted from their previous grades. In effect, the algorithm started reinforcing societal biases of economic classes: High-performing students from poorer areas were more likely to receive a lower predicted score compared to students from richer schools. This resulted in protests and later forced the government to scrap the approach altogether. The main takeaway here is that human oversight is an important element in ensuring fair outcomes. The choice to let an algorithm have the final say in a decision should be taken with a high degree of consideration and responsibility.

Note that these examples given here have been broad simplifications of fairness metrics and mitigation methods. For a more detailed view, we recommend checking out the course Advanced Trustworthy AI.

Increasing trust in AI tools and ensuring fair use of data and fair access to data are important considerations for the responsible development and deployment of AI technologies. While they touch more closely on the topics of privacy and explainability, they have a strong link to fairness.

  • Fair use of data involves ensuring that data is collected, processed, and used in accordance with relevant laws and ethical principles, such as data protection laws and principles of privacy and confidentiality. This includes obtaining informed consent from individuals before collecting their data, ensuring that data is collected and used only for the purposes for which it was intended, and implementing appropriate security measures to protect data from unauthorized access or use.

  • Fair access to data means that data should be accessible to all parties who have a legitimate need for it, regardless of their size, power, or resources. There should be ways for sharing data between different organizations and implementing regulations that prevent data monopolies or unfair practices that limit access to data.

To increase trust, AI developers and users should prioritize transparency and accountability in their use of AI tools. This includes providing clear explanations of how AI tools work, what data they use, and how they make decisions. It also includes implementing mechanisms for monitoring and auditing AI tools to ensure they’re functioning as intended and not engaging in biased or discriminatory behavior. We’ll talk about the transparency and explainability of AI models in the next chapter.

Here’s a list of sources you can check out to become more familiar with the GDPR and other European initiatives:

Part summary

Here’s what you’ve learned in this second chapter of this course:

  • Bias refers to a skewed perspective. In machine learning, bias refers to favored or unequal outcomes for certain groups based on skewed data or model behavior.

  • Fairness is usually talked about when bias results in ethical issues.

  • Discrimination means unfair treatment of groups or individuals based on specific characteristics such as race or gender.

  • Algorithmic bias can take the form of historical bias (making predictions based on earlier outcomes), sampling bias (when the data isn’t collected randomly enough but builds a preference towards certain populations), and measurement bias (the data is labeled in a way that doesn’t reflect real life)

  • Biased systems often self-reinforce – biased outcomes feed to the model that creates more biased outcomes

  • There are three metrics to use when detecting bias: group fairness (or statistical parity), similarity-based fairness, and counterfactual fairness

  • Bias can be mitigated in the pre-, in-, and post-processing stages.

With this knowledge, you can better appreciate the complexities of data processing, AI systems, and the impacts these have on the human population. You also know better what developers and deployers of AI systems need to look out for to ensure future systems avoid discriminatory outcomes.

Next Chapter
3. Explainability