I.

Challenges in implementing fairness in AI

Nowadays, everyone isn’t just talking about AI – it also seems like everyone is implementing (or planning to implement) AI in a wide range of products and services. Since AI systems are becoming more common in society, policymakers and AI researchers are paying more attention to issues of fairness that occur when these systems are deployed in critical areas that can affect people’s lives and careers. After all, we want these systems to benefit everyone equally – but sadly, due to several factors, this isn’t always the case. In this section, we’ll be looking at the various ways AI algorithms can be biased or cause discrimination and what can be done to mitigate these effects.

Bias, fairness, and discrimination

Bias, (un)fairness, and discrimination terms are commonly used interchangeably to describe unethical favoritism or prejudice towards an individual or a group of people based on their attributes. However, in the machine learning domain, these terminologies have a distinctive meaning as discussed below:

Terminology

Bias, discrimination, and fairness

Bias: Bias refers to a skewed or unbalanced perspective. While bias in itself doesn’t necessarily mean a moral or ethical dilemma, it may cause social consequences such as discrimination based on a person’s demographic representation. In machine learning, bias can be a source of unfairness or discrimination introduced during the process of data collection, sampling, or measurements.

Discrimination: Discrimination is an effect of one’s action or outcome of a decision-making process. It’s caused due to human prejudice and stereotyping based on sensitive attributes such as race, gender, or similar attributes. In many countries and regions, non-discrimination law specifies protected and sensitive attributes. That means ensuring equal treatment of these individuals or groups of people is legally required – and this should be considered before deploying an AI-based system.

Fairness: Fairness is a desired quality of a system requiring it to avoid discrimination by ensuring that the system outcomes treat people in an equivalent manner. In machine learning, fairness is evaluated by comparing the model performance with the protected and unprotected attributes.

When talking about fairness in AI, it’s important to understand the context in which it’s used. According to work by Friedman and Nissenbaum, an AI system is considered unfair if it contains bias that leads to a moral consequence (1). Fairness also appears as a core principle in the European Union’s General Data Protection Regulation (GDPR), often associated with either transparency or lawfulness, two other core principles of the regulation (2). In other words, fairness can mean anything from the AI model not being discriminating to obeying laws.

Bias in an AI system can refer to either a bias present in the data used by the algorithm (for example, skewed training data) or an inherent bias in the operation of the algorithm itself (for example, the selected loss function and its impact on the outcome of the model). Discrimination is a possible moral consequence of bias. In fact, the GDPR also makes this link between fairness and discrimination:

Learn more

Fairness and discrimination in GDPR

“In order to ensure fair and transparent processing in respect of the data subject, taking into account the specific circumstances and context in which the personal data are processed, the controller should:

  • use appropriate mathematical or statistical procedures for the profiling,

  • implement technical and organisational measures appropriate to ensure, in particular, that factors which result in inaccuracies in personal data are corrected and the risk of errors is minimised,

  • secure personal data in a manner that takes account of the potential risks involved for the interests and rights of the data subject,

  • prevent, among other things, discriminatory effects on natural persons on the basis of racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation, or processing that results in measures having such an effect.”

- GDPR, Recital 71

In this sense, biased data or biased algorithms can lead to discriminatory outcomes when the resulting decision-making processes are put into practice.

However, when talking about bias and fairness, it’s important to point out that biased systems don’t necessarily lead to discriminatory outcomes. For example, in certain contexts, an AI system may be intentionally designed to be biased to better reach or serve a specific target audience. For example, one could artificially tweak a model to cater to underrepresented groups so that they’ll be treated in the same way as the majority – like when doing recruitment. Likewise, unbiased systems can still lead to unfair outcomes, for example if a specific demographic is in a disadvantageous position and would need more support than others instead of strictly equal treatment.

In this chapter, since we're focusing on fairness, bias will always be associated with contexts within which discriminatory outcomes (unfairness) are possible, and as such we won’t be looking at intentional bias.

Algorithmic fairness

A common saying within machine learning development circles is “garbage in, garbage out” – meaning that the quality of a machine learning algorithm's predictions is only as good as the quality of its training data. This also means that faults within that data will carry on into the prediction outputs as well. In short, unfair or biased data will also lead to unfair or biased algorithms if care isn’t taken to mitigate these data flaws.

Developers using supervised learning build machine learning models based on their collected and labeled training data sets. Training data informs the machine learning model to make predictions about the world. So, the better the labeled data, the better the predictions. Problems arise, however, when labeled data is inaccurate, distorted, or carries inherent biases. The outcome will differ from what is expected, and predictive models will fail.

Problems in the data can have disastrous effects on a machine learning system’s performance. Incomplete or missing data, for example, would result in predictions that don’t take the full picture into account. Incorrect data creates misleading and mistaken predictions. Similarly, biased data can result in discriminatory prediction, often carrying on a cycle of already existing discriminatory practices. Algorithmic unfairness is thus often born out of unfair biases in the training data. Looking back at our definition of ‘unfair’ above, an algorithm can be unfair if it’s trained on biased data that results in unequal treatment of otherwise equal users or individuals.

Factors affecting machine learning bias

Several factors can influence machine learning bias. Let’s take a closer look at some of the situations that create unfair bias in machine learning models.

Sampling bias

Sampling bias happens when the data used to train the algorithm is collected in a way that the final data sample isn’t representative of the population that the algorithm will be used on. What this means is that certain members of a population are more likely to be part of the sample than others.

For example, imagine a machine learning application that’s designed to be used on a large population. Now imagine that the prediction model is only trained on a smaller section of this population that is in no way representative of the whole. Naturally, any proficiency the machine learning model gains will be heavily slanted toward predictions relating to the specific subsection of the population represented in the training data. In fact, this already happens. Datasets used to train facial recognition systems are often skewed as they mainly contain images of white men. Despite this, the resulting systems are later used to identify all genders and skin colors, causing white male faces to have a much higher rate of accuracy over everyone else. Research even shows that facial recognition software across the board has difficulty recognizing darker-skinned individuals’ faces entirely. We’ll look at a practical example of this in the next section.

Another example of sampling bias can be found in federated learning, where a machine learning algorithm is trained in a decentralized environment across multiple devices (for example Internet of Things devices such as smartphones or smart watches), each holding its own local data samples. If a federated learning process opts for a so-called ‘asynchronous approach’, meaning that the centralized server doesn’t wait for all devices to report data at the same time but instead aggregates them on a first come first serve basis, then a sampling bias can be formed against weaker devices. What this means in practice is that devices that are computationally slower or have poorer network connection would be slower to report their data and thus fall behind in the aggregation. If such learning processes take place over a large coverage of diverse areas, such as a city or region, the areas with poor socio-economic status or worse network connectivity will be less often represented in the training. For example, if collecting health-related data from smart wearables, people using those devices in areas with poor network connectivity can lose some data due to the connection not being able to be properly used – which would result in more of the data being lost in these areas and hence the samples from these regions to not be as representative as the similar data from areas with better internet access. The result of this is that the resulting machine learning model is once again trained on poorly distributed data samples.

Historical bias

Biased data can have many sources. Sampling bias, as mentioned above, is one source. Other forms of bias sources are the practices and circumstances that helped form the training data, known as historical bias. This is a bias that’s already present in the world and that has to be accounted for when selecting the training data. It’s often the case that machine learning algorithms designed for specific real-life applications are also trained on existing real-life data from previous outcomes and decisions made within the application domain. For example, a company designing a machine learning algorithm-based recruitment tool to streamline its hiring process is likely to utilize data based on its previous hiring outcomes. Following the previously stated ‘garbage in, garbage out’ idea, if the previous hiring practices were less than stellar, the machine learning algorithm will simply learn from example. After all, without intervention, a machine learning algorithm simply learns from its existing role models (whatever system produced the training data). If the training data is the result of an unfair system, for example, one where racist, sexist, or classist practices have endured, then the machine learning algorithm is likely to learn the patterns these practices produce and continue applying them.

As we already learned in section 1.2, this is exactly what happened with Amazon. Amazon recently abandoned an AI recruitment tool biased toward men for technical jobs (3). In 2014, a team at Amazon developed an automated system that reviews job applicants' resumes. Amazon's system learned to demote resumes containing keywords referring to women and give lower scores to female graduates from various universities. The system's failures due to gender bias stem from the skewed training data set it was given to learn: Amazon's workforce is unquestionably male-dominated. Thus the algorithm was fed characteristically masculine attributes as positives. The development team behind the tool tried to remove gender bias from the system but ultimately decided to abandon the program entirely in 2017.

Measurement bias

In this type of bias, it isn’t just the data itself, but the way we label that data that has a biasing effect on the system. For example, imagine data collected on the number of fouls committed in a certain team sport across multiple games with different referees. In addition to the actual performance of the various players and teams, what the referee considers a ‘foul’ also plays a part. Some referees may be more stringent than others, meaning that what gets labeled as a ‘foul’ varies and the data reporting of actual fouls committed isn’t consistent across games.

Self-reinforcing feedback loops

Outcome, data, and algorithm in a circle-shaped process chart
Outcome, data, and algorithm in a circle-shaped process chart

If algorithmic bias isn’t dealt with, it can have a nasty habit of maintaining and reinforcing itself. Imagine a system consisting of training data as input, the training process generating the model, then the model being applied to give outputs. Bias can enter this system from different directions. One direction is that the training data itself is already biased from previous outcomes. The results may already be skewed because of historical bias. In short, discriminatory data trains the model to be discriminatory. Another direction from which bias may enter the system is via the training process itself, as we’ve previously established when looking at sampling bias. We can also consider the design of the data features and labeling of training data in the model training process as a source, which refers to measurement bias.

Example

A feedback loop in practice: the VoxCeleb 1 dataset

For example, the VoxCeleb 1 dataset, a dataset containing samples from YouTube videos of celebrity interviews, contains metadata that labels speaker nationality based on citizenship sourced from the celebrities’ Wikipedia pages. When used to train or evaluate speaker verification software, a problem that arises is that this metadata informs representation in the process, where citizenship becomes conflated with the speaker's accent. This becomes a problem as the population of a country has people speaking with many different accents, which creates important nuances that aren’t visible in the labeling.

Either way, bias has now entered the system, and the AI model starts generating biased outputs. If these outputs aren’t intervened with, then this will lead to biased outcomes. In the future, this model and future models may be continuously trained on this new outcome data, which carries on the bias into the next generation. These outcomes may even create new biased practices where there were none before. Then, the biased data once again re-enters the system and becomes recycled over and over. This has now caused an automated cycle of bias which is a self-reinforcing feedback loop. This is why detecting and mitigating bias isn’t just about individual instances but about a continuous effort that requires standardized practices and guidelines.

As seen in this section, the challenges of data and fairness aren’t clear and easy problems as they often reflect biases and issues of the world we live in. Questions of bias and fairness always need to be understood in their specific cultural, societal, and regional contexts and require us as humans to reason about ethics — there’s no such thing as a neutral or correct approach to any of these problems. In the next section, let’s take a look at what situations and use cases require action to be taken to ensure the intended functioning of the systems.

Next section
II. Detecting and mitigating bias and unfairness