Recommendations for Fair Machine Learning Researchers

There have been several attempts to define fairness quantitatively (which are detailed in the posts on statistical and causal fairness). Some think that the rapid growth of this new field has led to widely inconsistent motivations, terminology, and notation, presenting a serious challenge for cataloging and comparing definitions [1]. Despite much headway in the last several years, our work in fair machine learning is far from over. In fact, there are several missteps that the field has taken that we need to remedy before truly being able to call our methods “fair”. These issues include rigid categorization strategies, improper terminology, hurtful assumptions and abstractions, misalignment with legal ideals, and issues of diversity and power struggles.

Rigid “Box-Like” Categorization

In most published fairness works, fairness is enforced on rigidly structured groups or categories. For instance, many papers consider the binary categories of male or female and White or Black as the main axis along which to determine if an algorithm is fair or not. These ontological assumptions, though helpful in simplifying the problem at hand, are often misplaced.

The problem with narrowing concepts like gender and race down to simple binary groups is that there is no precise definition of what a “group” or “category” is in the first place. Despite not having a common interpretation, it is widely accepted in the social sciences that groups and categories are social constructs, not rigid boxes into which a person can be placed. Constructionist ontology is the understanding that socially salient categories such as gender and race are not embodied by sharing a physical trait or genealogical features, but are in fact constituted by a web of social relations and meanings [2]. Social construction does not mean that these groups are not real, but that these categories of race and gender are brought into existence and shaped into what we know them to be by historical events, social forces, political power, and/or colonial conquest [3]. When we treat these social constructs as rigidly defined attributes, rather than structural, institutional, and relational circumstances, it minimizes the structural aspects of algorithmic (un)fairness [4]. The very concept of fairness itself can only be understood when framed in the viewpoint of the specific social group being considered.

Specific to racial categorization, Sebastian Benthall and critical race scholar Bruce D. Haynes discuss how “racial classification is embedded in state institutions, and reinforced in civil society in ways that are relevant to the design of machine learning systems” [5]. Race is widely acknowledge in the social science field to be a social construction tied to a specific context and point of history, rather than to a certain phenotypical property. Hanna et al. explain that the meaning of a “race” at any given point in time is tied to a specific racial project which is an explanation and interpretation of racial identities according to the efforts in organizing and distributing resources along particular racial lines [4]. They express that it would be more accurate to describe race as having relational qualities, with dimensions that are symbolic or based on phenotype, but that are also contingent on specific social and historical contexts [4].

The fair machine learning community needs to understand the multidimensional aspects of concepts such as race and gender and needs to seriously consider the impact that our conceptualization and operationalization of historically marginalized groups have on these groups today when defining what a “group” or “category” is in a specific fair machine learning setting. “To oversimplify is to do violence, or even more, to re-inscribe violence on communities that already experience structural violence” [4]. The simplifications we make erase the social, economic, and political complexities of racial, gender, and sexuality categories. These counterfactual-based methodologies tend to treat groups as interchangeable, obscuring the unique oppression encountered by each group [2]. Overall, we cannot do meaningful work in fair machine learning without first understanding and specifying the social ontology of the human groupings about which we are concerned will be the basis for unfairness [2].

An avenue of fair machine learning research that would be a good start for solving this issue is the work on intersectional fairness [6,7]. Intersectionality describes the ways in which inequality based on marginalization attributes like gender, race, ethnicity, belonging to LGBTQIA+, and/or disability class “intersect” to create unique effects of discrimination. An example of intersectional fairness can be seen in the figure below. Discrimination does not operate inside of a vacuum and often times discrimination based on one marginalization attribute reinforces the discrimination based on another. For example, if we try to solve the pay gap between men and women and do not include other dimensions like race, socio-economic status, or immigration status, then it is very likely that our solution will actually reinforce inequalities among women [8]. While intersectional fairness would be a good start, as it allows for a finer grained classification of an individual, it does not fix the categorization issue itself as that will take the collaboration between the fair machine learning community and social scientists.

The above is a fictional example of acceptance rates to a university’s computer science department based on gender, race, and ACT score. For gender: 0 = male and 1 = female. For ACT score: 0 = poor, 1 = okay, 2 = average, 3 = excellent. For race: 0 = White, 1 = Black, 2 = Other. In the top image, the red dots represent students that were not admitted and blue dots show the students that were admitted. In this fictional example, White men were accepted with worse ACT scores than other applicants like Black men or White females. Additionally, no Black women were accepted. If we only took into account gender or race, we would not be able to see this trend and our correction could reinforce inequalities among the specific marginalization class itself.

Unintentionally Adverse Terminology

It is natural to take words such as “bias” and “protected groups” at face value when reading a fair machine learning publication. Especially when we, as technically minded researchers, would rather spend more time understanding the functionality of an algorithm, rather than the schematics of a particular word. But “placation is an absolution” [9] and “Language shapes our thoughts” [10]. Many fair machine learning works use the term algorithmic bias is used liberally and without much thought. However, the word “bias” actively removes responsibility from the algorithm or dataset creator by obscuring the social structures and byproducts of oppressive institutions that contribute to the output of the algorithm [9]. It makes the effect of bias (i.e., an unfair model) out to be purely accidental.

So why use “bias” then? Mainly because the word oppression is strong and polarizing [9,11]. And, despite being proposed originally by Safiya Noble in 2018 [12], the term never caught on. Algorithmic oppression as a theoretical concept acknowledges that there are systems of oppression that cannot simply be reformed, and that not every societal problem has (or should have) a technological solution. Algorithmic oppression analyzes the ways that technology has violent impacts on marginalized peoples’ lives, and in doing so it does not water down the impact to “discrimination” or “implicit bias” because doing so fundamentally invalidates the struggles and hardships that oppressed people endure [9].

In addition to Oppression over Bias, Hampton also comments on the term “protected groups”. They note that calling marginalized groups like Black, LGBTQIA+, or even females, “protected groups” is a “meaningless gesture, although well intentioned” [9]. This is because, in reality, these groups are not protected, but oppressed and disparaged and calling them “protected groups” does nothing to change their circumstances.

We echo the sentiments of Hampton [9]. This section is more of a critique of our language than a request to overhaul an already confusing field in terms of terminology. Let it serve as a reminder that our choice of words have very real consequences beyond simply explaining the techniques of our method.

Damaging Assumptions and Abstractions

Assumptions

When designing a fair machine learning model, many elements are generally assumed and not distinctly specified. Some of these assumptions include the societal objective hoped to be fulfilled by deploying a fair model, the set of individuals subjected to classification by the fair model, and the decision space available to the decision makers who will interact with the model’s final predictions [1]. These assumptions can have undesirable consequences when they do not hold in the actual usage of the model. Each assumption is a choice that fundamentally determines if the model will ultimately advance fairness in society [1]. Additionally, it rarely is the case that the moral assumptions beneath the fair machine learning metrics are explained [13].

Of particular importance is the assumption of the population who will be acted upon by the model, i.e., the individuals who will be subjected to classification. The way that a person comes to belong in a social category or grouping may reflect underlying (objectionable) social structures, e.g., the “predictive” policing that targets racial minorities for arrest [1]. A model that satisfies fairness criteria when evaluated only on the population to which the model is applied may overlook unfairness in the process by which individuals came to be subject to the model in the first place [1].

Starting with clearly articulated goals can improve both fairness and accountability. Recent criticisms of fair machine learning have rightly pointed out that quantitative notions of fairness can restrict our thinking when we aim to make adjustments to a decision-making process, rather than to address the societal problems at hand. While algorithmic thinking runs such risks, quantitative approaches can also force us to make our assumptions more explicit and clarify what we are treating as background conditions. In doing so, we have the opportunity to be more deliberate and have meaningful debate about the difficult policy issues that we might otherwise hand-wave away, such as: “what is our objective”, and “how do we want to go about achieving it” [1]? Additionally, developing fair machine learning metrics that consider and analyze the entire ecosystem that they will operate under (i.e., procedural fairness) could offer a potential fix for the risk posed by assumptions.

Abstractions

Abstraction is one of the cornerstones of computing. It allows a programmer to hide all but the needed information about an object to reduce complexity and increase efficiency. But, abstraction can also lead to the erasure of critical social and historical contexts in problems where fair machine learning is necessary [4]. Almost all proposed fair machine learning metrics (and all those discussed in this work) bound the surrounding system tightly to only consider the machine learning model, the inputs, and the outputs, while completely abstracting away any social context [14]. By abstracting away the social context in which fair machine learning algorithms are deployed, we no longer are able to understand the broader context that determines how fair our outcome truly is.

Selbst et al. call these abstraction pitfalls traps – failure modes that occur when failing to properly understand and account for the interactions between a technical system and our humanistic, societal, world [14]. Specifically, they name five specific traps that arise when we fail to consider how the social concept aligns with technology and we recall them below:

Framing Trap: failure to model the entire system over which a social criterion, such as fairness, will be enforced.
Portability Trap: failure to understand how re-purposing algorithmic solutions designed for one social context may be misleading, inaccurate, or otherwise do harm when applied to a different context.
Formalism Trap: failure to account for the full meaning of social concepts such as fairness, which can be procedural, contextual, contestable, and cannot be resolved through mathematical formalism.
Ripple Effect Trap: failure to understand how the insertion of technology into an existing social system changes the behaviors and embedded values of the pre-existing system.
Solutionism Trap: failure to recognize the possibility that the best solution to a problem may not involve technology.

Selbst et al.’s main proposed solution is to focus on the process of determining where and how to apply technical solutions, and when applying technical solutions causes more harm than good [14]. They point out that in order to come to such a conclusion, technical researchers will need to either learn new social science skills or partner with social scientists on projects. Additionally, they point out that we must also become more comfortable with going against the intrinsic nature of the computer scientist to use abstraction, and be at ease with the difficult or unresolvable tensions between the usefulness and dangers of it [14].

Misalignment of Current Fair ML Metrics with the Legal Field

Several works exist that critique the alignment of current fair machine learning metrics with disparate impact and disparate treatment. Xiang and Raji note that both disparate impact and disparate treatment were developed with human discriminators in mind, and simply replacing human decision makers with algorithmic ones is often not appropriate [15]. They state that “intent is an inherently human characteristic”, and the common fair machine learning characterization of disparate treatment as not using marginalization class variables in an algorithm should be contested. They also note that simply accounting for disproportionate outcomes is not enough to prove disparate impact. It is only the first step of a disparate impact case, and there is only liability if the defendant cannot justify the outcomes by using non-discriminatory rationals.

Additionally, [16] notes that while most work on fair machine learning has focused on achieving a fair distribution of decision outcomes, little to no attention has been paid to the overall decision process used to generate the outcome (i.e., procedural fairness). They note that this is at the determent of not incorporating human moral sense for whether or not it is fair to use a feature in a decision making scenario. To this end, they support the use of procedural fairness since it utilizes several considerations that are overlooked in distributive fairness cases, such as feature volitionality, feature reliability, feature privacy, and feature relevance.

However, there has been some push-back on developing procedurally fair machine learning metrics. Xiang and Raji note that the term “procedural fairness” as described in the fair machine learning literature is a narrow and misguided view of what procedural fairness means from a legal lens [15]. Procedural justice aims to arrive at a just outcome through an iterative process as well as through a close examination of the set of governing laws in place that guide the decision-maker to a specific decision [15,17]. They pose that the overall goal of procedural fairness in machine learning should be re-aligned with the aim of procedural justice by instead analyzing the system surrounding the algorithm, as well as its use, rather than simply looking at it from the specifics of the algorithm itself.

Power Dynamics and Diversity

Here, we consider three important power dynamics: who is doing the classifying, who is picking the objective function, and who gets to define what counts as science. Starting with the first - who has the power to classify - J. Khadijah Abdurahman says that “it is not just that classification systems are inaccurate or biased, it is who has the power to classify, to determine the repercussions / policies associated thereof, and their relation to historical and accumulated injustice” [18]. As mentioned above, since there is no agreed upon definition of what a group/category is, it is ultimately up to those in power to classify people according to the task at hand. Often, this results in rigid classifications that do not align with how people would classify themselves. Additionally, because of data limitations, most often those in power employ the categories provided by the U.S. census or other taxonomies which stem from bureaucratic processes. But, it is well studied that these categories are unstable, contingent, and rooted in racial inequality [4]. When we undertake the process of classifying people, we need to understand what the larger implications of classifying are, and how they further impact or reinforce hurtful social structures.

The second question - who chooses the final optimization function to use in a fair machine learning algorithm - seems fairly intuitive. Of course, those creating fair machine learning methods do. But, should we have this power? The choice of how to construct the objective function of an algorithm is intimately connected with the political economy question of who has ownership and control rights over data and algorithms [19]. It is important to keep in mind that our work is, overall, for the benefit of marginalized populaces. That being the case, “it is not only irresponsible to force our ideas of what communities need, but also violent” [9]. “Before seeking new design solutions, we (should) look for what is already working at the community level” and “honor and uplift traditional, indigenous, and local knowledge and practices” [20]. This may require taking to asking the oppressed groups what their communities need, and what we should keep in mind when constructing the optimization algorithm to better serve them. “We must emphasize an importance of including all communities, and the voices and ideas of marginalized people must be centered as (they) are the first and hardest hit by algorithmic oppression” [9].

The final question - who chooses what is defined as science - comes from the study of the interplay of feminism with science and technology. Ruth Hubbard, the first woman to hold a tenured professorship position in biology at Harvard, advocated for the inclusion of other social groups besides White men to be allowed to make scientific contributions as “whoever gets to define what counts as a scientific problem also gets a powerful role in shaping the picture of the world that results from scientific research” [21]. For a drastic example, consider R.A. Fisher, who for a long period of time was the world’s leading statistician and practically invented large parts of the subject, was also a eugenicist, and thought that “those who did not take his word as God-given truth were at best stupid and at worst evil” [10].

Despite calls for diversity in science and technology, there are conflicting views on how to go about doing so. Some say that including marginalized populaces will help gain outside perspectives that will overall aid in creating technology to better suit the people it will eventually be used on [1]. Others say that this is actually not the case, and more diversity will not automatically solve algorithmic oppression [9]. Sociologist Ruha Benjamin points out that “having a more diverse team is an inadequate solution to discriminatory design practices that grow out of the interplay of racism and captilatism” as it shifts responsibility from “our technologies are harming people” to “BIPOC tokens have to fix it” [9,22]. By promoting diversity as a solution to the problem of algorithmic oppression, we “obscure the fact that there are power imbalances that are deeply embedded in societal systems and institutions” [9].

Regardless of how to solve the diversity issue, it is agreed upon that it is important to engage with marginalized communities and educate them on what fair machine learning is, and how it affects them. “We will solve nothing without involving our communities, and we must take care to ensure we do not impose elitist ideas of who can and cannot do science and engineering” [9]. It is our view that we, the fair machine learning community, should be having conversations with BIPOC communities about their thoughts on how we, the fair machine learning community, should solve the diversity issue (as well as thoughts on what they need and actually want from our community), and what we can do to help fix the problems machine learning ultimately created in the first place.