SciTech

Algorithms can reinforce bias instead of mitigating it

“We need to teach ethics to nerds,” is how author and data scientist Cathy O’Neil summarized her first step in trying to address the problem of algorithmic bias that she outlined in her talk on Sept. 25. This event, hosted by a local non-profit media organization, Public Source, drew in around 300 people, including the CMU Robotics Team, to the Lecture Hall in the Carnegie Library of Pittsburgh to hear O’Neil dispel what she sees as the myth that algorithms are inherently equitable.

O’Neil explains that algorithms are “things you use data for to predict something.” In order to make this prediction, “you need data and a definition of success.” In her career at a hedge fund during the 2008 financial crisis and as a data scientist at advertising start-ups, O’Neil saw these core components of algorithms leading to sometimes morally dubious outcomes.

O’Neil concluded that data used to construct or train algorithms can be used to “automate the status quo,” as they lead the model to “they actually propagate whatever practices we had that the data reflected back to us.” Additionally, she observed that “everyone who builds an algorithm is in charge of defining success,” and what constitutes success looks different for different people. Presenting algorithms as unbiased, O’Neil finds, hides the question of “for whom does the model fail.”

This belief, that algorithms can on occasion reinforce rather than mitigate systemic bias, does not begin or end with O’Neil. Studies and news stories chronicling the impact of algorithms with flawed success criteria or datasets have abounded as of late, with ProPublica’s series on Machine Bias reporting on “Risk Scores” that are more likely to falsely label black prisoners as likely to re-offend and prompting Facebook to stop allowing housing advertisers to filter the desired audience by race, and a 2015 study at Carnegie Mellon showing that ads for higher-paying jobs on Facebook were shown more frequently to men than women.

Carnegie Mellon is an institution that often sends its graduates to work at large tech companies like Facebook and Google that are creating and implementing these algorithms that touch the lives of millions. Namely, it is a hub for the “nerds” that O’Neil deemed in need of ethics education.
Professor David Danks, head of the department of philosophy and L.L. Thurstone Professor of Philosophy and Psychology, and co-author of the paper “Algorithmic Bias in Autonomous Systems” found that when teaching about building algorithms, the AI and Machine Learning curriculum “does a good job of helping students to understand the importance of “loss functions,” or what O’Neil referred to as the definition of success. The development of these loss functions or success criteria is so important, he explains, because “algorithms largely attempt to optimize whatever loss function is provided to them, and so it is critical to understand their role in an algorithm.”

Where the curriculum is lacking, Professor Danks continued, was in teaching students about where to “ensure that our training or historical data actually track the world that we want, rather than the world as it is,” and in identifying that loss functions are subject to human error and bias.
These effects are not easy for students to diagnose on their own, as he states that “even for seemingly “objective” challenges, the success criteria are ultimately chosen by the human developers, and that is another possible source of bias, as humans do not always know what they actually want (and so might choose the wrong loss function).” Professor Jim Herbsleb in the Institute of Software research in SCS, who teaches the class Ethics and Policy Issues in Computing, agreed, stating that “many students are surprised that algorithms can and often do exhibit biases that reflect and perhaps even amplify bias in the training data.”

Professor Danks thinks that the challenge in teaching about the societal impacts of algorithms stems from the “widespread view that we can separate technological development from the introduction of values or ethical standards.” This separation leads to a situation where “the CMU curriculum, like most undergraduate curricula, tends to focus on the technological elements in isolation. So, for example, people will learn how an algorithm functions, or what variations in the data will lead to different models. But there is relatively little discussion of values when these algorithms are being taught.”

This practice of separation becomes questionable, supplies Professor Alex London Professor Danks’s co-author for Algorithmic Bias in Autonomous Systems and Clara L. West Professor of Ethics and Philosophy, because as decision processes typically left to humans become automated, one type of bias judged is deviation from “norms that are fundamentally ethical in nature, such as equality, fairness, and justice.” This means for Professor London that “developers of these systems not only need to be technically proficient, they need to understand the larger social contexts in which their systems will act and the social and ethical norms to which its decisions need to conform.” As the ability to navigate these ethical judgments becomes more central to the job of a developer, conversations like PublicSource’s Cathy O’Neil event and classes like Ethics and Policy Issues in Computing are available to those who seek them out. How many do so remains to be seen, as Professor Herbsleb observes that students tend to seek out classes that will look more impressive to employers than his class, which draws only a few computer science students every year. He finds this to be “understandable, of course. But I do worry that our graduates are not necessarily well equipped to think through the ethical, policy, and social implications of what they will be asked to do, or of their own innovative ideas.”