With the rising influence of machine learning algorithms on many important aspects of our daily lives, there are growing concerns that biases inherent in data can lead the behavior of these algorithms to discriminate against certain populations. Biased data can lead data-driven algorithms to produce biased outcomes along lines of gender, race, sexual orientation, and political ties, with important real-world consequences, including decision-making for lending and law enforcement. Thus, there is an urgent need for machine learning algorithms that make unbiased decisions with biased data. We propose a novel framework for measuring and correcting bias in data-driven algorithms, with inspiration from privacy-preserving machine learning and Bayesian probabilistic modeling. A case study on census data demonstrates the utility of our approach.
Dr. James Foulds is an Assistant Professor in the Department of Information Systems at UMBC. His research interests are in both applied and foundational machine learning, focusing on probabilistic latent variable models and the inference algorithms to learn them from data. His work aims to promote the practice of probabilistic modeling for computational social science, and to improve AI’s role in society regarding privacy and fairness. He earned his Ph.D. in computer science at the University of California, Irvine, and was a postdoctoral scholar at the University of California, Santa Cruz, followed by the University of California, San Diego. His master’s and bachelor’s degrees were earned with first class honours at the University of Waikato, New Zealand, where he also contributed to the Weka data mining system.