Brendan O’Connor (UMass Amherst) “Demographic Bias in Social Media Language Analysis: A Case Study of African-American English”

February 1, 2019 @ 12:00 pm – 1:15 pm
Hackerman Hall B17
3400 N Charles St
Baltimore, MD 21218
We conduct a case study of dialectal language in online conversational text by investigating African-American English (AAE) on Twitter, through a demographically supervised model to identify AAE-like language associated with geo-located messages.  We verify that this language follows well-known AAE linguistic phenomena — and furthermore, existing tools like language identification, part-of-speech tagging, and dependency parsing fail on this AAE-like language more often than text associated with white speakers.  We leverage our model to fix racial bias in some of these tools, and discuss future implications for fairness and artificial intelligence.
Brendan O’Connor ( is an assistant professor in the College of Information and Computer Sciences at the University of Massachusetts Amherst, and works in the intersection of computational social science and natural language processing – studying how social factors influence language technologies, and how to better understand social trends with text analysis. For example, his research investigates racial bias in NLP technologies, political events reported in news, and opinions and slang in Twitter; his work has been featured in the New York Times and the Wall Street Journal. He received his PhD in 2014 from Carnegie Mellon University’s Machine Learning Department, advised by Noah Smith, and has previously been a Visiting Fellow at the Harvard Institute for Quantitative Social Science, and an intern with the Facebook Data Science team. He holds a BS/MS in Symbolic Systems from Stanford University.

Center for Language and Speech Processing