David Bamman (University of California, Berkeley) “Modeling the Spread of Information within Novels”

February 22, 2021 @ 12:00 pm – 1:15 pm
via Zoom


Understanding the ways in which information flows through social networks is important for questions of influence–including tracking the spread of cultural trends and disinformation and measuring shifts in public opinion.  Much work in this space has focused on networks where nodes, edges and information are all directly observed (such as Twitter accounts with explicit friend/follower edges and retweets as instances of propagation); in this talk, I will focus on the comparatively overlooked case of information propagation in *implicit* networks–where we seek to discover single instances of a message passing from person A to person B to person C, only given a depiction of their activity in text.

Literature in many ways presents an ideal domain for modeling information propagation described in text, since it depicts a largely closed universe in which characters interact and speak to each other.  At the same time, it poses several wholly distinct challenges–in particular, both the length of literary texts and the subtleties involved in extracting information from fictional works pose difficulties for NLP systems optimized for other domains.  In this talk, I will describe our work in measuring information propagation in these implicit networks, and detail an NLP pipeline for discovering it, focusing in detail on new datasets we have created for tagging characters and their coreference in text.  This is joint work with Matt Sims, Olivia Lewke, Anya Mansoor, Sejal Popat and Sheng Shen.


David Bamman is an assistant professor in the School of Information at UC Berkeley, where he works in the areas of natural language processing and cultural analytics, applying NLP and machine learning to empirical questions in the humanities and social sciences. His research focuses on improving the performance of NLP for underserved domains like literature (including LitBank and BookNLP) and exploring the affordances of empirical methods for the study of literature and culture. Before Berkeley, he received his PhD in the School of Computer Science at Carnegie Mellon University and was a senior researcher at the Perseus Project of Tufts University. Bamman’s work is supported by the National Endowment for the Humanities, National Science Foundation, an Amazon Research Award, and an NSF CAREER award.

Center for Language and Speech Processing