Text Geolocation and Dating: Light-Weight Language Grounding – Jason Baldridge (University of Texas)
View Seminar Video
It used to be that computational linguists had to collaborate with roboticists in order to work on grounding language in the real world. However, since the advent of the internet, and particularly in the last decade, the world has been brought within digital scope. People’s social and business interactions are increasingly mediated through a medium that is dominated by text. They tweet from places, express their opinions openly, give descriptions of photos, and generally reveal a great deal about themselves in doing so, including their location, gender, age, social status, relationships and more. In this talk, I’ll discuss work on geolocation and dating of texts, that is, identifying a sets of latitude-longitude pairs and time periods that a document is about or related to. These applications and the models developed for them set the stage for deeper investigations into computational models of word meaning that go beyond standard word vectors and into augmented multi-component representations that include dimensions connected to the real world via geographic and temporal values and beyond.
Jason Baldridge is an associate professor in the Department of Linguistics at the University of Texas at Austin. He received his Ph.D. from the University of Edinburgh in 2002, where his doctoral dissertation was awarded the 2003 Beth Dissertation Prize from the European Association for Logic, Language, and Information. His main research interests include categorial grammars, parsing, semi-supervised learning, coreference resolution, and georeferencing. He is one of the co-creators of the Apache OpenNLP Toolkit and has been active for many years in the creation and promotion of open source software for natural language processing.