TAG-based Structured Prediction Models for Parsing and Machine Translation – Michael Collins (MIT)

March 9, 2010 all-day

In structured prediction problems the goal is to learn a function that maps input points to structured output labels: for example, strings, graphs, or trees. These problems are common in many fields—for example, natural language processing (NLP), computer vision, and computational biology—and have been the focus of a great deal of recent research in machine learning.In this talk I’ll describe models for two structured prediction problems in NLP: parsing and machine translation. Central to both approaches is a variant of Tree Adjoining Grammar (TAG) (Joshi et al.,1975), which is computationally efficient, but which also allows the use of relatively rich syntactic representations. The TAG-based parser generalizes a powerful class of discriminative models (conditional random fields) to full syntactic parsing. The TAG-based translation system makes direct use of syntactic structures in modeling differences in word order between different languages, and in modeling the grammaticality of translation output. In both cases we show improvements over state-of-the-art systems.This is joint work with Xavier Carreras and Terry Koo.
Michael Collins is an Associate Professor of Computer Science at MIT. His research is focused on topics including statistical parsing, structured prediction problems in machine learning, and applications including machine translation, dialog systems, and speech recognition. His awards include a Sloan fellowship, an NSF career award, and best paper awards at EMNLP (2002 and 2004), UAI (2004 and 2005), and CoNLL 2008.

Center for Language and Speech Processing