Improving Machine Translation by Propagating Uncertainty – Chris Dyer (University of Maryland)

September 8, 2009 all-day

View Seminar Video
NLP systems typically consist of a series of components where the output of one module (e.g., a word segmenter) serves as input to another (e.g., a translator). Integration between the components is often achieved using only the 1-best analysis from an upstream component as the input to a downstream component. Unfortunately, this naive integration strategy results in compounding error propagation (cf. Finkel et al. 2006, Dyer et al. 2008). In this talk, I briefly review the effects of this problem in machine translation, where examples of upstream uncertainty include not only the noisy outputs of statistical preprocessors (such as word segmenters and STT systems), but also “development-time” decisions (such as determining what the appropriate granularity of the lexical units is or how much text normalization to do). I show that by encoding input alternatives in a word lattice, translation quality can be improved over a 1-best baseline, with only a slight runtime performance cost. I then explore in more detail the implications of modeling development-time uncertainty jointly with translation, focusing on the problem of source language word segmentation. I tackle this problem in two ways. First, I present a Markov random field model of word segmentation and describe how to use it to generate lattices appropriate for translation by training it to maximize the (conditional) probability of a collection of segmentation alternatives, rather than maximizing the probability of a single correct analysis. Second, I describe generalized alignment models that align lattices in one language to strings in another, enabling the joint modeling of segmentation (or other noisy processes) and translation. Since lattice inputs break the Markov assumptions that enable the efficient inference made in many common word alignment models, I also present novel Monte Carlo techniques for performing word and lattice alignment.
Chris Dyer is a Ph.D. candidate at the University of Maryland, College Park, in the Department of Linguistics under the supervision of Philip Resnik. His research interests include statistical machine translation, computational morphology and phonology, unsupervised learning, and scaling NLP models to deal with larger data sets using the MapReduce programming paradigm. He is graduating this spring and will be joining Noah Smith’s lab as a postdoc.

Center for Language and Speech Processing