Scalable Training for Machine Translation Made Successful for the First Time – Liang Huang (CUNY)
View Seminar Video
While large-scale discriminative training has triumphed in many NLP problems, its definite success on machine translation has been largely elusive. Most recent efforts along this line are not scalable: they only train on the small dev set with an impoverished set of rather dense features. We instead present a very simple yet theoretically motivated approach by extending my recent framework of violation-fixing perceptron to the latent variable setting, and use forced decoding to compute the target derivations. Our method allows structured learning to scale, for the first time, to a large portion of the training data, which enables a rich set of sparse, lexicalized, and non-local features. Extensive experiments show very significant gains in BLEU (by at least +2.0) over MERT and PRO baselines with the help of over 20M sparse features.
Liang Huang is currently an Assistant Professor at the City University of New York (CUNY). He graduated in 2008 from Penn and has worked as a Research Scientist at Google and a Research Assistant Professor at USC/ISI. His work is mainly on the theoretical aspects (algorithms and formalisms) of computational linguistics, and related theoretical problems in machine learning. He has received a Best Paper Award at ACL 2008, several best paper nominations (ACL 2007, EMNLP 2008, and ACL 2010), two Google Faculty Research Awards (2010 and 2013), and a University Graduate Teaching Prize at Penn (2005).