Machine Translation Applications of Stochastic Inversion Transduction Grammars – Dekai Wu (Hong Kong University of Science & Technology, Department of Computer Science)

November 1, 1995 all-day

We have introduced and are developing the notion of bilingual language modeling, an approach that shows promise for a number of aspects of statistical machine translation. A bilingual language model simultaneously generates matched strings of two languages following a parametric distribution. The formalism we are currently investigating, the stochastic inversion transduction grammar (SITG), is context-free but incorporates an inversion constraint that reduces computational complexity while maintaining sufficient word-order flexibility. We introduce bilingual parsing with an efficient parsing algorithm for SITGs, giving useful applications in sub-sentential alignment and bracketing of parallel texts, and automatic extraction of phrasal translations. An iterative EM training algorithm for SITGs has been developed for corpus-based estimation of the probabilities. We will also discuss current and future directions.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing