Machine Translation Applications of Stochastic Inversion
Transduction Grammars
Dekai Wu
The Hong Kong University of Science and Technology
Department of Computer Science
Wed. November 1, 1995
Barton Hall 114
Abstract
--------
We have introduced and are developing the notion of {\it bilingual
language modeling}, an approach that shows promise for a number of
aspects of statistical machine translation. A bilingual language
model simultaneously generates matched strings of two languages
following a parametric distribution. The formalism we are currently
investigating, the stochastic inversion transduction grammar (SITG),
is context-free but incorporates an inversion constraint that reduces
computational complexity while maintaining sufficient word-order
flexibility. We introduce {\it bilingual parsing} with an efficient
parsing algorithm for SITGs, giving useful applications in sub-sentential
alignment and bracketing of parallel texts, and automatic extraction
of phrasal translations. An iterative EM training algorithm for SITGs
has been developed for corpus-based estimation of the probabilities.
We will also discuss current and future directions.
**************************************************************************