Rachel Wicks (JHU) “To Sentences and Beyond: Paving the Way for Context-Aware Machine Translation”

When:
March 4, 2024 @ 12:00 pm – 1:15 pm
2024-03-04T12:00:00-05:00
2024-03-04T13:15:00-05:00
Where:
Hackerman Hall B17
3400 N. Charles Street
Baltimore
MD 21218
Cost:
Free

Abstract

Most machine translation systems operate on the sentence-level while humans write and translate within a given context. Operating on individual sentences forces error-prone sentence segmentation into the machine translation pipeline. This limits the upper-bound performance of these systems by creating noisy training bitext. Further, many grammatical features necessitate inter-sentential context in order to translate which makes perfect sentence-level machine translation an impossible task. In this talk, we will cover the inherent limits of sentence-level machine translation. Following this, we will explore a key obstacle in the way of true context-aware machine translation—an abject lack of data.  Finally, we will cover recent work that provides (1) a new evaluation dataset that specifically addresses the translation of context-dependent discourse phenomena and (2) reconstructed documents from large-scale sentence-level bitext that can be used to improve performance when translating these types of phenomena.

Center for Language and Speech Processing