Cross-Lingual Abstract Meaning Representations (CLAMR) for Machine Translation

Goal: Explore the potential benefit of Abstract Meaning Representation to semantics-based statistical machine translation.

This team will explore several facets of using Abstract Meaning Representation (AMR), analyzing and generating them, matching parallel AMRs from source and target languages, graph learning of AMRs (GLAMR), and determining semantically equivalent AMRs that can provide a greater range of matching options. The team leverages several closely related research traditions, including the Czech Tectogrammatical approach, ISI’s AMR prototyping, and longstanding syntactic and semantic modeling at Boulder, Brandeis, Rochester and elsewhere, all of which benefit from the availability of treebanks, PropBanks, and other richly annotated linguistic resources as represented by SemLink.

Both Chinese/English and Czech/English corpora and test sets have been prepared for use in the summer and beyond. The team aims to reduce English bias from existing AMRs so that cross-linguistic AMRs can be as compatible as possible. Since they are graphs, traditional tree-matching approaches used for machine translation must be extended to graph matching, requiring new approaches to make the problem tractable. The team hopes to address the question of whether or not any graph matching obstacles can be overcome by generating alternative AMRs that are semantically equivalent. Team members are also interested in knowing whether or not insights gained this summer will provide concrete measurable improvements to either AMR parsing or AMR generation – both key steps in an AMR-based machine translation system.

The team’s investigations are organized along three named, intertwined threads:

  • GLAMR, or Graph Languages for AMRs, entails looking at AMR pairs to see what operations are needed to ensure accurate mappings and meaning-transfer from one language to another.
  • MATRIX, or Meaning in AMRs and Tectogrammatical Representation Interchange, entails reformatting AMRs automatically to semantically equivalent representations in search of better cross-linguistic matching.
  • PARSE entails automatically parsing English, Chinese and Czech sentences into AMRs

 

Team Members
Team Leader
Martha PalmerUniversity of Colorado
Senior Members
Ondrej BojarCharles University in Prague
David ChiangUniversity of Southern California
Frank DrewesUmea University
Daniel GildeaUniversity of Rochester
Jan HajicCharles University in Prague
Adam LopezJohns Hopkins University
Giorgio SattaUniversity of Padua
Zdenka UresovaCharles University in Prague
Graduate Students
Wei-Te ChenUniversity of Colorado
Ondrej DusekCharles University in Prague
Jeffrey FlainganCarnegie Mellon University
Tim O'GormanUniversity of Colorado
Xiaochang PengUniversity of Rochester
Martin PopelCharles University in Prague
Aditya RenduchintalaJohns Hopkins University
Naomi SaphraJohns Hopkins University
Chuan WangBrandeis University
Yuchen ZhangBrandeis University
Affiliate Members
Silvie CinkovaCharles University in Prague
Sanjeev KhudanpurJohns Hopkins University
James PustejovskyBrandeis University
Roman SudarikovCharles University in Prague

Center for Language and Speech Processing