Cross-Lingual Abstract Meaning Representations (CLAMR) for Machine Translation

Research Group of the 2014 Frederick Jelinek Memorial Summer Workshop

Goal: Explore the potential benefit of Abstract Meaning Representation to semantics-based statistical machine translation.

This team will explore several facets of using Abstract Meaning Representation (AMR), analyzing and generating them, matching parallel AMRs from source and target languages, graph learning of AMRs (GLAMR), and determining semantically equivalent AMRs that can provide a greater range of matching options. The team leverages several closely related research traditions, including the Czech Tectogrammatical approach, ISI’s AMR prototyping, and longstanding syntactic and semantic modeling at Boulder, Brandeis, Rochester and elsewhere, all of which benefit from the availability of treebanks, PropBanks, and other richly annotated linguistic resources as represented by SemLink.

Both Chinese/English and Czech/English corpora and test sets have been prepared for use in the summer and beyond. The team aims to reduce English bias from existing AMRs so that cross-linguistic AMRs can be as compatible as possible. Since they are graphs, traditional tree-matching approaches used for machine translation must be extended to graph matching, requiring new approaches to make the problem tractable. The team hopes to address the question of whether or not any graph matching obstacles can be overcome by generating alternative AMRs that are semantically equivalent. Team members are also interested in knowing whether or not insights gained this summer will provide concrete measurable improvements to either AMR parsing or AMR generation – both key steps in an AMR-based machine translation system.

The team’s investigations are organized along three named, intertwined threads:

GLAMR, or Graph Languages for AMRs, entails looking at AMR pairs to see what operations are needed to ensure accurate mappings and meaning-transfer from one language to another.
MATRIX, or Meaning in AMRs and Tectogrammatical Representation Interchange, entails reformatting AMRs automatically to semantically equivalent representations in search of better cross-linguistic matching.
PARSE entails automatically parsing English, Chinese and Czech sentences into AMRs

Team Members
Team Leader
Martha Palmer	University of Colorado
Senior Members
Ondrej Bojar	Charles University in Prague
David Chiang	University of Southern California
Frank Drewes	Umea University
Daniel Gildea	University of Rochester
Jan Hajic	Charles University in Prague
Adam Lopez	Johns Hopkins University
Giorgio Satta	University of Padua
Zdenka Uresova	Charles University in Prague
Graduate Students
Wei-Te Chen	University of Colorado
Ondrej Dusek	Charles University in Prague
Jeffrey Flaingan	Carnegie Mellon University
Tim O'Gorman	University of Colorado
Xiaochang Peng	University of Rochester
Martin Popel	Charles University in Prague
Aditya Renduchintala	Johns Hopkins University
Naomi Saphra	Johns Hopkins University
Chuan Wang	Brandeis University
Yuchen Zhang	Brandeis University
Affiliate Members
Silvie Cinkova	Charles University in Prague
Sanjeev Khudanpur	Johns Hopkins University
James Pustejovsky	Brandeis University
Roman Sudarikov	Charles University in Prague

Cross-Lingual Abstract Meaning Representations (CLAMR) for Machine Translation

Upcoming Seminars

Center for Language and Speech Processing