Mahsa Yarmohammadi (Johns Hopkins University) “Data Augmentation for Zero-shot Cross-Lingual Information Extraction”

February 4, 2022 @ 12:00 pm – 1:15 pm
Ames 234 Presented Virtually via Zoom


In this talk, I present a multipronged strategy for zero-shot cross-lingual Information Extraction, that is the construction of an IE model for some target language, given existing annotations exclusively in some other language. This work is part of the JHU team’s effort under the IARPA BETTER program. I explore data augmentation techniques including data projection and self-training, and how different pretrained encoders impact them. We find through extensive experiments and extension of techniques that a combination of approaches, both new and old, leads to better performance than any one cross-lingual strategy in particular.


Mahsa Yarmohammadi is an assistant research scientist in CLSP, JHU, who leads state-of-the-art research in cross-lingual language and speech applications and algorithms. A primary focus of Yarmohammadi’s research is using deep learning techniques to transfer existing resources into other languages and to learn representations of language from multilingual data. She also works in automatic speech recognition and speech translation. Yarmohammadi received her PhD in computer science and engineering from Oregon Health & Science University (2016). She joined CLSP as a post-doctoral fellow in 2017.

Center for Language and Speech Processing