Advances in Deterministic Dependency Parsing – Yoav Goldberg (Google Research)

November 20, 2012 all-day

View Seminar Video
Transition-based dependency parsers are fast, surprisingly accurate and easy to implement. However, many formal aspects of these parsing systems are not well understood. Specifically, little can be said about the effect of individual parsing decisions on the global parse structure. We help bridge this gap by introducing a property which holds for many transition systems (including the popular arc-eager system) and allows us to reason about the global effects of individual parsing actions in these systems. This kind of reasoning paves the path to many interesting applications.I will describe two immediate applications:A novel arc-constrained decoding algorithm (“find a tree that includes the following edges”) for transition-based parsersA novel oracle which can return a *set* of optimal actions for *any* (configuration,gold-tree) pair, in sharp contrast to traditional oracles that return a single, static sequence of transitions The new oracles allows for a better training procedure which teaches the parser to respond optimally to non-optimal configurations and helps in mitigating error-propagation mistakes. The new oracle and training procedure produce greedy parsers that greatly outperform parsers trained with the traditional, static oracles on a wide range of datasets.This is a joint work with Joakim Nivre.
Yoav Goldberg is a post-doctoral researcher at Google Research NY, working primarily on syntactic parsing and its applications. Prior to that, he completed his PhD in Ben Gurion University, where he worked with Prof. Michael Elhadad on automatic processing of Modern Hebrew, a specimen of a morphologically rich language. He spent a summer at USC/ISI working on Machine Translation with Kevin Knight, David Chiang and Liang Huang. Coming February, Yoav will leave Google to assume a Tenure-track senior-lecturer position (“assistant professorship”) in Bar Ilan University’s Computer Science Department.

Center for Language and Speech Processing