Abstract:
Great progress in wide-coverage parsing has been made in recent years by combining statistical models of such semantically-relevant properties of parses as headword dependencies with rule based grammars. However, such parsers have until now been based on highly overgenerating context-free covering grammars. The analyses that these grammars yield depart in important respects from interpretable structures. In particular they fail to include the long-range "deep" semantic dependencies that are involved in relative and coordinate constructions.
The lecture reviews the reasons for capturing such dependencies in more expressive grammars, and discusses the problem of providing sound statistical models for them, using as illustration some recent experiments with CCG, a "mildly context sensitive" grammar formalism that has been applied to a wide range of languages and linguistic phenomena.