CLSP Homepage : Workshop Homepage
Workshop 2002
Preworkshop Lecture Saturday, September 6, 2008


Jump To:

Seminar
Information
Natural Language Generation in the Context of Machine Translation: Jan Hajic - 07/05/2002


slides from Jan Hajic's lecture

  • Abstract:

    The so-called "tectogrammatical" representation (TR) of sentence structure will be described. The TR will be demonstrated on the Prague Dependency Treebank, about a million-word corpus of Czech with rich linguistic annotation scheme. Two lower layers of sentence analysis representation will be presented, too: a morphological layer and an analytical (i.e., surface) dependency syntax layer.

    The TR represents the deepest analysis of a sentence structure at what we call "the linguistic meaning" level. Its potential advantages to a multilingual MT system will be discussed and examples from English, Arabic and Czech will be shown.

    The outline of the Workshop'02 generation project will follow. It will be shown that the core of the problem lies in the mapping from the tectogrammatical representation of a particular sentence to the analytical one. Important issues of the process will be discussed and possible solutions (i.e. those that will be worked on during the Workshop) will be presented.

     




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu