CLSP Homepage : Workshop Homepage
Workshop 2005
Guest Lecture Thursday, December 4, 2008


Seminar
Information
Syntactic Language Modeling Eugene Charniak - 07/13/2005


  • Abstract:

    A language model is a probability distribution over all sentences in a language. Traditionally they are associated with speech recognition systems where they help the system distinguish between word sequences which sound the same but with very different probabilities of being uttered (e.g., "the big/pig dog").

    In this talk I argue for the utility of language modeling in many natural-language processing tasks. In particular I describe a language model based upon a probabilistic parser for English, and its use in two quite distinct NLP tasks: machine translation and detecting speech repairs. Most people have some idea of what machine translation is, but speech repairs are less discussed. Frequently in speech people hesitate and then rephrase something they started to say. ("I need a uh want a ticket to Boston.") Often this is seen as a reason why grammatical models might not be useful in speech. Contrariwise, its ungrammaticality should cause a syntactic model to assign such sequences very low probability compared to the same sentence without the mistake. This in turn might aid in correcting for them. We show this is the case.

    This is joint work with Mark Johnson, Matt Lease, Kevin Knight, and Kenji Yamada.

  • Biography

    Eugene Charniak is Professor of Computer Science and Cognitive Science at Brown University and is a past Chairman of the Department of Computer Science(1991-1997). He received his A.B. degree in Physics from University of Chicago, and a Ph.D. from M.I.T. in Computer Science. He has published four books, the most recent being Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. He is on the editorial boards of several journals and was a founding editor of the journal ``Cognitive Science''. His research has always been in the area of language understanding and technologies which relate to it. Over the last 15 years he has been interested in statistical techniques for language understanding, and more specifically in the use of statistical methods in syntactic parsing, speech recognition, and machine translation.




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu