CLSP Homepage : Workshop Homepage
Workshop 2002
Preworkshop Lecture Friday, August 29, 2008


Jump To:

Seminar
Information
Statistical Natural Language Processing: Eugene Charniak - 07/01/2002
  • Abstract:

    Over the last ten years or so the field of natural language processing (NLP) has become increasingly dominated by corpus-based methods and statistical techniques. In this research, problems are attacked by collecting statistics from a corpus (sometimes marked with correct answers, sometimes not) and then applying the statistics to new instances of the task. In this talk we give an overview of statistical techniques in four areas of NLP: parsing (finding the correct phrase structure for a sentence), lexical semantics (learning meanings and other properties of words and phrases from text), anaphora resolution (determining the intended antecedent of pronouns, and noun phrases in general), and word-sense disambiguation (finding the correct sense in context of a word with multiple meanings). As a general rule, corpus-based, and particularly statistical techniques outperform hand-crafted systems, and the rate of progress in the field is still quite high.

     

  • Biography:

    Eugene Charniak is Professor of Computer Science and Cognitive Science at Brown University. He received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. His research has always been in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Over the last few years he has been interested in statistical techniques for language understanding. His research in this area has included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.




The Center for Language and Speech Processing
The Johns Hopkins University
3400 North Charles Street, Barton Hall
Baltimore, MD 21218
*Telephone: (410) 516-4237 *Fax: (410) 516-5050 *E-mail: clsp@clsp.jhu.edu