Grammatical Trigrams
John Lafferty
School of Computer Science
Carnegie Mellon University
April 23, 1996
========================================================
ABSTRACT
It is widely believed among speech and language researchers that the
incorporation of linguistic information should improve statistical models
of natural language and benefit applications such as speech recognition.
This belief has yet to be realized. In this talk I will discuss some
previous attempts for building more effective language models and present
some new ideas that this past work suggests. In particular, I will give
an overview of current work at CMU to develop language modeling
techniques that combine grammatical information with n-gram statistics.
This work uses link grammar to extract structural information and
exponential models to estimate probabilities. After introducing the
relevant concepts, I will discuss areas of recent work that make this
approach practical, including techniques that help decrease the
computational burden of parameter estimation and robust parsing
algorithms that enable the approach to be applied to disfluent and
ungrammatical speech.
**************************************************************************