Smoothing for Maximum Entropy Models – Stanley Chen (Carnegie Mellon University)

February 18, 1997 all-day

Recent work has demonstrated that maximum entropy models are a promising technique for combining multiple sources of information; applications of maximum entropy models have included language modeling, prepositional phrase attachment, and machine translation. Smoothing is a technique for improving probability estimates in the presence of limited data. Smoothing has yielded substantial performance gains in a variety of applications; however, there has been very little work in developing smoothing techniques for maximum entropy models.

In this work, we compare smoothed maximum entropy n-gram models with smoothed conventional n-gram models on the task of language modeling. We show that existing smoothing algorithms for maximum entropy models compare unfavorably to smoothing algorithms for conventional models, and propose a novel method that yields comparable performance to conventional smoothing techniques.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing