Smoothing for Maximum Entropy Models – Stanley Chen (Carnegie Mellon University)
Recent work has demonstrated that maximum entropy models are a promising technique for combining multiple sources of information; applications of maximum entropy models have included language modeling, prepositional phrase attachment, and machine translation. Smoothing is a technique for improving probability estimates in the presence of limited data. Smoothing has yielded substantial performance gains in a variety of applications; however, there has been very little work in developing smoothing techniques for maximum entropy models.
In this work, we compare smoothed maximum entropy n-gram models with smoothed conventional n-gram models on the task of language modeling. We show that existing smoothing algorithms for maximum entropy models compare unfavorably to smoothing algorithms for conventional models, and propose a novel method that yields comparable performance to conventional smoothing techniques.