Discriminative Classification with Incomplete Data – Tommi S. Jaakkola (Massachusetts Institute of Technology)

April 25, 2000 all-day

View Seminar Video
Effective discrimination of structured examples such as sequences is essential in many application areas including speech recognition, image or text classification, and various problems in computational biology. While there are considerable differences in the details of these application areas, many of the basic challenges associated with classification methods are shared across these problem areas. For instance, the examples to be classified are structured and they may be only partially known. We also typically have a shortage of labeled training examples while a large number of unannotated examples can be readily acquired. The category labels invariably denote fairly abstract properties giving rise to diverse class conditional densities. The challenge is to represent and estimate robust decision rules under these conditions.
As a solution to these problems, we present a new framework for discriminative classification based on various generalizations of the well-known maximum entropy principle. The framework is expressly discriminative, naturally accomodates uncertain or missing labels, and extends, e.g., to anomaly detection problems. The maximum entropy discrimination approach is fundamentally driven by large margin classification and is inherently Bayesian. A number of other standard discriminative methods such as support vector machines can be subsumed under this framework. I will motivate and explain some of the key technical ideas and details, and provide experimental results demonstrating substantial benefits that can be achieved with these methods. I will also identify the current limitations with our approach.
This is in part joint work with Marina Meila and Tony Jebara.

Tommi S. Jaakkola received the M.Sc. degree in theoretical physics from Helsinki University of Technology, Finland, and Ph.D. from MIT in computational neuroscience. Following a brief postdoctoral experience in computational molecular biology (DOE/Sloan fellow, UCSC) he joined the MIT EECS faculty 1998. His research interests include theoretical aspects of machine learning and statistical inference as well as problems in computational biology and information retrieval.

Center for Language and Speech Processing