Distance Metric Learning for Large Margin Classification – Lawrence K. Saul (University of Pennsylvania)
View Seminar Video
Many frameworks for statistical pattern recognition are based on computing Mahalanobis distances, which appear as positive semidefinite quadratic forms. I will describe how to learn the parameters of these quadratic forms — the so-called distance metrics — for two popular models of multiway classification. First, for k-nearest neighbor kNN classification, I will show how to learn metrics with the property that distances between differently labeled examples greatly exceed distances between nearest neighbors belonging to the same class. Second, for Gaussian mixture models GMMs, I will show how to learn the mean and covariance parameters so that these models perform well as classifiers with large margins of error. Both of these models for multiway classification can be viewed as new types of large margin classifiers, with the traditionally linear hyperplane decision boundaries of support vector machines SVMs replaced by the nonlinear decision boundaries induced by kNN or GMMs. Like SVMs, the objective functions for “large margin” kNN/GMM classifiers are convex, with no local minima. The advantages of these models over SVMs are that: i. they are naturally suited for problems in multiway as opposed to binary classification, and ii. the kernel trick is not required for nonlinear decision boundaries. I will describe successful applications of these models to handwritten digit recognition and automatic speech recognition. Joint work with K. Weinberger, F. Sha, and J. Blitzer.
Lawrence Saul received his A.B. in Physics from Harvard 1990 and his Ph.D. in Physics from M.I.T. 1994. He stayed at M.I.T. for two more years as a postdoctoral fellow in the Center for Biological and Computational Learning, then joined the Speech and Image Processing Center of AT&T Labs in Florham Park, NJ. In 1999, the MIT Technology Review recognized him as one of 100 top young innovators. He joined the faculty of the University of Pennsylvania in January 2002, where he is currently an Associate Professor in the Department of Computer and Information Science. He has received an NSF CAREER Award for work in statistical learning, and more recently he served as Program Chair and General Chair for the 2003-2004 conferences on Neural Information Processing Systems. He is currently serving on the Editorial Board for the Journal of Machine Learning Research. In July 2006, he will be joining the faculty of the Department of Computer Science and Engineering at UC San Diego.