Maximum Entropy and Species Distribution Modeling

Rob Schapire, Princeton

October 30, 2007


View Seminar Video

Abstract

Modeling the geographic distribution of a plant or animal species is a critical problem in conservation biology: to save a threatened species, one first needs to know where it prefers to live, and what its requirements are for survival. From a machine-learning perspective, this is an especially challenging problem in which the learner is presented with no negative examples and often only a tiny number of positive examples. In this talk, I will describe the application of maximum-entropy methods to this problem, a set of decades-old techniques that happen to fit the problem very cleanly and effectively. I will describe a version of maxent that we have shown enjoys strong theoretical performance guarantees that enable it to perform effectively even with a very large number of features. I will also describe some extensive experimental tests of the method, as well as some surprising applications. This talk includes joint work with Miroslav Dudík and Steven Phillips.

Biography

Robert Schapire received his ScB in math and computer science from Brown University in 1986, and his SM (1988) and PhD (1991) from MIT under the supervision of Ronald Rivest. After a short post-doc at Harvard, he joined the technical staff at AT&T Labs (formerly AT&T Bell Laboratories) in 1991 where he remained for eleven years. At the end of 2002, he became a Professor of Computer Science at Princeton University. His awards include the 1991 ACM Doctoral Dissertation Award, the 2003 Gödel Prize and the 2004 Kanelakkis Theory and Practice Award (both of the last two with Yoav Freund). His main research interest is in theoretical and applied machine learning.