Two Algorithms for Learning Sparse Representations – Tong Zhang (Rutgers)

When:
April 1, 2008 all-day
2008-04-01T00:00:00-04:00
2008-04-02T00:00:00-04:00

View Seminar Video
Abstract
We present two algorithms for sparse learning, where our goal is to estimate a target function that is a sparse linear combination of a set of basis functions. The first method is an online learning algorithm that focuses on scalability and can solve problems with large numbers of features and training data. We propose a general method called truncated gradient that can induce sparsity in the weights of online learning algorithms with convex loss functions. The approach is theoretically motivated, and can be regarded as an online counterpart of the popular L1-regularization method in the batch setting. We prove that small rates of sparsification result in only small additional regret with respect to typical online learning guarantees. Empirical experiments show that the approach works well. The second method is a batch learning algorithm focuses on effective feature selection. Since this problem is NP-hard in the general setting, approximation solutions are necessary. Two methods that are widely used to solve this problem are forward and backward greedy algorithms. First, we show that neither idea is adequate. Second, we propose a novel combination called FoBa that simultaneously incorporates forward and backward steps in a specific way, and show that the resulting procedure can effectively solve this NP-hard problem under quite reasonable conditions. The first part is joint with John Langford, Lihong Li and Alexander Strehl.

Biography
Tong Zhang received a B.A. in mathematics and computer science from Cornell University in 1994 and a Ph.D. in Computer Science from Stanford University in 1998. After graduation, he worked at IBM T.J. Watson Research Center in Yorktown Heights, New York, and Yahoo Research in New York city. He is currently an associate professor of statistics at Rutgers University. His research interests include machine learning, algorithms for statistical computation, their mathematical analysis and applications.

Center for Language and Speech Processing