Submodularity and Big Data – Jeff Blimes (University of Washington)

November 5, 2013 all-day

View Seminar Video
The amount of data available today is a problem not only for humans but also for computer consumers of information. At the same time, bigger is different, and discovering how is an important challenge in big data sciences. In this talk, we will discuss how submodular functions can address these problems. After giving a brief background on submodularity, we will first discuss document summarization, and how one can achieve optimal results on a number of standard benchmarks using very efficient algorithms. Next we will discuss data subset selection for speech recognition systems, and how choosing a good subset has many advantages, showing results on both the TIMIT and the Fisher corpora. We will also discuss data selection for machine translation systems. Lastly, we will discuss similar problems in computer vision. The talk will include sufficient background to make it accessible to everyone.
Jeff Bilmes is a professor at the Department of Electrical Engineering at the University of Washington, Seattle Washington, and also an adjunct professor in Computer Science & Engineering and the department of Linguistics. He received his Ph.D. from the Computer Science Division of the department of Electrical Engineering and Computer Science, University of California in Berkeley. He was a 2001 NSF Career award winner, a 2002 CRA Digital Government Fellow, a 2008 NAE Gilbreth Lectureship award recipient, and a 2012/2013 ISCA Lecturer.

Center for Language and Speech Processing