Malcolm Slaney (Google) “Auditory Attention: From Saliency to Models to Applications”

When:
October 10, 2017 @ 12:00 pm – 1:15 pm
2017-10-10T12:00:00-04:00
2017-10-10T13:15:00-04:00
Where:
Hackerman Hall B17
3400 N Charles St
Baltimore, MD 21218
USA
Cost:
Free
Contact:
Center for Language and Speech Processing

Abstract

Understanding attention is key to many auditory tasks. In this talk I would like to summarize several aspects of attention that we have used to better understand how humans use attention in our daily lives.  This work extends from top-down and bottom-up models of attention that are useful for solving the cocktail party problem, to the use of eye-gaze and face-pose information to better understand speech in human-machine and human-human-machine interactions. The common thread throughout all this work is the use of implicit signals such as auditory saliency, face pose and eye gaze as part of a speech-processing system. I will show algorithms and results from speech recognition, speech understanding, addressee detection, and selecting the desired speech from a complicated auditory environment. All of this is grounded in models of auditory attention and saliency.

Biography

BSEE, MSEE, and Ph.D., Purdue University. Dr. Malcolm Slaney is a research scientist in the Machine Hearing Group at Google. He is a Adjunct Professor at Stanford CCRMA, where he has led the Hearing Seminar for more than 20 years, and an Affiliate Faculty in the Electrical Engineering Department at the University of Washington. He is a (former) Associate Editor of IEEE Transactions on Audio, Speech and Signal Processing and IEEE Multimedia Magazine. He has given successful tutorials at ICASSP 1996 and 2009 on “Applications of Psychoacoustics to Signal Processing,” on “Multimedia Information Retrieval” at SIGIR and ICASSP, and “Web-Scale Multimedia Data” at ACM Multimedia 2010. He is a coauthor, with A. C. Kak, of the IEEE book Principles of “Computerized Tomographic Imaging”. This book was republished by SIAM in their “Classics in Applied Mathematics” Series. He is coeditor, with Steven Greenberg, of the book “Computational Models of Auditory Function.” Before joining Google, Dr. Slaney has worked at Bell Laboratory, Schlumberger Palo Alto Research, Apple Computer, Interval Research, IBM’s Almaden Research Center, Yahoo! Research, and Microsoft Research. For many years, he has lead the auditory group at the Telluride Neuromorphic (Cognition) Workshop. Dr. Slaney’s recent work is on understanding conversational speech and general audio perception.  He is a Fellow of the IEEE.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2680

Center for Language and Speech Processing