Dan Ellis (Google)

October 4, 2019 @ 12:00 pm – 1:15 pm
Hackerman B17
3400 N. Charles Street
MD 21218

Title: Recognizing Sound Events

Abstract: The Sound Understanding team at Google has been developing automatic sound classification tools with the ambition to cover all possible sounds – speech, music, and environmental. I will describe our application of vision-inspired deep neural networks to the classification of our ‘AudioSet’ ontology of ~600 sound events, as well as related applications in bioacoustics and cross-modal learning. With UPF Barcelona, we recently ran a Kaggle competition (part of DCASE 2019) with over 800 participants, and we will shortly release a pretrained model to make state-of-the-art generic sound recognition widely available.

Bio: Dan Ellis joined Google in 2015 after 15 years as a faculty member in the Electrical Engineering department at Columbia University, where he headed the Laboratory for Recognition and Organization of Speech and Audio (LabROSA). He has over 150 publications in the areas of audio processing, speech recognition, and music information retrieval.

Joint work with Eduardo Fonseca, Frederic Font, Matt Harvey, Shawn Hershey, Aren Jansen, Caroline Liu, Jiayang Liu, Channing Moore, Ratheet Pandya, Manoj Plakal, Rif A. Saurous

Center for Language and Speech Processing