Gautham Mysore (Adobe Research) “Simplifying the Creation of Voice-based Content”

When:
October 8, 2018 @ 12:00 pm – 1:15 pm
2018-10-08T12:00:00-04:00
2018-10-08T13:15:00-04:00
Where:
Hackerman Hall B17
3400 N Charles St
Baltimore, MD 21218
USA

Abstract

Voice-based content such as podcasts, radio stories, audiobooks, vlogs, and lecture videos are very prevalent these days. However, creating high quality content can be quite challenging, especially for novices. High quality recording equipment and recording environments are expensive and can be difficult to set up and use. Voice editing tools can have steep learning curve. Finally, people often have limited voice acting skills, which affects the production quality. We therefore aim to dramatically simplify this process to allow people to easily create high quality content without the need to become an audio engineer. In this talk, I will present our work in this space and a present a number of open problems. This includes variations of classical speech processing problems like speech enhancement and synthesis, as well as new problems at the intersection of signal processing, machine learning, and HCI.

Biography

Gautham Mysore is a principal scientist and head of the Audio Research Group at Adobe Research in San Francisco. He is also an Adjunct Professor at Stanford University in the Center for Computer Research in Music and Acoustics (CCRMA). His research involves developing new machine learning and signal processing algorithms for a wide variety of real-world audio applications. Gautham received his Ph.D. (CCRMA), M.A. (CCRMA), and M.S. (Electrical Engineering) from Stanford University. He has previously been a visiting researcher at the Gatsby Computational Neuroscience Unit at the University College London. He has co-authored over 60 papers and 35 patents. He has been a member of the IEEE technical committee on Audio and Acoustic Signal Processing, and was a technical program co-chair of WASPAA 2017.

Johns Hopkins University

Johns Hopkins University, Whiting School of Engineering

Center for Language and Speech Processing
Hackerman 226
3400 North Charles Street, Baltimore, MD 21218-2608

Center for Language and Speech Processing