DeLiang Wang (Ohio State University) “Towards Solving the Cocktail Party Problem”
3400 N Charles St
Baltimore, MD 21218
The cocktail party problem, or speech separation, has evaded a solution for decades in speech and audio processing. I have been advocating a new formulation of this old challenge that estimates an ideal time-frequency mask (binary or ratio). This formulation turns the classical signal processing problem into a machine learning problem, and deep neural networks (DNNs) are particularly well-suited for this task due to their representational capacity. I will describe recent algorithms that employ deep learning for supervised speech separation, including speech enhancement and speaker separation. DNN-based mask estimation elevates speech separation performance to new levels, and produces the first demonstration of substantial speech intelligibility improvements for both hearing-impaired and normal-hearing listeners in background interference. These advances represent big strides towards solving the cocktail party problem.
DeLiang Wang received the B.S. degree and the M.S. degree from Peking (Beijing) University and the Ph.D. degree in 1991 from the University of Southern California all in computer science. Since 1991, he has been with the Department of Computer Science & Engineering and the Center for Cognitive and Brain Sciences at The Ohio State University, where he is a Professor and University Distinguished Scholar. He received the U.S. Office of Naval Research Young Investigator Award in 1996, the 2005 Best Paper Award of IEEE Transactions on Neural Networks, and the 2008 Helmholtz Award from the International Neural Network Society. He is an IEEE Fellow and Co-Editor-in-Chief of Neural Networks.