Zhizheng Wu (Facebook) “Audio Deepfake Detection: An Overview and its Challenges”
Deep learning has been successfully applied to solve various complex problems. However, deep learning algorithms have also been employed to create deepfakes (e.g. fake audios, images and videos) for misleading information. Deepfakes are increasingly detrimental to privacy, society security and even democracy. This talk first introduces the state-of-the-art voice generation algorithms which can be leveraged to manipulate or create audio audio content. Following by an overview of the audio deepfake detection efforts done by the research community through the ASVspoof challenges, the talk concludes with a discussion of the challenges in audio deepfake detection.
Zhizheng Wu is a research scientist at Facebook. He pioneered in speaker verification spoofing detection research and initiated the first Speaker Verification Spoofing and Countermeasures (ASVspoof) challenge, which becomes a biennial event. Zhizheng also co-organized the first Voice Conversion Challenge (VCC 2016) and organized the Blizzard Challenge 2019 edition. He is also the creator of the SAS corpus which established the protocol for ASVspoof research and the creator of Merlin, which is an open-source neural network-based speech synthesis toolkit.