SPS Webinar: Minor Manipulations, Major Threat: An Overview of Partially Fake Speech – Lin Zhang (JHU)
Abstract
Speech can easily be manipulated through techniques, such as text-to-speech synthesis, voice conversion, replay, tampering, adversarial attacks, and more. However, when the manipulation is applied only to a minor portion of an audio, the remaining real segments can have a dominant influence upon human listeners and make machine detection extremely challenging. Therefore, there is an urgent need to explore such a scenario, where synthetic speech is embedded within otherwise real audio. The primary objective of this webinar is to review research efforts aimed at defending against such partially fake audio with focus on relevant databases, explainable analyses, and three core tasks (spoof detection, localization, and diarization).
Bio
Lin Zhang received the M.S. degree from Tianjin University, Tianjin, China, in 2020, and the Ph.D. degree from the Graduate University for Advanced Studies / National Institute of Informatics, Tokyo, Japan in 2024. She is currently a Postdoctoral Fellow at the Center for Language and Speech Processing, Johns Hopkins University, USA. She has also visited and/or worked at Brno University of Technology and Duke Kunshan University. Her research interests include speech security and privacy, speech production, as well as machine learning.