Henry Li Xinyan (JHU) “Speech Anonymization – Towards Finding the Min-Max Optimum”
3400 N CHARLES ST
Baltimore
MD 21218
Abstract
The widespread deployment of speech technologies has highlighted the urgent need to protect the personal identifying information of users of these technologies. In this talk, we will first introduce the currently prevailing definition for speech anonymization, as well as give some background on prior work in this area. Next, we will discuss our submission to the Voice Privacy challenge. We found that while voice conversion systems better preserve emotional content, they struggle to conceal speaker identity in semi-white-box attack scenarios; conversely, TTS methods perform better at anonymization and worse at emotion preservation. As such, we propose a random admixture system which seeks to balance out the strengths and weaknesses of the two categories of systems. Finally, we will motivate our future/ongoing work for the Voice Privacy Attacker challenge.