Voice anonymization: from vocoder drift to neural audio codecs and robust evaluation

Panariello, Michele
Thesis

The increasing popularity of speech technology raises privacy concerns: processing of speech data by unauthorized parties poses threats such as voice cloning or inference of personal information (e.g., gender, health, or emotional state) that can be used by adversaries for nefarious purposes. Voice anonymization aims to mitigate such risks by changing the speaker's voice identity into that of a pseudo-speaker, while preserving relevant linguistic and para-linguistic attributes. The utterance cannot be linked back to the original speaker, thus preventing the inference of sensitive information. Anonymization is typically achieved through voice conversion or text-to-speech synthesis, and its effectiveness is assessed by simulating attacks from privacy adversaries attempting to re-identify anonymized utterances through automatic speaker verification. This dissertation explores the topic of voice anonymization from multiple perspectives. First, an analysis of recent anonymization systems is presented, with particular focus on the role of the vocoder and how its synthesis process impacts overall privacy protection. Next, novel approaches to anonymization are proposed, based on the language modeling of neural audio codec tokens and character-based conditioning of the vocoder.

The shortcomings of current evaluation methods are then examined, with particular attention to privacy overestimation - i.e. when the attacker performs sub optimally, leading to overly optimistic estimates of privacy and a false sense of security. Finally, concluding remarks highlight how the presented work fits within current research trends and outline possible future directions for the field.


Type:
Thèse
Date:
2025-12-15
Department:
Sécurité numérique
Eurecom Ref:
8475
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Thesis and is available at :
See also:

PERMALINK : https://www.eurecom.fr/publication/8475