Sex conversion in speech involves privacy risks from data collection and often leaves residual sex-specific cues in outputs, even when target speaker references are unavailable. We introduce RASO for Reference-free Adversarial Sex Obfuscation. Innovations include a sex-conditional adversarial learning framework to disentangle linguistic content from sex-related acoustic markers and explicit regularisation to align fundamental frequency distributions and formant trajectories with sex-neutral characteristics learned from sex-balanced training data. RASO preserves linguistic content and, even when assessed under a semi-informed attack model, it significantly outperforms a competing approach to sex obfuscation.
Reference-free adversarial sex obfuscation in speech
APSIPA 2025, 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 22-24 October 2025, Shangri-la, Singapore
Type:
Conférence
City:
Shangri-la
Date:
2025-10-22
Department:
Sécurité numérique
Eurecom Ref:
8317
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in APSIPA 2025, 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 22-24 October 2025, Shangri-la, Singapore and is available at :
PERMALINK : https://www.eurecom.fr/publication/8317