Sex conversion in speech involves privacy risks from data collection and often leaves residual sex-specific cues in outputs, even when target speaker references are unavailable. We introduce RASO for Reference-free Adversarial Sex Obfuscation. Innovations include a sex-conditional adversarial learning framework to disentangle linguistic content from sex-related acoustic markers and explicit regularisation to align fundamental frequency distributions and formant trajectories with sex-neutral characteristics learned from sex-balanced training data. RASO preserves linguistic content and, even when assessed under a semi-informed attack model, it significantly outperforms a competing approach to sex obfuscation.
Reference-free adversarial sex obfuscation in speech
Submitted to ArXiV, 4 August 2025
Type:
Conference
Date:
2025-08-04
Department:
Digital Security
Eurecom Ref:
8317
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Submitted to ArXiV, 4 August 2025 and is available at :
PERMALINK : https://www.eurecom.fr/publication/8317