Automatic Speaker Verification (ASV) systems are increasingly employed to secure access to services and facilities. However, recent advances in speech deepfake generation pose serious threats to their reliability. Modern speech synthesis models can convincingly imitate a target speaker's voice and generate realistic synthetic audio, potentially enabling unauthorized access through ASV systems. To counter these threats, forensic detectors have been developed to distinguish between real and fake speech. Although these models achieve strong performance, their deep learning nature makes them susceptible to adversarial attacks, i.e., carefully crafted, imperceptible perturbations in the audio signal that make the model unable to classify correctly. In this paper, we explore adversarial attacks targeting speech deepfake detectors. Specifically, we analyze the effectiveness of Basic Iterative Method (BIM) attacks applied in both time and frequency domains under white- and black-box conditions. Additionally, we propose an ensemble-based attack strategy designed to simultaneously target multiple detection models. This approach generates adversarial examples with balanced effectiveness across the ensemble, enhancing transferability to unseen models. Our experimental results show that, although crafting universally transferable attacks remains challenging, it is possible to fool state-of-the-art detectors using minimal, imperceptible perturbations, highlighting the need for more robust defenses in speech deepfake detection.
BIM-Based Adversarial Attacks Against Speech Deepfake Detectors
W. E. Wang;D. Salvi;V. Negroni;D. U. Leonzio;P. Bestagini;S. Tubaro
2025-01-01
Abstract
Automatic Speaker Verification (ASV) systems are increasingly employed to secure access to services and facilities. However, recent advances in speech deepfake generation pose serious threats to their reliability. Modern speech synthesis models can convincingly imitate a target speaker's voice and generate realistic synthetic audio, potentially enabling unauthorized access through ASV systems. To counter these threats, forensic detectors have been developed to distinguish between real and fake speech. Although these models achieve strong performance, their deep learning nature makes them susceptible to adversarial attacks, i.e., carefully crafted, imperceptible perturbations in the audio signal that make the model unable to classify correctly. In this paper, we explore adversarial attacks targeting speech deepfake detectors. Specifically, we analyze the effectiveness of Basic Iterative Method (BIM) attacks applied in both time and frequency domains under white- and black-box conditions. Additionally, we propose an ensemble-based attack strategy designed to simultaneously target multiple detection models. This approach generates adversarial examples with balanced effectiveness across the ensemble, enhancing transferability to unseen models. Our experimental results show that, although crafting universally transferable attacks remains challenging, it is possible to fool state-of-the-art detectors using minimal, imperceptible perturbations, highlighting the need for more robust defenses in speech deepfake detection.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


