Anomalous sound detection is central to audio-based surveillance and monitoring. In a domestic environment, however, the classes of sounds to be considered anomalous are situation-dependent and cannot be determined in advance. At the same time, it is not feasible to expect a demanding labeling effort from the end user. To address these problems, we present a novel zero-shot method relying on an auxiliary large-scale pretrained audio neural network in support of an unsupervised anomaly detector. The auxiliary module is tasked to generate a fingerprint for each sound occasionally registered by the user. These fingerprints are then compared with those extracted from the input audio stream, and the resulting similarity score is used to increase or reduce the sensitivity of the base detector. Experimental results on synthetic data show that the proposed method substantially improves upon the unsupervised base detector and is capable of outperforming existing few-shot learning systems developed for machine condition monitoring without involving additional training.

Zero-shot anomalous sound detection in domestic environments using large-scale pretrained audio pattern recognition models

Alessandro Ilic Mezza;Maximo Cobos;Fabio Antonacci
2023-01-01

Abstract

Anomalous sound detection is central to audio-based surveillance and monitoring. In a domestic environment, however, the classes of sounds to be considered anomalous are situation-dependent and cannot be determined in advance. At the same time, it is not feasible to expect a demanding labeling effort from the end user. To address these problems, we present a novel zero-shot method relying on an auxiliary large-scale pretrained audio neural network in support of an unsupervised anomaly detector. The auxiliary module is tasked to generate a fingerprint for each sound occasionally registered by the user. These fingerprints are then compared with those extracted from the input audio stream, and the resulting similarity score is used to increase or reduce the sensitivity of the base detector. Experimental results on synthetic data show that the proposed method substantially improves upon the unsupervised base detector and is capable of outperforming existing few-shot learning systems developed for machine condition monitoring without involving additional training.
2023
ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
978-1-7281-6327-7
audio anomaly detection
domestic environments
pretrained audio neural networks
zero-shot learning
File in questo prodotto:
File Dimensione Formato  
Zero-Shot_Anomalous_Sound_Detection_in_Domestic_Environments_Using_Large-Scale_Pretrained_Audio_Pattern_Recognition_Models.pdf

Accesso riservato

: Publisher’s version
Dimensione 1.53 MB
Formato Adobe PDF
1.53 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1250202
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact