Speech audio acquisitions exhibit different quality and reverberation properties depending on the recording setup and environment. For this reason, it is expected that speech analysis systems that work correctly on certain audio recordings may fail on others acquired in different acoustic contexts. Therefore, to be able to tell whether a track under analysis shares the same acoustic characteristics of a reference one may be useful to understand if it can be successfully processed by a given speech analysis system. Alternatively, in a forensic scenario, an estimate of acoustic parameter similarity between two tracks can be used to verify whether the recordings have been likely acquired in the same environment or not. In this work, we propose two methods to estimate acoustic parameter similarity between a speech recording under analysis and a reference one. The first method relies on the estimation of channel-based acoustic indicators that are then compared to extract a similarity measure. The second method directly learns a parameter similarity measure through siamese neural networks.

A DATA-DRIVEN APPROACH FOR ACOUSTIC PARAMETER SIMILARITY ESTIMATION OF SPEECH RECORDING

Borrelli C.;Bestagini P.;Antonacci F.;Sarti A.;Tubaro S.
2022-01-01

Abstract

Speech audio acquisitions exhibit different quality and reverberation properties depending on the recording setup and environment. For this reason, it is expected that speech analysis systems that work correctly on certain audio recordings may fail on others acquired in different acoustic contexts. Therefore, to be able to tell whether a track under analysis shares the same acoustic characteristics of a reference one may be useful to understand if it can be successfully processed by a given speech analysis system. Alternatively, in a forensic scenario, an estimate of acoustic parameter similarity between two tracks can be used to verify whether the recordings have been likely acquired in the same environment or not. In this work, we propose two methods to estimate acoustic parameter similarity between a speech recording under analysis and a reference one. The first method relies on the estimation of channel-based acoustic indicators that are then compared to extract a similarity measure. The second method directly learns a parameter similarity measure through siamese neural networks.
2022
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
978-1-6654-0540-9
Acoustic similarity
clarity index
reverberation time
siamese neural networks
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1233409
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact