Non-negative matrix factorization (NMF) has been widely adopted for the blind separation of acoustic sources. In the context of VR and AR applications, when several microphones are available the adoption of sound field representations such as spherical harmonics or ray space is shown to be effective in the NMF context. In this work, we propose a dictionary-based NMF-based model considering ray-space-transformed signals. The novelty of this approach is to account for the explicit modelling of the frequency dependency of the sound propagation from the source positions to the sensors. Spatially-constrained approaches aim at exploiting spatial information to improve the separation performance; however, they may not take advantage of possible priors given by representations of the data. The proposed approach allows us to exploit the source location model of the Ray Space through a predefined frequency-dependent dictionary of Ray Space patterns. Results demonstrate the competitive performance of the proposed method with respect to state-of-the-art NMF-based algorithms using real recordings.

Ray-Space constrained multichannel Nonnegative Matrix Factorization for Audio Source Separation

Olivieri M.;Pezzoli M.;Antonacci F.;Sarti A.
2024-01-01

Abstract

Non-negative matrix factorization (NMF) has been widely adopted for the blind separation of acoustic sources. In the context of VR and AR applications, when several microphones are available the adoption of sound field representations such as spherical harmonics or ray space is shown to be effective in the NMF context. In this work, we propose a dictionary-based NMF-based model considering ray-space-transformed signals. The novelty of this approach is to account for the explicit modelling of the frequency dependency of the sound propagation from the source positions to the sensors. Spatially-constrained approaches aim at exploiting spatial information to improve the separation performance; however, they may not take advantage of possible priors given by representations of the data. The proposed approach allows us to exploit the source location model of the Ray Space through a predefined frequency-dependent dictionary of Ray Space patterns. Results demonstrate the competitive performance of the proposed method with respect to state-of-the-art NMF-based algorithms using real recordings.
2024
European Signal Processing Conference
blind source separation (BSS)
microphone array processing
Non-negative matrix factorization (NMF)
Ray Space
File in questo prodotto:
File Dimensione Formato  
RS_MNMF_ICASSP_2024-3.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 366.15 kB
Formato Adobe PDF
366.15 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1284156
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact