Non-negative matrix factorization (NMF) has been widely adopted for the blind separation of acoustic sources. In the context of VR and AR applications, when several microphones are available the adoption of sound field representations such as spherical harmonics or ray space is shown to be effective in the NMF context. In this work, we propose a dictionary-based NMF-based model considering ray-space-transformed signals. The novelty of this approach is to account for the explicit modelling of the frequency dependency of the sound propagation from the source positions to the sensors. Spatially-constrained approaches aim at exploiting spatial information to improve the separation performance; however, they may not take advantage of possible priors given by representations of the data. The proposed approach allows us to exploit the source location model of the Ray Space through a predefined frequency-dependent dictionary of Ray Space patterns. Results demonstrate the competitive performance of the proposed method with respect to state-of-the-art NMF-based algorithms using real recordings.
Ray-Space constrained multichannel Nonnegative Matrix Factorization for Audio Source Separation
Olivieri M.;Pezzoli M.;Antonacci F.;Sarti A.
2024-01-01
Abstract
Non-negative matrix factorization (NMF) has been widely adopted for the blind separation of acoustic sources. In the context of VR and AR applications, when several microphones are available the adoption of sound field representations such as spherical harmonics or ray space is shown to be effective in the NMF context. In this work, we propose a dictionary-based NMF-based model considering ray-space-transformed signals. The novelty of this approach is to account for the explicit modelling of the frequency dependency of the sound propagation from the source positions to the sensors. Spatially-constrained approaches aim at exploiting spatial information to improve the separation performance; however, they may not take advantage of possible priors given by representations of the data. The proposed approach allows us to exploit the source location model of the Ray Space through a predefined frequency-dependent dictionary of Ray Space patterns. Results demonstrate the competitive performance of the proposed method with respect to state-of-the-art NMF-based algorithms using real recordings.| File | Dimensione | Formato | |
|---|---|---|---|
|
RS_MNMF_ICASSP_2024-3.pdf
accesso aperto
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
366.15 kB
Formato
Adobe PDF
|
366.15 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


