Recent studies have applied Denoising Diffusion Probabilistic Models (DDPMs) to recommender systems, reporting notable improvements. However, several reproducibility studies have shown that claims asserting the superiority of new methods are frequently not substantiated by rigorous evidence, as they often rely on non-reproducible experimental protocols, weak or untuned baselines, and questionable evaluation practices. This extended abstract presents key findings from the manuscript “Diffusion Recommender Models and the Illusion of Progress: A Concerning Study of Reproducibility and a Conceptual Mismatch” which investigates whether the reported advancements of diffusion-based models in recommendation are supported by rigorous and reproducible experimental evaluation. The study re-executes the experiments of four DDPM-based models presented at SIGIR 2023 and 2024, revealing substantial methodological issues and limited reproducibility. In addition, it highlights a conceptual mismatch between the generative nature of DDPMs and the deterministic requirements of offline evaluation, underscoring the need for a reconsideration of evaluation procedures for generative models.
Diffusion Models for Recommendation: Reproducibility and Conceptual Mismatch
Benigni M.;Ferrari Dacrema M.;
2025-01-01
Abstract
Recent studies have applied Denoising Diffusion Probabilistic Models (DDPMs) to recommender systems, reporting notable improvements. However, several reproducibility studies have shown that claims asserting the superiority of new methods are frequently not substantiated by rigorous evidence, as they often rely on non-reproducible experimental protocols, weak or untuned baselines, and questionable evaluation practices. This extended abstract presents key findings from the manuscript “Diffusion Recommender Models and the Illusion of Progress: A Concerning Study of Reproducibility and a Conceptual Mismatch” which investigates whether the reported advancements of diffusion-based models in recommendation are supported by rigorous and reproducible experimental evaluation. The study re-executes the experiments of four DDPM-based models presented at SIGIR 2023 and 2024, revealing substantial methodological issues and limited reproducibility. In addition, it highlights a conceptual mismatch between the generative nature of DDPMs and the deterministic requirements of offline evaluation, underscoring the need for a reconsideration of evaluation procedures for generative models.| File | Dimensione | Formato | |
|---|---|---|---|
|
diffusion-models-for-recommendation-reproducibility-and-conceptual-mismatch.pdf
accesso aperto
:
Publisher’s version
Dimensione
207.03 kB
Formato
Adobe PDF
|
207.03 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


