RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Offline evaluation is a fundamental component in the deployment and development of better recommender systems. In recent years, the contextual bandit framework has emerged as a valuable approach for offline and counterfactual evaluation, leading to the increasing interest in estimators based on inverse propensity scoring (IPS), direct methods (DM), and doubly robust (DR) techniques. However, nearly all existing methods rely on frequentist statistics, limiting their ability to capture model uncertainty and reflecting it in evaluation outcomes.This work explores the novel research direction of Bayesian statistics for Off-Policy Evaluation in recommendation tasks, motivated by the need for reliable estimators that are more robust to distribution shift, data sparsity, and model misspecification. Three underexplored research directions are identified in this work: (i) using posterior uncertainty from Bayesian reward models to design adaptive hybrid estimators, (ii) explicitly modeling all components of the OPE problem (contexts, actions, and rewards) using a joint probabilistic framework, and (iii) quantifying epistemic uncertainty over policy value estimates via posterior inference.By leveraging the Bayesian framework, the aim is to improve the reliability, interpretability, and safety of offline evaluation protocols, offering a new perspective on one of the most persistent challenges in recommender systems research. This perspective is especially relevant in data-scarce or high-stakes settings, where understanding uncertainty is essential for trustworthy decision-making.

Bayesian Perspectives on Offline Evaluation for Recommender Systems

Benigni M.

2025-01-01

Abstract

Offline evaluation is a fundamental component in the deployment and development of better recommender systems. In recent years, the contextual bandit framework has emerged as a valuable approach for offline and counterfactual evaluation, leading to the increasing interest in estimators based on inverse propensity scoring (IPS), direct methods (DM), and doubly robust (DR) techniques. However, nearly all existing methods rely on frequentist statistics, limiting their ability to capture model uncertainty and reflecting it in evaluation outcomes.This work explores the novel research direction of Bayesian statistics for Off-Policy Evaluation in recommendation tasks, motivated by the need for reliable estimators that are more robust to distribution shift, data sparsity, and model misspecification. Three underexplored research directions are identified in this work: (i) using posterior uncertainty from Bayesian reward models to design adaptive hybrid estimators, (ii) explicitly modeling all components of the OPE problem (contexts, actions, and rewards) using a joint probabilistic framework, and (iii) quantifying epistemic uncertainty over policy value estimates via posterior inference.By leveraging the Bayesian framework, the aim is to improve the reliability, interpretability, and safety of offline evaluation protocols, offering a new perspective on one of the most persistent challenges in recommender systems research. This perspective is especially relevant in data-scarce or high-stakes settings, where understanding uncertainty is essential for trustworthy decision-making.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo del libro
	
				RecSys2025 - Proceedings of the 19th ACM Conference on Recommender Systems
			
	Parole chiave
	
				Bayesian Statistics
Offline Evaluation
Recommender Systems
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
bayesian-perspectives-on-offline-evaluation-for-recommender-systems.pdf accesso aperto : Publisher’s version Dimensione 7.95 MB Formato Adobe PDF Visualizza/Apri	7.95 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1307671

Citazioni

ND

0

0

social impact