RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The accumulation of litter in natural environments poses significant ecological and social challenges, motivating the development of automated solutions for litter detection. However, collecting and centrally aggregating large-scale annotated datasets for training object detectors often raises privacy and ownership concerns. In this work, we propose a Federated Learning (FL) framework to train a lightweight litter detection model based on the YOLO architecture, which enables collaborative model development without requiring centralized access to raw data. Each participating client locally trains the model on site-specific datasets collected in the wild, and only model updates are shared with a central server for aggregation. We compare and contrast different FL process configurations involving mixed and heterogeneous training datasets built starting from two commonly used benchmark datasets collected across different locations and having very different visual data distributions, i.e. TACO and PlastOPol. Experimental results show that the federated model, trained across these non-IID data distributions, achieves superior generalization in cross-dataset evaluation compared to the corresponding centrally trained models.

Federated Learning for Cross-Dataset Generalization in Litter Detection

Baresi, Luciano;Bianco, Simone;Lestingi, Livia;Wehbe, Iyad

2025-01-01

Abstract

The accumulation of litter in natural environments poses significant ecological and social challenges, motivating the development of automated solutions for litter detection. However, collecting and centrally aggregating large-scale annotated datasets for training object detectors often raises privacy and ownership concerns. In this work, we propose a Federated Learning (FL) framework to train a lightweight litter detection model based on the YOLO architecture, which enables collaborative model development without requiring centralized access to raw data. Each participating client locally trains the model on site-specific datasets collected in the wild, and only model updates are shared with a central server for aggregation. We compare and contrast different FL process configurations involving mixed and heterogeneous training datasets built starting from two commonly used benchmark datasets collected across different locations and having very different visual data distributions, i.e. TACO and PlastOPol. Experimental results show that the federated model, trained across these non-IID data distributions, achieves superior generalization in cross-dataset evaluation compared to the corresponding centrally trained models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo del libro
	
				28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy – Including 14th Conference on Prestigious Applications of Intelligent Systems (PAIS 2025)
			
	Titolo della collana
	
				FRONTIERS IN ARTIFICIAL INTELLIGENCE AND APPLICATIONS
			
	ISBN (International Standard Book Number)
	
				9781643686318
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
p9659.pdf accesso aperto Descrizione: pre-print : Pre-Print (o Pre-Refereeing) Dimensione 2.68 MB Formato Adobe PDF Visualizza/Apri	2.68 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1309224

Citazioni

ND

1

ND

ND

social impact