The accumulation of litter in natural environments poses significant ecological and social challenges, motivating the development of automated solutions for litter detection. However, collecting and centrally aggregating large-scale annotated datasets for training object detectors often raises privacy and ownership concerns. In this work, we propose a Federated Learning (FL) framework to train a lightweight litter detection model based on the YOLO architecture, which enables collaborative model development without requiring centralized access to raw data. Each participating client locally trains the model on site-specific datasets collected in the wild, and only model updates are shared with a central server for aggregation. We compare and contrast different FL process configurations involving mixed and heterogeneous training datasets built starting from two commonly used benchmark datasets collected across different locations and having very different visual data distributions, i.e. TACO and PlastOPol. Experimental results show that the federated model, trained across these non-IID data distributions, achieves superior generalization in cross-dataset evaluation compared to the corresponding centrally trained models.

Federated Learning for Cross-Dataset Generalization in Litter Detection

Baresi, Luciano;Lestingi, Livia;Wehbe, Iyad
2025-01-01

Abstract

The accumulation of litter in natural environments poses significant ecological and social challenges, motivating the development of automated solutions for litter detection. However, collecting and centrally aggregating large-scale annotated datasets for training object detectors often raises privacy and ownership concerns. In this work, we propose a Federated Learning (FL) framework to train a lightweight litter detection model based on the YOLO architecture, which enables collaborative model development without requiring centralized access to raw data. Each participating client locally trains the model on site-specific datasets collected in the wild, and only model updates are shared with a central server for aggregation. We compare and contrast different FL process configurations involving mixed and heterogeneous training datasets built starting from two commonly used benchmark datasets collected across different locations and having very different visual data distributions, i.e. TACO and PlastOPol. Experimental results show that the federated model, trained across these non-IID data distributions, achieves superior generalization in cross-dataset evaluation compared to the corresponding centrally trained models.
2025
28th European Conference on Artificial Intelligence, 25-30 October 2025, Bologna, Italy – Including 14th Conference on Prestigious Applications of Intelligent Systems (PAIS 2025)
9781643686318
File in questo prodotto:
File Dimensione Formato  
p9659.pdf

accesso aperto

Descrizione: pre-print
: Pre-Print (o Pre-Refereeing)
Dimensione 2.68 MB
Formato Adobe PDF
2.68 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1309224
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact