The Usage-Based Insurance paradigm, which is receiving a lot of attention in recent years, envisages computing the car policy premium based on accident risk probability, evaluated observing the past driving history and habits. However, Usage-Based Insurance strategies are usually based on simple empirical decision rules built on travelled distance. The development of intelligent systems for smart risk prediction using the stored overall driving behaviour, without the need of other insurance or socio-demographic information, is still an open challenge. This work aims at exploring a comprehensive machine learning-based approach solely based on driving-related data of private vehicles. The anonymized dataset employed in this study is provided by the telematics company UnipolTech, and contains space/time densely measured data related to trips of almost 100000 vehicles uniformly spread on the Italian territory, recorded every 2 km by on-board telematics fix devices (black boxes), from February 2018 to February 2020. An innovative feature engineering process is proposed, with the aim of uncovering novel informative quantities able to disclose complex aspects of driving behaviour. Recent and powerful learning techniques are explored to develop advanced predictive models, able to provide a reliable accident probability for each vehicle, automatically managing the critical imbalance intrinsically peculiar this kind of datasets.

Machine learning based car accident risk prediction for usage-based insurance

Strada Silvia;Formentin Simone;Savaresi Sergio
2025-01-01

Abstract

The Usage-Based Insurance paradigm, which is receiving a lot of attention in recent years, envisages computing the car policy premium based on accident risk probability, evaluated observing the past driving history and habits. However, Usage-Based Insurance strategies are usually based on simple empirical decision rules built on travelled distance. The development of intelligent systems for smart risk prediction using the stored overall driving behaviour, without the need of other insurance or socio-demographic information, is still an open challenge. This work aims at exploring a comprehensive machine learning-based approach solely based on driving-related data of private vehicles. The anonymized dataset employed in this study is provided by the telematics company UnipolTech, and contains space/time densely measured data related to trips of almost 100000 vehicles uniformly spread on the Italian territory, recorded every 2 km by on-board telematics fix devices (black boxes), from February 2018 to February 2020. An innovative feature engineering process is proposed, with the aim of uncovering novel informative quantities able to disclose complex aspects of driving behaviour. Recent and powerful learning techniques are explored to develop advanced predictive models, able to provide a reliable accident probability for each vehicle, automatically managing the critical imbalance intrinsically peculiar this kind of datasets.
2025
File in questo prodotto:
File Dimensione Formato  
Your Submission IDA-230971R2.pdf

Accesso riservato

: Publisher’s version
Dimensione 149.43 kB
Formato Adobe PDF
149.43 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1274927
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact