In the last decades, the great availability of data and computing power drove the development of powerful machine learning techniques in many research areas, including the ones, as the meteorology, where traditional conceptual models were usually adopted. In this work, we analyze the performance obtained by different techniques in the forecasting of intense rainfall events. A linear classifier, the logistic regression, is used as a benchmark in order to fairly evaluate more complex nonlinear tools: a support vector machine, a deep neural network, and a random forest. Our analysis focuses on both the accuracy and computing effort necessary to identify these models. The nonlinear predictors are proved to outperform the linear baseline model. Under a computational perspective, both neural network and random forest turn out to be more efficient than the support vector machine. The study area we considered is composed of the catchments of four rivers (Lambro, Seveso, Groane, and Olona) in the Lombardy region, Northern Italy, just upstream from the highly urbanized metropolitan area of Milan. Data of intense convective rainfall events from 2010 up to 2017 (more than 600 events) have been used to identify and test the four considered predictors.

A comparative study on machine learning techniques for intense convective rainfall events forecasting

M. Sangiorgio;S. Barindelli;V. Guglieri;E. Solazzo;G. Venuti;G. Guariso
2020-01-01

Abstract

In the last decades, the great availability of data and computing power drove the development of powerful machine learning techniques in many research areas, including the ones, as the meteorology, where traditional conceptual models were usually adopted. In this work, we analyze the performance obtained by different techniques in the forecasting of intense rainfall events. A linear classifier, the logistic regression, is used as a benchmark in order to fairly evaluate more complex nonlinear tools: a support vector machine, a deep neural network, and a random forest. Our analysis focuses on both the accuracy and computing effort necessary to identify these models. The nonlinear predictors are proved to outperform the linear baseline model. Under a computational perspective, both neural network and random forest turn out to be more efficient than the support vector machine. The study area we considered is composed of the catchments of four rivers (Lambro, Seveso, Groane, and Olona) in the Lombardy region, Northern Italy, just upstream from the highly urbanized metropolitan area of Milan. Data of intense convective rainfall events from 2010 up to 2017 (more than 600 events) have been used to identify and test the four considered predictors.
2020
Theory and Applications of Time Series Analysis
9783030562199
Nowcasting, Logistic regression, Random forest, Support vector machines, Deep neural networks, Global navigation satellite system, Zenith tropospheric delay
File in questo prodotto:
File Dimensione Formato  
Sangiorgio_et_al.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 505.03 kB
Formato Adobe PDF
505.03 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1167851
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact