RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

This study explores the applicability of data augmentation techniques for reconstructing missing energy time -series in limited data regimes. In particular, multiple synthetic copies of a relatively small training dataset are stacked together with pseudo-random noise. First, an existing convolutional denoising autoencoder is selected from a previous work, as the base imputation model of this study. Then, an optimal augmentation rate, which minimizes the training set of the model, is chosen based on the preliminary results obtained from one building. The results proved that, augmenting 80 times a nine days-long training set could reduce the initial average root mean squared error (RMSE) by 37% and 48%, for continuous and random missing scenarios. Additionally, the augmented model outperformed the benchmark methods with 23% and 12% lower average RMSE. No additional tuning or calibration costs were required for the existing base imputation model. Therefore, the presented data augmentation technique could significantly reduce the expensive computational costs associated with deep learning models.

Augmenting energy time-series for data-efficient imputation of missing values

Liguori, A;Markovic, R;Ferrando, M;Frisch, J;Causone, F;van Treeck, C

2023-01-01

Abstract

This study explores the applicability of data augmentation techniques for reconstructing missing energy time -series in limited data regimes. In particular, multiple synthetic copies of a relatively small training dataset are stacked together with pseudo-random noise. First, an existing convolutional denoising autoencoder is selected from a previous work, as the base imputation model of this study. Then, an optimal augmentation rate, which minimizes the training set of the model, is chosen based on the preliminary results obtained from one building. The results proved that, augmenting 80 times a nine days-long training set could reduce the initial average root mean squared error (RMSE) by 37% and 48%, for continuous and random missing scenarios. Additionally, the augmented model outperformed the benchmark methods with 23% and 12% lower average RMSE. No additional tuning or calibration costs were required for the existing base imputation model. Therefore, the presented data augmentation technique could significantly reduce the expensive computational costs associated with deep learning models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo della rivista
	
				APPLIED ENERGY
			
	Parole chiave
	
				Missing data
Data augmentation
Data scarcity
Building energy data
Deep learning
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
AugmentingTimeSeries.pdf Accesso riservato Descrizione: pubisher version : Publisher’s version Dimensione 1.76 MB Formato Adobe PDF Visualizza/Apri	1.76 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1247261

Citazioni

ND

15

6

social impact