This study explores the applicability of data augmentation techniques for reconstructing missing energy time -series in limited data regimes. In particular, multiple synthetic copies of a relatively small training dataset are stacked together with pseudo-random noise. First, an existing convolutional denoising autoencoder is selected from a previous work, as the base imputation model of this study. Then, an optimal augmentation rate, which minimizes the training set of the model, is chosen based on the preliminary results obtained from one building. The results proved that, augmenting 80 times a nine days-long training set could reduce the initial average root mean squared error (RMSE) by 37% and 48%, for continuous and random missing scenarios. Additionally, the augmented model outperformed the benchmark methods with 23% and 12% lower average RMSE. No additional tuning or calibration costs were required for the existing base imputation model. Therefore, the presented data augmentation technique could significantly reduce the expensive computational costs associated with deep learning models.

Augmenting energy time-series for data-efficient imputation of missing values

Ferrando, M;Causone, F;
2023-01-01

Abstract

This study explores the applicability of data augmentation techniques for reconstructing missing energy time -series in limited data regimes. In particular, multiple synthetic copies of a relatively small training dataset are stacked together with pseudo-random noise. First, an existing convolutional denoising autoencoder is selected from a previous work, as the base imputation model of this study. Then, an optimal augmentation rate, which minimizes the training set of the model, is chosen based on the preliminary results obtained from one building. The results proved that, augmenting 80 times a nine days-long training set could reduce the initial average root mean squared error (RMSE) by 37% and 48%, for continuous and random missing scenarios. Additionally, the augmented model outperformed the benchmark methods with 23% and 12% lower average RMSE. No additional tuning or calibration costs were required for the existing base imputation model. Therefore, the presented data augmentation technique could significantly reduce the expensive computational costs associated with deep learning models.
2023
Missing data
Data augmentation
Data scarcity
Building energy data
Deep learning
File in questo prodotto:
File Dimensione Formato  
AugmentingTimeSeries.pdf

Accesso riservato

Descrizione: pubisher version
: Publisher’s version
Dimensione 1.76 MB
Formato Adobe PDF
1.76 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1247261
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact