RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Recurrent neurons (and in particular LSTM cells) demonstrated to be efficient when used as basic blocks to build sequence to sequence architectures, which represent the state-of-the-art approach in many sequential tasks related to natural language processing. In this work, these architectures are proposed as general purposes, multi-step predictors for nonlinear time series. We analyze artificial, noise-free data generated by chaotic oscillators and compare LSTM nets with the benchmarks set by feed-forward, one-step-recursive and multi-output predictors. We focus on two different training methods for LSTM nets. The traditional one makes use of the so-called teacher forcing, i.e. the ground truth data are used as input for each time step ahead, rather than the outputs predicted for the previous steps. Conversely, the second feeds the previous predictions back into the recurrent neurons, as it happens when the network is used in forecasting. LSTM predictors robustly show the strengths of the two benchmark competitors, i.e., the good short-term performance of one-step-recursive predictors and greatly improved mid-long-term predictions with respect to feed-forward, multi-output predictors. Training LSTM predictors without teacher forcing is recommended to improve accuracy and robustness, and ensures a more uniform distribution of the predictive power within the chaotic attractor. We also show that LSTM architectures maintain good performances when the number of time lags included in the input differs from the actual embedding dimension of the dataset, a feature that is very important when working on real data.

Robustness of LSTM neural networks for multi-step forecasting of chaotic time series

M. Sangiorgio;F. Dercole

2020-01-01

Abstract

Recurrent neurons (and in particular LSTM cells) demonstrated to be efficient when used as basic blocks to build sequence to sequence architectures, which represent the state-of-the-art approach in many sequential tasks related to natural language processing. In this work, these architectures are proposed as general purposes, multi-step predictors for nonlinear time series. We analyze artificial, noise-free data generated by chaotic oscillators and compare LSTM nets with the benchmarks set by feed-forward, one-step-recursive and multi-output predictors. We focus on two different training methods for LSTM nets. The traditional one makes use of the so-called teacher forcing, i.e. the ground truth data are used as input for each time step ahead, rather than the outputs predicted for the previous steps. Conversely, the second feeds the previous predictions back into the recurrent neurons, as it happens when the network is used in forecasting. LSTM predictors robustly show the strengths of the two benchmark competitors, i.e., the good short-term performance of one-step-recursive predictors and greatly improved mid-long-term predictions with respect to feed-forward, multi-output predictors. Training LSTM predictors without teacher forcing is recommended to improve accuracy and robustness, and ensures a more uniform distribution of the predictive power within the chaotic attractor. We also show that LSTM architectures maintain good performances when the number of time lags included in the input differs from the actual embedding dimension of the dataset, a feature that is very important when working on real data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo della rivista
	
				CHAOS, SOLITONS AND FRACTALS
			
	Parole chiave
	
				Deterministic chaos
Recurrent neural networks
Teacher forcing
Exposure bias
Multi-step prediction
Nonlinear time series
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1146266

Citazioni

ND

140

72

social impact