Processing data streams gained much importance in recent years. Standard machine learning algorithms do not cope well with non-stationary streaming data, where decision models evolve and generate so-called concept drift. Online adaptive algorithms emerged to solve these issues. They learn incrementally and generally require explicit forgetting mechanisms to adapt to concept drift. In this paper, we propose the application of Kalman filtering to handle evolving data streams. This novel approach addresses data stream mining and concept drift management challenges from a new perspective, directly modelling a representation suitable for the data streams. First, we study a Kalman filter based learning approach and investigate its integration into the Naive Bayes algorithm, namely KalmanNB. Additionally, we propose the Hoeffding Kalman Tree, a combination of the Hoeffding Tree with KalmanNB. Empirical results demonstrate that the Kalman filter based approach inherently manages concept drifts, and it adapts to the emerging concept more rapidly than the state-of-the-art algorithms. Moreover, it is an accurate and robust approach and requires less storage while still being faster.
Kalman Filtering for Learning with Evolving Data Streams
Ziffer, Giacomo;Bernardo, Alessio;Valle, Emanuele Della;
2021-01-01
Abstract
Processing data streams gained much importance in recent years. Standard machine learning algorithms do not cope well with non-stationary streaming data, where decision models evolve and generate so-called concept drift. Online adaptive algorithms emerged to solve these issues. They learn incrementally and generally require explicit forgetting mechanisms to adapt to concept drift. In this paper, we propose the application of Kalman filtering to handle evolving data streams. This novel approach addresses data stream mining and concept drift management challenges from a new perspective, directly modelling a representation suitable for the data streams. First, we study a Kalman filter based learning approach and investigate its integration into the Naive Bayes algorithm, namely KalmanNB. Additionally, we propose the Hoeffding Kalman Tree, a combination of the Hoeffding Tree with KalmanNB. Empirical results demonstrate that the Kalman filter based approach inherently manages concept drifts, and it adapts to the emerging concept more rapidly than the state-of-the-art algorithms. Moreover, it is an accurate and robust approach and requires less storage while still being faster.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.