The world is constantly changing, and so are the massive amount of data produced. However, only a few studies deal with online class imbalance learning that combines the challenges of class-imbalanced data streams and concept drift. In this paper, we propose the Synthetic Minority Oversampling TEchnique with Online Bagging (SMOTE-OB). It is a novel cost-sensitive ensemble strategy that uses Online Bagging and a new sketched version of SMOTE to over/undersample the minority and majority classes. We benchmarked SMOTE-OB on synthetic and real data streams containing different concept drifts, imbalance levels, and class distributions. We bring statistical evidence that the SMOTE-OB ensemble achieves minority class performance that are better than the state-of-the-art ones. Moreover, we perform a time/memory consumption analysis.

SMOTE-OB: Combining SMOTE and Online Bagging for Continuous Rebalancing of Evolving Data Streams

Bernardo, Alessio;Valle, Emanuele Della
2021-01-01

Abstract

The world is constantly changing, and so are the massive amount of data produced. However, only a few studies deal with online class imbalance learning that combines the challenges of class-imbalanced data streams and concept drift. In this paper, we propose the Synthetic Minority Oversampling TEchnique with Online Bagging (SMOTE-OB). It is a novel cost-sensitive ensemble strategy that uses Online Bagging and a new sketched version of SMOTE to over/undersample the minority and majority classes. We benchmarked SMOTE-OB on synthetic and real data streams containing different concept drifts, imbalance levels, and class distributions. We bring statistical evidence that the SMOTE-OB ensemble achieves minority class performance that are better than the state-of-the-art ones. Moreover, we perform a time/memory consumption analysis.
2021
2021 IEEE International Conference on Big Data (Big Data)
978-1-6654-3902-2
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1202060
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 3
social impact