The data-driven culture is based on the importance of data analysis in supporting decision-making. In particular, machine learning technologies and tools are evolving quickly and becoming increasingly popular as an effective means to gain insights from raw data. However, it should be considered that Machine Learning (ML) models often generate uncertain results due mainly to their imperfect and statistical nature. In this paper, we focus on the fact that data preparation techniques can introduce additional uncertainty. Errors, missing values, and inconsistencies are frequently addressed using techniques that correct data using estimates and thus add further uncertainty. Focusing on the specific problem of incomplete data, this paper (i) investigates the effect of imputation techniques on the results' uncertainty, and (ii) identifies the techniques that minimize such an issue.

About the Effects of Data Imputation Techniques on ML Uncertainty

Cappiello C.;Cerutti F.;Sancricca C.;
2023-01-01

Abstract

The data-driven culture is based on the importance of data analysis in supporting decision-making. In particular, machine learning technologies and tools are evolving quickly and becoming increasingly popular as an effective means to gain insights from raw data. However, it should be considered that Machine Learning (ML) models often generate uncertain results due mainly to their imperfect and statistical nature. In this paper, we focus on the fact that data preparation techniques can introduce additional uncertainty. Errors, missing values, and inconsistencies are frequently addressed using techniques that correct data using estimates and thus add further uncertainty. Focusing on the specific problem of incomplete data, this paper (i) investigates the effect of imputation techniques on the results' uncertainty, and (ii) identifies the techniques that minimize such an issue.
2023
Joint Workshops at the 49th International Conference on Very Large Data Bases, VLDBW 2023
Data Imputation
Data Quality
Uncertainty
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1261164
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact