Data preparation is the process of normalizing, cleaning, transforming, and combining data prior to processing or analysis. It is crucial for obtaining valuable results from data analysis. However, designing the most effective data preparation pipeline is often one of the biggest challenges for data analysts, consuming up to 70–80% of their time. The work illustrated in this paper is the first step toward designing a framework that simplifies the selection and validation of data preparation tasks. It proposes an environment with diverse levels of assistance and autonomy, accommodating varying data analysts’ skills and expertise. The requirements for the design of this new framework were elicited through in-depth interviews and think-aloud sessions involving a sample of data analysts, which highlighted understandability, explainability, and continuous learning as fundamental factors. The paper discusses alternatives to enhance these factors, also considering strategies that adopt Large Language Models.
Improving Understandability and Control in Data Preparation: A Human-Centered Approach
Pucci E.;Sancricca C.;Andolina S.;Cappiello C.;Matera M.;
2024-01-01
Abstract
Data preparation is the process of normalizing, cleaning, transforming, and combining data prior to processing or analysis. It is crucial for obtaining valuable results from data analysis. However, designing the most effective data preparation pipeline is often one of the biggest challenges for data analysts, consuming up to 70–80% of their time. The work illustrated in this paper is the first step toward designing a framework that simplifies the selection and validation of data preparation tasks. It proposes an environment with diverse levels of assistance and autonomy, accommodating varying data analysts’ skills and expertise. The requirements for the design of this new framework were elicited through in-depth interviews and think-aloud sessions involving a sample of data analysts, which highlighted understandability, explainability, and continuous learning as fundamental factors. The paper discusses alternatives to enhance these factors, also considering strategies that adopt Large Language Models.File | Dimensione | Formato | |
---|---|---|---|
Improving Understandability and Control - A Human-Centered Approach.pdf
Accesso riservato
:
Publisher’s version
Dimensione
2.16 MB
Formato
Adobe PDF
|
2.16 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.