Validating Personas presents a significant challenge in current scientific literature, relying on costly external validation methods involving users, experts or simulations. This study aims to address this challenge by introducing a novel method to perform personas validation based on data already available during their development. The proposed method starts from previously developed Personas, and trains a clustering model on the entire dataset. Subsequently, a training-test split is repeated twenty times, with each iteration involving the training of a clustering model on the training set. Labels for the test set are then predicted by both models, and consensus is assessed through clustering consensus metrics such as Adjusted Rand Index, Adjusted Mutual Information, homogeneity, completeness and V- measure averaged across the twenty iterations. The proposed method is evaluated using a dataset comprising 1070 subjects regarding their willingness to vaccinate against COVID-19. Three unbalanced Personas are derived from previous studies, and are used as input to this novel method. Clustering consensus metrics reveal a fair agreement between clustering results, with average ARI of 0.3391, average AMI of 0.3435, average homogeneity of 0.3531, average completeness of 0.3473 and average V-measure of 0.3499. In conclusion, the proposed method exhibits robust capabilities in generalizing beyond the training data by accurately classifying new samples in their respective Personas, suggesting the possibility of reducing costs when compared to existing methods in current literature.
A Data-Driven Method to Perform Personas Validation Using Clustering Consensus Metrics
Tauro, Emanuele;Caiani, Enrico Gianluca
2024-01-01
Abstract
Validating Personas presents a significant challenge in current scientific literature, relying on costly external validation methods involving users, experts or simulations. This study aims to address this challenge by introducing a novel method to perform personas validation based on data already available during their development. The proposed method starts from previously developed Personas, and trains a clustering model on the entire dataset. Subsequently, a training-test split is repeated twenty times, with each iteration involving the training of a clustering model on the training set. Labels for the test set are then predicted by both models, and consensus is assessed through clustering consensus metrics such as Adjusted Rand Index, Adjusted Mutual Information, homogeneity, completeness and V- measure averaged across the twenty iterations. The proposed method is evaluated using a dataset comprising 1070 subjects regarding their willingness to vaccinate against COVID-19. Three unbalanced Personas are derived from previous studies, and are used as input to this novel method. Clustering consensus metrics reveal a fair agreement between clustering results, with average ARI of 0.3391, average AMI of 0.3435, average homogeneity of 0.3531, average completeness of 0.3473 and average V-measure of 0.3499. In conclusion, the proposed method exhibits robust capabilities in generalizing beyond the training data by accurately classifying new samples in their respective Personas, suggesting the possibility of reducing costs when compared to existing methods in current literature.File | Dimensione | Formato | |
---|---|---|---|
EHB2023_InTakeCare_Final.pdf
Accesso riservato
:
Publisher’s version
Dimensione
456.93 kB
Formato
Adobe PDF
|
456.93 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.