RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Alzheimer’s Disease (AD) is a progressive neurodegenerative disease that has no cure. Early detection is critical to slow its development, but the diagnosis process is lengthy and costly. Computer-Aided Dementia Detection through Natural Language Processing is emerging as a viable solution for an early diagnosis. Many works in the literature use transcripts of the conversations from the famous DementiaBank dataset to train and test Machine Learning models to detect Dementia automatically. However, the reproducibility and comparability of previous results have been a significant problem in this research domain. We propose a set of curated features, a modular and extensible Feature Extraction framework, and a Performance Evaluation framework to solve these problems. We then evaluated the baseline performance of 12 Machine Learning algorithms over three different tasks: Regression, Binary Classification, and Multiclass Classification with 3, 4, and 5 classes. The top performer model was the Gradient Boosted Decision Trees, achieving an RMSE of 4.3 for the Regression task, an Accuracy of 0.78 for the Binary classification task, and an Accuracy of respectively 0.63, 0.64, and 0.49 for the 3, 4, and 5 classes Multiclass Classification tasks.

Computer-Aided Dementia Detection: How Informative Are Your Features?

E. Stoppa;G. W. Di Donato;N. Parde;M. D. Santambrogio

2022-01-01

Abstract

Alzheimer’s Disease (AD) is a progressive neurodegenerative disease that has no cure. Early detection is critical to slow its development, but the diagnosis process is lengthy and costly. Computer-Aided Dementia Detection through Natural Language Processing is emerging as a viable solution for an early diagnosis. Many works in the literature use transcripts of the conversations from the famous DementiaBank dataset to train and test Machine Learning models to detect Dementia automatically. However, the reproducibility and comparability of previous results have been a significant problem in this research domain. We propose a set of curated features, a modular and extensible Feature Extraction framework, and a Performance Evaluation framework to solve these problems. We then evaluated the baseline performance of 12 Machine Learning algorithms over three different tasks: Regression, Binary Classification, and Multiclass Classification with 3, 4, and 5 classes. The top performer model was the Gradient Boosted Decision Trees, achieving an RMSE of 4.3 for the Regression task, an Accuracy of 0.78 for the Binary classification task, and an Accuracy of respectively 0.63, 0.64, and 0.49 for the 3, 4, and 5 classes Multiclass Classification tasks.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2022
			
	Titolo del libro
	
				Proceedings of 2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI)
			
	ISBN (International Standard Book Number)
	
				978-1-6654-9739-8
978-1-6654-9740-4
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Computer-Aided_Dementia_Detection_How_Informative_Are_Your_Features.pdf Accesso riservato : Publisher’s version Dimensione 750.84 kB Formato Adobe PDF Visualizza/Apri	750.84 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1231638

Citazioni

ND

3

ND

social impact