RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

In the medical domain, there is a large amount of valuable information that is stored in textual format. These unstructured data have long been ignored, due to the difficulties of introducing them in statistical models, but in the last years, the field of Natural Language Processing (NLP) has seen relevant improvements, with models capable of achieving relevant results in various tasks, including information extraction, classification and clustering. NLP models are typically language-specific and often domain-specific, but most of the work to date has been focused on the English language, especially in the medical domain. In this work, we propose a pipeline for clustering Italian medical texts, with a case study on clinical questions reported in referrals

Clustering Italian medical texts: a case study on referrals

V. Torri;M. Ercolanoni;F. Bortolan;O. Leoni;F. Ieva

2023-01-01

Abstract

In the medical domain, there is a large amount of valuable information that is stored in textual format. These unstructured data have long been ignored, due to the difficulties of introducing them in statistical models, but in the last years, the field of Natural Language Processing (NLP) has seen relevant improvements, with models capable of achieving relevant results in various tasks, including information extraction, classification and clustering. NLP models are typically language-specific and often domain-specific, but most of the work to date has been focused on the English language, especially in the medical domain. In this work, we propose a pipeline for clustering Italian medical texts, with a case study on clinical questions reported in referrals

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo del libro
	
				Proceedings of the Statistics and Data Science Conference
			
	ISBN (International Standard Book Number)
	
				9788869521706
			
	Parole chiave
	
				Natural Language Processing,Clustering,Administrative Databases,Medical documents
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Vittorio_Torri_ClusteringItalianMedicalTexts.pdf accesso aperto : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 126.98 kB Formato Adobe PDF Visualizza/Apri	126.98 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1242777

Citazioni

ND

ND

ND

ND

social impact