RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

An enhanced technique for hierarchical agglomerative clustering is presented. Classical clusterings suffer from non-uniqueness, resulting from the adopted scaling of data and from the arbitrary choice of the function to measure the proximity between elements. Moreover, most classical methods cannot account for the effect of measurement uncertainty on initial data, when present. To overcome these limitations, the definition of a weighted, asymmetric function is introduced to quantify the proximity between any two elements. The data weighting depends dynamically on the degree of advancement of the clustering procedure. The novel proximity measure is derived from a geometric approach to the clustering, and it allows to both disengage the result from the data scaling, and to indicate the robustness of a clustering against the measurement uncertainty of initial data. The method applies to both flat and hierarchical clustering, maintaining the computational cost of the classical methods.

A novel scale-invariant, dynamic method for hierarchical clustering of data affected by measurement uncertainty

Vignati, Federica;Fustinoni, Damiano;Niro, Alfonso

2018-01-01

Abstract

An enhanced technique for hierarchical agglomerative clustering is presented. Classical clusterings suffer from non-uniqueness, resulting from the adopted scaling of data and from the arbitrary choice of the function to measure the proximity between elements. Moreover, most classical methods cannot account for the effect of measurement uncertainty on initial data, when present. To overcome these limitations, the definition of a weighted, asymmetric function is introduced to quantify the proximity between any two elements. The data weighting depends dynamically on the degree of advancement of the clustering procedure. The novel proximity measure is derived from a geometric approach to the clustering, and it allows to both disengage the result from the data scaling, and to indicate the robustness of a clustering against the measurement uncertainty of initial data. The method applies to both flat and hierarchical clustering, maintaining the computational cost of the classical methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2018
			
	Titolo della rivista
	
				JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS
			
	Parole chiave
	
				Computational cost; Hierarchical clustering; Non-uniqueness; Proximity measure; Uncertainty; Computational Mathematics; Applied Mathematics
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
paper_stat v.4 2017.11.20.pdf Accesso riservato : Publisher’s version Dimensione 397.58 kB Formato Adobe PDF Visualizza/Apri	397.58 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1085451

Citazioni

ND

10

5

social impact