We propose a new model for cluster analysis in a Bayesian nonparametric framework. Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.
A ``Density-Based'' Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models
ARGIENTO, RAFFAELE;GUGLIELMI, ALESSANDRA
2014-01-01
Abstract
We propose a new model for cluster analysis in a Bayesian nonparametric framework. Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.File | Dimensione | Formato | |
---|---|---|---|
JCGS2014_authorscopy.pdf
Accesso riservato
Descrizione: articolo
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
816 kB
Formato
Adobe PDF
|
816 kB | Adobe PDF | Visualizza/Apri |
A “Density-Based” Algorithm for Cluster Analysis_11311-758438_Guglielmi.pdf
accesso aperto
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
639.68 kB
Formato
Adobe PDF
|
639.68 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.