We propose a new model for cluster analysis in a Bayesian nonparametric framework. Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.

A ``Density-Based'' Algorithm for Cluster Analysis Using Species Sampling Gaussian Mixture Models

ARGIENTO, RAFFAELE;GUGLIELMI, ALESSANDRA
2014-01-01

Abstract

We propose a new model for cluster analysis in a Bayesian nonparametric framework. Our model combines two ingredients, species sampling mixture models of Gaussian distributions on one hand, and a deterministic clustering procedure (DBSCAN) on the other. Here, two observations from the underlying species sampling mixture model share the same cluster if the distance between the densities corresponding to their latent parameters is smaller than a threshold; this yields a random partition which is coarser than the one induced by the species sampling mixture. Since this procedure depends on the value of the threshold, we suggest a strategy to fix it. In addition, we discuss implementation and applications of the model; comparison with more standard clustering algorithms will be given as well. Supplementary materials for the article are available online.
2014
Bayesian Nonparametrics; Cluster analysis; DBSCAN algorithm; Dirichlet process; Species sampling mixture models
File in questo prodotto:
File Dimensione Formato  
JCGS2014_authorscopy.pdf

Accesso riservato

Descrizione: articolo
: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 816 kB
Formato Adobe PDF
816 kB Adobe PDF   Visualizza/Apri
A “Density-Based” Algorithm for Cluster Analysis_11311-758438_Guglielmi.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 639.68 kB
Formato Adobe PDF
639.68 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/758438
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 17
  • ???jsp.display-item.citation.isi??? 17
social impact