Model-based clustering with noise: Bayesian inference and estimation

H. Bensmail, J. J. Meulman

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Bensmail, Celeux, Raftery, and Robert (1997) introduced a new approach to cluster analysis based on geometric modeling based on the within-group covariance in a mixture of multivariate normal distributions using a fully Bayesian framework. This is a model-based methodology, where the covariance matrix structure is involved. Previously, similar structures were used (using a maximum likelihood approach) by Banfleld and Raftery (1993) for clustering data where they restricted some parameters of the covariance matrix structure to be known. In the same framework, Dasgupta and Raftery (1998) used the same reparameterization to detect the features in a spatial point process using maximum likelihood approach. These approaches work well, but they have some limitations. These limitations include the fact that not all covariance structures were considered and some parameters of the covariance structures were fixed. This paper proposes a new way of overcoming the existing limitations. It generalizes the model used in the the previous approaches by introducing a more comprehensive portfolio of covariance matrix structures. Further, this paper proposes a Bayesian solution in the presence of the noise in clustering problems. The performance of the proposed method is first studied by simulation; the procedure is also applied to the analysis of data concerning species of butterflies and diabetes patients.

Original languageEnglish
Pages (from-to)49-76
Number of pages28
JournalJournal of Classification
Volume20
Issue number1
DOIs
Publication statusPublished - 1 Jul 2003

    Fingerprint

Keywords

  • Bayes factor
  • Canonical discriminant analysis
  • Eigenvalue decomposition
  • Gaussian mixture
  • Gibbs sampler
  • Markov chain Monte Carlo

ASJC Scopus subject areas

  • Mathematics (miscellaneous)
  • Psychology (miscellaneous)
  • Statistics, Probability and Uncertainty
  • Library and Information Sciences

Cite this