A distance-relatedness dynamic model for clustering high dimensional data of arbitrary shapes and densities

Noha Yousri, Mohamed S. Kamel, Mohamed A. Ismail

Research output: Contribution to journalArticle

32 Citations (Scopus)

Abstract

It is important to find the natural clusters in high dimensional data where visualization becomes difficult. A natural cluster is a cluster of any shape and density, and it should not be restricted to a globular shape as a wide number of algorithms assume, or to a specific user-defined density as some density-based algorithms require. In this work, it is proposed to solve the problem by maximizing the relatedness of distances between patterns in the same cluster. It is then possible to distinguish clusters based on their distance-based densities. A novel dynamic model is proposed based on new distance-relatedness measures and clustering criteria. The proposed algorithm "Mitosis" is able to discover clusters of arbitrary shapes and arbitrary densities in high dimensional data. It has a good computational complexity compared to related algorithms. It performs very well on high dimensional data, discovering clusters that cannot be found by known algorithms. It also identifies outliers in the data as a by-product of the cluster formation process. A validity measure that depends on the main clustering criterion is also proposed to tune the algorithm's parameters. The theoretical bases of the algorithm and its steps are presented. Its performance is illustrated by comparing it with related algorithms on several data sets.

Original languageEnglish
Pages (from-to)1193-1209
Number of pages17
JournalPattern Recognition
Volume42
Issue number7
DOIs
Publication statusPublished - 1 Jul 2009
Externally publishedYes

    Fingerprint

Keywords

  • Arbitrary density clusters
  • Arbitrary shaped clusters
  • Clustering
  • Distance-relatedness
  • Dynamic model
  • High dimensional data

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this