SCHISM: A new approach to interesting subspace mining

Karlton Sequeira, Mohammed Zaki

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

High dimensional data pose challenges to traditional clustering algorithms due to their inherent sparseness and data tend to cluster in different and possibly overlapping subspaces of the entire feature space. Finding such subspaces is called subspace mining. We present SCHISM, a new algorithm for mining interesting subspaces, using the notions of support and Chernoff-Hoeffding bounds. We use a vertical representation of the dataset, and use a depth first search with backtracking to find maximal interesting subspaces. We test our algorithm on a number of high dimensional synthetic and real datasets to test its effectiveness.

Original languageEnglish
Pages (from-to)137-160
Number of pages24
JournalInternational Journal of Business Intelligence and Data Mining
Volume1
Issue number2
DOIs
Publication statusPublished - 1 Dec 2005
Externally publishedYes

    Fingerprint

Keywords

  • Chernoff-Hoeffding bounds
  • Clustering
  • Data mining
  • Interestingness measures
  • Maximal subspaces
  • Subspace mining

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems and Management
  • Statistics, Probability and Uncertainty

Cite this