Robust outlier detection using commute time and eigenspace embedding

Nguyen Lu Dang Khoa, Sanjay Chawla

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

We present a method to find outliers using 'commute distance' computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments that using this measure, we can capture both global and local outliers effectively with just a distance based method. Moreover, the method can detect outlying clusters which other traditional methods often fail to capture and also shows a high resistance to noise than local outlier detection method. Moreover, to avoid the O(n3) direct computation of commute distance, a graph component sampling and an eigenspace approximation combined with pruning technique reduce the time to O(nlogn) while preserving the outlier ranking.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings
Pages422-434
Number of pages13
EditionPART 2
DOIs
Publication statusPublished - 1 Dec 2010
Event14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010 - Hyderabad, India
Duration: 21 Jun 201024 Jun 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6119 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010
CountryIndia
CityHyderabad
Period21/6/1024/6/10

    Fingerprint

Keywords

  • Commute distance
  • Eigenspace embedding
  • Nearest neighbor graph
  • Outlier detection
  • Random walk

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Khoa, N. L. D., & Chawla, S. (2010). Robust outlier detection using commute time and eigenspace embedding. In Advances in Knowledge Discovery and Data Mining - 14th Pacific-Asia Conference, PAKDD 2010, Proceedings (PART 2 ed., pp. 422-434). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6119 LNAI, No. PART 2). https://doi.org/10.1007/978-3-642-13672-6_41