Robust outlier detection using commute time and eigenspace embedding

Nguyen Lu Dang Khoa, Sanjay Chawla

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

We present a method to find outliers using 'commute distance' computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments that using this measure, we can capture both global and local outliers effectively with just a distance based method. Moreover, the method can detect outlying clusters which other traditional methods often fail to capture and also shows a high resistance to noise than local outlier detection method. Moreover, to avoid the O(n3) direct computation of commute distance, a graph component sampling and an eigenspace approximation combined with pruning technique reduce the time to O(nlogn) while preserving the outlier ranking.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages422-434
Number of pages13
Volume6119 LNAI
EditionPART 2
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010 - Hyderabad
Duration: 21 Jun 201024 Jun 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6119 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010
CityHyderabad
Period21/6/1024/6/10

Fingerprint

Outlier Detection
Eigenspace
Commute
Eigenvalues and eigenfunctions
Sampling
Outlier
Experiments
Euclidean Distance
Graph Laplacian
Laplacian Matrix
Graph in graph theory
Pruning
Eigenvector
Random walk
Ranking
Approximation
Vertex of a graph
Experiment

Keywords

  • Commute distance
  • Eigenspace embedding
  • Nearest neighbor graph
  • Outlier detection
  • Random walk

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Khoa, N. L. D., & Chawla, S. (2010). Robust outlier detection using commute time and eigenspace embedding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 2 ed., Vol. 6119 LNAI, pp. 422-434). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6119 LNAI, No. PART 2). https://doi.org/10.1007/978-3-642-13672-6_41

Robust outlier detection using commute time and eigenspace embedding. / Khoa, Nguyen Lu Dang; Chawla, Sanjay.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6119 LNAI PART 2. ed. 2010. p. 422-434 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6119 LNAI, No. PART 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Khoa, NLD & Chawla, S 2010, Robust outlier detection using commute time and eigenspace embedding. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 edn, vol. 6119 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 2, vol. 6119 LNAI, pp. 422-434, 14th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2010, Hyderabad, 21/6/10. https://doi.org/10.1007/978-3-642-13672-6_41
Khoa NLD, Chawla S. Robust outlier detection using commute time and eigenspace embedding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 ed. Vol. 6119 LNAI. 2010. p. 422-434. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2). https://doi.org/10.1007/978-3-642-13672-6_41
Khoa, Nguyen Lu Dang ; Chawla, Sanjay. / Robust outlier detection using commute time and eigenspace embedding. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6119 LNAI PART 2. ed. 2010. pp. 422-434 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2).
@inproceedings{999e524931954e42be51f4795702d7b8,
title = "Robust outlier detection using commute time and eigenspace embedding",
abstract = "We present a method to find outliers using 'commute distance' computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments that using this measure, we can capture both global and local outliers effectively with just a distance based method. Moreover, the method can detect outlying clusters which other traditional methods often fail to capture and also shows a high resistance to noise than local outlier detection method. Moreover, to avoid the O(n3) direct computation of commute distance, a graph component sampling and an eigenspace approximation combined with pruning technique reduce the time to O(nlogn) while preserving the outlier ranking.",
keywords = "Commute distance, Eigenspace embedding, Nearest neighbor graph, Outlier detection, Random walk",
author = "Khoa, {Nguyen Lu Dang} and Sanjay Chawla",
year = "2010",
doi = "10.1007/978-3-642-13672-6_41",
language = "English",
isbn = "3642136710",
volume = "6119 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 2",
pages = "422--434",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 2",

}

TY - GEN

T1 - Robust outlier detection using commute time and eigenspace embedding

AU - Khoa, Nguyen Lu Dang

AU - Chawla, Sanjay

PY - 2010

Y1 - 2010

N2 - We present a method to find outliers using 'commute distance' computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments that using this measure, we can capture both global and local outliers effectively with just a distance based method. Moreover, the method can detect outlying clusters which other traditional methods often fail to capture and also shows a high resistance to noise than local outlier detection method. Moreover, to avoid the O(n3) direct computation of commute distance, a graph component sampling and an eigenspace approximation combined with pruning technique reduce the time to O(nlogn) while preserving the outlier ranking.

AB - We present a method to find outliers using 'commute distance' computed from a random walk on graph. Unlike Euclidean distance, commute distance between two nodes captures both the distance between them and their local neighborhood densities. Indeed commute distance is the Euclidean distance in the space spanned by eigenvectors of the graph Laplacian matrix. We show by analysis and experiments that using this measure, we can capture both global and local outliers effectively with just a distance based method. Moreover, the method can detect outlying clusters which other traditional methods often fail to capture and also shows a high resistance to noise than local outlier detection method. Moreover, to avoid the O(n3) direct computation of commute distance, a graph component sampling and an eigenspace approximation combined with pruning technique reduce the time to O(nlogn) while preserving the outlier ranking.

KW - Commute distance

KW - Eigenspace embedding

KW - Nearest neighbor graph

KW - Outlier detection

KW - Random walk

UR - http://www.scopus.com/inward/record.url?scp=79956324411&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79956324411&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-13672-6_41

DO - 10.1007/978-3-642-13672-6_41

M3 - Conference contribution

SN - 3642136710

SN - 9783642136719

VL - 6119 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 422

EP - 434

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -