A validity index for outlier detection

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Defining a boundary between inliers and outliers is a major challenge in unsupervised outlier detection. In the absence of labeled data, the true outliers set cannot be evaluated. This lays the burden on both the choice of an efficient outlier detection criterion, and parameter selection. While numerous unsupervised outlier detection criteria, with different parameters, have been proposed, an unsupervised evaluation of outliers is still missing. This work introduces a theoretical basis, and proposes a validity index, to evaluate the quality of outliers. This is not a trivial problem when nothing is known about the structure and density of the data. The proposed index considers the outlierness quality, the deviation between characteristics of outliers and inliers, and the data distortion. Low and high dimensional data sets are used to evaluate the proposed index.

Original languageEnglish
Title of host publicationProceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10
Pages325-329
Number of pages5
DOIs
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10 - Cairo, Egypt
Duration: 29 Nov 20101 Dec 2010

Other

Other2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10
CountryEgypt
CityCairo
Period29/11/101/12/10

Keywords

  • Outlier analysis
  • Validity index

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Hardware and Architecture

Cite this

Yousri, N. (2010). A validity index for outlier detection. In Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10 (pp. 325-329). [5687245] https://doi.org/10.1109/ISDA.2010.5687245

A validity index for outlier detection. / Yousri, Noha.

Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. p. 325-329 5687245.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yousri, N 2010, A validity index for outlier detection. in Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10., 5687245, pp. 325-329, 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10, Cairo, Egypt, 29/11/10. https://doi.org/10.1109/ISDA.2010.5687245
Yousri N. A validity index for outlier detection. In Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. p. 325-329. 5687245 https://doi.org/10.1109/ISDA.2010.5687245
Yousri, Noha. / A validity index for outlier detection. Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10. 2010. pp. 325-329
@inproceedings{f646fc1eb67f4ad99dd57b3a9a3a7d5a,
title = "A validity index for outlier detection",
abstract = "Defining a boundary between inliers and outliers is a major challenge in unsupervised outlier detection. In the absence of labeled data, the true outliers set cannot be evaluated. This lays the burden on both the choice of an efficient outlier detection criterion, and parameter selection. While numerous unsupervised outlier detection criteria, with different parameters, have been proposed, an unsupervised evaluation of outliers is still missing. This work introduces a theoretical basis, and proposes a validity index, to evaluate the quality of outliers. This is not a trivial problem when nothing is known about the structure and density of the data. The proposed index considers the outlierness quality, the deviation between characteristics of outliers and inliers, and the data distortion. Low and high dimensional data sets are used to evaluate the proposed index.",
keywords = "Outlier analysis, Validity index",
author = "Noha Yousri",
year = "2010",
month = "12",
day = "1",
doi = "10.1109/ISDA.2010.5687245",
language = "English",
isbn = "9781424481354",
pages = "325--329",
booktitle = "Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10",

}

TY - GEN

T1 - A validity index for outlier detection

AU - Yousri, Noha

PY - 2010/12/1

Y1 - 2010/12/1

N2 - Defining a boundary between inliers and outliers is a major challenge in unsupervised outlier detection. In the absence of labeled data, the true outliers set cannot be evaluated. This lays the burden on both the choice of an efficient outlier detection criterion, and parameter selection. While numerous unsupervised outlier detection criteria, with different parameters, have been proposed, an unsupervised evaluation of outliers is still missing. This work introduces a theoretical basis, and proposes a validity index, to evaluate the quality of outliers. This is not a trivial problem when nothing is known about the structure and density of the data. The proposed index considers the outlierness quality, the deviation between characteristics of outliers and inliers, and the data distortion. Low and high dimensional data sets are used to evaluate the proposed index.

AB - Defining a boundary between inliers and outliers is a major challenge in unsupervised outlier detection. In the absence of labeled data, the true outliers set cannot be evaluated. This lays the burden on both the choice of an efficient outlier detection criterion, and parameter selection. While numerous unsupervised outlier detection criteria, with different parameters, have been proposed, an unsupervised evaluation of outliers is still missing. This work introduces a theoretical basis, and proposes a validity index, to evaluate the quality of outliers. This is not a trivial problem when nothing is known about the structure and density of the data. The proposed index considers the outlierness quality, the deviation between characteristics of outliers and inliers, and the data distortion. Low and high dimensional data sets are used to evaluate the proposed index.

KW - Outlier analysis

KW - Validity index

UR - http://www.scopus.com/inward/record.url?scp=79851479602&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79851479602&partnerID=8YFLogxK

U2 - 10.1109/ISDA.2010.5687245

DO - 10.1109/ISDA.2010.5687245

M3 - Conference contribution

AN - SCOPUS:79851479602

SN - 9781424481354

SP - 325

EP - 329

BT - Proceedings of the 2010 10th International Conference on Intelligent Systems Design and Applications, ISDA'10

ER -