A PCA-based change detection framework for multidimensional data streams

Abdulhakim Qahtan, Basma Alharbi, Suojin Wang, Xiangliang Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on principal component analysis, which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test. The evaluation results on synthetic and real data show that our framework outperforms two baseline methods in terms of both detection accuracy and computational costs.

Original languageEnglish
Title of host publicationKDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages935-944
Number of pages10
Volume2015-August
ISBN (Electronic)9781450336642
DOIs
Publication statusPublished - 10 Aug 2015
Externally publishedYes
Event21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015 - Sydney, Australia
Duration: 10 Aug 201513 Aug 2015

Other

Other21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015
CountryAustralia
CitySydney
Period10/8/1513/8/15

Fingerprint

Principal component analysis
Costs

Keywords

  • Change detection
  • Data streams
  • Density estimation
  • Principal component analysis

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Qahtan, A., Alharbi, B., Wang, S., & Zhang, X. (2015). A PCA-based change detection framework for multidimensional data streams. In KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Vol. 2015-August, pp. 935-944). Association for Computing Machinery. https://doi.org/10.1145/2783258.2783359

A PCA-based change detection framework for multidimensional data streams. / Qahtan, Abdulhakim; Alharbi, Basma; Wang, Suojin; Zhang, Xiangliang.

KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August Association for Computing Machinery, 2015. p. 935-944.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qahtan, A, Alharbi, B, Wang, S & Zhang, X 2015, A PCA-based change detection framework for multidimensional data streams. in KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. vol. 2015-August, Association for Computing Machinery, pp. 935-944, 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2015, Sydney, Australia, 10/8/15. https://doi.org/10.1145/2783258.2783359
Qahtan A, Alharbi B, Wang S, Zhang X. A PCA-based change detection framework for multidimensional data streams. In KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August. Association for Computing Machinery. 2015. p. 935-944 https://doi.org/10.1145/2783258.2783359
Qahtan, Abdulhakim ; Alharbi, Basma ; Wang, Suojin ; Zhang, Xiangliang. / A PCA-based change detection framework for multidimensional data streams. KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining. Vol. 2015-August Association for Computing Machinery, 2015. pp. 935-944
@inproceedings{63fbcbddccc5414299005f2aa22d6e11,
title = "A PCA-based change detection framework for multidimensional data streams",
abstract = "Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on principal component analysis, which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test. The evaluation results on synthetic and real data show that our framework outperforms two baseline methods in terms of both detection accuracy and computational costs.",
keywords = "Change detection, Data streams, Density estimation, Principal component analysis",
author = "Abdulhakim Qahtan and Basma Alharbi and Suojin Wang and Xiangliang Zhang",
year = "2015",
month = "8",
day = "10",
doi = "10.1145/2783258.2783359",
language = "English",
volume = "2015-August",
pages = "935--944",
booktitle = "KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - A PCA-based change detection framework for multidimensional data streams

AU - Qahtan, Abdulhakim

AU - Alharbi, Basma

AU - Wang, Suojin

AU - Zhang, Xiangliang

PY - 2015/8/10

Y1 - 2015/8/10

N2 - Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on principal component analysis, which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test. The evaluation results on synthetic and real data show that our framework outperforms two baseline methods in terms of both detection accuracy and computational costs.

AB - Detecting changes in multidimensional data streams is an important and challenging task. In unsupervised change detection, changes are usually detected by comparing the distribution in a current (test) window with a reference window. It is thus essential to design divergence metrics and density estimators for comparing the data distributions, which are mostly done for univariate data. Detecting changes in multidimensional data streams brings difficulties to the density estimation and comparisons. In this paper, we propose a framework for detecting changes in multidimensional data streams based on principal component analysis, which is used for projecting data into a lower dimensional space, thus facilitating density estimation and change-score calculations. The proposed framework also has advantages over existing approaches by reducing computational costs with an efficient density estimator, promoting the change-score calculation by introducing effective divergence metrics, and by minimizing the efforts required from users on the threshold parameter setting by using the Page-Hinkley test. The evaluation results on synthetic and real data show that our framework outperforms two baseline methods in terms of both detection accuracy and computational costs.

KW - Change detection

KW - Data streams

KW - Density estimation

KW - Principal component analysis

UR - http://www.scopus.com/inward/record.url?scp=84954107519&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84954107519&partnerID=8YFLogxK

U2 - 10.1145/2783258.2783359

DO - 10.1145/2783258.2783359

M3 - Conference contribution

VL - 2015-August

SP - 935

EP - 944

BT - KDD 2015 - Proceedings of the 21st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

PB - Association for Computing Machinery

ER -