Efficient estimation of dynamic density functions with an application to outlier detection

Abdulhakim Qahtan, Xiangliang Zhang, Suojin Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we propose a new method to estimate the dynamic density over data streams, named KDE-Track as it is based on a conventional and widely used Kernel Density Estimation (KDE) method. KDE-Track can efficiently estimate the density with linear complexity by using interpolation on a kernel model, which is incrementally updated upon the arrival of streaming data. Both theoretical analysis and experimental validation show that KDE-Track outperforms traditional KDE and a baseline method Cluster-Kernels on estimation accuracy of the complex density structures in data streams, computing time and memory usage. KDE-Track is also demonstrated on timely catching the dynamic density of synthetic and real-world data. In addition, KDE-Track is used to accurately detect outliers in sensor data and compared with two existing methods developed for detecting outliers and cleaning sensor data.

Original languageEnglish
Title of host publicationCIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management
Pages2159-2163
Number of pages5
DOIs
Publication statusPublished - 19 Dec 2012
Externally publishedYes
Event21st ACM International Conference on Information and Knowledge Management, CIKM 2012 - Maui, HI
Duration: 29 Oct 20122 Nov 2012

Other

Other21st ACM International Conference on Information and Knowledge Management, CIKM 2012
CityMaui, HI
Period29/10/122/11/12

Fingerprint

Probability density function
Sensors
Cleaning
Interpolation
Data storage equipment

Keywords

  • data streams
  • density estimation
  • interpolation
  • outlier detection

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Qahtan, A., Zhang, X., & Wang, S. (2012). Efficient estimation of dynamic density functions with an application to outlier detection. In CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management (pp. 2159-2163) https://doi.org/10.1145/2396761.2398593

Efficient estimation of dynamic density functions with an application to outlier detection. / Qahtan, Abdulhakim; Zhang, Xiangliang; Wang, Suojin.

CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012. p. 2159-2163.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qahtan, A, Zhang, X & Wang, S 2012, Efficient estimation of dynamic density functions with an application to outlier detection. in CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management. pp. 2159-2163, 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, Maui, HI, 29/10/12. https://doi.org/10.1145/2396761.2398593
Qahtan A, Zhang X, Wang S. Efficient estimation of dynamic density functions with an application to outlier detection. In CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012. p. 2159-2163 https://doi.org/10.1145/2396761.2398593
Qahtan, Abdulhakim ; Zhang, Xiangliang ; Wang, Suojin. / Efficient estimation of dynamic density functions with an application to outlier detection. CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management. 2012. pp. 2159-2163
@inproceedings{12eaab04b5324fe4bab459f9e88a7704,
title = "Efficient estimation of dynamic density functions with an application to outlier detection",
abstract = "In this paper, we propose a new method to estimate the dynamic density over data streams, named KDE-Track as it is based on a conventional and widely used Kernel Density Estimation (KDE) method. KDE-Track can efficiently estimate the density with linear complexity by using interpolation on a kernel model, which is incrementally updated upon the arrival of streaming data. Both theoretical analysis and experimental validation show that KDE-Track outperforms traditional KDE and a baseline method Cluster-Kernels on estimation accuracy of the complex density structures in data streams, computing time and memory usage. KDE-Track is also demonstrated on timely catching the dynamic density of synthetic and real-world data. In addition, KDE-Track is used to accurately detect outliers in sensor data and compared with two existing methods developed for detecting outliers and cleaning sensor data.",
keywords = "data streams, density estimation, interpolation, outlier detection",
author = "Abdulhakim Qahtan and Xiangliang Zhang and Suojin Wang",
year = "2012",
month = "12",
day = "19",
doi = "10.1145/2396761.2398593",
language = "English",
isbn = "9781450311564",
pages = "2159--2163",
booktitle = "CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management",

}

TY - GEN

T1 - Efficient estimation of dynamic density functions with an application to outlier detection

AU - Qahtan, Abdulhakim

AU - Zhang, Xiangliang

AU - Wang, Suojin

PY - 2012/12/19

Y1 - 2012/12/19

N2 - In this paper, we propose a new method to estimate the dynamic density over data streams, named KDE-Track as it is based on a conventional and widely used Kernel Density Estimation (KDE) method. KDE-Track can efficiently estimate the density with linear complexity by using interpolation on a kernel model, which is incrementally updated upon the arrival of streaming data. Both theoretical analysis and experimental validation show that KDE-Track outperforms traditional KDE and a baseline method Cluster-Kernels on estimation accuracy of the complex density structures in data streams, computing time and memory usage. KDE-Track is also demonstrated on timely catching the dynamic density of synthetic and real-world data. In addition, KDE-Track is used to accurately detect outliers in sensor data and compared with two existing methods developed for detecting outliers and cleaning sensor data.

AB - In this paper, we propose a new method to estimate the dynamic density over data streams, named KDE-Track as it is based on a conventional and widely used Kernel Density Estimation (KDE) method. KDE-Track can efficiently estimate the density with linear complexity by using interpolation on a kernel model, which is incrementally updated upon the arrival of streaming data. Both theoretical analysis and experimental validation show that KDE-Track outperforms traditional KDE and a baseline method Cluster-Kernels on estimation accuracy of the complex density structures in data streams, computing time and memory usage. KDE-Track is also demonstrated on timely catching the dynamic density of synthetic and real-world data. In addition, KDE-Track is used to accurately detect outliers in sensor data and compared with two existing methods developed for detecting outliers and cleaning sensor data.

KW - data streams

KW - density estimation

KW - interpolation

KW - outlier detection

UR - http://www.scopus.com/inward/record.url?scp=84871035102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84871035102&partnerID=8YFLogxK

U2 - 10.1145/2396761.2398593

DO - 10.1145/2396761.2398593

M3 - Conference contribution

AN - SCOPUS:84871035102

SN - 9781450311564

SP - 2159

EP - 2163

BT - CIKM 2012 - Proceedings of the 21st ACM International Conference on Information and Knowledge Management

ER -