KDE-Track

An Efficient Dynamic Density Estimator for Data Streams

Abdulhakim Qahtan, Suojin Wang, Xiangliang Zhang

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Recent developments in sensors, global positioning system devices, and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed, and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.

Original languageEnglish
Article number7738463
Pages (from-to)642-655
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume29
Issue number3
DOIs
Publication statusPublished - 1 Mar 2017
Externally publishedYes

Fingerprint

Probability density function
Global positioning system
Interpolation
Availability
Bandwidth
Data storage equipment
Sensors

Keywords

  • Adaptive resampling
  • Bandwidth selection
  • Data streams
  • Dynamic density estimation
  • Interpolation

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Cite this

KDE-Track : An Efficient Dynamic Density Estimator for Data Streams. / Qahtan, Abdulhakim; Wang, Suojin; Zhang, Xiangliang.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 29, No. 3, 7738463, 01.03.2017, p. 642-655.

Research output: Contribution to journalArticle

@article{52a3c94a35bc44b69c1da889c71d4311,
title = "KDE-Track: An Efficient Dynamic Density Estimator for Data Streams",
abstract = "Recent developments in sensors, global positioning system devices, and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed, and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.",
keywords = "Adaptive resampling, Bandwidth selection, Data streams, Dynamic density estimation, Interpolation",
author = "Abdulhakim Qahtan and Suojin Wang and Xiangliang Zhang",
year = "2017",
month = "3",
day = "1",
doi = "10.1109/TKDE.2016.2626441",
language = "English",
volume = "29",
pages = "642--655",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "3",

}

TY - JOUR

T1 - KDE-Track

T2 - An Efficient Dynamic Density Estimator for Data Streams

AU - Qahtan, Abdulhakim

AU - Wang, Suojin

AU - Zhang, Xiangliang

PY - 2017/3/1

Y1 - 2017/3/1

N2 - Recent developments in sensors, global positioning system devices, and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed, and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.

AB - Recent developments in sensors, global positioning system devices, and smart phones have increased the availability of spatiotemporal data streams. Developing models for mining such streams is challenged by the huge amount of data that cannot be stored in the memory, the high arrival speed, and the dynamic changes in the data distribution. Density estimation is an important technique in stream mining for a wide variety of applications. The construction of kernel density estimators is well studied and documented. However, existing techniques are either expensive or inaccurate and unable to capture the changes in the data distribution. In this paper, we present a method called KDE-Track to estimate the density of spatiotemporal data streams. KDE-Track can efficiently estimate the density function with linear time complexity using interpolation on a kernel model, which is incrementally updated upon the arrival of new samples from the stream. We also propose an accurate and efficient method for selecting the bandwidth value for the kernel density estimator, which increases its accuracy significantly. Both theoretical analysis and experimental validation show that KDE-Track outperforms a set of baseline methods on the estimation accuracy and computing time of complex density structures in data streams.

KW - Adaptive resampling

KW - Bandwidth selection

KW - Data streams

KW - Dynamic density estimation

KW - Interpolation

UR - http://www.scopus.com/inward/record.url?scp=85012300533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012300533&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2016.2626441

DO - 10.1109/TKDE.2016.2626441

M3 - Article

VL - 29

SP - 642

EP - 655

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 3

M1 - 7738463

ER -