RHist: Adaptive summarization over continuous data streams

Lin Qiao, Divyakant Agrawal, Amr El Abbadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications. Furthermore, since the entire data is not available at all times the maintenance task must be done incrementally. We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario. In order to reduce query estimation errors, we propose adaptive approaches which not only capture the data distribution, but also integrate independent query patterns. We introduce a workload decay model to efficiently capture global workload information and ensure that the query patterns from the recent past are weighted more than queries that are further in the past. We verify experimentally that our approach successfully 'adapts to continuously changing workload as well as data streams.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
EditorsK Kalpakis, N Goharian, D Grossman
Pages469-476
Number of pages8
Publication statusPublished - 1 Dec 2002
Externally publishedYes
EventProceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002) - McLean, VA, United States
Duration: 4 Nov 20029 Nov 2002

Other

OtherProceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002)
CountryUnited States
CityMcLean, VA
Period4/11/029/11/02

Fingerprint

Query
Data streams
Summarization
Workload
Telecommunications
Estimation error
Scenarios
Decay
Network monitoring
Online analytical processing

Keywords

  • Adaptive approximation
  • Data stream
  • Histogram

ASJC Scopus subject areas

  • Business, Management and Accounting(all)

Cite this

Qiao, L., Agrawal, D., & El Abbadi, A. (2002). RHist: Adaptive summarization over continuous data streams. In K. Kalpakis, N. Goharian, & D. Grossman (Eds.), International Conference on Information and Knowledge Management, Proceedings (pp. 469-476)

RHist : Adaptive summarization over continuous data streams. / Qiao, Lin; Agrawal, Divyakant; El Abbadi, Amr.

International Conference on Information and Knowledge Management, Proceedings. ed. / K Kalpakis; N Goharian; D Grossman. 2002. p. 469-476.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qiao, L, Agrawal, D & El Abbadi, A 2002, RHist: Adaptive summarization over continuous data streams. in K Kalpakis, N Goharian & D Grossman (eds), International Conference on Information and Knowledge Management, Proceedings. pp. 469-476, Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002), McLean, VA, United States, 4/11/02.
Qiao L, Agrawal D, El Abbadi A. RHist: Adaptive summarization over continuous data streams. In Kalpakis K, Goharian N, Grossman D, editors, International Conference on Information and Knowledge Management, Proceedings. 2002. p. 469-476
Qiao, Lin ; Agrawal, Divyakant ; El Abbadi, Amr. / RHist : Adaptive summarization over continuous data streams. International Conference on Information and Knowledge Management, Proceedings. editor / K Kalpakis ; N Goharian ; D Grossman. 2002. pp. 469-476
@inproceedings{124e336a890e46138deb277548627c81,
title = "RHist: Adaptive summarization over continuous data streams",
abstract = "Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications. Furthermore, since the entire data is not available at all times the maintenance task must be done incrementally. We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario. In order to reduce query estimation errors, we propose adaptive approaches which not only capture the data distribution, but also integrate independent query patterns. We introduce a workload decay model to efficiently capture global workload information and ensure that the query patterns from the recent past are weighted more than queries that are further in the past. We verify experimentally that our approach successfully 'adapts to continuously changing workload as well as data streams.",
keywords = "Adaptive approximation, Data stream, Histogram",
author = "Lin Qiao and Divyakant Agrawal and {El Abbadi}, Amr",
year = "2002",
month = "12",
day = "1",
language = "English",
pages = "469--476",
editor = "K Kalpakis and N Goharian and D Grossman",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - RHist

T2 - Adaptive summarization over continuous data streams

AU - Qiao, Lin

AU - Agrawal, Divyakant

AU - El Abbadi, Amr

PY - 2002/12/1

Y1 - 2002/12/1

N2 - Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications. Furthermore, since the entire data is not available at all times the maintenance task must be done incrementally. We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario. In order to reduce query estimation errors, we propose adaptive approaches which not only capture the data distribution, but also integrate independent query patterns. We introduce a workload decay model to efficiently capture global workload information and ensure that the query patterns from the recent past are weighted more than queries that are further in the past. We verify experimentally that our approach successfully 'adapts to continuously changing workload as well as data streams.

AB - Maintaining approximate aggregates and summaries over data streams is crucial to handle the OLAP query workload that arises in applications, such as network monitoring and telecommunications. Furthermore, since the entire data is not available at all times the maintenance task must be done incrementally. We show that R(elaxed)Hist(ogram) is an appropriate summarization under data stream scenario. In order to reduce query estimation errors, we propose adaptive approaches which not only capture the data distribution, but also integrate independent query patterns. We introduce a workload decay model to efficiently capture global workload information and ensure that the query patterns from the recent past are weighted more than queries that are further in the past. We verify experimentally that our approach successfully 'adapts to continuously changing workload as well as data streams.

KW - Adaptive approximation

KW - Data stream

KW - Histogram

UR - http://www.scopus.com/inward/record.url?scp=0037818502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0037818502&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0037818502

SP - 469

EP - 476

BT - International Conference on Information and Knowledge Management, Proceedings

A2 - Kalpakis, K

A2 - Goharian, N

A2 - Grossman, D

ER -