Applying the golden rule of sampling for query estimation

Y. L. Wu, D. Agrawal, A. El Abbadi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Citations (Scopus)

Abstract

Query size estimation is crucial for many database system components. In particular, query optimizers need efficient and accurate query size estimation when deciding among alternative query plans. In this paper we propose a novel sampling technique based on the golden rule of sampling, introduced by von Neumann in 1947, for estimating range queries. The proposed technique randomly samples the frequency domain using the cumulative frequency distribution and yields good estimates without any a priori knowledge of the actual underlying distribution of spatial objects. We show experimentally that the proposed sampling technique gives smaller approximation error than the Min-Skew histogram based and wavelet based approaches for both synthetic and real datasets. Moreover, the proposed technique can be easily extended for higher dimensional datasets.

Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data
EditorsT. Sellis, S. Mehrotra
Pages449-460
Number of pages12
Publication statusPublished - 2001
Externally publishedYes
Event2001 ACM SIGMOD International Conference on Management of Data - Santa Barbara, CA, United States
Duration: 21 May 200124 May 2001

Other

Other2001 ACM SIGMOD International Conference on Management of Data
CountryUnited States
CitySanta Barbara, CA
Period21/5/0124/5/01

    Fingerprint

Keywords

  • Cumulative frequency distribution
  • Query estimation
  • Random sampling
  • Range query

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Wu, Y. L., Agrawal, D., & El Abbadi, A. (2001). Applying the golden rule of sampling for query estimation. In T. Sellis, & S. Mehrotra (Eds.), Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 449-460)