Self-tuning Histograms

Building Histograms Without Looking at Data

Ashraf Aboulnaga, Surajit Chaudhuri

Research output: Contribution to journalArticle

145 Citations (Scopus)

Abstract

In this paper, we introduce self-tuning histograms. Although similar in structure to traditional histograms, these histograms infer data distributions not by examining the data or a sample thereof, but by using feedback from the query execution engine about the actual selectivity of range selection operators to progressively refine the histogram. Since the cost of building and maintaining self-tuning histograms is independent of the data size, self-tuning histograms provide a remarkably inexpensive way to construct histograms for large data sets with little up-front costs. Self-tuning histograms are particularly attractive as an alternative to multi-dimensional traditional histograms that capture dependencies between attributes but are prohibitively expensive to build and maintain. In this paper, we describe the techniques for initializing and refining self-tuning histograms. Our experimental results show that self-tuning histograms provide a low-cost alternative to traditional multi-dimensional histograms with little loss of accuracy for data distributions with low to moderate skew.

Original languageEnglish
Pages (from-to)181-192
Number of pages12
JournalSIGMOD Record (ACM Special Interest Group on Management of Data)
Volume28
Issue number2
Publication statusPublished - 1 Jun 1999
Externally publishedYes

Fingerprint

Tuning
Costs
Refining
Engines
Feedback

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Information Systems
  • Software

Cite this

Self-tuning Histograms : Building Histograms Without Looking at Data. / Aboulnaga, Ashraf; Chaudhuri, Surajit.

In: SIGMOD Record (ACM Special Interest Group on Management of Data), Vol. 28, No. 2, 01.06.1999, p. 181-192.

Research output: Contribution to journalArticle

@article{f5c0e9a4c13944d8b40e26bd893fe7f4,
title = "Self-tuning Histograms: Building Histograms Without Looking at Data",
abstract = "In this paper, we introduce self-tuning histograms. Although similar in structure to traditional histograms, these histograms infer data distributions not by examining the data or a sample thereof, but by using feedback from the query execution engine about the actual selectivity of range selection operators to progressively refine the histogram. Since the cost of building and maintaining self-tuning histograms is independent of the data size, self-tuning histograms provide a remarkably inexpensive way to construct histograms for large data sets with little up-front costs. Self-tuning histograms are particularly attractive as an alternative to multi-dimensional traditional histograms that capture dependencies between attributes but are prohibitively expensive to build and maintain. In this paper, we describe the techniques for initializing and refining self-tuning histograms. Our experimental results show that self-tuning histograms provide a low-cost alternative to traditional multi-dimensional histograms with little loss of accuracy for data distributions with low to moderate skew.",
author = "Ashraf Aboulnaga and Surajit Chaudhuri",
year = "1999",
month = "6",
day = "1",
language = "English",
volume = "28",
pages = "181--192",
journal = "SIGMOD Record",
issn = "0163-5808",
publisher = "Association for Computing Machinery (ACM)",
number = "2",

}

TY - JOUR

T1 - Self-tuning Histograms

T2 - Building Histograms Without Looking at Data

AU - Aboulnaga, Ashraf

AU - Chaudhuri, Surajit

PY - 1999/6/1

Y1 - 1999/6/1

N2 - In this paper, we introduce self-tuning histograms. Although similar in structure to traditional histograms, these histograms infer data distributions not by examining the data or a sample thereof, but by using feedback from the query execution engine about the actual selectivity of range selection operators to progressively refine the histogram. Since the cost of building and maintaining self-tuning histograms is independent of the data size, self-tuning histograms provide a remarkably inexpensive way to construct histograms for large data sets with little up-front costs. Self-tuning histograms are particularly attractive as an alternative to multi-dimensional traditional histograms that capture dependencies between attributes but are prohibitively expensive to build and maintain. In this paper, we describe the techniques for initializing and refining self-tuning histograms. Our experimental results show that self-tuning histograms provide a low-cost alternative to traditional multi-dimensional histograms with little loss of accuracy for data distributions with low to moderate skew.

AB - In this paper, we introduce self-tuning histograms. Although similar in structure to traditional histograms, these histograms infer data distributions not by examining the data or a sample thereof, but by using feedback from the query execution engine about the actual selectivity of range selection operators to progressively refine the histogram. Since the cost of building and maintaining self-tuning histograms is independent of the data size, self-tuning histograms provide a remarkably inexpensive way to construct histograms for large data sets with little up-front costs. Self-tuning histograms are particularly attractive as an alternative to multi-dimensional traditional histograms that capture dependencies between attributes but are prohibitively expensive to build and maintain. In this paper, we describe the techniques for initializing and refining self-tuning histograms. Our experimental results show that self-tuning histograms provide a low-cost alternative to traditional multi-dimensional histograms with little loss of accuracy for data distributions with low to moderate skew.

UR - http://www.scopus.com/inward/record.url?scp=0005334432&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0005334432&partnerID=8YFLogxK

M3 - Article

VL - 28

SP - 181

EP - 192

JO - SIGMOD Record

JF - SIGMOD Record

SN - 0163-5808

IS - 2

ER -