SCRAP: A statistical approach for creating a database query workload based on performance bottlenecks

James Skarie, Biplob K. Debnath, David J. Lilja, Mohamed F. Mokbel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With the tremendous growth in stored data, the role of database systems has become more significant than ever before. Standard query workloads, such as the TPC-C and TPC-H benchmark suites, are used to evaluate and tune the functionality and performance of database systems. Running and configuring benchmarks is a time consuming task. It requires substantial statistical expertise due to the enormous data size and large number of queries in the workload. Subsetting can be used to reduce the number of queries in a workload. An existing workload subsetting technique selected queries based on similarities of the ranks of the queries for low-level characteristics, such as cache miss rates, or based on the execution time required in different computer systems. However, many low-level characteristics are correlated, produce similar behaviors. Also, raw execution time as a metric is too diffuse to capture important performance bottlenecks. Our goal is to select a subset of queries that can reproduce the same bottlenecks in the system as the original workload. In this paper, we propose a statistical approach for creating a database query workload based on performance bottlenecks (SCRAP). Our methodology takes a query workload and a set of system configuration parameters as inputs, and selects a subset of the queries from the workload based on the similarity of performance bottlenecks. Experimental results using the TPC-H benchmark and the PostgreSQL database system, show that the reduced workload and the original workload produce similar performance bottlenecks, and the subset accurately estimates the total execution time.

Original languageEnglish
Title of host publicationProceedings of the 2007 IEEE International Symposium on Workload Characterization, IISWC
Pages183-192
Number of pages10
DOIs
Publication statusPublished - 1 Dec 2007
Event2007 IEEE International Symposium on Workload Characterization, IISWC - Boston, MA, United States
Duration: 27 Sep 200729 Sep 2007

Publication series

NameProceedings of the 2007 IEEE International Symposium on Workload Characterization, IISWC

Other

Other2007 IEEE International Symposium on Workload Characterization, IISWC
CountryUnited States
CityBoston, MA
Period27/9/0729/9/07

    Fingerprint

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Skarie, J., Debnath, B. K., Lilja, D. J., & Mokbel, M. F. (2007). SCRAP: A statistical approach for creating a database query workload based on performance bottlenecks. In Proceedings of the 2007 IEEE International Symposium on Workload Characterization, IISWC (pp. 183-192). [4362194] (Proceedings of the 2007 IEEE International Symposium on Workload Characterization, IISWC). https://doi.org/10.1109/IISWC.2007.4362194