Precision-time tradeoffs

A paradigm for processing statistical queries on databases

Jaideep Srivastava, Doron Rotem

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Conventional query processing techniques are aimed at queries which access small amounts of data, and require each data item for the answer. In case the database is used for statistical analysis as well as operational purposes, for some types of queries a large part of the database may be required to compute the answer. This may lead to a data access bottleneck, caused by the excessive number of disk accesses needed to get the data into primary memory. An example is computation of statistical parameters, such as count, average, median, and standard deviation, which are useful for statistical analysis of the database. Yet another example that faces this bottleneck is the verification of the truth of a set of predicates (goals), based on the current database state, for the purposes of intelligent decision making. A solution to this problem is to maintain a set of precomputed information about the database in a view or a snapshot. Statistical queries can be processed using the view rather than the real database. A crucial issue is that the precision of the precomputed information in the view deteriorates with time, because of the dynamic nature of the underlying database. Thus the answer provided is approximate, which is acceptable under many circumstances, especially when the error is bounded. The tradeoff is that the processing of queries is made faster at the expense of the precision in the answer. The concept of precision in the context of database queries is formalized, and a data model to incorporate it is developed. Algorithms are designed to maintain materialized views of data to specified degrees of precision.

Original languageEnglish
Title of host publicationStatistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings
PublisherSpringer Verlag
Pages226-245
Number of pages20
ISBN (Print)9783540505754
DOIs
Publication statusPublished - 1 Jan 1989
Externally publishedYes
Event4th International Working Conference on Statistical and Scientific Database Management, SSDBM 1988 - Rome, Italy
Duration: 21 Jun 198823 Jun 1988

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume339 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other4th International Working Conference on Statistical and Scientific Database Management, SSDBM 1988
CountryItaly
CityRome
Period21/6/8823/6/88

Fingerprint

Trade-offs
Paradigm
Query
Processing
Statistical Analysis
Statistical methods
Query processing
Snapshot
Query Processing
Predicate
Data Model
Standard deviation
Data structures
Count
Decision making
Decision Making
Data storage equipment

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Srivastava, J., & Rotem, D. (1989). Precision-time tradeoffs: A paradigm for processing statistical queries on databases. In Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings (pp. 226-245). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 339 LNCS). Springer Verlag. https://doi.org/10.1007/BFb0027516

Precision-time tradeoffs : A paradigm for processing statistical queries on databases. / Srivastava, Jaideep; Rotem, Doron.

Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings. Springer Verlag, 1989. p. 226-245 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 339 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Srivastava, J & Rotem, D 1989, Precision-time tradeoffs: A paradigm for processing statistical queries on databases. in Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 339 LNCS, Springer Verlag, pp. 226-245, 4th International Working Conference on Statistical and Scientific Database Management, SSDBM 1988, Rome, Italy, 21/6/88. https://doi.org/10.1007/BFb0027516
Srivastava J, Rotem D. Precision-time tradeoffs: A paradigm for processing statistical queries on databases. In Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings. Springer Verlag. 1989. p. 226-245. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/BFb0027516
Srivastava, Jaideep ; Rotem, Doron. / Precision-time tradeoffs : A paradigm for processing statistical queries on databases. Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings. Springer Verlag, 1989. pp. 226-245 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{b0420c3b814c480a9306b2931fc841e5,
title = "Precision-time tradeoffs: A paradigm for processing statistical queries on databases",
abstract = "Conventional query processing techniques are aimed at queries which access small amounts of data, and require each data item for the answer. In case the database is used for statistical analysis as well as operational purposes, for some types of queries a large part of the database may be required to compute the answer. This may lead to a data access bottleneck, caused by the excessive number of disk accesses needed to get the data into primary memory. An example is computation of statistical parameters, such as count, average, median, and standard deviation, which are useful for statistical analysis of the database. Yet another example that faces this bottleneck is the verification of the truth of a set of predicates (goals), based on the current database state, for the purposes of intelligent decision making. A solution to this problem is to maintain a set of precomputed information about the database in a view or a snapshot. Statistical queries can be processed using the view rather than the real database. A crucial issue is that the precision of the precomputed information in the view deteriorates with time, because of the dynamic nature of the underlying database. Thus the answer provided is approximate, which is acceptable under many circumstances, especially when the error is bounded. The tradeoff is that the processing of queries is made faster at the expense of the precision in the answer. The concept of precision in the context of database queries is formalized, and a data model to incorporate it is developed. Algorithms are designed to maintain materialized views of data to specified degrees of precision.",
author = "Jaideep Srivastava and Doron Rotem",
year = "1989",
month = "1",
day = "1",
doi = "10.1007/BFb0027516",
language = "English",
isbn = "9783540505754",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "226--245",
booktitle = "Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings",

}

TY - GEN

T1 - Precision-time tradeoffs

T2 - A paradigm for processing statistical queries on databases

AU - Srivastava, Jaideep

AU - Rotem, Doron

PY - 1989/1/1

Y1 - 1989/1/1

N2 - Conventional query processing techniques are aimed at queries which access small amounts of data, and require each data item for the answer. In case the database is used for statistical analysis as well as operational purposes, for some types of queries a large part of the database may be required to compute the answer. This may lead to a data access bottleneck, caused by the excessive number of disk accesses needed to get the data into primary memory. An example is computation of statistical parameters, such as count, average, median, and standard deviation, which are useful for statistical analysis of the database. Yet another example that faces this bottleneck is the verification of the truth of a set of predicates (goals), based on the current database state, for the purposes of intelligent decision making. A solution to this problem is to maintain a set of precomputed information about the database in a view or a snapshot. Statistical queries can be processed using the view rather than the real database. A crucial issue is that the precision of the precomputed information in the view deteriorates with time, because of the dynamic nature of the underlying database. Thus the answer provided is approximate, which is acceptable under many circumstances, especially when the error is bounded. The tradeoff is that the processing of queries is made faster at the expense of the precision in the answer. The concept of precision in the context of database queries is formalized, and a data model to incorporate it is developed. Algorithms are designed to maintain materialized views of data to specified degrees of precision.

AB - Conventional query processing techniques are aimed at queries which access small amounts of data, and require each data item for the answer. In case the database is used for statistical analysis as well as operational purposes, for some types of queries a large part of the database may be required to compute the answer. This may lead to a data access bottleneck, caused by the excessive number of disk accesses needed to get the data into primary memory. An example is computation of statistical parameters, such as count, average, median, and standard deviation, which are useful for statistical analysis of the database. Yet another example that faces this bottleneck is the verification of the truth of a set of predicates (goals), based on the current database state, for the purposes of intelligent decision making. A solution to this problem is to maintain a set of precomputed information about the database in a view or a snapshot. Statistical queries can be processed using the view rather than the real database. A crucial issue is that the precision of the precomputed information in the view deteriorates with time, because of the dynamic nature of the underlying database. Thus the answer provided is approximate, which is acceptable under many circumstances, especially when the error is bounded. The tradeoff is that the processing of queries is made faster at the expense of the precision in the answer. The concept of precision in the context of database queries is formalized, and a data model to incorporate it is developed. Algorithms are designed to maintain materialized views of data to specified degrees of precision.

UR - http://www.scopus.com/inward/record.url?scp=85034847215&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034847215&partnerID=8YFLogxK

U2 - 10.1007/BFb0027516

DO - 10.1007/BFb0027516

M3 - Conference contribution

SN - 9783540505754

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 226

EP - 245

BT - Statistical and Scientific Database Management - 4th International Working Conference, SSDBM, Proceedings

PB - Springer Verlag

ER -