Accessing scientific data: Simpler is better

Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi, Flip Korn

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

A variety of index structures has been proposed for supporting fast access and summarization of large multidimensional data sets. Some of these indices are fairly involved, hence few are used in practice. In this paper we examine how to reduce the I/O cost by taking full advantage of recent trends in hard disk development which favor reading large chunks of consecutive disk blocks over seeking and searching. We present the Multiresolution File Scan (MFS) approach which is based on a surprisingly simple and flexible data structure which outperforms sophisticated multidimensional indices, even if they are bulk-loaded and hence optimized for query processing. Our approach also has the advantage that it can incorporate a priori knowledge about the query workload. It readily supports summarization using distributive (e.g., count, sum, max, min) and algebraic (e.g., avg) aggregate operators.

Original languageEnglish
Pages (from-to)214-232
Number of pages19
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2750
Publication statusPublished - 1 Dec 2003

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this