Exploiting sequential access when declustering data over disks and MEMS-based storage

Hailing Yu, Divyakant Agrawal, Amr El Abbadi

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than multiple small random I/O accesses. However, prior optimal cost and data placement approaches for processing range queries over two-dimensional datasets do not consider this property. In particular, these techniques do not consider the issue of sequential data placement when multiple I/O blocks need to be retrieved from a single device. In this paper, we reevaluate the optimal cost of range queries by declustering two-dimensional datasets over multiple devices, and prove that, in general, it is impossible to achieve the new optimal cost. This is because disks cannot facilitate two-dimensional sequential access which is required by the new optimal cost. Then we revisit the existing data allocation schemes under the new optimal cost, and show that none of them can achieve the new optimal cost. Fortunately, MEMS-based storage is being developed to reduce I/O cost. We first show that the two-dimensional sequential access requirement can not be satisfied by simply modeling MEMS-based storage as conventional disks. Then we propose a new placement scheme that exploits the physical properties of MEMS-based storage to solve this problem. Our theoretical analysis and experimental results show that the new scheme achieves almost optimal I/O costs.

Original languageEnglish
Pages (from-to)147-168
Number of pages22
JournalDistributed and Parallel Databases
Volume19
Issue number2-3
DOIs
Publication statusPublished - 1 May 2006
Externally publishedYes

Fingerprint

Micro-electro-mechanical Systems
MEMS
Costs
Data Placement
Range Query
Data Allocation
Physical property
Placement
Theoretical Analysis
Physical properties
Requirements
Experimental Results
Processing
Modeling

ASJC Scopus subject areas

  • Information Systems
  • Theoretical Computer Science
  • Computational Theory and Mathematics

Cite this

Exploiting sequential access when declustering data over disks and MEMS-based storage. / Yu, Hailing; Agrawal, Divyakant; Abbadi, Amr El.

In: Distributed and Parallel Databases, Vol. 19, No. 2-3, 01.05.2006, p. 147-168.

Research output: Contribution to journalArticle

Yu, Hailing ; Agrawal, Divyakant ; Abbadi, Amr El. / Exploiting sequential access when declustering data over disks and MEMS-based storage. In: Distributed and Parallel Databases. 2006 ; Vol. 19, No. 2-3. pp. 147-168.
@article{6dcfefdec8a44bb1b45d4342890aee16,
title = "Exploiting sequential access when declustering data over disks and MEMS-based storage",
abstract = "Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than multiple small random I/O accesses. However, prior optimal cost and data placement approaches for processing range queries over two-dimensional datasets do not consider this property. In particular, these techniques do not consider the issue of sequential data placement when multiple I/O blocks need to be retrieved from a single device. In this paper, we reevaluate the optimal cost of range queries by declustering two-dimensional datasets over multiple devices, and prove that, in general, it is impossible to achieve the new optimal cost. This is because disks cannot facilitate two-dimensional sequential access which is required by the new optimal cost. Then we revisit the existing data allocation schemes under the new optimal cost, and show that none of them can achieve the new optimal cost. Fortunately, MEMS-based storage is being developed to reduce I/O cost. We first show that the two-dimensional sequential access requirement can not be satisfied by simply modeling MEMS-based storage as conventional disks. Then we propose a new placement scheme that exploits the physical properties of MEMS-based storage to solve this problem. Our theoretical analysis and experimental results show that the new scheme achieves almost optimal I/O costs.",
author = "Hailing Yu and Divyakant Agrawal and Abbadi, {Amr El}",
year = "2006",
month = "5",
day = "1",
doi = "10.1007/s10619-006-8485-z",
language = "English",
volume = "19",
pages = "147--168",
journal = "Distributed and Parallel Databases",
issn = "0926-8782",
publisher = "Springer Netherlands",
number = "2-3",

}

TY - JOUR

T1 - Exploiting sequential access when declustering data over disks and MEMS-based storage

AU - Yu, Hailing

AU - Agrawal, Divyakant

AU - Abbadi, Amr El

PY - 2006/5/1

Y1 - 2006/5/1

N2 - Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than multiple small random I/O accesses. However, prior optimal cost and data placement approaches for processing range queries over two-dimensional datasets do not consider this property. In particular, these techniques do not consider the issue of sequential data placement when multiple I/O blocks need to be retrieved from a single device. In this paper, we reevaluate the optimal cost of range queries by declustering two-dimensional datasets over multiple devices, and prove that, in general, it is impossible to achieve the new optimal cost. This is because disks cannot facilitate two-dimensional sequential access which is required by the new optimal cost. Then we revisit the existing data allocation schemes under the new optimal cost, and show that none of them can achieve the new optimal cost. Fortunately, MEMS-based storage is being developed to reduce I/O cost. We first show that the two-dimensional sequential access requirement can not be satisfied by simply modeling MEMS-based storage as conventional disks. Then we propose a new placement scheme that exploits the physical properties of MEMS-based storage to solve this problem. Our theoretical analysis and experimental results show that the new scheme achieves almost optimal I/O costs.

AB - Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than multiple small random I/O accesses. However, prior optimal cost and data placement approaches for processing range queries over two-dimensional datasets do not consider this property. In particular, these techniques do not consider the issue of sequential data placement when multiple I/O blocks need to be retrieved from a single device. In this paper, we reevaluate the optimal cost of range queries by declustering two-dimensional datasets over multiple devices, and prove that, in general, it is impossible to achieve the new optimal cost. This is because disks cannot facilitate two-dimensional sequential access which is required by the new optimal cost. Then we revisit the existing data allocation schemes under the new optimal cost, and show that none of them can achieve the new optimal cost. Fortunately, MEMS-based storage is being developed to reduce I/O cost. We first show that the two-dimensional sequential access requirement can not be satisfied by simply modeling MEMS-based storage as conventional disks. Then we propose a new placement scheme that exploits the physical properties of MEMS-based storage to solve this problem. Our theoretical analysis and experimental results show that the new scheme achieves almost optimal I/O costs.

UR - http://www.scopus.com/inward/record.url?scp=33744778504&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33744778504&partnerID=8YFLogxK

U2 - 10.1007/s10619-006-8485-z

DO - 10.1007/s10619-006-8485-z

M3 - Article

VL - 19

SP - 147

EP - 168

JO - Distributed and Parallel Databases

JF - Distributed and Parallel Databases

SN - 0926-8782

IS - 2-3

ER -