Data Placement and Query Processing Based on RPE Parallelisms

Yaxin Yu, Guoren Wang, Ge Yu, Gang Wu, Junan Hu, Nan Tang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the performance of parallel database systems. This paper proposes two data partition strategies to decluster XML documents with very large size, Path Schema based Path Instance Balancing (PSPIB) strategy, in which all path instances with the same path schema in a data tree are declustered evenly over all sites, and Node Schema based Node Round-Robin (NSNRR) strategy, in which all node objects with the same node schema in a data tree are declustered over all sites in a round-robin way. Accordingly, two query processing algorithms are proposed based on the two partition methods, Parallel Path Merge (PPM) algorithm and Parallel Pipelining Path Join (PPPJ) algorithm. The performance analysis and evaluation on the two data placement strategies and corresponding query processing algorithms are given in this paper.

Original languageEnglish
Title of host publicationProceedings - IEEE Computer Society's International Computer Software and Applications Conference
Pages151-156
Number of pages6
Publication statusPublished - 2003
Externally publishedYes
EventProceedings: 27th Annual International Computer Software and Applications Conference, COMPSAC 2003 - Dallas, TX, United States
Duration: 3 Nov 20036 Nov 2003

Other

OtherProceedings: 27th Annual International Computer Software and Applications Conference, COMPSAC 2003
CountryUnited States
CityDallas, TX
Period3/11/036/11/03

Fingerprint

Query processing
XML
Throughput

ASJC Scopus subject areas

  • Software

Cite this

Yu, Y., Wang, G., Yu, G., Wu, G., Hu, J., & Tang, N. (2003). Data Placement and Query Processing Based on RPE Parallelisms. In Proceedings - IEEE Computer Society's International Computer Software and Applications Conference (pp. 151-156)

Data Placement and Query Processing Based on RPE Parallelisms. / Yu, Yaxin; Wang, Guoren; Yu, Ge; Wu, Gang; Hu, Junan; Tang, Nan.

Proceedings - IEEE Computer Society's International Computer Software and Applications Conference. 2003. p. 151-156.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yu, Y, Wang, G, Yu, G, Wu, G, Hu, J & Tang, N 2003, Data Placement and Query Processing Based on RPE Parallelisms. in Proceedings - IEEE Computer Society's International Computer Software and Applications Conference. pp. 151-156, Proceedings: 27th Annual International Computer Software and Applications Conference, COMPSAC 2003, Dallas, TX, United States, 3/11/03.
Yu Y, Wang G, Yu G, Wu G, Hu J, Tang N. Data Placement and Query Processing Based on RPE Parallelisms. In Proceedings - IEEE Computer Society's International Computer Software and Applications Conference. 2003. p. 151-156
Yu, Yaxin ; Wang, Guoren ; Yu, Ge ; Wu, Gang ; Hu, Junan ; Tang, Nan. / Data Placement and Query Processing Based on RPE Parallelisms. Proceedings - IEEE Computer Society's International Computer Software and Applications Conference. 2003. pp. 151-156
@inproceedings{7018106a1136440cb7eb2edf2d46924f,
title = "Data Placement and Query Processing Based on RPE Parallelisms",
abstract = "The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the performance of parallel database systems. This paper proposes two data partition strategies to decluster XML documents with very large size, Path Schema based Path Instance Balancing (PSPIB) strategy, in which all path instances with the same path schema in a data tree are declustered evenly over all sites, and Node Schema based Node Round-Robin (NSNRR) strategy, in which all node objects with the same node schema in a data tree are declustered over all sites in a round-robin way. Accordingly, two query processing algorithms are proposed based on the two partition methods, Parallel Path Merge (PPM) algorithm and Parallel Pipelining Path Join (PPPJ) algorithm. The performance analysis and evaluation on the two data placement strategies and corresponding query processing algorithms are given in this paper.",
author = "Yaxin Yu and Guoren Wang and Ge Yu and Gang Wu and Junan Hu and Nan Tang",
year = "2003",
language = "English",
pages = "151--156",
booktitle = "Proceedings - IEEE Computer Society's International Computer Software and Applications Conference",

}

TY - GEN

T1 - Data Placement and Query Processing Based on RPE Parallelisms

AU - Yu, Yaxin

AU - Wang, Guoren

AU - Yu, Ge

AU - Wu, Gang

AU - Hu, Junan

AU - Tang, Nan

PY - 2003

Y1 - 2003

N2 - The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the performance of parallel database systems. This paper proposes two data partition strategies to decluster XML documents with very large size, Path Schema based Path Instance Balancing (PSPIB) strategy, in which all path instances with the same path schema in a data tree are declustered evenly over all sites, and Node Schema based Node Round-Robin (NSNRR) strategy, in which all node objects with the same node schema in a data tree are declustered over all sites in a round-robin way. Accordingly, two query processing algorithms are proposed based on the two partition methods, Parallel Path Merge (PPM) algorithm and Parallel Pipelining Path Join (PPPJ) algorithm. The performance analysis and evaluation on the two data placement strategies and corresponding query processing algorithms are given in this paper.

AB - The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the performance of parallel database systems. This paper proposes two data partition strategies to decluster XML documents with very large size, Path Schema based Path Instance Balancing (PSPIB) strategy, in which all path instances with the same path schema in a data tree are declustered evenly over all sites, and Node Schema based Node Round-Robin (NSNRR) strategy, in which all node objects with the same node schema in a data tree are declustered over all sites in a round-robin way. Accordingly, two query processing algorithms are proposed based on the two partition methods, Parallel Path Merge (PPM) algorithm and Parallel Pipelining Path Join (PPPJ) algorithm. The performance analysis and evaluation on the two data placement strategies and corresponding query processing algorithms are given in this paper.

UR - http://www.scopus.com/inward/record.url?scp=0345529057&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0345529057&partnerID=8YFLogxK

M3 - Conference contribution

SP - 151

EP - 156

BT - Proceedings - IEEE Computer Society's International Computer Software and Applications Conference

ER -