CliqueSquare: Flat plans for massively parallel RDF queries

François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge Arnulfo Quiane Ruiz, Stamatis Zampetakis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Citations (Scopus)

Abstract

As increasing volumes of RDF data are being produced and analyzed, many massively distributed architectures have been proposed for storing and querying this data. These architectures are characterized first, by their RDF partitioning and storage method, and second, by their approach for distributed query optimization, i.e., determining which operations to execute on each node in order to compute the query answers. We present CliqueSquare, a novel optimization approach for evaluating conjunctive RDF queries in a massively parallel environment. We focus on reducing query response time, and thus seek to build flat plans, where the number of joins encountered on a root-to-leaf path in the plan is minimized. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. We have deployed our algorithms in a MapReduce-based RDF platform and demonstrate experimentally the interest of the flat plans built by our best algorithms.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
PublisherIEEE Computer Society
Pages771-782
Number of pages12
Volume2015-May
ISBN (Print)9781479979639
DOIs
Publication statusPublished - 26 May 2015
Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
Duration: 13 Apr 201517 Apr 2015

Other

Other2015 31st IEEE International Conference on Data Engineering, ICDE 2015
CountryKorea, Republic of
CitySeoul
Period13/4/1517/4/15

Fingerprint

Stars

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing
  • Software

Cite this

Goasdoué, F., Kaoudi, Z., Manolescu, I., Quiane Ruiz, J. A., & Zampetakis, S. (2015). CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings - International Conference on Data Engineering (Vol. 2015-May, pp. 771-782). [7113332] IEEE Computer Society. https://doi.org/10.1109/ICDE.2015.7113332

CliqueSquare : Flat plans for massively parallel RDF queries. / Goasdoué, François; Kaoudi, Zoi; Manolescu, Ioana; Quiane Ruiz, Jorge Arnulfo; Zampetakis, Stamatis.

Proceedings - International Conference on Data Engineering. Vol. 2015-May IEEE Computer Society, 2015. p. 771-782 7113332.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Goasdoué, F, Kaoudi, Z, Manolescu, I, Quiane Ruiz, JA & Zampetakis, S 2015, CliqueSquare: Flat plans for massively parallel RDF queries. in Proceedings - International Conference on Data Engineering. vol. 2015-May, 7113332, IEEE Computer Society, pp. 771-782, 2015 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, Korea, Republic of, 13/4/15. https://doi.org/10.1109/ICDE.2015.7113332
Goasdoué F, Kaoudi Z, Manolescu I, Quiane Ruiz JA, Zampetakis S. CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings - International Conference on Data Engineering. Vol. 2015-May. IEEE Computer Society. 2015. p. 771-782. 7113332 https://doi.org/10.1109/ICDE.2015.7113332
Goasdoué, François ; Kaoudi, Zoi ; Manolescu, Ioana ; Quiane Ruiz, Jorge Arnulfo ; Zampetakis, Stamatis. / CliqueSquare : Flat plans for massively parallel RDF queries. Proceedings - International Conference on Data Engineering. Vol. 2015-May IEEE Computer Society, 2015. pp. 771-782
@inproceedings{7941c879d8b24905925b4ff53522ffd7,
title = "CliqueSquare: Flat plans for massively parallel RDF queries",
abstract = "As increasing volumes of RDF data are being produced and analyzed, many massively distributed architectures have been proposed for storing and querying this data. These architectures are characterized first, by their RDF partitioning and storage method, and second, by their approach for distributed query optimization, i.e., determining which operations to execute on each node in order to compute the query answers. We present CliqueSquare, a novel optimization approach for evaluating conjunctive RDF queries in a massively parallel environment. We focus on reducing query response time, and thus seek to build flat plans, where the number of joins encountered on a root-to-leaf path in the plan is minimized. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. We have deployed our algorithms in a MapReduce-based RDF platform and demonstrate experimentally the interest of the flat plans built by our best algorithms.",
author = "Fran{\cc}ois Goasdou{\'e} and Zoi Kaoudi and Ioana Manolescu and {Quiane Ruiz}, {Jorge Arnulfo} and Stamatis Zampetakis",
year = "2015",
month = "5",
day = "26",
doi = "10.1109/ICDE.2015.7113332",
language = "English",
isbn = "9781479979639",
volume = "2015-May",
pages = "771--782",
booktitle = "Proceedings - International Conference on Data Engineering",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - CliqueSquare

T2 - Flat plans for massively parallel RDF queries

AU - Goasdoué, François

AU - Kaoudi, Zoi

AU - Manolescu, Ioana

AU - Quiane Ruiz, Jorge Arnulfo

AU - Zampetakis, Stamatis

PY - 2015/5/26

Y1 - 2015/5/26

N2 - As increasing volumes of RDF data are being produced and analyzed, many massively distributed architectures have been proposed for storing and querying this data. These architectures are characterized first, by their RDF partitioning and storage method, and second, by their approach for distributed query optimization, i.e., determining which operations to execute on each node in order to compute the query answers. We present CliqueSquare, a novel optimization approach for evaluating conjunctive RDF queries in a massively parallel environment. We focus on reducing query response time, and thus seek to build flat plans, where the number of joins encountered on a root-to-leaf path in the plan is minimized. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. We have deployed our algorithms in a MapReduce-based RDF platform and demonstrate experimentally the interest of the flat plans built by our best algorithms.

AB - As increasing volumes of RDF data are being produced and analyzed, many massively distributed architectures have been proposed for storing and querying this data. These architectures are characterized first, by their RDF partitioning and storage method, and second, by their approach for distributed query optimization, i.e., determining which operations to execute on each node in order to compute the query answers. We present CliqueSquare, a novel optimization approach for evaluating conjunctive RDF queries in a massively parallel environment. We focus on reducing query response time, and thus seek to build flat plans, where the number of joins encountered on a root-to-leaf path in the plan is minimized. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. We have deployed our algorithms in a MapReduce-based RDF platform and demonstrate experimentally the interest of the flat plans built by our best algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84940876019&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84940876019&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2015.7113332

DO - 10.1109/ICDE.2015.7113332

M3 - Conference contribution

AN - SCOPUS:84940876019

SN - 9781479979639

VL - 2015-May

SP - 771

EP - 782

BT - Proceedings - International Conference on Data Engineering

PB - IEEE Computer Society

ER -