Sphinx

Distributed execution of interactive SQL queries on big spatial data

Ahmed Eldawy, Mostafa Elganainy, Ammar Bakeer, Ahmed Abdelmotaleb, Mohamed Mokbel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

This paper presents Sphinx, a full-fledged distributed system which uses a standard SQL interface to process big spatial data. Sphinx adds spatial data types, indexes and query processing, inside the code-base of Cloudera Impala for efficient processing of spatial data. In particular, Sphinx is composed of four main components, namely, query parser, indexer, query planner, and query executor. The query parser injects spatial data types and functions in the SQL interface of Sphinx. The indexer creates spatial indexes in Sphinx by adopting a two-layered index design. The query planner utilizes these indexes to construct efficient query plans for range query and spatial join operations. Finally, the query executor carries out these plans on big spatial datasets in a distributed cluster. A system prototype of Sphinx running on real datasets shows up-to three orders of magnitude performance improvement over traditional Impala.

Original languageEnglish
Title of host publication23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
PublisherAssociation for Computing Machinery
Volume03-06-November-2015
ISBN (Electronic)9781450339674
DOIs
Publication statusPublished - 3 Nov 2015
Externally publishedYes
Event23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015 - Seattle, United States
Duration: 3 Nov 20156 Nov 2015

Other

Other23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
CountryUnited States
CitySeattle
Period3/11/156/11/15

Fingerprint

Query processing
Spatial Data
spatial data
Query
Processing
Spatial Index
Range Query
Query Processing
index
Join
Distributed Systems
Prototype
plan

Keywords

  • Impala
  • Range query
  • Spatial
  • Spatial join
  • Sphinx
  • SQL

ASJC Scopus subject areas

  • Earth-Surface Processes
  • Computer Science Applications
  • Modelling and Simulation
  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Eldawy, A., Elganainy, M., Bakeer, A., Abdelmotaleb, A., & Mokbel, M. (2015). Sphinx: Distributed execution of interactive SQL queries on big spatial data. In 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015 (Vol. 03-06-November-2015). [a78] Association for Computing Machinery. https://doi.org/10.1145/2820783.2820869

Sphinx : Distributed execution of interactive SQL queries on big spatial data. / Eldawy, Ahmed; Elganainy, Mostafa; Bakeer, Ammar; Abdelmotaleb, Ahmed; Mokbel, Mohamed.

23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015. Vol. 03-06-November-2015 Association for Computing Machinery, 2015. a78.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Eldawy, A, Elganainy, M, Bakeer, A, Abdelmotaleb, A & Mokbel, M 2015, Sphinx: Distributed execution of interactive SQL queries on big spatial data. in 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015. vol. 03-06-November-2015, a78, Association for Computing Machinery, 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015, Seattle, United States, 3/11/15. https://doi.org/10.1145/2820783.2820869
Eldawy A, Elganainy M, Bakeer A, Abdelmotaleb A, Mokbel M. Sphinx: Distributed execution of interactive SQL queries on big spatial data. In 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015. Vol. 03-06-November-2015. Association for Computing Machinery. 2015. a78 https://doi.org/10.1145/2820783.2820869
Eldawy, Ahmed ; Elganainy, Mostafa ; Bakeer, Ammar ; Abdelmotaleb, Ahmed ; Mokbel, Mohamed. / Sphinx : Distributed execution of interactive SQL queries on big spatial data. 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015. Vol. 03-06-November-2015 Association for Computing Machinery, 2015.
@inproceedings{704ba152846b4e8f976de0359d993395,
title = "Sphinx: Distributed execution of interactive SQL queries on big spatial data",
abstract = "This paper presents Sphinx, a full-fledged distributed system which uses a standard SQL interface to process big spatial data. Sphinx adds spatial data types, indexes and query processing, inside the code-base of Cloudera Impala for efficient processing of spatial data. In particular, Sphinx is composed of four main components, namely, query parser, indexer, query planner, and query executor. The query parser injects spatial data types and functions in the SQL interface of Sphinx. The indexer creates spatial indexes in Sphinx by adopting a two-layered index design. The query planner utilizes these indexes to construct efficient query plans for range query and spatial join operations. Finally, the query executor carries out these plans on big spatial datasets in a distributed cluster. A system prototype of Sphinx running on real datasets shows up-to three orders of magnitude performance improvement over traditional Impala.",
keywords = "Impala, Range query, Spatial, Spatial join, Sphinx, SQL",
author = "Ahmed Eldawy and Mostafa Elganainy and Ammar Bakeer and Ahmed Abdelmotaleb and Mohamed Mokbel",
year = "2015",
month = "11",
day = "3",
doi = "10.1145/2820783.2820869",
language = "English",
volume = "03-06-November-2015",
booktitle = "23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Sphinx

T2 - Distributed execution of interactive SQL queries on big spatial data

AU - Eldawy, Ahmed

AU - Elganainy, Mostafa

AU - Bakeer, Ammar

AU - Abdelmotaleb, Ahmed

AU - Mokbel, Mohamed

PY - 2015/11/3

Y1 - 2015/11/3

N2 - This paper presents Sphinx, a full-fledged distributed system which uses a standard SQL interface to process big spatial data. Sphinx adds spatial data types, indexes and query processing, inside the code-base of Cloudera Impala for efficient processing of spatial data. In particular, Sphinx is composed of four main components, namely, query parser, indexer, query planner, and query executor. The query parser injects spatial data types and functions in the SQL interface of Sphinx. The indexer creates spatial indexes in Sphinx by adopting a two-layered index design. The query planner utilizes these indexes to construct efficient query plans for range query and spatial join operations. Finally, the query executor carries out these plans on big spatial datasets in a distributed cluster. A system prototype of Sphinx running on real datasets shows up-to three orders of magnitude performance improvement over traditional Impala.

AB - This paper presents Sphinx, a full-fledged distributed system which uses a standard SQL interface to process big spatial data. Sphinx adds spatial data types, indexes and query processing, inside the code-base of Cloudera Impala for efficient processing of spatial data. In particular, Sphinx is composed of four main components, namely, query parser, indexer, query planner, and query executor. The query parser injects spatial data types and functions in the SQL interface of Sphinx. The indexer creates spatial indexes in Sphinx by adopting a two-layered index design. The query planner utilizes these indexes to construct efficient query plans for range query and spatial join operations. Finally, the query executor carries out these plans on big spatial datasets in a distributed cluster. A system prototype of Sphinx running on real datasets shows up-to three orders of magnitude performance improvement over traditional Impala.

KW - Impala

KW - Range query

KW - Spatial

KW - Spatial join

KW - Sphinx

KW - SQL

UR - http://www.scopus.com/inward/record.url?scp=84961226189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961226189&partnerID=8YFLogxK

U2 - 10.1145/2820783.2820869

DO - 10.1145/2820783.2820869

M3 - Conference contribution

VL - 03-06-November-2015

BT - 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015

PB - Association for Computing Machinery

ER -