Sphinx: Distributed execution of interactive SQL queries on big spatial data

Ahmed Eldawy, Mostafa Elganainy, Ammar Bakeer, Ahmed Abdelmotaleb, Mohamed Mokbel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

This paper presents Sphinx, a full-fledged distributed system which uses a standard SQL interface to process big spatial data. Sphinx adds spatial data types, indexes and query processing, inside the code-base of Cloudera Impala for efficient processing of spatial data. In particular, Sphinx is composed of four main components, namely, query parser, indexer, query planner, and query executor. The query parser injects spatial data types and functions in the SQL interface of Sphinx. The indexer creates spatial indexes in Sphinx by adopting a two-layered index design. The query planner utilizes these indexes to construct efficient query plans for range query and spatial join operations. Finally, the query executor carries out these plans on big spatial datasets in a distributed cluster. A system prototype of Sphinx running on real datasets shows up-to three orders of magnitude performance improvement over traditional Impala.

Original languageEnglish
Title of host publication23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
PublisherAssociation for Computing Machinery
Volume03-06-November-2015
ISBN (Electronic)9781450339674
DOIs
Publication statusPublished - 3 Nov 2015
Externally publishedYes
Event23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015 - Seattle, United States
Duration: 3 Nov 20156 Nov 2015

Other

Other23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
CountryUnited States
CitySeattle
Period3/11/156/11/15

    Fingerprint

Keywords

  • Impala
  • Range query
  • Spatial
  • Spatial join
  • Sphinx
  • SQL

ASJC Scopus subject areas

  • Earth-Surface Processes
  • Computer Science Applications
  • Modelling and Simulation
  • Computer Graphics and Computer-Aided Design
  • Information Systems

Cite this

Eldawy, A., Elganainy, M., Bakeer, A., Abdelmotaleb, A., & Mokbel, M. (2015). Sphinx: Distributed execution of interactive SQL queries on big spatial data. In 23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015 (Vol. 03-06-November-2015). [a78] Association for Computing Machinery. https://doi.org/10.1145/2820783.2820869