ST-Hadoop: a MapReduce framework for spatio-temporal data

Louai Alarabi, Mohamed F. Mokbel, Mashaal Musleh

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data types and operations. In the indexing layer, ST-Hadoop spatiotemporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for three fundamental spatio-temporal queries, namely, spatio-temporal range, top-k nearest neighbor, and join queries. Extensibility of ST-Hadoop allows others to extend features and operations easily using similar approaches described in the paper. Extensive experiments conducted on large-scale dataset of size 10 TB that contains over 1 Billion spatio-temporal records, to show that ST-Hadoop achieves orders of magnitude better performance than Hadoop and SpaitalHadoop when dealing with spatio-temporal data and operations. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System.

Original languageEnglish
Pages (from-to)785-813
Number of pages29
JournalGeoInformatica
Volume22
Issue number4
DOIs
Publication statusPublished - 1 Oct 2018

    Fingerprint

Keywords

  • MapReduce-based systems
  • Spatio-temporal join query
  • Spatio-temporal nearest neighbor query
  • Spatio-temporal range query
  • Spatio-temporal systems

ASJC Scopus subject areas

  • Information Systems
  • Geography, Planning and Development

Cite this