Optimizing cross-platform data movement

Sebastian Kruse, Zoi Kaoudi, Jorge Arnulfo Quiane Ruiz, Sanjay Chawla, Felix Naumann, Bertty Contreras-Rojas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PublisherIEEE Computer Society
Pages1642-1645
Number of pages4
ISBN (Electronic)9781538674741
DOIs
Publication statusPublished - 1 Apr 2019
Event35th IEEE International Conference on Data Engineering, ICDE 2019 - Macau, China
Duration: 8 Apr 201911 Apr 2019

Publication series

NameProceedings - International Conference on Data Engineering
Volume2019-April
ISSN (Print)1084-4627

Conference

Conference35th IEEE International Conference on Data Engineering, ICDE 2019
CountryChina
CityMacau
Period8/4/1911/4/19

Fingerprint

Costs

Keywords

  • Cross-platform
  • Data movement
  • Polystore
  • Query opimization

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Kruse, S., Kaoudi, Z., Quiane Ruiz, J. A., Chawla, S., Naumann, F., & Contreras-Rojas, B. (2019). Optimizing cross-platform data movement. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019 (pp. 1642-1645). [8731354] (Proceedings - International Conference on Data Engineering; Vol. 2019-April). IEEE Computer Society. https://doi.org/10.1109/ICDE.2019.00162

Optimizing cross-platform data movement. / Kruse, Sebastian; Kaoudi, Zoi; Quiane Ruiz, Jorge Arnulfo; Chawla, Sanjay; Naumann, Felix; Contreras-Rojas, Bertty.

Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. p. 1642-1645 8731354 (Proceedings - International Conference on Data Engineering; Vol. 2019-April).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kruse, S, Kaoudi, Z, Quiane Ruiz, JA, Chawla, S, Naumann, F & Contreras-Rojas, B 2019, Optimizing cross-platform data movement. in Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019., 8731354, Proceedings - International Conference on Data Engineering, vol. 2019-April, IEEE Computer Society, pp. 1642-1645, 35th IEEE International Conference on Data Engineering, ICDE 2019, Macau, China, 8/4/19. https://doi.org/10.1109/ICDE.2019.00162
Kruse S, Kaoudi Z, Quiane Ruiz JA, Chawla S, Naumann F, Contreras-Rojas B. Optimizing cross-platform data movement. In Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society. 2019. p. 1642-1645. 8731354. (Proceedings - International Conference on Data Engineering). https://doi.org/10.1109/ICDE.2019.00162
Kruse, Sebastian ; Kaoudi, Zoi ; Quiane Ruiz, Jorge Arnulfo ; Chawla, Sanjay ; Naumann, Felix ; Contreras-Rojas, Bertty. / Optimizing cross-platform data movement. Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019. IEEE Computer Society, 2019. pp. 1642-1645 (Proceedings - International Conference on Data Engineering).
@inproceedings{07018e216fd94ecc9608418e88feb818,
title = "Optimizing cross-platform data movement",
abstract = "Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.",
keywords = "Cross-platform, Data movement, Polystore, Query opimization",
author = "Sebastian Kruse and Zoi Kaoudi and {Quiane Ruiz}, {Jorge Arnulfo} and Sanjay Chawla and Felix Naumann and Bertty Contreras-Rojas",
year = "2019",
month = "4",
day = "1",
doi = "10.1109/ICDE.2019.00162",
language = "English",
series = "Proceedings - International Conference on Data Engineering",
publisher = "IEEE Computer Society",
pages = "1642--1645",
booktitle = "Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019",

}

TY - GEN

T1 - Optimizing cross-platform data movement

AU - Kruse, Sebastian

AU - Kaoudi, Zoi

AU - Quiane Ruiz, Jorge Arnulfo

AU - Chawla, Sanjay

AU - Naumann, Felix

AU - Contreras-Rojas, Bertty

PY - 2019/4/1

Y1 - 2019/4/1

N2 - Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.

AB - Data analytics are moving beyond the limits of a single data processing platform. A cross-platform query optimizer is necessary to enable applications to run their tasks over multiple platforms efficiently and in a platform-agnostic manner. For the optimizer to be effective, it must consider data movement costs across different data processing platforms. In this paper, we present the graph-based data movement strategy used by Rheem, our open-source cross-platform system. In particular, we (i) model the data movement problem as a new graph problem, which we prove to be NP-hard, and (ii) propose a novel graph exploration algorithm, which allows Rheem to discover multiple hidden opportunities for cross-platform data processing.

KW - Cross-platform

KW - Data movement

KW - Polystore

KW - Query opimization

UR - http://www.scopus.com/inward/record.url?scp=85067927337&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067927337&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2019.00162

DO - 10.1109/ICDE.2019.00162

M3 - Conference contribution

AN - SCOPUS:85067927337

T3 - Proceedings - International Conference on Data Engineering

SP - 1642

EP - 1645

BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019

PB - IEEE Computer Society

ER -