Spinner: Scalable graph partitioning in the cloud

Claudio Martella, Dionysios Logothetis, Andreas Loukas, Georgos Siganos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Citations (Scopus)

Abstract

In this paper, we present a graph partitioning algorithm to partition graphs with trillions of edges. To achieve such scale, our solution leverages the vertex-centric Pregel abstraction provided by Giraph, a system for large-scale graph analytics. We designed our algorithm to compute partitions with high locality and fair balance, and focused on the characteristics necessary to reach wide adoption by practitioners in production. Our solution can (i) scale to massive graphs and thousands of compute cores, (ii) efficiently adapt partitions to changes to graphs and compute environments, and (iii) seamlessly integrate in existing systems without additional infrastructure. We evaluate our solution on the Facebook and Instagram graphs, as well as on other large-scale, real-world graphs. We show that it is scalable and computes partitionings with quality comparable, and sometimes outperforming, existing solutions. By integrating the computed partitionings in Giraph, we speedup various real-world applications by up to a factor of 5.6 compared to default hash-partitioning.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017
PublisherIEEE Computer Society
Pages1083-1094
Number of pages12
ISBN (Electronic)9781509065431
DOIs
Publication statusPublished - 16 May 2017
Event33rd IEEE International Conference on Data Engineering, ICDE 2017 - San Diego, United States
Duration: 19 Apr 201722 Apr 2017

Other

Other33rd IEEE International Conference on Data Engineering, ICDE 2017
CountryUnited States
CitySan Diego
Period19/4/1722/4/17

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Martella, C., Logothetis, D., Loukas, A., & Siganos, G. (2017). Spinner: Scalable graph partitioning in the cloud. In Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017 (pp. 1083-1094). [7930049] IEEE Computer Society. https://doi.org/10.1109/ICDE.2017.153

Spinner : Scalable graph partitioning in the cloud. / Martella, Claudio; Logothetis, Dionysios; Loukas, Andreas; Siganos, Georgos.

Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017. IEEE Computer Society, 2017. p. 1083-1094 7930049.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Martella, C, Logothetis, D, Loukas, A & Siganos, G 2017, Spinner: Scalable graph partitioning in the cloud. in Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017., 7930049, IEEE Computer Society, pp. 1083-1094, 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, United States, 19/4/17. https://doi.org/10.1109/ICDE.2017.153
Martella C, Logothetis D, Loukas A, Siganos G. Spinner: Scalable graph partitioning in the cloud. In Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017. IEEE Computer Society. 2017. p. 1083-1094. 7930049 https://doi.org/10.1109/ICDE.2017.153
Martella, Claudio ; Logothetis, Dionysios ; Loukas, Andreas ; Siganos, Georgos. / Spinner : Scalable graph partitioning in the cloud. Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017. IEEE Computer Society, 2017. pp. 1083-1094
@inproceedings{39c7b95a73df48c8bcb37a14b433473e,
title = "Spinner: Scalable graph partitioning in the cloud",
abstract = "In this paper, we present a graph partitioning algorithm to partition graphs with trillions of edges. To achieve such scale, our solution leverages the vertex-centric Pregel abstraction provided by Giraph, a system for large-scale graph analytics. We designed our algorithm to compute partitions with high locality and fair balance, and focused on the characteristics necessary to reach wide adoption by practitioners in production. Our solution can (i) scale to massive graphs and thousands of compute cores, (ii) efficiently adapt partitions to changes to graphs and compute environments, and (iii) seamlessly integrate in existing systems without additional infrastructure. We evaluate our solution on the Facebook and Instagram graphs, as well as on other large-scale, real-world graphs. We show that it is scalable and computes partitionings with quality comparable, and sometimes outperforming, existing solutions. By integrating the computed partitionings in Giraph, we speedup various real-world applications by up to a factor of 5.6 compared to default hash-partitioning.",
author = "Claudio Martella and Dionysios Logothetis and Andreas Loukas and Georgos Siganos",
year = "2017",
month = "5",
day = "16",
doi = "10.1109/ICDE.2017.153",
language = "English",
pages = "1083--1094",
booktitle = "Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Spinner

T2 - Scalable graph partitioning in the cloud

AU - Martella, Claudio

AU - Logothetis, Dionysios

AU - Loukas, Andreas

AU - Siganos, Georgos

PY - 2017/5/16

Y1 - 2017/5/16

N2 - In this paper, we present a graph partitioning algorithm to partition graphs with trillions of edges. To achieve such scale, our solution leverages the vertex-centric Pregel abstraction provided by Giraph, a system for large-scale graph analytics. We designed our algorithm to compute partitions with high locality and fair balance, and focused on the characteristics necessary to reach wide adoption by practitioners in production. Our solution can (i) scale to massive graphs and thousands of compute cores, (ii) efficiently adapt partitions to changes to graphs and compute environments, and (iii) seamlessly integrate in existing systems without additional infrastructure. We evaluate our solution on the Facebook and Instagram graphs, as well as on other large-scale, real-world graphs. We show that it is scalable and computes partitionings with quality comparable, and sometimes outperforming, existing solutions. By integrating the computed partitionings in Giraph, we speedup various real-world applications by up to a factor of 5.6 compared to default hash-partitioning.

AB - In this paper, we present a graph partitioning algorithm to partition graphs with trillions of edges. To achieve such scale, our solution leverages the vertex-centric Pregel abstraction provided by Giraph, a system for large-scale graph analytics. We designed our algorithm to compute partitions with high locality and fair balance, and focused on the characteristics necessary to reach wide adoption by practitioners in production. Our solution can (i) scale to massive graphs and thousands of compute cores, (ii) efficiently adapt partitions to changes to graphs and compute environments, and (iii) seamlessly integrate in existing systems without additional infrastructure. We evaluate our solution on the Facebook and Instagram graphs, as well as on other large-scale, real-world graphs. We show that it is scalable and computes partitionings with quality comparable, and sometimes outperforming, existing solutions. By integrating the computed partitionings in Giraph, we speedup various real-world applications by up to a factor of 5.6 compared to default hash-partitioning.

UR - http://www.scopus.com/inward/record.url?scp=85021232900&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021232900&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2017.153

DO - 10.1109/ICDE.2017.153

M3 - Conference contribution

AN - SCOPUS:85021232900

SP - 1083

EP - 1094

BT - Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017

PB - IEEE Computer Society

ER -