Scalable facility location for massive graphs on pregel-like systems

Kiran Garimella, Gianmarco Morales, Aristides Gionis, Mauro Sozio

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.

Original languageEnglish
Title of host publicationCIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages273-282
Number of pages10
Volume19-23-Oct-2015
ISBN (Electronic)9781450337946
DOIs
Publication statusPublished - 17 Oct 2015
Externally publishedYes
Event24th ACM International Conference on Information and Knowledge Management, CIKM 2015 - Melbourne, Australia
Duration: 19 Oct 201523 Oct 2015

Other

Other24th ACM International Conference on Information and Knowledge Management, CIKM 2015
CountryAustralia
CityMelbourne
Period19/10/1523/10/15

Fingerprint

Graph
Facility location
Guarantee
Shortest path
Costs
Communication
Approximation
Leverage
Social media
World Wide Web
Location problem

ASJC Scopus subject areas

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Cite this

Garimella, K., Morales, G., Gionis, A., & Sozio, M. (2015). Scalable facility location for massive graphs on pregel-like systems. In CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management (Vol. 19-23-Oct-2015, pp. 273-282). Association for Computing Machinery. https://doi.org/10.1145/2806416.2806508

Scalable facility location for massive graphs on pregel-like systems. / Garimella, Kiran; Morales, Gianmarco; Gionis, Aristides; Sozio, Mauro.

CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management. Vol. 19-23-Oct-2015 Association for Computing Machinery, 2015. p. 273-282.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Garimella, K, Morales, G, Gionis, A & Sozio, M 2015, Scalable facility location for massive graphs on pregel-like systems. in CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management. vol. 19-23-Oct-2015, Association for Computing Machinery, pp. 273-282, 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, Australia, 19/10/15. https://doi.org/10.1145/2806416.2806508
Garimella K, Morales G, Gionis A, Sozio M. Scalable facility location for massive graphs on pregel-like systems. In CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management. Vol. 19-23-Oct-2015. Association for Computing Machinery. 2015. p. 273-282 https://doi.org/10.1145/2806416.2806508
Garimella, Kiran ; Morales, Gianmarco ; Gionis, Aristides ; Sozio, Mauro. / Scalable facility location for massive graphs on pregel-like systems. CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management. Vol. 19-23-Oct-2015 Association for Computing Machinery, 2015. pp. 273-282
@inproceedings{667ff9eda02a4c8ba0fb53e9ff8239b9,
title = "Scalable facility location for massive graphs on pregel-like systems",
abstract = "We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.",
author = "Kiran Garimella and Gianmarco Morales and Aristides Gionis and Mauro Sozio",
year = "2015",
month = "10",
day = "17",
doi = "10.1145/2806416.2806508",
language = "English",
volume = "19-23-Oct-2015",
pages = "273--282",
booktitle = "CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Scalable facility location for massive graphs on pregel-like systems

AU - Garimella, Kiran

AU - Morales, Gianmarco

AU - Gionis, Aristides

AU - Sozio, Mauro

PY - 2015/10/17

Y1 - 2015/10/17

N2 - We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.

AB - We propose a new scalable algorithm for the facility-location problem. We study the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on a graph. This setting is applicable to various problems arising in the Web and social media, and allows to leverage the inherent sparsity of such graphs. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our work builds upon previous results: a facility location algorithm for the PRAM model, a recent distance-sketching method for massive graphs, and a parallel algorithm to finding maximal independent sets. The main challenge is to adapt those building blocks to the distributed graph setting, while maintaining the approximation guarantee and limiting the amount of distributed communication. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while, in terms of quality, being competitive with state-of-the-art sequential algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84958236798&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958236798&partnerID=8YFLogxK

U2 - 10.1145/2806416.2806508

DO - 10.1145/2806416.2806508

M3 - Conference contribution

VL - 19-23-Oct-2015

SP - 273

EP - 282

BT - CIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management

PB - Association for Computing Machinery

ER -