The little engine(s) that could

Scaling online social networks

Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang, Nikolaos Laoutaris, Parminder Chhabra, Pablo Rodriguez

Research output: Contribution to journalArticle

26 Citations (Scopus)

Abstract

The difficulty of partitioning social graphs has introduced new system design challenges for scaling of online social networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., distributed hash tables (DHTs), can suffer from expensive interserver communication. Such challenges have often caused costly rearchitecting efforts for popular OSNs like Twitter and Facebook. We design, implement, and evaluate SPAR, a Social Partitioning and Replication middleware that mediates transparently between the application and the database layer of an OSN. SPAR leverages the underlying social graph structure in order to minimize the required replication overhead for ensuring that users have their neighbors' data colocated in the same machine. The gains from this are multifold: Application developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost. We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR incurs minimum overhead, can help a well-known Twitter clone reach Twitter's scale without changing a line of its application logic, and achieves higher throughput than Cassandra, a popular key-value store database.

Original languageEnglish
Article number6172626
Pages (from-to)1162-1175
Number of pages14
JournalIEEE/ACM Transactions on Networking
Volume20
Issue number4
DOIs
Publication statusPublished - 5 Apr 2012
Externally publishedYes

Fingerprint

Engines
Systems analysis
Middleware
Redundancy
Scalability
Servers
Semantics
Throughput
Data storage equipment
Communication
Costs

Keywords

  • Algorithms
  • distributed systems
  • online social networks (OSNs)
  • scaling

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Software
  • Computer Science Applications
  • Computer Networks and Communications

Cite this

Pujol, J. M., Erramilli, V., Siganos, G., Yang, X., Laoutaris, N., Chhabra, P., & Rodriguez, P. (2012). The little engine(s) that could: Scaling online social networks. IEEE/ACM Transactions on Networking, 20(4), 1162-1175. [6172626]. https://doi.org/10.1109/TNET.2012.2188815

The little engine(s) that could : Scaling online social networks. / Pujol, Josep M.; Erramilli, Vijay; Siganos, Georgos; Yang, Xiaoyuan; Laoutaris, Nikolaos; Chhabra, Parminder; Rodriguez, Pablo.

In: IEEE/ACM Transactions on Networking, Vol. 20, No. 4, 6172626, 05.04.2012, p. 1162-1175.

Research output: Contribution to journalArticle

Pujol, JM, Erramilli, V, Siganos, G, Yang, X, Laoutaris, N, Chhabra, P & Rodriguez, P 2012, 'The little engine(s) that could: Scaling online social networks', IEEE/ACM Transactions on Networking, vol. 20, no. 4, 6172626, pp. 1162-1175. https://doi.org/10.1109/TNET.2012.2188815
Pujol JM, Erramilli V, Siganos G, Yang X, Laoutaris N, Chhabra P et al. The little engine(s) that could: Scaling online social networks. IEEE/ACM Transactions on Networking. 2012 Apr 5;20(4):1162-1175. 6172626. https://doi.org/10.1109/TNET.2012.2188815
Pujol, Josep M. ; Erramilli, Vijay ; Siganos, Georgos ; Yang, Xiaoyuan ; Laoutaris, Nikolaos ; Chhabra, Parminder ; Rodriguez, Pablo. / The little engine(s) that could : Scaling online social networks. In: IEEE/ACM Transactions on Networking. 2012 ; Vol. 20, No. 4. pp. 1162-1175.
@article{a4c3946e74a74df8afe6186bfa8ddb20,
title = "The little engine(s) that could: Scaling online social networks",
abstract = "The difficulty of partitioning social graphs has introduced new system design challenges for scaling of online social networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., distributed hash tables (DHTs), can suffer from expensive interserver communication. Such challenges have often caused costly rearchitecting efforts for popular OSNs like Twitter and Facebook. We design, implement, and evaluate SPAR, a Social Partitioning and Replication middleware that mediates transparently between the application and the database layer of an OSN. SPAR leverages the underlying social graph structure in order to minimize the required replication overhead for ensuring that users have their neighbors' data colocated in the same machine. The gains from this are multifold: Application developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost. We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR incurs minimum overhead, can help a well-known Twitter clone reach Twitter's scale without changing a line of its application logic, and achieves higher throughput than Cassandra, a popular key-value store database.",
keywords = "Algorithms, distributed systems, online social networks (OSNs), scaling",
author = "Pujol, {Josep M.} and Vijay Erramilli and Georgos Siganos and Xiaoyuan Yang and Nikolaos Laoutaris and Parminder Chhabra and Pablo Rodriguez",
year = "2012",
month = "4",
day = "5",
doi = "10.1109/TNET.2012.2188815",
language = "English",
volume = "20",
pages = "1162--1175",
journal = "IEEE/ACM Transactions on Networking",
issn = "1063-6692",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

TY - JOUR

T1 - The little engine(s) that could

T2 - Scaling online social networks

AU - Pujol, Josep M.

AU - Erramilli, Vijay

AU - Siganos, Georgos

AU - Yang, Xiaoyuan

AU - Laoutaris, Nikolaos

AU - Chhabra, Parminder

AU - Rodriguez, Pablo

PY - 2012/4/5

Y1 - 2012/4/5

N2 - The difficulty of partitioning social graphs has introduced new system design challenges for scaling of online social networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., distributed hash tables (DHTs), can suffer from expensive interserver communication. Such challenges have often caused costly rearchitecting efforts for popular OSNs like Twitter and Facebook. We design, implement, and evaluate SPAR, a Social Partitioning and Replication middleware that mediates transparently between the application and the database layer of an OSN. SPAR leverages the underlying social graph structure in order to minimize the required replication overhead for ensuring that users have their neighbors' data colocated in the same machine. The gains from this are multifold: Application developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost. We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR incurs minimum overhead, can help a well-known Twitter clone reach Twitter's scale without changing a line of its application logic, and achieves higher throughput than Cassandra, a popular key-value store database.

AB - The difficulty of partitioning social graphs has introduced new system design challenges for scaling of online social networks (OSNs). Vertical scaling by resorting to full replication can be a costly proposition. Scaling horizontally by partitioning and distributing data among multiple servers using, for e.g., distributed hash tables (DHTs), can suffer from expensive interserver communication. Such challenges have often caused costly rearchitecting efforts for popular OSNs like Twitter and Facebook. We design, implement, and evaluate SPAR, a Social Partitioning and Replication middleware that mediates transparently between the application and the database layer of an OSN. SPAR leverages the underlying social graph structure in order to minimize the required replication overhead for ensuring that users have their neighbors' data colocated in the same machine. The gains from this are multifold: Application developers can assume local semantics, i.e., develop as they would for a single machine; scalability is achieved by adding commodity machines with low memory and network I/O requirements; and N+K redundancy is achieved at a fraction of the cost. We provide a complete system design, extensive evaluation based on datasets from Twitter, Orkut, and Facebook, and a working implementation. We show that SPAR incurs minimum overhead, can help a well-known Twitter clone reach Twitter's scale without changing a line of its application logic, and achieves higher throughput than Cassandra, a popular key-value store database.

KW - Algorithms

KW - distributed systems

KW - online social networks (OSNs)

KW - scaling

UR - http://www.scopus.com/inward/record.url?scp=84865325520&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84865325520&partnerID=8YFLogxK

U2 - 10.1109/TNET.2012.2188815

DO - 10.1109/TNET.2012.2188815

M3 - Article

VL - 20

SP - 1162

EP - 1175

JO - IEEE/ACM Transactions on Networking

JF - IEEE/ACM Transactions on Networking

SN - 1063-6692

IS - 4

M1 - 6172626

ER -