The little engine(s) that could: Scaling online social networks

Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang, Nikos Laoutaris, Parminder Chhabra, Pablo Rodriguez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

126 Citations (Scopus)

Abstract

The difficulty of scaling Online Social Networks (OSNs) has introduced new system design challenges that has often caused costly re-architecting for services like Twitter and Facebook. The complexity of interconnection of users in social networks has introduced new scalability challenges. Conventional vertical scaling by resorting to full replication can be a costly proposition. Horizontal scaling by partitioning and distributing data among multiples servers - e.g. using DHTs - can lead to costly inter-server communication. We design, implement, and evaluate SPAR, a social partitioning and replication middle-ware that transparently leverages the social graph structure to achieve data locality while minimizing replication. SPAR guarantees that for all users in an OSN, their direct neighbor's data is co-located in the same server. The gains from this approach are multi-fold: application developers can assume local semantics, i.e., develop as they would for a single server; scalability is achieved by adding commodity servers with low memory and network I/O requirements; and redundancy is achieved at a fraction of the cost. We detail our system design and an evaluation based on datasets from Twitter, Orkut, and Facebook, with a working implementation. We show that SPAR incurs minimum overhead, and can help a well-known open-source Twitter clone reach Twitter's scale without changing a line of its application logic and achieves higher throughput than Cassandra, Facebook's DHT based key-value store database.

Original languageEnglish
Title of host publicationSIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference
Pages375-386
Number of pages12
DOIs
Publication statusPublished - 15 Nov 2010
Externally publishedYes
Event7th International Conference on Autonomic Computing, SIGCOMM 2010 - New Delhi, India
Duration: 30 Aug 20103 Sep 2010

Other

Other7th International Conference on Autonomic Computing, SIGCOMM 2010
CountryIndia
CityNew Delhi
Period30/8/103/9/10

Fingerprint

Social Networks
Engine
Servers
Server
Scaling
Replication
Engines
System Design
Partitioning
Scalability
Data Locality
Systems analysis
Single Server
Clone
Interconnection
Middleware
Leverage
Proposition
Open Source
High Throughput

Keywords

  • partition
  • replication
  • scalability
  • social networks

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Theoretical Computer Science

Cite this

Pujol, J. M., Erramilli, V., Siganos, G., Yang, X., Laoutaris, N., Chhabra, P., & Rodriguez, P. (2010). The little engine(s) that could: Scaling online social networks. In SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference (pp. 375-386) https://doi.org/10.1145/1851182.1851227

The little engine(s) that could : Scaling online social networks. / Pujol, Josep M.; Erramilli, Vijay; Siganos, Georgos; Yang, Xiaoyuan; Laoutaris, Nikos; Chhabra, Parminder; Rodriguez, Pablo.

SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference. 2010. p. 375-386.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Pujol, JM, Erramilli, V, Siganos, G, Yang, X, Laoutaris, N, Chhabra, P & Rodriguez, P 2010, The little engine(s) that could: Scaling online social networks. in SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference. pp. 375-386, 7th International Conference on Autonomic Computing, SIGCOMM 2010, New Delhi, India, 30/8/10. https://doi.org/10.1145/1851182.1851227
Pujol JM, Erramilli V, Siganos G, Yang X, Laoutaris N, Chhabra P et al. The little engine(s) that could: Scaling online social networks. In SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference. 2010. p. 375-386 https://doi.org/10.1145/1851182.1851227
Pujol, Josep M. ; Erramilli, Vijay ; Siganos, Georgos ; Yang, Xiaoyuan ; Laoutaris, Nikos ; Chhabra, Parminder ; Rodriguez, Pablo. / The little engine(s) that could : Scaling online social networks. SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference. 2010. pp. 375-386
@inproceedings{d62e5e056e6f4ff9989699a160aac02b,
title = "The little engine(s) that could: Scaling online social networks",
abstract = "The difficulty of scaling Online Social Networks (OSNs) has introduced new system design challenges that has often caused costly re-architecting for services like Twitter and Facebook. The complexity of interconnection of users in social networks has introduced new scalability challenges. Conventional vertical scaling by resorting to full replication can be a costly proposition. Horizontal scaling by partitioning and distributing data among multiples servers - e.g. using DHTs - can lead to costly inter-server communication. We design, implement, and evaluate SPAR, a social partitioning and replication middle-ware that transparently leverages the social graph structure to achieve data locality while minimizing replication. SPAR guarantees that for all users in an OSN, their direct neighbor's data is co-located in the same server. The gains from this approach are multi-fold: application developers can assume local semantics, i.e., develop as they would for a single server; scalability is achieved by adding commodity servers with low memory and network I/O requirements; and redundancy is achieved at a fraction of the cost. We detail our system design and an evaluation based on datasets from Twitter, Orkut, and Facebook, with a working implementation. We show that SPAR incurs minimum overhead, and can help a well-known open-source Twitter clone reach Twitter's scale without changing a line of its application logic and achieves higher throughput than Cassandra, Facebook's DHT based key-value store database.",
keywords = "partition, replication, scalability, social networks",
author = "Pujol, {Josep M.} and Vijay Erramilli and Georgos Siganos and Xiaoyuan Yang and Nikos Laoutaris and Parminder Chhabra and Pablo Rodriguez",
year = "2010",
month = "11",
day = "15",
doi = "10.1145/1851182.1851227",
language = "English",
isbn = "9781450302012",
pages = "375--386",
booktitle = "SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference",

}

TY - GEN

T1 - The little engine(s) that could

T2 - Scaling online social networks

AU - Pujol, Josep M.

AU - Erramilli, Vijay

AU - Siganos, Georgos

AU - Yang, Xiaoyuan

AU - Laoutaris, Nikos

AU - Chhabra, Parminder

AU - Rodriguez, Pablo

PY - 2010/11/15

Y1 - 2010/11/15

N2 - The difficulty of scaling Online Social Networks (OSNs) has introduced new system design challenges that has often caused costly re-architecting for services like Twitter and Facebook. The complexity of interconnection of users in social networks has introduced new scalability challenges. Conventional vertical scaling by resorting to full replication can be a costly proposition. Horizontal scaling by partitioning and distributing data among multiples servers - e.g. using DHTs - can lead to costly inter-server communication. We design, implement, and evaluate SPAR, a social partitioning and replication middle-ware that transparently leverages the social graph structure to achieve data locality while minimizing replication. SPAR guarantees that for all users in an OSN, their direct neighbor's data is co-located in the same server. The gains from this approach are multi-fold: application developers can assume local semantics, i.e., develop as they would for a single server; scalability is achieved by adding commodity servers with low memory and network I/O requirements; and redundancy is achieved at a fraction of the cost. We detail our system design and an evaluation based on datasets from Twitter, Orkut, and Facebook, with a working implementation. We show that SPAR incurs minimum overhead, and can help a well-known open-source Twitter clone reach Twitter's scale without changing a line of its application logic and achieves higher throughput than Cassandra, Facebook's DHT based key-value store database.

AB - The difficulty of scaling Online Social Networks (OSNs) has introduced new system design challenges that has often caused costly re-architecting for services like Twitter and Facebook. The complexity of interconnection of users in social networks has introduced new scalability challenges. Conventional vertical scaling by resorting to full replication can be a costly proposition. Horizontal scaling by partitioning and distributing data among multiples servers - e.g. using DHTs - can lead to costly inter-server communication. We design, implement, and evaluate SPAR, a social partitioning and replication middle-ware that transparently leverages the social graph structure to achieve data locality while minimizing replication. SPAR guarantees that for all users in an OSN, their direct neighbor's data is co-located in the same server. The gains from this approach are multi-fold: application developers can assume local semantics, i.e., develop as they would for a single server; scalability is achieved by adding commodity servers with low memory and network I/O requirements; and redundancy is achieved at a fraction of the cost. We detail our system design and an evaluation based on datasets from Twitter, Orkut, and Facebook, with a working implementation. We show that SPAR incurs minimum overhead, and can help a well-known open-source Twitter clone reach Twitter's scale without changing a line of its application logic and achieves higher throughput than Cassandra, Facebook's DHT based key-value store database.

KW - partition

KW - replication

KW - scalability

KW - social networks

UR - http://www.scopus.com/inward/record.url?scp=78149334877&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78149334877&partnerID=8YFLogxK

U2 - 10.1145/1851182.1851227

DO - 10.1145/1851182.1851227

M3 - Conference contribution

AN - SCOPUS:78149334877

SN - 9781450302012

SP - 375

EP - 386

BT - SIGCOMM'10 - Proceedings of the SIGCOMM 2010 Conference

ER -