Generating synthetic decentralized social graphs with local differential privacy

Zhan Qin, Ting Yu, Yin Yang, Issa Khalil, Xiaokui Xiao, Kui Ren

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

A large amount of valuable information resides in decentralized social graphs, where no entity has access to the complete graph structure. Instead, each user maintains locally a limited view of the graph. For example, in a phone network, each user keeps a contact list locally in her phone, and does not have access to other users' contacts. The contact lists of all users form an implicit social graph that could be very useful to study the interaction patterns among different populations. However, due to privacy concerns, one could not simply collect the unfettered local views from users and reconstruct a decentralized social network. In this paper, we investigate techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. We show that existing local differential privacy and synthetic graph generation techniques are insufficient for preserving important graph properties, due to excessive noise injection, inability to retain important graph structure, or both. Motivated by this, we propose LDPGen, a novel multi-phase technique that incrementally clusters users based on their connections to different partitions of the whole population. Every time a user reports information, LDPGen carefully injects noise to ensure local differential privacy.We derive optimal parameters in this process to cluster structurally-similar users together. Once a good clustering of users is obtained, LDPGen adapts existing social graph generation models to construct a synthetic social graph. We conduct comprehensive experiments over four real datasets to evaluate the quality of the obtained synthetic graphs, using a variety of metrics, including (i) important graph structural measures; (ii) quality of community discovery; and (iii) applicability in social recommendation. Our experiments show that the proposed technique produces high-quality synthetic graphs that well represent the original decentralized social graphs, and significantly outperform those from baseline approaches.

Original languageEnglish
Title of host publicationCCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages425-438
Number of pages14
VolumePart F131467
ISBN (Electronic)9781450349468
DOIs
Publication statusPublished - 30 Oct 2017
Event24th ACM SIGSAC Conference on Computer and Communications Security, CCS 2017 - Dallas, United States
Duration: 30 Oct 20173 Nov 2017

Other

Other24th ACM SIGSAC Conference on Computer and Communications Security, CCS 2017
CountryUnited States
CityDallas
Period30/10/173/11/17

Fingerprint

Experiments

Keywords

  • Community Discovery
  • Decentralized Social Networks
  • Local Differential Privacy
  • Synthetic Graph Generation

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Cite this

Qin, Z., Yu, T., Yang, Y., Khalil, I., Xiao, X., & Ren, K. (2017). Generating synthetic decentralized social graphs with local differential privacy. In CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (Vol. Part F131467, pp. 425-438). Association for Computing Machinery. https://doi.org/10.1145/3133956.3134086

Generating synthetic decentralized social graphs with local differential privacy. / Qin, Zhan; Yu, Ting; Yang, Yin; Khalil, Issa; Xiao, Xiaokui; Ren, Kui.

CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. Vol. Part F131467 Association for Computing Machinery, 2017. p. 425-438.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qin, Z, Yu, T, Yang, Y, Khalil, I, Xiao, X & Ren, K 2017, Generating synthetic decentralized social graphs with local differential privacy. in CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. vol. Part F131467, Association for Computing Machinery, pp. 425-438, 24th ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, United States, 30/10/17. https://doi.org/10.1145/3133956.3134086
Qin Z, Yu T, Yang Y, Khalil I, Xiao X, Ren K. Generating synthetic decentralized social graphs with local differential privacy. In CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. Vol. Part F131467. Association for Computing Machinery. 2017. p. 425-438 https://doi.org/10.1145/3133956.3134086
Qin, Zhan ; Yu, Ting ; Yang, Yin ; Khalil, Issa ; Xiao, Xiaokui ; Ren, Kui. / Generating synthetic decentralized social graphs with local differential privacy. CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. Vol. Part F131467 Association for Computing Machinery, 2017. pp. 425-438
@inproceedings{7a9e31a8b97a43b6be6e52371547ceac,
title = "Generating synthetic decentralized social graphs with local differential privacy",
abstract = "A large amount of valuable information resides in decentralized social graphs, where no entity has access to the complete graph structure. Instead, each user maintains locally a limited view of the graph. For example, in a phone network, each user keeps a contact list locally in her phone, and does not have access to other users' contacts. The contact lists of all users form an implicit social graph that could be very useful to study the interaction patterns among different populations. However, due to privacy concerns, one could not simply collect the unfettered local views from users and reconstruct a decentralized social network. In this paper, we investigate techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. We show that existing local differential privacy and synthetic graph generation techniques are insufficient for preserving important graph properties, due to excessive noise injection, inability to retain important graph structure, or both. Motivated by this, we propose LDPGen, a novel multi-phase technique that incrementally clusters users based on their connections to different partitions of the whole population. Every time a user reports information, LDPGen carefully injects noise to ensure local differential privacy.We derive optimal parameters in this process to cluster structurally-similar users together. Once a good clustering of users is obtained, LDPGen adapts existing social graph generation models to construct a synthetic social graph. We conduct comprehensive experiments over four real datasets to evaluate the quality of the obtained synthetic graphs, using a variety of metrics, including (i) important graph structural measures; (ii) quality of community discovery; and (iii) applicability in social recommendation. Our experiments show that the proposed technique produces high-quality synthetic graphs that well represent the original decentralized social graphs, and significantly outperform those from baseline approaches.",
keywords = "Community Discovery, Decentralized Social Networks, Local Differential Privacy, Synthetic Graph Generation",
author = "Zhan Qin and Ting Yu and Yin Yang and Issa Khalil and Xiaokui Xiao and Kui Ren",
year = "2017",
month = "10",
day = "30",
doi = "10.1145/3133956.3134086",
language = "English",
volume = "Part F131467",
pages = "425--438",
booktitle = "CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security",
publisher = "Association for Computing Machinery",

}

TY - GEN

T1 - Generating synthetic decentralized social graphs with local differential privacy

AU - Qin, Zhan

AU - Yu, Ting

AU - Yang, Yin

AU - Khalil, Issa

AU - Xiao, Xiaokui

AU - Ren, Kui

PY - 2017/10/30

Y1 - 2017/10/30

N2 - A large amount of valuable information resides in decentralized social graphs, where no entity has access to the complete graph structure. Instead, each user maintains locally a limited view of the graph. For example, in a phone network, each user keeps a contact list locally in her phone, and does not have access to other users' contacts. The contact lists of all users form an implicit social graph that could be very useful to study the interaction patterns among different populations. However, due to privacy concerns, one could not simply collect the unfettered local views from users and reconstruct a decentralized social network. In this paper, we investigate techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. We show that existing local differential privacy and synthetic graph generation techniques are insufficient for preserving important graph properties, due to excessive noise injection, inability to retain important graph structure, or both. Motivated by this, we propose LDPGen, a novel multi-phase technique that incrementally clusters users based on their connections to different partitions of the whole population. Every time a user reports information, LDPGen carefully injects noise to ensure local differential privacy.We derive optimal parameters in this process to cluster structurally-similar users together. Once a good clustering of users is obtained, LDPGen adapts existing social graph generation models to construct a synthetic social graph. We conduct comprehensive experiments over four real datasets to evaluate the quality of the obtained synthetic graphs, using a variety of metrics, including (i) important graph structural measures; (ii) quality of community discovery; and (iii) applicability in social recommendation. Our experiments show that the proposed technique produces high-quality synthetic graphs that well represent the original decentralized social graphs, and significantly outperform those from baseline approaches.

AB - A large amount of valuable information resides in decentralized social graphs, where no entity has access to the complete graph structure. Instead, each user maintains locally a limited view of the graph. For example, in a phone network, each user keeps a contact list locally in her phone, and does not have access to other users' contacts. The contact lists of all users form an implicit social graph that could be very useful to study the interaction patterns among different populations. However, due to privacy concerns, one could not simply collect the unfettered local views from users and reconstruct a decentralized social network. In this paper, we investigate techniques to ensure local differential privacy of individuals while collecting structural information and generating representative synthetic social graphs. We show that existing local differential privacy and synthetic graph generation techniques are insufficient for preserving important graph properties, due to excessive noise injection, inability to retain important graph structure, or both. Motivated by this, we propose LDPGen, a novel multi-phase technique that incrementally clusters users based on their connections to different partitions of the whole population. Every time a user reports information, LDPGen carefully injects noise to ensure local differential privacy.We derive optimal parameters in this process to cluster structurally-similar users together. Once a good clustering of users is obtained, LDPGen adapts existing social graph generation models to construct a synthetic social graph. We conduct comprehensive experiments over four real datasets to evaluate the quality of the obtained synthetic graphs, using a variety of metrics, including (i) important graph structural measures; (ii) quality of community discovery; and (iii) applicability in social recommendation. Our experiments show that the proposed technique produces high-quality synthetic graphs that well represent the original decentralized social graphs, and significantly outperform those from baseline approaches.

KW - Community Discovery

KW - Decentralized Social Networks

KW - Local Differential Privacy

KW - Synthetic Graph Generation

UR - http://www.scopus.com/inward/record.url?scp=85041433533&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041433533&partnerID=8YFLogxK

U2 - 10.1145/3133956.3134086

DO - 10.1145/3133956.3134086

M3 - Conference contribution

VL - Part F131467

SP - 425

EP - 438

BT - CCS 2017 - Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

PB - Association for Computing Machinery

ER -