Protecting sensitive labels in social network data anonymization

Mingxuan Yuan, Lei Chen, Philip S. Yu, Ting Yu

Research output: Contribution to journalArticle

63 Citations (Scopus)

Abstract

Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one's private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-l-diversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique.

Original languageEnglish
Article number6109254
Pages (from-to)633-647
Number of pages15
JournalIEEE Transactions on Knowledge and Data Engineering
Volume25
Issue number3
DOIs
Publication statusPublished - 8 Feb 2013
Externally publishedYes

Fingerprint

Labels
Social sciences
Industry
Experiments

Keywords

  • anonymous
  • privacy
  • Social networks

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Computer Science Applications

Cite this

Protecting sensitive labels in social network data anonymization. / Yuan, Mingxuan; Chen, Lei; Yu, Philip S.; Yu, Ting.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 25, No. 3, 6109254, 08.02.2013, p. 633-647.

Research output: Contribution to journalArticle

Yuan, Mingxuan ; Chen, Lei ; Yu, Philip S. ; Yu, Ting. / Protecting sensitive labels in social network data anonymization. In: IEEE Transactions on Knowledge and Data Engineering. 2013 ; Vol. 25, No. 3. pp. 633-647.
@article{6d72809bc7a948daa76371b043b4e8b2,
title = "Protecting sensitive labels in social network data anonymization",
abstract = "Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one's private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-l-diversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique.",
keywords = "anonymous, privacy, Social networks",
author = "Mingxuan Yuan and Lei Chen and Yu, {Philip S.} and Ting Yu",
year = "2013",
month = "2",
day = "8",
doi = "10.1109/TKDE.2011.259",
language = "English",
volume = "25",
pages = "633--647",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "3",

}

TY - JOUR

T1 - Protecting sensitive labels in social network data anonymization

AU - Yuan, Mingxuan

AU - Chen, Lei

AU - Yu, Philip S.

AU - Yu, Ting

PY - 2013/2/8

Y1 - 2013/2/8

N2 - Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one's private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-l-diversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique.

AB - Privacy is one of the major concerns when publishing or sharing social network data for social science research and business analysis. Recently, researchers have developed privacy models similar to k-anonymity to prevent node reidentification through structure information. However, even when these privacy models are enforced, an attacker may still be able to infer one's private information if a group of nodes largely share the same sensitive labels (i.e., attributes). In other words, the label-node relationship is not well protected by pure structure anonymization methods. Furthermore, existing approaches, which rely on edge editing or node clustering, may significantly alter key graph properties. In this paper, we define a k-degree-l-diversity anonymity model that considers the protection of structural information as well as sensitive labels of individuals. We further propose a novel anonymization methodology based on adding noise nodes. We develop a new algorithm by adding noise nodes into the original graph with the consideration of introducing the least distortion to graph properties. Most importantly, we provide a rigorous analysis of the theoretical bounds on the number of noise nodes added and their impacts on an important graph property. We conduct extensive experiments to evaluate the effectiveness of the proposed technique.

KW - anonymous

KW - privacy

KW - Social networks

UR - http://www.scopus.com/inward/record.url?scp=84873308092&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873308092&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2011.259

DO - 10.1109/TKDE.2011.259

M3 - Article

VL - 25

SP - 633

EP - 647

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 3

M1 - 6109254

ER -