Lowrank mechanism

Optimizing batch queries under differential privacy

Ganzhao Yuan, Zhenjie Zhang, Marianne Winslett, Xiaokui Xiao, Yin Yang, Zhifeng Hao

Research output: Contribution to journalArticle

54 Citations (Scopus)

Abstract

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably the matrix mechanism [16], has suggested that processing a batch of correlated queries as a whole can potentially achieve considerable accuracy gains, compared to answering them individually. However, as we point out in this paper, the matrix mechanism is mainly of theoretical interest; in particular, several inherent problems in its design limit its accuracy in practice, which almost never exceeds that of näïve methods. In fact, we are not aware of any existing solution that can effectively optimize a query batch under differential privacy. Motivated by this, we propose the Low-Rank Mechanism (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a low rank approximation of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-theart query processing solutions under differential privacy, by large margins.

Original languageEnglish
Pages (from-to)1352-1363
Number of pages12
JournalProceedings of the VLDB Endowment
Volume5
Issue number11
Publication statusPublished - Jul 2012
Externally publishedYes

Fingerprint

Query processing
Processing
Experiments

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Yuan, G., Zhang, Z., Winslett, M., Xiao, X., Yang, Y., & Hao, Z. (2012). Lowrank mechanism: Optimizing batch queries under differential privacy. Proceedings of the VLDB Endowment, 5(11), 1352-1363.

Lowrank mechanism : Optimizing batch queries under differential privacy. / Yuan, Ganzhao; Zhang, Zhenjie; Winslett, Marianne; Xiao, Xiaokui; Yang, Yin; Hao, Zhifeng.

In: Proceedings of the VLDB Endowment, Vol. 5, No. 11, 07.2012, p. 1352-1363.

Research output: Contribution to journalArticle

Yuan, G, Zhang, Z, Winslett, M, Xiao, X, Yang, Y & Hao, Z 2012, 'Lowrank mechanism: Optimizing batch queries under differential privacy', Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1352-1363.
Yuan, Ganzhao ; Zhang, Zhenjie ; Winslett, Marianne ; Xiao, Xiaokui ; Yang, Yin ; Hao, Zhifeng. / Lowrank mechanism : Optimizing batch queries under differential privacy. In: Proceedings of the VLDB Endowment. 2012 ; Vol. 5, No. 11. pp. 1352-1363.
@article{9a0fad424a3144868c215dba330b27bb,
title = "Lowrank mechanism: Optimizing batch queries under differential privacy",
abstract = "Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably the matrix mechanism [16], has suggested that processing a batch of correlated queries as a whole can potentially achieve considerable accuracy gains, compared to answering them individually. However, as we point out in this paper, the matrix mechanism is mainly of theoretical interest; in particular, several inherent problems in its design limit its accuracy in practice, which almost never exceeds that of n{\"a}{\"i}ve methods. In fact, we are not aware of any existing solution that can effectively optimize a query batch under differential privacy. Motivated by this, we propose the Low-Rank Mechanism (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a low rank approximation of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-theart query processing solutions under differential privacy, by large margins.",
author = "Ganzhao Yuan and Zhenjie Zhang and Marianne Winslett and Xiaokui Xiao and Yin Yang and Zhifeng Hao",
year = "2012",
month = "7",
language = "English",
volume = "5",
pages = "1352--1363",
journal = "Proceedings of the VLDB Endowment",
issn = "2150-8097",
publisher = "Very Large Data Base Endowment Inc.",
number = "11",

}

TY - JOUR

T1 - Lowrank mechanism

T2 - Optimizing batch queries under differential privacy

AU - Yuan, Ganzhao

AU - Zhang, Zhenjie

AU - Winslett, Marianne

AU - Xiao, Xiaokui

AU - Yang, Yin

AU - Hao, Zhifeng

PY - 2012/7

Y1 - 2012/7

N2 - Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably the matrix mechanism [16], has suggested that processing a batch of correlated queries as a whole can potentially achieve considerable accuracy gains, compared to answering them individually. However, as we point out in this paper, the matrix mechanism is mainly of theoretical interest; in particular, several inherent problems in its design limit its accuracy in practice, which almost never exceeds that of näïve methods. In fact, we are not aware of any existing solution that can effectively optimize a query batch under differential privacy. Motivated by this, we propose the Low-Rank Mechanism (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a low rank approximation of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-theart query processing solutions under differential privacy, by large margins.

AB - Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result, such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results, while satisfying the privacy guarantees. Previous work, notably the matrix mechanism [16], has suggested that processing a batch of correlated queries as a whole can potentially achieve considerable accuracy gains, compared to answering them individually. However, as we point out in this paper, the matrix mechanism is mainly of theoretical interest; in particular, several inherent problems in its design limit its accuracy in practice, which almost never exceeds that of näïve methods. In fact, we are not aware of any existing solution that can effectively optimize a query batch under differential privacy. Motivated by this, we propose the Low-Rank Mechanism (LRM), the first practical differentially private technique for answering batch queries with high accuracy, based on a low rank approximation of the workload matrix. We prove that the accuracy provided by LRM is close to the theoretical lower bound for any mechanism to answer a batch of queries under differential privacy. Extensive experiments using real data demonstrate that LRM consistently outperforms state-of-theart query processing solutions under differential privacy, by large margins.

UR - http://www.scopus.com/inward/record.url?scp=84872862526&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872862526&partnerID=8YFLogxK

M3 - Article

VL - 5

SP - 1352

EP - 1363

JO - Proceedings of the VLDB Endowment

JF - Proceedings of the VLDB Endowment

SN - 2150-8097

IS - 11

ER -