Optimizing batch linear queries under exact and approximate differential privacy

Ganzhao Yuan, Zhenjie Zhang, Marianne Winslett, Xiaokui Xiao, Yin Yang, Zhifeng Hao

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

Original languageEnglish
Article number11
JournalACM Transactions on Database Systems
Volume40
Issue number2
DOIs
Publication statusPublished - 1 Jun 2015
Externally publishedYes

Fingerprint

Query processing
Processing
Experiments

Keywords

  • Algorithms
  • Experimentation
  • Theory

ASJC Scopus subject areas

  • Information Systems

Cite this

Optimizing batch linear queries under exact and approximate differential privacy. / Yuan, Ganzhao; Zhang, Zhenjie; Winslett, Marianne; Xiao, Xiaokui; Yang, Yin; Hao, Zhifeng.

In: ACM Transactions on Database Systems, Vol. 40, No. 2, 11, 01.06.2015.

Research output: Contribution to journalArticle

Yuan, Ganzhao ; Zhang, Zhenjie ; Winslett, Marianne ; Xiao, Xiaokui ; Yang, Yin ; Hao, Zhifeng. / Optimizing batch linear queries under exact and approximate differential privacy. In: ACM Transactions on Database Systems. 2015 ; Vol. 40, No. 2.
@article{6f60cd548f2a450da2dca0f7ac2247fe,
title = "Optimizing batch linear queries under exact and approximate differential privacy",
abstract = "Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.",
keywords = "Algorithms, Experimentation, Theory",
author = "Ganzhao Yuan and Zhenjie Zhang and Marianne Winslett and Xiaokui Xiao and Yin Yang and Zhifeng Hao",
year = "2015",
month = "6",
day = "1",
doi = "10.1145/2699501",
language = "English",
volume = "40",
journal = "ACM Transactions on Database Systems",
issn = "0362-5915",
publisher = "Association for Computing Machinery (ACM)",
number = "2",

}

TY - JOUR

T1 - Optimizing batch linear queries under exact and approximate differential privacy

AU - Yuan, Ganzhao

AU - Zhang, Zhenjie

AU - Winslett, Marianne

AU - Xiao, Xiaokui

AU - Yang, Yin

AU - Hao, Zhifeng

PY - 2015/6/1

Y1 - 2015/6/1

N2 - Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

AB - Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or absence of any individual record from the published noisy results. The main objective in differentially private query processing is to maximize the accuracy of the query results while satisfying the privacy guarantees. Previous work, notably Li et al. [2010], has suggested that, with an appropriate strategy, processing a batch of correlated queries as a whole achieves considerably higher accuracy than answering them individually. However, to our knowledge there is currently no practical solution to find such a strategy for an arbitrary query batch; existing methods either return strategies of poor quality (often worse than naive methods) or require prohibitively expensive computations for even moderately large domains. Motivated by this, we propose a low-rank mechanism (LRM), the first practical differentially private technique for answering batch linear queries with high accuracy. LRM works for both exact (i.e., ε-) and approximate (i.e., (ε, δ)-) differential privacy definitions. We derive the utility guarantees of LRM and provide guidance on how to set the privacy parameters, given the user's utility expectation. Extensive experiments using real data demonstrate that our proposed method consistently outperforms state-of-the-art query processing solutions under differential privacy, by large margins.

KW - Algorithms

KW - Experimentation

KW - Theory

UR - http://www.scopus.com/inward/record.url?scp=84934766644&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84934766644&partnerID=8YFLogxK

U2 - 10.1145/2699501

DO - 10.1145/2699501

M3 - Article

AN - SCOPUS:84934766644

VL - 40

JO - ACM Transactions on Database Systems

JF - ACM Transactions on Database Systems

SN - 0362-5915

IS - 2

M1 - 11

ER -