Association Rule Hiding

Vassilios S. Verykios, Ahmed Elmagarmid, Elisa Bertino, Yucel Saygin, Elena Dasseni

Research output: Contribution to journalArticle

332 Citations (Scopus)

Abstract

Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.

Original languageEnglish
Pages (from-to)434-447
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Volume16
Issue number4
DOIs
Publication statusPublished - 1 Apr 2004
Externally publishedYes

Fingerprint

Association rules
Learning algorithms
Data mining
Learning systems
Industry

Keywords

  • Association rule mining
  • Privacy preserving data mining
  • Sensitive rule hiding

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Information Systems

Cite this

Association Rule Hiding. / Verykios, Vassilios S.; Elmagarmid, Ahmed; Bertino, Elisa; Saygin, Yucel; Dasseni, Elena.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 4, 01.04.2004, p. 434-447.

Research output: Contribution to journalArticle

Verykios, VS, Elmagarmid, A, Bertino, E, Saygin, Y & Dasseni, E 2004, 'Association Rule Hiding', IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 4, pp. 434-447. https://doi.org/10.1109/TKDE.2004.1269668
Verykios, Vassilios S. ; Elmagarmid, Ahmed ; Bertino, Elisa ; Saygin, Yucel ; Dasseni, Elena. / Association Rule Hiding. In: IEEE Transactions on Knowledge and Data Engineering. 2004 ; Vol. 16, No. 4. pp. 434-447.
@article{0290fe53266948daae6ca1b26a6a61f1,
title = "Association Rule Hiding",
abstract = "Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.",
keywords = "Association rule mining, Privacy preserving data mining, Sensitive rule hiding",
author = "Verykios, {Vassilios S.} and Ahmed Elmagarmid and Elisa Bertino and Yucel Saygin and Elena Dasseni",
year = "2004",
month = "4",
day = "1",
doi = "10.1109/TKDE.2004.1269668",
language = "English",
volume = "16",
pages = "434--447",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "4",

}

TY - JOUR

T1 - Association Rule Hiding

AU - Verykios, Vassilios S.

AU - Elmagarmid, Ahmed

AU - Bertino, Elisa

AU - Saygin, Yucel

AU - Dasseni, Elena

PY - 2004/4/1

Y1 - 2004/4/1

N2 - Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.

AB - Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may encounter when releasing data to outside parties. A key problem, and still not sufficiently investigated, is the need to balance the confidentiality of the disclosed data with the legitimate needs of the data users. Every disclosure limitation method affects, in some way, and modifies true data values and relationships. In this paper, we investigate confidentiality issues of a broad category of rules, the association rules. In particular, we present three strategies and five algorithms for hiding a group of association rules, which is characterized as sensitive. One rule is characterized as sensitive if its disclosure risk is above a certain privacy threshold. Sometimes, sensitive rules should not be disclosed to the public since, among other things, they may be used for inferring sensitive data, or they may provide business competitors with an advantage. We also perform an evaluation study of the hiding algorithms in order to analyze their time complexity and the impact that they have in the original database.

KW - Association rule mining

KW - Privacy preserving data mining

KW - Sensitive rule hiding

UR - http://www.scopus.com/inward/record.url?scp=2142754478&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2142754478&partnerID=8YFLogxK

U2 - 10.1109/TKDE.2004.1269668

DO - 10.1109/TKDE.2004.1269668

M3 - Article

AN - SCOPUS:2142754478

VL - 16

SP - 434

EP - 447

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 4

ER -