Scalable algorithms for association mining

Mohammed J. Zaki

Research output: Contribution to journalArticle

928 Citations (Scopus)

Abstract

Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets and then, forming conditional implication rules among them. In this paper, we present efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The items are organized into a subset lattice search space, which is decomposed into small independent chunks or sublattices, which can be solved in memory. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. We also present the effect of using different database layout schemes combined with the proposed decomposition and traversal techniques. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases.

Original languageEnglish
Pages (from-to)372-390
Number of pages19
JournalIEEE Transactions on Knowledge and Data Engineering
Volume12
Issue number3
DOIs
Publication statusPublished - 3 Dec 2000
Externally publishedYes

Fingerprint

Association reactions
Data mining
Association rules
Structural properties
Decomposition
Data storage equipment

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Artificial Intelligence
  • Information Systems

Cite this

Scalable algorithms for association mining. / Zaki, Mohammed J.

In: IEEE Transactions on Knowledge and Data Engineering, Vol. 12, No. 3, 03.12.2000, p. 372-390.

Research output: Contribution to journalArticle

Zaki, Mohammed J. / Scalable algorithms for association mining. In: IEEE Transactions on Knowledge and Data Engineering. 2000 ; Vol. 12, No. 3. pp. 372-390.
@article{f6ed23c6634c4f08bc3b2556d42290fc,
title = "Scalable algorithms for association mining",
abstract = "Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets and then, forming conditional implication rules among them. In this paper, we present efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The items are organized into a subset lattice search space, which is decomposed into small independent chunks or sublattices, which can be solved in memory. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. We also present the effect of using different database layout schemes combined with the proposed decomposition and traversal techniques. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases.",
author = "Zaki, {Mohammed J.}",
year = "2000",
month = "12",
day = "3",
doi = "10.1109/69.846291",
language = "English",
volume = "12",
pages = "372--390",
journal = "IEEE Transactions on Knowledge and Data Engineering",
issn = "1041-4347",
publisher = "IEEE Computer Society",
number = "3",

}

TY - JOUR

T1 - Scalable algorithms for association mining

AU - Zaki, Mohammed J.

PY - 2000/12/3

Y1 - 2000/12/3

N2 - Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets and then, forming conditional implication rules among them. In this paper, we present efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The items are organized into a subset lattice search space, which is decomposed into small independent chunks or sublattices, which can be solved in memory. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. We also present the effect of using different database layout schemes combined with the proposed decomposition and traversal techniques. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases.

AB - Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets and then, forming conditional implication rules among them. In this paper, we present efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The items are organized into a subset lattice search space, which is decomposed into small independent chunks or sublattices, which can be solved in memory. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. We also present the effect of using different database layout schemes combined with the proposed decomposition and traversal techniques. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases.

UR - http://www.scopus.com/inward/record.url?scp=0033718951&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033718951&partnerID=8YFLogxK

U2 - 10.1109/69.846291

DO - 10.1109/69.846291

M3 - Article

AN - SCOPUS:0033718951

VL - 12

SP - 372

EP - 390

JO - IEEE Transactions on Knowledge and Data Engineering

JF - IEEE Transactions on Knowledge and Data Engineering

SN - 1041-4347

IS - 3

ER -