Localized algorithm for parallel association mining

Mohammed Javeed Zaki, Srinivasan Parthasarathy, Wei Li

Research output: Contribution to conferencePaper

34 Citations (Scopus)

Abstract

Discovery of association rules is an important database mining problem. Mining for association rules involves extracting patterns from large databases and inferring useful rules from them. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the database to determine the commonly occurring patterns or itemsets (set of items), thus incurring high I/O overhead. In the parallel case, these algorithms do a reduction at the end of each pass to construct the global patterns, thus incurring high synchronization cost. In this paper we describe a new parallel association mining algorithm. Our algorithm is a result of detailed study of the available parallelism and the properties of associations. The algorithm uses a scheme to cluster related frequent itemsets together, and to partition them among the processors. At the same time it also uses a different database layout which clusters related transactions together, and selectively replicates the database so that the portion of the database needed for the computation of associations is local to each processor. After the initial set-up phase, the algorithm eliminates the need for further communication or synchronization. The algorithm further scans the local database partition only three times, thus minimizing I/O overheads. Unlike previous approaches, the algorithms uses simple intersection operations to compute frequent itemsets and doesn't have to maintain or search complex hash structures. Our experimental testbed is a 32-processor DEC Alpha cluster inter-connected by the Memory Channel network. We present results on the performance of our algorithm on various databases, and compare it against a well known parallel algorithm. Our algorithm outperforms it by an more than an order of magnitude.

Original languageEnglish
Pages321-330
Number of pages10
Publication statusPublished - 1 Jan 1997
EventProceedings of the 1997 9th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA - Newport, RI, USA
Duration: 22 Jun 199725 Jun 1997

Other

OtherProceedings of the 1997 9th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA
CityNewport, RI, USA
Period22/6/9725/6/97

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Safety, Risk, Reliability and Quality

Cite this

Zaki, M. J., Parthasarathy, S., & Li, W. (1997). Localized algorithm for parallel association mining. 321-330. Paper presented at Proceedings of the 1997 9th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA, Newport, RI, USA, .