Parallel Sequence Mining on Shared-Memory Machines

Mohammed J. Zaki

Research output: Contribution to journalArticle

68 Citations (Scopus)

Abstract

We present pSPADE, a parallel algorithm for fast discovery of frequent sequences in large databases. pSPADE decomposes the original search space into smaller suffix-based classes. Each class can be solved in main-memory using efficient search techniques and simple join operations. Furthermore, each class can be solved independently on each processor requiring no synchronization. However, dynamic interclass and intraclass load balancing must be exploited to ensure that each processor gets an equal amount of work. Experiments on a 12 processor SGI Origin 2000 shared memory system show good speedup and excellent scaleup results.

Original languageEnglish
Pages (from-to)401-426
Number of pages26
JournalJournal of Parallel and Distributed Computing
Volume61
Issue number3
DOIs
Publication statusPublished - 1 Mar 2001
Externally publishedYes

    Fingerprint

Keywords

  • Knowledge discovery; data mining; sequential patterns; frequent sequences; temporal association rules

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Control and Systems Engineering

Cite this