### Abstract

Recent research in frequent pattern mining (FPM) has shifted from obtaining the complete set of frequent patterns to generating only a representative (summary) subset of frequent patterns. Most of the existing approaches to this problem adopt a two-step solution; in the first step, they obtain all the frequent patterns, and in the second step, some form of clustering is used to obtain the summary pattern set. However, the two-step method is inefficient and sometimes infeasible since the first step itself may fail to finish in a reasonable amount of time. In this paper, we propose an alternative approach to mining frequent pattern representatives based on a uniform sampling of the output space. Our new algorithm, MUSK, obtains representative patterns by sampling uniformly from the pool of all frequent maximal patterns; uniformity is achieved by a variant of Markov Chain Monte Carlo (MCMC) algorithm. MUSK simulates a random walk on the frequent pattern partial order graph with a prescribed transition probability matrix, whose values are computed locally during the simulation. In the stationary distribution of the random walk, all maximal frequent pattern nodes in the partial order graph are sampled uniformly. Experiments on various kind of graph and itemset databases validate the effectiveness of our approach.

Original language | English |
---|---|

Title of host publication | Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics |

Pages | 646-657 |

Number of pages | 12 |

Volume | 2 |

Publication status | Published - 31 Dec 2009 |

Externally published | Yes |

Event | 9th SIAM International Conference on Data Mining 2009, SDM 2009 - Sparks, NV, United States Duration: 30 Apr 2009 → 2 May 2009 |

### Other

Other | 9th SIAM International Conference on Data Mining 2009, SDM 2009 |
---|---|

Country | United States |

City | Sparks, NV |

Period | 30/4/09 → 2/5/09 |

### Fingerprint

### ASJC Scopus subject areas

- Computational Theory and Mathematics
- Software
- Applied Mathematics

### Cite this

*Society for Industrial and Applied Mathematics - 9th SIAM International Conference on Data Mining 2009, Proceedings in Applied Mathematics*(Vol. 2, pp. 646-657)