Efficiently mining frequent embedded unordered trees

Mohammed J. Zaki

Research output: Contribution to journalArticle

74 Citations (Scopus)

Abstract

Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.

Original languageEnglish
Pages (from-to)33-52
Number of pages20
JournalFundamenta Informaticae
Volume66
Issue number1-2
Publication statusPublished - 7 Sep 2005
Externally publishedYes

Fingerprint

Unordered
Mining
Equivalence classes
Efficient Algorithms
Trees (mathematics)
Bioinformatics
Semistructured Data
Ordered Trees
Web Mining
Labeled Trees
Equivalence class
Join
Performance Evaluation

Keywords

  • Embedded Trees
  • Tree Mining
  • Unordered Trees

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Software
  • Applied Mathematics
  • Safety, Risk, Reliability and Quality

Cite this

Efficiently mining frequent embedded unordered trees. / Zaki, Mohammed J.

In: Fundamenta Informaticae, Vol. 66, No. 1-2, 07.09.2005, p. 33-52.

Research output: Contribution to journalArticle

Zaki, Mohammed J. / Efficiently mining frequent embedded unordered trees. In: Fundamenta Informaticae. 2005 ; Vol. 66, No. 1-2. pp. 33-52.
@article{c45d0ce0352047639534d60e44bec595,
title = "Efficiently mining frequent embedded unordered trees",
abstract = "Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.",
keywords = "Embedded Trees, Tree Mining, Unordered Trees",
author = "Zaki, {Mohammed J.}",
year = "2005",
month = "9",
day = "7",
language = "English",
volume = "66",
pages = "33--52",
journal = "Fundamenta Informaticae",
issn = "0169-2968",
publisher = "IOS Press",
number = "1-2",

}

TY - JOUR

T1 - Efficiently mining frequent embedded unordered trees

AU - Zaki, Mohammed J.

PY - 2005/9/7

Y1 - 2005/9/7

N2 - Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.

AB - Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.

KW - Embedded Trees

KW - Tree Mining

KW - Unordered Trees

UR - http://www.scopus.com/inward/record.url?scp=24044516553&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=24044516553&partnerID=8YFLogxK

M3 - Article

VL - 66

SP - 33

EP - 52

JO - Fundamenta Informaticae

JF - Fundamenta Informaticae

SN - 0169-2968

IS - 1-2

ER -