Clique percolation method for finding naturally cohesive and overlapping document clusters

Wei Gao, Kam Fai Wong, Yunqing Xia, Ruifeng Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages97-108
Number of pages12
Volume4285 LNAI
DOIs
Publication statusPublished - 1 Dec 2006
Externally publishedYes
Event21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006 - Singapore, Singapore
Duration: 17 Dec 200619 Dec 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4285 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006
CountrySingapore
CitySingapore
Period17/12/0619/12/06

Fingerprint

Clique
Overlapping
Document Clustering
Experiments
Clustering
Random Graphs
Grouping
Adjacent
Tend
Benchmark
Graph in graph theory
Experiment
Model

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Gao, W., Wong, K. F., Xia, Y., & Xu, R. (2006). Clique percolation method for finding naturally cohesive and overlapping document clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4285 LNAI, pp. 97-108). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI). https://doi.org/10.1007/11940098_10

Clique percolation method for finding naturally cohesive and overlapping document clusters. / Gao, Wei; Wong, Kam Fai; Xia, Yunqing; Xu, Ruifeng.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. p. 97-108 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gao, W, Wong, KF, Xia, Y & Xu, R 2006, Clique percolation method for finding naturally cohesive and overlapping document clusters. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4285 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4285 LNAI, pp. 97-108, 21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006, Singapore, Singapore, 17/12/06. https://doi.org/10.1007/11940098_10
Gao W, Wong KF, Xia Y, Xu R. Clique percolation method for finding naturally cohesive and overlapping document clusters. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI. 2006. p. 97-108. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11940098_10
Gao, Wei ; Wong, Kam Fai ; Xia, Yunqing ; Xu, Ruifeng. / Clique percolation method for finding naturally cohesive and overlapping document clusters. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. pp. 97-108 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{4eb65a76a0fe46769db1b8764bd63bdc,
title = "Clique percolation method for finding naturally cohesive and overlapping document clusters",
abstract = "Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.",
author = "Wei Gao and Wong, {Kam Fai} and Yunqing Xia and Ruifeng Xu",
year = "2006",
month = "12",
day = "1",
doi = "10.1007/11940098_10",
language = "English",
isbn = "354049667X",
volume = "4285 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "97--108",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Clique percolation method for finding naturally cohesive and overlapping document clusters

AU - Gao, Wei

AU - Wong, Kam Fai

AU - Xia, Yunqing

AU - Xu, Ruifeng

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.

AB - Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.

UR - http://www.scopus.com/inward/record.url?scp=77049119723&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77049119723&partnerID=8YFLogxK

U2 - 10.1007/11940098_10

DO - 10.1007/11940098_10

M3 - Conference contribution

AN - SCOPUS:77049119723

SN - 354049667X

SN - 9783540496670

VL - 4285 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 97

EP - 108

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -