TRICLUSTER

An effective algorithm for mining coherent clusters in 3D microarray data

Lizhuang Zhao, Mohammed J. Zaki

Research output: Chapter in Book/Report/Conference proceedingConference contribution

135 Citations (Scopus)

Abstract

In this paper we introduce a novel algorithm called TRICLUSTER, for mining coherent clusters in three-dimensional (3D) gene expression datasets. TRICLUSTER can mine arbitrarily positioned and overlapping clusters, and depending on different parameter values, it can mine different types of clusters, including those with constant or similar values along each dimension, as well as scaling and shifting expression patterns. TRICLUSTER relies on graph-based approach to mine all valid clusters. For each time slice, i.e., a gene×sample matrix, it constructs the range multigraph, a compact representation of all similar value ranges between any two sample columns. It then searches for constrained maximal cliques in this multigraph to yield the set of biclusters for this time slice. Then TRICLUSTER constructs another graph using the biclusters (as vertices) from each time slice; mining cliques from this graph yields the final set of triclusters. Optionally, TRICLUSTER merges/deletes some clusters having large overlaps. We present a useful set of metrics to evaluate the clustering quality, and we show that TRICLUSTER can find significant triclusters in the real microarray datasets.

Original languageEnglish
Title of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data
EditorsJ. Widom, F. Ozcan, R. Chirkova
Pages694-705
Number of pages12
Publication statusPublished - 2005
Externally publishedYes
EventSIGMOD 2005: ACM SIGMOD International Conference on Management of Data - Baltimore, MD, United States
Duration: 14 Jun 200516 Jun 2005

Other

OtherSIGMOD 2005: ACM SIGMOD International Conference on Management of Data
CountryUnited States
CityBaltimore, MD
Period14/6/0516/6/05

Fingerprint

Microarrays
Gene expression

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Zhao, L., & Zaki, M. J. (2005). TRICLUSTER: An effective algorithm for mining coherent clusters in 3D microarray data. In J. Widom, F. Ozcan, & R. Chirkova (Eds.), Proceedings of the ACM SIGMOD International Conference on Management of Data (pp. 694-705)

TRICLUSTER : An effective algorithm for mining coherent clusters in 3D microarray data. / Zhao, Lizhuang; Zaki, Mohammed J.

Proceedings of the ACM SIGMOD International Conference on Management of Data. ed. / J. Widom; F. Ozcan; R. Chirkova. 2005. p. 694-705.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhao, L & Zaki, MJ 2005, TRICLUSTER: An effective algorithm for mining coherent clusters in 3D microarray data. in J Widom, F Ozcan & R Chirkova (eds), Proceedings of the ACM SIGMOD International Conference on Management of Data. pp. 694-705, SIGMOD 2005: ACM SIGMOD International Conference on Management of Data, Baltimore, MD, United States, 14/6/05.
Zhao L, Zaki MJ. TRICLUSTER: An effective algorithm for mining coherent clusters in 3D microarray data. In Widom J, Ozcan F, Chirkova R, editors, Proceedings of the ACM SIGMOD International Conference on Management of Data. 2005. p. 694-705
Zhao, Lizhuang ; Zaki, Mohammed J. / TRICLUSTER : An effective algorithm for mining coherent clusters in 3D microarray data. Proceedings of the ACM SIGMOD International Conference on Management of Data. editor / J. Widom ; F. Ozcan ; R. Chirkova. 2005. pp. 694-705
@inproceedings{198318b466b34bd5b727573324961e25,
title = "TRICLUSTER: An effective algorithm for mining coherent clusters in 3D microarray data",
abstract = "In this paper we introduce a novel algorithm called TRICLUSTER, for mining coherent clusters in three-dimensional (3D) gene expression datasets. TRICLUSTER can mine arbitrarily positioned and overlapping clusters, and depending on different parameter values, it can mine different types of clusters, including those with constant or similar values along each dimension, as well as scaling and shifting expression patterns. TRICLUSTER relies on graph-based approach to mine all valid clusters. For each time slice, i.e., a gene×sample matrix, it constructs the range multigraph, a compact representation of all similar value ranges between any two sample columns. It then searches for constrained maximal cliques in this multigraph to yield the set of biclusters for this time slice. Then TRICLUSTER constructs another graph using the biclusters (as vertices) from each time slice; mining cliques from this graph yields the final set of triclusters. Optionally, TRICLUSTER merges/deletes some clusters having large overlaps. We present a useful set of metrics to evaluate the clustering quality, and we show that TRICLUSTER can find significant triclusters in the real microarray datasets.",
author = "Lizhuang Zhao and Zaki, {Mohammed J.}",
year = "2005",
language = "English",
pages = "694--705",
editor = "J. Widom and F. Ozcan and R. Chirkova",
booktitle = "Proceedings of the ACM SIGMOD International Conference on Management of Data",

}

TY - GEN

T1 - TRICLUSTER

T2 - An effective algorithm for mining coherent clusters in 3D microarray data

AU - Zhao, Lizhuang

AU - Zaki, Mohammed J.

PY - 2005

Y1 - 2005

N2 - In this paper we introduce a novel algorithm called TRICLUSTER, for mining coherent clusters in three-dimensional (3D) gene expression datasets. TRICLUSTER can mine arbitrarily positioned and overlapping clusters, and depending on different parameter values, it can mine different types of clusters, including those with constant or similar values along each dimension, as well as scaling and shifting expression patterns. TRICLUSTER relies on graph-based approach to mine all valid clusters. For each time slice, i.e., a gene×sample matrix, it constructs the range multigraph, a compact representation of all similar value ranges between any two sample columns. It then searches for constrained maximal cliques in this multigraph to yield the set of biclusters for this time slice. Then TRICLUSTER constructs another graph using the biclusters (as vertices) from each time slice; mining cliques from this graph yields the final set of triclusters. Optionally, TRICLUSTER merges/deletes some clusters having large overlaps. We present a useful set of metrics to evaluate the clustering quality, and we show that TRICLUSTER can find significant triclusters in the real microarray datasets.

AB - In this paper we introduce a novel algorithm called TRICLUSTER, for mining coherent clusters in three-dimensional (3D) gene expression datasets. TRICLUSTER can mine arbitrarily positioned and overlapping clusters, and depending on different parameter values, it can mine different types of clusters, including those with constant or similar values along each dimension, as well as scaling and shifting expression patterns. TRICLUSTER relies on graph-based approach to mine all valid clusters. For each time slice, i.e., a gene×sample matrix, it constructs the range multigraph, a compact representation of all similar value ranges between any two sample columns. It then searches for constrained maximal cliques in this multigraph to yield the set of biclusters for this time slice. Then TRICLUSTER constructs another graph using the biclusters (as vertices) from each time slice; mining cliques from this graph yields the final set of triclusters. Optionally, TRICLUSTER merges/deletes some clusters having large overlaps. We present a useful set of metrics to evaluate the clustering quality, and we show that TRICLUSTER can find significant triclusters in the real microarray datasets.

UR - http://www.scopus.com/inward/record.url?scp=29844442223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29844442223&partnerID=8YFLogxK

M3 - Conference contribution

SP - 694

EP - 705

BT - Proceedings of the ACM SIGMOD International Conference on Management of Data

A2 - Widom, J.

A2 - Ozcan, F.

A2 - Chirkova, R.

ER -