Discovering consensus patterns in biological databases

Mohamed Y. ElTabakh, Walid G. Aref, Mourad Ouzzani, Mohamed H. Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages170-184
Number of pages15
Volume4316 LNBI
Publication statusPublished - 1 Dec 2006
Externally publishedYes
Event1st International Workshop on Data Mining and Bioinformatics, VDMB 2006 - Seoul, Korea, Republic of
Duration: 11 Sep 200611 Sep 2006

Other

Other1st International Workshop on Data Mining and Bioinformatics, VDMB 2006
CountryKorea, Republic of
CitySeoul
Period11/9/0611/9/06

Fingerprint

Tandem Repeat Sequences
Cluster Analysis
Databases
Hierarchical Clustering
Clustering algorithms
Substitution reactions
Processing
Post-processing
Clustering Algorithm
Substitution
Experiments
Clustering
Requirements
Range of data
Experiment

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

ElTabakh, M. Y., Aref, W. G., Ouzzani, M., & Ali, M. H. (2006). Discovering consensus patterns in biological databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4316 LNBI, pp. 170-184)

Discovering consensus patterns in biological databases. / ElTabakh, Mohamed Y.; Aref, Walid G.; Ouzzani, Mourad; Ali, Mohamed H.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4316 LNBI 2006. p. 170-184.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

ElTabakh, MY, Aref, WG, Ouzzani, M & Ali, MH 2006, Discovering consensus patterns in biological databases. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4316 LNBI, pp. 170-184, 1st International Workshop on Data Mining and Bioinformatics, VDMB 2006, Seoul, Korea, Republic of, 11/9/06.
ElTabakh MY, Aref WG, Ouzzani M, Ali MH. Discovering consensus patterns in biological databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4316 LNBI. 2006. p. 170-184
ElTabakh, Mohamed Y. ; Aref, Walid G. ; Ouzzani, Mourad ; Ali, Mohamed H. / Discovering consensus patterns in biological databases. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4316 LNBI 2006. pp. 170-184
@inproceedings{e711e31efafe49c1b3d3624d95a3b755,
title = "Discovering consensus patterns in biological databases",
abstract = "Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.",
author = "ElTabakh, {Mohamed Y.} and Aref, {Walid G.} and Mourad Ouzzani and Ali, {Mohamed H.}",
year = "2006",
month = "12",
day = "1",
language = "English",
isbn = "3540689702",
volume = "4316 LNBI",
pages = "170--184",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Discovering consensus patterns in biological databases

AU - ElTabakh, Mohamed Y.

AU - Aref, Walid G.

AU - Ouzzani, Mourad

AU - Ali, Mohamed H.

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.

AB - Consensus patterns, like motifs and tandem repeats, are highly conserved patterns with very few substitutions where no gaps are allowed. In this paper, we present a progressive hierarchical clustering technique for discovering consensus patterns in biological databases over a certain length range. This technique can discover consensus patterns with various requirements by applying a post-processing phase. The progressive nature of the hierarchical clustering algorithm makes it scalable and efficient. Experiments to discover motifs and tandem repeats on real biological databases show significant performance gain over non-progressive clustering techniques.

UR - http://www.scopus.com/inward/record.url?scp=34547546788&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34547546788&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:34547546788

SN - 3540689702

SN - 9783540689706

VL - 4316 LNBI

SP - 170

EP - 184

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -