The ParTriCluster algorithm for gene expression analysis

Renata Braga Araújo, Guilherme Henrique Trielli Ferreira, Gustavo Henrique Orair, Wagner Meira, Renato Antônio Celso Ferreira, Dorgival Olavo Guedes Neto, Mohammed Javeed Zaki

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Analyzing gene expression patterns is becoming a highly relevant task in the Bioinformatics area. This analysis makes it possible to determine the behavior patterns of genes under various conditions, a fundamental information for treating diseases, among other applications. A recent advance in this area is the Tricluster algorithm, which is the first algorithm capable of determining 3D clusters (genes × samples × timestamps), that is, groups of genes that behave similarly across samples and timestamps. However, even though biological experiments collect an increasing amount of data to be analyzed and correlated, the triclustering problem remains a bottleneck due to its NP-Completeness, so its parallelization seems to be an essential step towards obtaining feasible solutions. In this work we propose and evaluate the implementation of a parallel version of the Tricluster algorithm using the filter-labeled-stream paradigm supported by the Anthill parallel programming environment. The results show that our parallelization scales well with the data size, being able to handle severe load imbalances that are inherent to the problem. Further more, the parallelization strategy is applicable to any depth-first searches.

Original languageEnglish
Pages (from-to)226-249
Number of pages24
JournalInternational Journal of Parallel Programming
Volume36
Issue number2
DOIs
Publication statusPublished - 1 Apr 2008
Externally publishedYes

    Fingerprint

Keywords

  • Bioinformatics
  • Clustering
  • Depth-first search
  • Parallel programming

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computational Theory and Mathematics

Cite this

Araújo, R. B., Ferreira, G. H. T., Orair, G. H., Meira, W., Ferreira, R. A. C., Neto, D. O. G., & Zaki, M. J. (2008). The ParTriCluster algorithm for gene expression analysis. International Journal of Parallel Programming, 36(2), 226-249. https://doi.org/10.1007/s10766-007-0067-9