eCEO: An efficient cloud epistasis computing model in genome-wide association study

Zhengkui Wang, Yue Wang, Kian Lee Tan, Limsoon Wong, Divyakant Agrawal

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Motivation: Recent studies suggested that a combination of multiple single nucleotide polymorphisms (SNPs) could have more significant associations with a specific phenotype. However, to discover epistasis, the epistatic interactions of SNPs, in a large number of SNPs, is a computationally challenging task. We are, therefore, motivated to develop efficient and effective solutions for identifying epistatic interactions of SNPs.Results: In this article, we propose an efficient Cloud-based Epistasis cOmputing (eCEO) model for large-scale epistatic interaction in genome-wide association study (GWAS). Given a large number of combinations of SNPs, our eCEO model is able to distribute them to balance the load across the processing nodes. Moreover, our eCEO model can efficiently process each combination of SNPs to determine the significance of its association with the phenotype. We have implemented and evaluated our eCEO model on our own cluster of more than 40 nodes. The experiment results demonstrate that the eCEO model is computationally efficient, flexible, scalable and practical. In addition, we have also deployed our eCEO model on the Amazon Elastic Compute Cloud. Our study further confirms its efficiency and ease of use in a public cloud.

Original languageEnglish
Article numberbtr091
Pages (from-to)1045-1051
Number of pages7
JournalBioinformatics
Volume27
Issue number8
DOIs
Publication statusPublished - 1 Apr 2011
Externally publishedYes

    Fingerprint

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability
  • Medicine(all)

Cite this

Wang, Z., Wang, Y., Tan, K. L., Wong, L., & Agrawal, D. (2011). eCEO: An efficient cloud epistasis computing model in genome-wide association study. Bioinformatics, 27(8), 1045-1051. [btr091]. https://doi.org/10.1093/bioinformatics/btr091