Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms

Emad Elsebakhi, Frank Lee, Eric Schendel, Anwar Haque, Nagarajan Kathireason, Tushar Pathare, Najeeb Syed, Rashid Al-Ali

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Currently, the exponential growth of biomedical data along with the complexities of managing high dimensionality, imbalanced distribution, sparse attributes instigates a difficult challenge of effectively applying functional networks as a new large-scale predictive modeling in healthcare and biomedicine. This article proposes functional networks based on propensity score and Newton Raphson-maximum-likelihood optimizations as a new large-scale machine learning classifier to enhance its performance in addressing these challenges within big biomedical data. Different use-cases scenarios based on integrated phenotypic and genomics big biomedical data were proposed: real-life biomedical data, (i) optimal design of cancer chemotherapy; (ii) identify inpatient-admission of individuals with primary diagnosis of cancer; (iii) identify severe asthma exacerbation children using integrated phenotypic and SNP repository data; and (iv) mixture models simulation studies. Comparative studies were carried to compare the performance of the new paradigm versus the common state-of-the-art of machine learning, data mining, and statistics schemes. The results of performance of the new classifier with the most common classifiers on the four benchmark databases have been recorded in tables and graphs. The obtained results of the new classifier outperform most of existing state-of-the art statistical machine learning schemes with reliable and efficient performance. The new predictive modeling classifier is saving the computational time and having reliable performances along with future avenue for extension to deal with next generation sequencing data on high performance computing platforms.

Original languageEnglish
Pages (from-to)69-81
Number of pages13
JournalJournal of Computational Science
Volume11
DOIs
Publication statusPublished - 1 Nov 2015

    Fingerprint

Keywords

  • Big data
  • Biomedical data and healthcare
  • Functional networks
  • Google Sibyl
  • Large scale high performance computing
  • Machine learning
  • MapReduce
  • Propensity score
  • Spark

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)
  • Modelling and Simulation

Cite this