Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms

Emad Elsebakhi, Frank Lee, Eric Schendel, Anwar Haque, Nagarajan Kathireason, Tushar Pathare, Najeeb Syed, Rashid Al-Ali

Research output: Contribution to journalArticle

27 Citations (Scopus)

Abstract

Currently, the exponential growth of biomedical data along with the complexities of managing high dimensionality, imbalanced distribution, sparse attributes instigates a difficult challenge of effectively applying functional networks as a new large-scale predictive modeling in healthcare and biomedicine. This article proposes functional networks based on propensity score and Newton Raphson-maximum-likelihood optimizations as a new large-scale machine learning classifier to enhance its performance in addressing these challenges within big biomedical data. Different use-cases scenarios based on integrated phenotypic and genomics big biomedical data were proposed: real-life biomedical data, (i) optimal design of cancer chemotherapy; (ii) identify inpatient-admission of individuals with primary diagnosis of cancer; (iii) identify severe asthma exacerbation children using integrated phenotypic and SNP repository data; and (iv) mixture models simulation studies. Comparative studies were carried to compare the performance of the new paradigm versus the common state-of-the-art of machine learning, data mining, and statistics schemes. The results of performance of the new classifier with the most common classifiers on the four benchmark databases have been recorded in tables and graphs. The obtained results of the new classifier outperform most of existing state-of-the art statistical machine learning schemes with reliable and efficient performance. The new predictive modeling classifier is saving the computational time and having reliable performances along with future avenue for extension to deal with next generation sequencing data on high performance computing platforms.

Original languageEnglish
Pages (from-to)69-81
Number of pages13
JournalJournal of Computational Science
Volume11
DOIs
Publication statusPublished - 1 Nov 2015

Fingerprint

Learning systems
Machine Learning
Classifiers
High Performance
Classifier
Computing
Predictive Modeling
Cancer
Chemotherapy
Asthma
Propensity Score
Statistical Learning
Newton-Raphson
Maximum likelihood
Exponential Growth
Data mining
Use Case
Mixture Model
Repository
Healthcare

Keywords

  • Big data
  • Biomedical data and healthcare
  • Functional networks
  • Google Sibyl
  • Large scale high performance computing
  • Machine learning
  • MapReduce
  • Propensity score
  • Spark

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)
  • Modelling and Simulation

Cite this

Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms. / Elsebakhi, Emad; Lee, Frank; Schendel, Eric; Haque, Anwar; Kathireason, Nagarajan; Pathare, Tushar; Syed, Najeeb; Al-Ali, Rashid.

In: Journal of Computational Science, Vol. 11, 01.11.2015, p. 69-81.

Research output: Contribution to journalArticle

@article{32bc72b68ec44f31939afad3a3663909,
title = "Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms",
abstract = "Currently, the exponential growth of biomedical data along with the complexities of managing high dimensionality, imbalanced distribution, sparse attributes instigates a difficult challenge of effectively applying functional networks as a new large-scale predictive modeling in healthcare and biomedicine. This article proposes functional networks based on propensity score and Newton Raphson-maximum-likelihood optimizations as a new large-scale machine learning classifier to enhance its performance in addressing these challenges within big biomedical data. Different use-cases scenarios based on integrated phenotypic and genomics big biomedical data were proposed: real-life biomedical data, (i) optimal design of cancer chemotherapy; (ii) identify inpatient-admission of individuals with primary diagnosis of cancer; (iii) identify severe asthma exacerbation children using integrated phenotypic and SNP repository data; and (iv) mixture models simulation studies. Comparative studies were carried to compare the performance of the new paradigm versus the common state-of-the-art of machine learning, data mining, and statistics schemes. The results of performance of the new classifier with the most common classifiers on the four benchmark databases have been recorded in tables and graphs. The obtained results of the new classifier outperform most of existing state-of-the art statistical machine learning schemes with reliable and efficient performance. The new predictive modeling classifier is saving the computational time and having reliable performances along with future avenue for extension to deal with next generation sequencing data on high performance computing platforms.",
keywords = "Big data, Biomedical data and healthcare, Functional networks, Google Sibyl, Large scale high performance computing, Machine learning, MapReduce, Propensity score, Spark",
author = "Emad Elsebakhi and Frank Lee and Eric Schendel and Anwar Haque and Nagarajan Kathireason and Tushar Pathare and Najeeb Syed and Rashid Al-Ali",
year = "2015",
month = "11",
day = "1",
doi = "10.1016/j.jocs.2015.09.008",
language = "English",
volume = "11",
pages = "69--81",
journal = "Journal of Computational Science",
issn = "1877-7503",
publisher = "Elsevier",

}

TY - JOUR

T1 - Large-scale machine learning based on functional networks for biomedical big data with high performance computing platforms

AU - Elsebakhi, Emad

AU - Lee, Frank

AU - Schendel, Eric

AU - Haque, Anwar

AU - Kathireason, Nagarajan

AU - Pathare, Tushar

AU - Syed, Najeeb

AU - Al-Ali, Rashid

PY - 2015/11/1

Y1 - 2015/11/1

N2 - Currently, the exponential growth of biomedical data along with the complexities of managing high dimensionality, imbalanced distribution, sparse attributes instigates a difficult challenge of effectively applying functional networks as a new large-scale predictive modeling in healthcare and biomedicine. This article proposes functional networks based on propensity score and Newton Raphson-maximum-likelihood optimizations as a new large-scale machine learning classifier to enhance its performance in addressing these challenges within big biomedical data. Different use-cases scenarios based on integrated phenotypic and genomics big biomedical data were proposed: real-life biomedical data, (i) optimal design of cancer chemotherapy; (ii) identify inpatient-admission of individuals with primary diagnosis of cancer; (iii) identify severe asthma exacerbation children using integrated phenotypic and SNP repository data; and (iv) mixture models simulation studies. Comparative studies were carried to compare the performance of the new paradigm versus the common state-of-the-art of machine learning, data mining, and statistics schemes. The results of performance of the new classifier with the most common classifiers on the four benchmark databases have been recorded in tables and graphs. The obtained results of the new classifier outperform most of existing state-of-the art statistical machine learning schemes with reliable and efficient performance. The new predictive modeling classifier is saving the computational time and having reliable performances along with future avenue for extension to deal with next generation sequencing data on high performance computing platforms.

AB - Currently, the exponential growth of biomedical data along with the complexities of managing high dimensionality, imbalanced distribution, sparse attributes instigates a difficult challenge of effectively applying functional networks as a new large-scale predictive modeling in healthcare and biomedicine. This article proposes functional networks based on propensity score and Newton Raphson-maximum-likelihood optimizations as a new large-scale machine learning classifier to enhance its performance in addressing these challenges within big biomedical data. Different use-cases scenarios based on integrated phenotypic and genomics big biomedical data were proposed: real-life biomedical data, (i) optimal design of cancer chemotherapy; (ii) identify inpatient-admission of individuals with primary diagnosis of cancer; (iii) identify severe asthma exacerbation children using integrated phenotypic and SNP repository data; and (iv) mixture models simulation studies. Comparative studies were carried to compare the performance of the new paradigm versus the common state-of-the-art of machine learning, data mining, and statistics schemes. The results of performance of the new classifier with the most common classifiers on the four benchmark databases have been recorded in tables and graphs. The obtained results of the new classifier outperform most of existing state-of-the art statistical machine learning schemes with reliable and efficient performance. The new predictive modeling classifier is saving the computational time and having reliable performances along with future avenue for extension to deal with next generation sequencing data on high performance computing platforms.

KW - Big data

KW - Biomedical data and healthcare

KW - Functional networks

KW - Google Sibyl

KW - Large scale high performance computing

KW - Machine learning

KW - MapReduce

KW - Propensity score

KW - Spark

UR - http://www.scopus.com/inward/record.url?scp=84944450355&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944450355&partnerID=8YFLogxK

U2 - 10.1016/j.jocs.2015.09.008

DO - 10.1016/j.jocs.2015.09.008

M3 - Article

AN - SCOPUS:84944450355

VL - 11

SP - 69

EP - 81

JO - Journal of Computational Science

JF - Journal of Computational Science

SN - 1877-7503

ER -