An unsupervised disease module identification technique in biological networks using novel quality metric based on connectivity, conductance and modularity [version 1; peer review: 2 approved with reservations]

Raghvendra Mall, Ehsan Ullah, Khalid Kunji, Michele Ceccarelli, Halima Bensmail

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.

Original languageEnglish
Article number378
JournalF1000Research
Volume7
DOIs
Publication statusPublished - 1 Jan 2018

    Fingerprint

Keywords

  • Biological Networks
  • Co-expression
  • Community Detection
  • Conductance
  • Disease Module Identification
  • GWAS
  • Modularity
  • PPI

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Pharmacology, Toxicology and Pharmaceutics(all)

Cite this