Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data

Jan Krumsiek, Karsten Suhre, Thomas Illig, Jerzy Adamski, Fabian J. Theis

Research output: Contribution to journalArticle

145 Citations (Scopus)

Abstract

Background: With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.Results: In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.Conclusions: In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.

Original languageEnglish
Article number21
JournalBMC Systems Biology
Volume5
DOIs
Publication statusPublished - 31 Jan 2011
Externally publishedYes

Fingerprint

Graphical Modeling
Metabolomics
High Throughput
Pathway
Metabolites
Fatty Acids
Throughput
Graphical Models
Correlation coefficient
Cross Reactions
Gaussian Model
Statistical Models
Serum
Partial Correlation
Fasting
Pearson Correlation
Fatty acids
Blood
Population
Demonstrate

ASJC Scopus subject areas

  • Structural Biology
  • Modelling and Simulation
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. / Krumsiek, Jan; Suhre, Karsten; Illig, Thomas; Adamski, Jerzy; Theis, Fabian J.

In: BMC Systems Biology, Vol. 5, 21, 31.01.2011.

Research output: Contribution to journalArticle

Krumsiek, Jan ; Suhre, Karsten ; Illig, Thomas ; Adamski, Jerzy ; Theis, Fabian J. / Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data. In: BMC Systems Biology. 2011 ; Vol. 5.
@article{5f045bf2eefe4954b02ee715b0c86ddf,
title = "Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data",
abstract = "Background: With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.Results: In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.Conclusions: In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.",
author = "Jan Krumsiek and Karsten Suhre and Thomas Illig and Jerzy Adamski and Theis, {Fabian J.}",
year = "2011",
month = "1",
day = "31",
doi = "10.1186/1752-0509-5-21",
language = "English",
volume = "5",
journal = "BMC Systems Biology",
issn = "1752-0509",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Gaussian graphical modeling reconstructs pathway reactions from high-throughput metabolomics data

AU - Krumsiek, Jan

AU - Suhre, Karsten

AU - Illig, Thomas

AU - Adamski, Jerzy

AU - Theis, Fabian J.

PY - 2011/1/31

Y1 - 2011/1/31

N2 - Background: With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.Results: In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.Conclusions: In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.

AB - Background: With the advent of high-throughput targeted metabolic profiling techniques, the question of how to interpret and analyze the resulting vast amount of data becomes more and more important. In this work we address the reconstruction of metabolic reactions from cross-sectional metabolomics data, that is without the requirement for time-resolved measurements or specific system perturbations. Previous studies in this area mainly focused on Pearson correlation coefficients, which however are generally incapable of distinguishing between direct and indirect metabolic interactions.Results: In our new approach we propose the application of a Gaussian graphical model (GGM), an undirected probabilistic graphical model estimating the conditional dependence between variables. GGMs are based on partial correlation coefficients, that is pairwise Pearson correlation coefficients conditioned against the correlation with all other metabolites. We first demonstrate the general validity of the method and its advantages over regular correlation networks with computer-simulated reaction systems. Then we estimate a GGM on data from a large human population cohort, covering 1020 fasting blood serum samples with 151 quantified metabolites. The GGM is much sparser than the correlation network, shows a modular structure with respect to metabolite classes, and is stable to the choice of samples in the data set. On the example of human fatty acid metabolism, we demonstrate for the first time that high partial correlation coefficients generally correspond to known metabolic reactions. This feature is evaluated both manually by investigating specific pairs of high-scoring metabolites, and then systematically on a literature-curated model of fatty acid synthesis and degradation. Our method detects many known reactions along with possibly novel pathway interactions, representing candidates for further experimental examination.Conclusions: In summary, we demonstrate strong signatures of intracellular pathways in blood serum data, and provide a valuable tool for the unbiased reconstruction of metabolic reactions from large-scale metabolomics data sets.

UR - http://www.scopus.com/inward/record.url?scp=79251567361&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79251567361&partnerID=8YFLogxK

U2 - 10.1186/1752-0509-5-21

DO - 10.1186/1752-0509-5-21

M3 - Article

VL - 5

JO - BMC Systems Biology

JF - BMC Systems Biology

SN - 1752-0509

M1 - 21

ER -