ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci

David Stacey, Eric B. Fauman, Daniel Ziemek, Benjamin B. Sun, Eric L. Harshfield, Angela M. Wood, Adam S. Butterworth, Karsten Suhre, Dirk S. Paul

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.

Original languageEnglish
Pages (from-to)e3
JournalNucleic Acids Research
Volume47
Issue number1
DOIs
Publication statusPublished - 10 Jan 2019

Fingerprint

Quantitative Trait Loci
Genes
Phenotype
Benchmarking
Genome-Wide Association Study
Genome
Lipids
Gene Expression
Proteins

ASJC Scopus subject areas

  • Genetics

Cite this

Stacey, D., Fauman, E. B., Ziemek, D., Sun, B. B., Harshfield, E. L., Wood, A. M., ... Paul, D. S. (2019). ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. Nucleic Acids Research, 47(1), e3. https://doi.org/10.1093/nar/gky837

ProGeM : a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. / Stacey, David; Fauman, Eric B.; Ziemek, Daniel; Sun, Benjamin B.; Harshfield, Eric L.; Wood, Angela M.; Butterworth, Adam S.; Suhre, Karsten; Paul, Dirk S.

In: Nucleic Acids Research, Vol. 47, No. 1, 10.01.2019, p. e3.

Research output: Contribution to journalArticle

Stacey, D, Fauman, EB, Ziemek, D, Sun, BB, Harshfield, EL, Wood, AM, Butterworth, AS, Suhre, K & Paul, DS 2019, 'ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci', Nucleic Acids Research, vol. 47, no. 1, pp. e3. https://doi.org/10.1093/nar/gky837
Stacey, David ; Fauman, Eric B. ; Ziemek, Daniel ; Sun, Benjamin B. ; Harshfield, Eric L. ; Wood, Angela M. ; Butterworth, Adam S. ; Suhre, Karsten ; Paul, Dirk S. / ProGeM : a framework for the prioritization of candidate causal genes at molecular quantitative trait loci. In: Nucleic Acids Research. 2019 ; Vol. 47, No. 1. pp. e3.
@article{fc183061d4aa4df7afa950856602e062,
title = "ProGeM: a framework for the prioritization of candidate causal genes at molecular quantitative trait loci",
abstract = "Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98{\%} of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69{\%} of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.",
author = "David Stacey and Fauman, {Eric B.} and Daniel Ziemek and Sun, {Benjamin B.} and Harshfield, {Eric L.} and Wood, {Angela M.} and Butterworth, {Adam S.} and Karsten Suhre and Paul, {Dirk S.}",
year = "2019",
month = "1",
day = "10",
doi = "10.1093/nar/gky837",
language = "English",
volume = "47",
pages = "e3",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "1",

}

TY - JOUR

T1 - ProGeM

T2 - a framework for the prioritization of candidate causal genes at molecular quantitative trait loci

AU - Stacey, David

AU - Fauman, Eric B.

AU - Ziemek, Daniel

AU - Sun, Benjamin B.

AU - Harshfield, Eric L.

AU - Wood, Angela M.

AU - Butterworth, Adam S.

AU - Suhre, Karsten

AU - Paul, Dirk S.

PY - 2019/1/10

Y1 - 2019/1/10

N2 - Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.

AB - Quantitative trait locus (QTL) mapping of molecular phenotypes such as metabolites, lipids and proteins through genome-wide association studies represents a powerful means of highlighting molecular mechanisms relevant to human diseases. However, a major challenge of this approach is to identify the causal gene(s) at the observed QTLs. Here, we present a framework for the 'Prioritization of candidate causal Genes at Molecular QTLs' (ProGeM), which incorporates biological domain-specific annotation data alongside genome annotation data from multiple repositories. We assessed the performance of ProGeM using a reference set of 227 previously reported and extensively curated metabolite QTLs. For 98% of these loci, the expert-curated gene was one of the candidate causal genes prioritized by ProGeM. Benchmarking analyses revealed that 69% of the causal candidates were nearest to the sentinel variant at the investigated molecular QTLs, indicating that genomic proximity is the most reliable indicator of 'true positive' causal genes. In contrast, cis-gene expression QTL data led to three false positive candidate causal gene assignments for every one true positive assignment. We provide evidence that these conclusions also apply to other molecular phenotypes, suggesting that ProGeM is a powerful and versatile tool for annotating molecular QTLs. ProGeM is freely available via GitHub.

UR - http://www.scopus.com/inward/record.url?scp=85059795521&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059795521&partnerID=8YFLogxK

U2 - 10.1093/nar/gky837

DO - 10.1093/nar/gky837

M3 - Article

C2 - 30239796

AN - SCOPUS:85059795521

VL - 47

SP - e3

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 1

ER -