The human gene damage index as a gene-level approach to prioritizing exome variants

Yuval Itan, Lei Shang, Bertrand Boisson, Etienne Patin, Alexandre Bolze, Marcela Moncada-Vélez, Eric Scott, Michael J. Ciancanelli, Fabien G. Lafaille, Janet G. Markle, Ruben Martinez-Barricarte, Sarah Jill De Jong, Xiao Fei Kong, Patrick Nitschke, Abdelaziz Belkadi, Jacinta Bustamante, Anne Puel, Stéphanie Boisson-Dupuis, Peter D. Stenson, Joseph G. Gleeson & 6 others David N. Cooper, Lluis Quintana-Murci, Jean Michel Claverie, Shen Ying Zhang, Laurent Abel, Jean Laurent Casanova

Research output: Contribution to journalArticle

81 Citations (Scopus)

Abstract

The protein-coding exome of a patient with a monogenic disease contains about 20,000 variants, only one or two of which are disease causing. We found that 58% of rare variants in the protein-coding exome of the general population are located in only 2% of the genes. Prompted by this observation, we aimed to develop a gene-level approach for predicting whether a given human protein-coding gene is likely to harbor disease-causing mutations. To this end, we derived the gene damage index (GDI): A genome-wide, gene-level metric of the mutational damage that has accumulated in the general population. We found that the GDI was correlated with selective evolutionary pressure, protein complexity, coding sequence length, and the number of paralogs. We compared GDI with the leading gene-level approaches, genic intolerance, and de novo excess, and demonstrated that GDI performed best for the detection of false positives (i.e., removing exome variants in genes irrelevant to disease), whereas genic intolerance and de novo excess performed better for the detection of true positives (i.e., assessing de novo mutations in genes likely to be disease causing). The GDI server, data, and software are freely available to noncommercial users from lab.rockefeller.edu/casanova/GDI.

Original languageEnglish
Pages (from-to)13615-13620
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume112
Issue number44
DOIs
Publication statusPublished - 3 Nov 2015
Externally publishedYes

Fingerprint

Exome
Genes
Proteins
Mutation
Population

Keywords

  • Gene prioritization
  • Gene-level
  • Mutational damage
  • Next generation sequencing
  • Variant prioritization

ASJC Scopus subject areas

  • General

Cite this

The human gene damage index as a gene-level approach to prioritizing exome variants. / Itan, Yuval; Shang, Lei; Boisson, Bertrand; Patin, Etienne; Bolze, Alexandre; Moncada-Vélez, Marcela; Scott, Eric; Ciancanelli, Michael J.; Lafaille, Fabien G.; Markle, Janet G.; Martinez-Barricarte, Ruben; De Jong, Sarah Jill; Kong, Xiao Fei; Nitschke, Patrick; Belkadi, Abdelaziz; Bustamante, Jacinta; Puel, Anne; Boisson-Dupuis, Stéphanie; Stenson, Peter D.; Gleeson, Joseph G.; Cooper, David N.; Quintana-Murci, Lluis; Claverie, Jean Michel; Zhang, Shen Ying; Abel, Laurent; Casanova, Jean Laurent.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 112, No. 44, 03.11.2015, p. 13615-13620.

Research output: Contribution to journalArticle

Itan, Y, Shang, L, Boisson, B, Patin, E, Bolze, A, Moncada-Vélez, M, Scott, E, Ciancanelli, MJ, Lafaille, FG, Markle, JG, Martinez-Barricarte, R, De Jong, SJ, Kong, XF, Nitschke, P, Belkadi, A, Bustamante, J, Puel, A, Boisson-Dupuis, S, Stenson, PD, Gleeson, JG, Cooper, DN, Quintana-Murci, L, Claverie, JM, Zhang, SY, Abel, L & Casanova, JL 2015, 'The human gene damage index as a gene-level approach to prioritizing exome variants', Proceedings of the National Academy of Sciences of the United States of America, vol. 112, no. 44, pp. 13615-13620. https://doi.org/10.1073/pnas.1518646112
Itan, Yuval ; Shang, Lei ; Boisson, Bertrand ; Patin, Etienne ; Bolze, Alexandre ; Moncada-Vélez, Marcela ; Scott, Eric ; Ciancanelli, Michael J. ; Lafaille, Fabien G. ; Markle, Janet G. ; Martinez-Barricarte, Ruben ; De Jong, Sarah Jill ; Kong, Xiao Fei ; Nitschke, Patrick ; Belkadi, Abdelaziz ; Bustamante, Jacinta ; Puel, Anne ; Boisson-Dupuis, Stéphanie ; Stenson, Peter D. ; Gleeson, Joseph G. ; Cooper, David N. ; Quintana-Murci, Lluis ; Claverie, Jean Michel ; Zhang, Shen Ying ; Abel, Laurent ; Casanova, Jean Laurent. / The human gene damage index as a gene-level approach to prioritizing exome variants. In: Proceedings of the National Academy of Sciences of the United States of America. 2015 ; Vol. 112, No. 44. pp. 13615-13620.
@article{c5e0310b515941d9af2b0ca3575e0ee5,
title = "The human gene damage index as a gene-level approach to prioritizing exome variants",
abstract = "The protein-coding exome of a patient with a monogenic disease contains about 20,000 variants, only one or two of which are disease causing. We found that 58{\%} of rare variants in the protein-coding exome of the general population are located in only 2{\%} of the genes. Prompted by this observation, we aimed to develop a gene-level approach for predicting whether a given human protein-coding gene is likely to harbor disease-causing mutations. To this end, we derived the gene damage index (GDI): A genome-wide, gene-level metric of the mutational damage that has accumulated in the general population. We found that the GDI was correlated with selective evolutionary pressure, protein complexity, coding sequence length, and the number of paralogs. We compared GDI with the leading gene-level approaches, genic intolerance, and de novo excess, and demonstrated that GDI performed best for the detection of false positives (i.e., removing exome variants in genes irrelevant to disease), whereas genic intolerance and de novo excess performed better for the detection of true positives (i.e., assessing de novo mutations in genes likely to be disease causing). The GDI server, data, and software are freely available to noncommercial users from lab.rockefeller.edu/casanova/GDI.",
keywords = "Gene prioritization, Gene-level, Mutational damage, Next generation sequencing, Variant prioritization",
author = "Yuval Itan and Lei Shang and Bertrand Boisson and Etienne Patin and Alexandre Bolze and Marcela Moncada-V{\'e}lez and Eric Scott and Ciancanelli, {Michael J.} and Lafaille, {Fabien G.} and Markle, {Janet G.} and Ruben Martinez-Barricarte and {De Jong}, {Sarah Jill} and Kong, {Xiao Fei} and Patrick Nitschke and Abdelaziz Belkadi and Jacinta Bustamante and Anne Puel and St{\'e}phanie Boisson-Dupuis and Stenson, {Peter D.} and Gleeson, {Joseph G.} and Cooper, {David N.} and Lluis Quintana-Murci and Claverie, {Jean Michel} and Zhang, {Shen Ying} and Laurent Abel and Casanova, {Jean Laurent}",
year = "2015",
month = "11",
day = "3",
doi = "10.1073/pnas.1518646112",
language = "English",
volume = "112",
pages = "13615--13620",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "44",

}

TY - JOUR

T1 - The human gene damage index as a gene-level approach to prioritizing exome variants

AU - Itan, Yuval

AU - Shang, Lei

AU - Boisson, Bertrand

AU - Patin, Etienne

AU - Bolze, Alexandre

AU - Moncada-Vélez, Marcela

AU - Scott, Eric

AU - Ciancanelli, Michael J.

AU - Lafaille, Fabien G.

AU - Markle, Janet G.

AU - Martinez-Barricarte, Ruben

AU - De Jong, Sarah Jill

AU - Kong, Xiao Fei

AU - Nitschke, Patrick

AU - Belkadi, Abdelaziz

AU - Bustamante, Jacinta

AU - Puel, Anne

AU - Boisson-Dupuis, Stéphanie

AU - Stenson, Peter D.

AU - Gleeson, Joseph G.

AU - Cooper, David N.

AU - Quintana-Murci, Lluis

AU - Claverie, Jean Michel

AU - Zhang, Shen Ying

AU - Abel, Laurent

AU - Casanova, Jean Laurent

PY - 2015/11/3

Y1 - 2015/11/3

N2 - The protein-coding exome of a patient with a monogenic disease contains about 20,000 variants, only one or two of which are disease causing. We found that 58% of rare variants in the protein-coding exome of the general population are located in only 2% of the genes. Prompted by this observation, we aimed to develop a gene-level approach for predicting whether a given human protein-coding gene is likely to harbor disease-causing mutations. To this end, we derived the gene damage index (GDI): A genome-wide, gene-level metric of the mutational damage that has accumulated in the general population. We found that the GDI was correlated with selective evolutionary pressure, protein complexity, coding sequence length, and the number of paralogs. We compared GDI with the leading gene-level approaches, genic intolerance, and de novo excess, and demonstrated that GDI performed best for the detection of false positives (i.e., removing exome variants in genes irrelevant to disease), whereas genic intolerance and de novo excess performed better for the detection of true positives (i.e., assessing de novo mutations in genes likely to be disease causing). The GDI server, data, and software are freely available to noncommercial users from lab.rockefeller.edu/casanova/GDI.

AB - The protein-coding exome of a patient with a monogenic disease contains about 20,000 variants, only one or two of which are disease causing. We found that 58% of rare variants in the protein-coding exome of the general population are located in only 2% of the genes. Prompted by this observation, we aimed to develop a gene-level approach for predicting whether a given human protein-coding gene is likely to harbor disease-causing mutations. To this end, we derived the gene damage index (GDI): A genome-wide, gene-level metric of the mutational damage that has accumulated in the general population. We found that the GDI was correlated with selective evolutionary pressure, protein complexity, coding sequence length, and the number of paralogs. We compared GDI with the leading gene-level approaches, genic intolerance, and de novo excess, and demonstrated that GDI performed best for the detection of false positives (i.e., removing exome variants in genes irrelevant to disease), whereas genic intolerance and de novo excess performed better for the detection of true positives (i.e., assessing de novo mutations in genes likely to be disease causing). The GDI server, data, and software are freely available to noncommercial users from lab.rockefeller.edu/casanova/GDI.

KW - Gene prioritization

KW - Gene-level

KW - Mutational damage

KW - Next generation sequencing

KW - Variant prioritization

UR - http://www.scopus.com/inward/record.url?scp=84946595515&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946595515&partnerID=8YFLogxK

U2 - 10.1073/pnas.1518646112

DO - 10.1073/pnas.1518646112

M3 - Article

VL - 112

SP - 13615

EP - 13620

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 44

ER -