Whole-exome sequencing to analyze population structure, parental inbreeding, and familial linkage

Exome/Array Consortium

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Principal component analysis (PCA), homozygosity rate estimations, and linkage studiess in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from different regions of the world, including North Africa and the Middle East, which are poorly covered by public databases and have high consanguinity rates. We tested and applied a number of quality control (QC) filters. Comparedwith GWSA, we found that WES provided an accurate prediction of population substructure using variants with a minor allele frequency > 2% (correlation = 0.89 with the PCA coordinates obtained by GWSA). WES also yielded highly reliable estimates of homozygosity rates using runs of homozygosity with a 1,000-kb window (correlation = 0.94 with the estimates provided by GWSA). Finally, homozygosity mapping analyses in 15 families including a single offspring with high homozygosity rates showed that WES provided 51% less genome- wide linkage information than GWSA overall but 97% more information for the coding regions. At the genome-wide scale, 76.3% of linked regions were found by both GWSA and WES, 17.7% were found by GWSA only, and 6.0% were found by WES only. For coding regions, the corresponding percentages were 83.5%, 7.4%, and 9.1%, respectively. With appropriate QC filters, WES can be used for PCA and adjustment for population substructure, estimating homozygosity rates in individuals, and powerful linkage analyses, particularly in coding regions.

Original languageEnglish
Pages (from-to)6713-6718
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Volume113
Issue number24
DOIs
Publication statusPublished - 14 Jun 2016
Externally publishedYes

Fingerprint

Exome
Inbreeding
Genome
Nucleotides
Population
Principal Component Analysis
Quality Control
Consanguinity
Northern Africa
Middle East
Gene Frequency
Databases

Keywords

  • Exome sequencing
  • Genotyping array
  • Homozygosity mapping
  • Linkage analysis
  • Population structure

ASJC Scopus subject areas

  • General

Cite this

Whole-exome sequencing to analyze population structure, parental inbreeding, and familial linkage. / Exome/Array Consortium.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 113, No. 24, 14.06.2016, p. 6713-6718.

Research output: Contribution to journalArticle

@article{f33cd8ad63504bc3a826797f9bed5045,
title = "Whole-exome sequencing to analyze population structure, parental inbreeding, and familial linkage",
abstract = "Principal component analysis (PCA), homozygosity rate estimations, and linkage studiess in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from different regions of the world, including North Africa and the Middle East, which are poorly covered by public databases and have high consanguinity rates. We tested and applied a number of quality control (QC) filters. Comparedwith GWSA, we found that WES provided an accurate prediction of population substructure using variants with a minor allele frequency > 2{\%} (correlation = 0.89 with the PCA coordinates obtained by GWSA). WES also yielded highly reliable estimates of homozygosity rates using runs of homozygosity with a 1,000-kb window (correlation = 0.94 with the estimates provided by GWSA). Finally, homozygosity mapping analyses in 15 families including a single offspring with high homozygosity rates showed that WES provided 51{\%} less genome- wide linkage information than GWSA overall but 97{\%} more information for the coding regions. At the genome-wide scale, 76.3{\%} of linked regions were found by both GWSA and WES, 17.7{\%} were found by GWSA only, and 6.0{\%} were found by WES only. For coding regions, the corresponding percentages were 83.5{\%}, 7.4{\%}, and 9.1{\%}, respectively. With appropriate QC filters, WES can be used for PCA and adjustment for population substructure, estimating homozygosity rates in individuals, and powerful linkage analyses, particularly in coding regions.",
keywords = "Exome sequencing, Genotyping array, Homozygosity mapping, Linkage analysis, Population structure",
author = "{Exome/Array Consortium} and Abdelaziz Belkadi and Vincent Pedergnana and Aur{\'e}lie Cobat and Yuval Itan and Vincent, {Quentin B.} and Avinash Abhyankar and Lei Shang and {El Baghdadi}, Jamila and Aziz Bousfiha and Waleed Al-Herz and Cigdem Arikan and Peter Arkwright and Cigdem Aydogmus and Olivier Bernard and Lizbeth Blancas-Galicia and St{\'e}phanie Boisson-Dupuis and Damien Bonnet and Stambouli, {Omar Boudghene} and Lobna Boussafara and Jeannette Boutros and Jacinta Bustamante and Michael Ciancanelli and Theresa Cole and Antonio Condino-Neto and Mukesh Desai and Claire Fieschi and {Luis Franco}, Jos{\'e} and Philippe Ichai and Emmanuelle Jouanguy and Melike Keser-Emiroglu and Kilic, {Sara S.} and {Alireza Mahdaviani}, Seyed and Nizar Malhaoui and Davood Mansouri and Nima Parvaneh and Capucine Picard and Anne Puel and Didier Raoult and Nima Rezaei and Ozden Sanal and Ramon, {Silvia Sanchez} and Fran{\cc}ois Vandenesch and Guillaume Vogt and Zhang, {Shen Ying} and Alexandre Alcais and Bertrand Boisson and Casanova, {Jean Laurent} and Laurent Abel",
year = "2016",
month = "6",
day = "14",
doi = "10.1073/pnas.1606460113",
language = "English",
volume = "113",
pages = "6713--6718",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "24",

}

TY - JOUR

T1 - Whole-exome sequencing to analyze population structure, parental inbreeding, and familial linkage

AU - Exome/Array Consortium

AU - Belkadi, Abdelaziz

AU - Pedergnana, Vincent

AU - Cobat, Aurélie

AU - Itan, Yuval

AU - Vincent, Quentin B.

AU - Abhyankar, Avinash

AU - Shang, Lei

AU - El Baghdadi, Jamila

AU - Bousfiha, Aziz

AU - Al-Herz, Waleed

AU - Arikan, Cigdem

AU - Arkwright, Peter

AU - Aydogmus, Cigdem

AU - Bernard, Olivier

AU - Blancas-Galicia, Lizbeth

AU - Boisson-Dupuis, Stéphanie

AU - Bonnet, Damien

AU - Stambouli, Omar Boudghene

AU - Boussafara, Lobna

AU - Boutros, Jeannette

AU - Bustamante, Jacinta

AU - Ciancanelli, Michael

AU - Cole, Theresa

AU - Condino-Neto, Antonio

AU - Desai, Mukesh

AU - Fieschi, Claire

AU - Luis Franco, José

AU - Ichai, Philippe

AU - Jouanguy, Emmanuelle

AU - Keser-Emiroglu, Melike

AU - Kilic, Sara S.

AU - Alireza Mahdaviani, Seyed

AU - Malhaoui, Nizar

AU - Mansouri, Davood

AU - Parvaneh, Nima

AU - Picard, Capucine

AU - Puel, Anne

AU - Raoult, Didier

AU - Rezaei, Nima

AU - Sanal, Ozden

AU - Ramon, Silvia Sanchez

AU - Vandenesch, François

AU - Vogt, Guillaume

AU - Zhang, Shen Ying

AU - Alcais, Alexandre

AU - Boisson, Bertrand

AU - Casanova, Jean Laurent

AU - Abel, Laurent

PY - 2016/6/14

Y1 - 2016/6/14

N2 - Principal component analysis (PCA), homozygosity rate estimations, and linkage studiess in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from different regions of the world, including North Africa and the Middle East, which are poorly covered by public databases and have high consanguinity rates. We tested and applied a number of quality control (QC) filters. Comparedwith GWSA, we found that WES provided an accurate prediction of population substructure using variants with a minor allele frequency > 2% (correlation = 0.89 with the PCA coordinates obtained by GWSA). WES also yielded highly reliable estimates of homozygosity rates using runs of homozygosity with a 1,000-kb window (correlation = 0.94 with the estimates provided by GWSA). Finally, homozygosity mapping analyses in 15 families including a single offspring with high homozygosity rates showed that WES provided 51% less genome- wide linkage information than GWSA overall but 97% more information for the coding regions. At the genome-wide scale, 76.3% of linked regions were found by both GWSA and WES, 17.7% were found by GWSA only, and 6.0% were found by WES only. For coding regions, the corresponding percentages were 83.5%, 7.4%, and 9.1%, respectively. With appropriate QC filters, WES can be used for PCA and adjustment for population substructure, estimating homozygosity rates in individuals, and powerful linkage analyses, particularly in coding regions.

AB - Principal component analysis (PCA), homozygosity rate estimations, and linkage studiess in humans are classically conducted through genome-wide single-nucleotide variant arrays (GWSA). We compared whole-exome sequencing (WES) and GWSA for this purpose. We analyzed 110 subjects originating from different regions of the world, including North Africa and the Middle East, which are poorly covered by public databases and have high consanguinity rates. We tested and applied a number of quality control (QC) filters. Comparedwith GWSA, we found that WES provided an accurate prediction of population substructure using variants with a minor allele frequency > 2% (correlation = 0.89 with the PCA coordinates obtained by GWSA). WES also yielded highly reliable estimates of homozygosity rates using runs of homozygosity with a 1,000-kb window (correlation = 0.94 with the estimates provided by GWSA). Finally, homozygosity mapping analyses in 15 families including a single offspring with high homozygosity rates showed that WES provided 51% less genome- wide linkage information than GWSA overall but 97% more information for the coding regions. At the genome-wide scale, 76.3% of linked regions were found by both GWSA and WES, 17.7% were found by GWSA only, and 6.0% were found by WES only. For coding regions, the corresponding percentages were 83.5%, 7.4%, and 9.1%, respectively. With appropriate QC filters, WES can be used for PCA and adjustment for population substructure, estimating homozygosity rates in individuals, and powerful linkage analyses, particularly in coding regions.

KW - Exome sequencing

KW - Genotyping array

KW - Homozygosity mapping

KW - Linkage analysis

KW - Population structure

UR - http://www.scopus.com/inward/record.url?scp=84974723289&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84974723289&partnerID=8YFLogxK

U2 - 10.1073/pnas.1606460113

DO - 10.1073/pnas.1606460113

M3 - Article

VL - 113

SP - 6713

EP - 6718

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 24

ER -