Sequence and analysis of a whole genome from Kuwaiti population subgroup of Persian ancestry

Gaurav Thareja, Sumi Elsa John, Prashantha Hebbar, Kazem Behbehani, Thangavel Alphonse Thanaraj, Osama Alsmadi

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

BACKGROUND: The 1000 Genome project paved the way for sequencing diverse human populations. New genome projects are being established to sequence underrepresented populations helping in understanding human genetic diversity. The Kuwait Genome Project an initiative to sequence individual genomes from the three subgroups of Kuwaiti population namely, Saudi Arabian tribe; "tent-dwelling" Bedouin; and Persian, attributing their ancestry to different regions in Arabian Peninsula and to modern-day Iran (West Asia). These subgroups were in line with settlement history and are confirmed by genetic studies. In this work, we report whole genome sequence of a Kuwaiti native from Persian subgroup at >37X coverage.

RESULTS: We document 3,573,824 SNPs, 404,090 insertions/deletions, and 11,138 structural variations. Out of the reported SNPs and indels, 85,939 are novel. We identify 295 'loss-of-function' and 2,314 'deleterious' coding variants, some of which carry homozygous genotypes in the sequenced genome; the associated phenotypes include pharmacogenomic traits such as greater triglyceride lowering ability with fenofibrate treatment, and requirement of high warfarin dosage to elicit anticoagulation response. 6,328 non-coding SNPs associate with 811 phenotype traits: in congruence with medical history of the participant for Type 2 diabetes and β-Thalassemia, and of participant's family for migraine, 72 (of 159 known) Type 2 diabetes, 3 (of 4) β-Thalassemia, and 76 (of 169) migraine variants are seen in the genome. Intergenome comparisons based on shared disease-causing variants, positions the sequenced genome between Asian and European genomes in congruence with geographical location of the region. On comparison, bead arrays perform better than sequencing platforms in correctly calling genotypes in low-coverage sequenced genome regions however in the event of novel SNP or indel near genotype calling position can lead to false calls using bead arrays.

CONCLUSIONS: We report, for the first time, reference genome resource for the population of Persian ancestry. The resource provides a starting point for designing large-scale genetic studies in Peninsula including Kuwait, and Persian population. Such efforts on populations under-represented in global genome variation surveys help augment current knowledge on human genome diversity.

Original languageEnglish
Number of pages1
JournalBMC Genomics
Volume16
DOIs
Publication statusPublished - 1 Jan 2015
Externally publishedYes

Fingerprint

Sequence Analysis
Genome
Population
Single Nucleotide Polymorphism
Kuwait
Thalassemia
Genotype
Migraine Disorders
Type 2 Diabetes Mellitus
Fenofibrate
Phenotype
Pharmacogenetics
Medical Genetics
Human Genome
Warfarin
Iran
Triglycerides
History

ASJC Scopus subject areas

  • Biotechnology
  • Genetics

Cite this

Sequence and analysis of a whole genome from Kuwaiti population subgroup of Persian ancestry. / Thareja, Gaurav; John, Sumi Elsa; Hebbar, Prashantha; Behbehani, Kazem; Thanaraj, Thangavel Alphonse; Alsmadi, Osama.

In: BMC Genomics, Vol. 16, 01.01.2015.

Research output: Contribution to journalArticle

Thareja, Gaurav ; John, Sumi Elsa ; Hebbar, Prashantha ; Behbehani, Kazem ; Thanaraj, Thangavel Alphonse ; Alsmadi, Osama. / Sequence and analysis of a whole genome from Kuwaiti population subgroup of Persian ancestry. In: BMC Genomics. 2015 ; Vol. 16.
@article{8ad01618e0654abd90080ef81aff98de,
title = "Sequence and analysis of a whole genome from Kuwaiti population subgroup of Persian ancestry",
abstract = "BACKGROUND: The 1000 Genome project paved the way for sequencing diverse human populations. New genome projects are being established to sequence underrepresented populations helping in understanding human genetic diversity. The Kuwait Genome Project an initiative to sequence individual genomes from the three subgroups of Kuwaiti population namely, Saudi Arabian tribe; {"}tent-dwelling{"} Bedouin; and Persian, attributing their ancestry to different regions in Arabian Peninsula and to modern-day Iran (West Asia). These subgroups were in line with settlement history and are confirmed by genetic studies. In this work, we report whole genome sequence of a Kuwaiti native from Persian subgroup at >37X coverage.RESULTS: We document 3,573,824 SNPs, 404,090 insertions/deletions, and 11,138 structural variations. Out of the reported SNPs and indels, 85,939 are novel. We identify 295 'loss-of-function' and 2,314 'deleterious' coding variants, some of which carry homozygous genotypes in the sequenced genome; the associated phenotypes include pharmacogenomic traits such as greater triglyceride lowering ability with fenofibrate treatment, and requirement of high warfarin dosage to elicit anticoagulation response. 6,328 non-coding SNPs associate with 811 phenotype traits: in congruence with medical history of the participant for Type 2 diabetes and β-Thalassemia, and of participant's family for migraine, 72 (of 159 known) Type 2 diabetes, 3 (of 4) β-Thalassemia, and 76 (of 169) migraine variants are seen in the genome. Intergenome comparisons based on shared disease-causing variants, positions the sequenced genome between Asian and European genomes in congruence with geographical location of the region. On comparison, bead arrays perform better than sequencing platforms in correctly calling genotypes in low-coverage sequenced genome regions however in the event of novel SNP or indel near genotype calling position can lead to false calls using bead arrays.CONCLUSIONS: We report, for the first time, reference genome resource for the population of Persian ancestry. The resource provides a starting point for designing large-scale genetic studies in Peninsula including Kuwait, and Persian population. Such efforts on populations under-represented in global genome variation surveys help augment current knowledge on human genome diversity.",
author = "Gaurav Thareja and John, {Sumi Elsa} and Prashantha Hebbar and Kazem Behbehani and Thanaraj, {Thangavel Alphonse} and Osama Alsmadi",
year = "2015",
month = "1",
day = "1",
doi = "10.1186/s12864-015-1233-x",
language = "English",
volume = "16",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Sequence and analysis of a whole genome from Kuwaiti population subgroup of Persian ancestry

AU - Thareja, Gaurav

AU - John, Sumi Elsa

AU - Hebbar, Prashantha

AU - Behbehani, Kazem

AU - Thanaraj, Thangavel Alphonse

AU - Alsmadi, Osama

PY - 2015/1/1

Y1 - 2015/1/1

N2 - BACKGROUND: The 1000 Genome project paved the way for sequencing diverse human populations. New genome projects are being established to sequence underrepresented populations helping in understanding human genetic diversity. The Kuwait Genome Project an initiative to sequence individual genomes from the three subgroups of Kuwaiti population namely, Saudi Arabian tribe; "tent-dwelling" Bedouin; and Persian, attributing their ancestry to different regions in Arabian Peninsula and to modern-day Iran (West Asia). These subgroups were in line with settlement history and are confirmed by genetic studies. In this work, we report whole genome sequence of a Kuwaiti native from Persian subgroup at >37X coverage.RESULTS: We document 3,573,824 SNPs, 404,090 insertions/deletions, and 11,138 structural variations. Out of the reported SNPs and indels, 85,939 are novel. We identify 295 'loss-of-function' and 2,314 'deleterious' coding variants, some of which carry homozygous genotypes in the sequenced genome; the associated phenotypes include pharmacogenomic traits such as greater triglyceride lowering ability with fenofibrate treatment, and requirement of high warfarin dosage to elicit anticoagulation response. 6,328 non-coding SNPs associate with 811 phenotype traits: in congruence with medical history of the participant for Type 2 diabetes and β-Thalassemia, and of participant's family for migraine, 72 (of 159 known) Type 2 diabetes, 3 (of 4) β-Thalassemia, and 76 (of 169) migraine variants are seen in the genome. Intergenome comparisons based on shared disease-causing variants, positions the sequenced genome between Asian and European genomes in congruence with geographical location of the region. On comparison, bead arrays perform better than sequencing platforms in correctly calling genotypes in low-coverage sequenced genome regions however in the event of novel SNP or indel near genotype calling position can lead to false calls using bead arrays.CONCLUSIONS: We report, for the first time, reference genome resource for the population of Persian ancestry. The resource provides a starting point for designing large-scale genetic studies in Peninsula including Kuwait, and Persian population. Such efforts on populations under-represented in global genome variation surveys help augment current knowledge on human genome diversity.

AB - BACKGROUND: The 1000 Genome project paved the way for sequencing diverse human populations. New genome projects are being established to sequence underrepresented populations helping in understanding human genetic diversity. The Kuwait Genome Project an initiative to sequence individual genomes from the three subgroups of Kuwaiti population namely, Saudi Arabian tribe; "tent-dwelling" Bedouin; and Persian, attributing their ancestry to different regions in Arabian Peninsula and to modern-day Iran (West Asia). These subgroups were in line with settlement history and are confirmed by genetic studies. In this work, we report whole genome sequence of a Kuwaiti native from Persian subgroup at >37X coverage.RESULTS: We document 3,573,824 SNPs, 404,090 insertions/deletions, and 11,138 structural variations. Out of the reported SNPs and indels, 85,939 are novel. We identify 295 'loss-of-function' and 2,314 'deleterious' coding variants, some of which carry homozygous genotypes in the sequenced genome; the associated phenotypes include pharmacogenomic traits such as greater triglyceride lowering ability with fenofibrate treatment, and requirement of high warfarin dosage to elicit anticoagulation response. 6,328 non-coding SNPs associate with 811 phenotype traits: in congruence with medical history of the participant for Type 2 diabetes and β-Thalassemia, and of participant's family for migraine, 72 (of 159 known) Type 2 diabetes, 3 (of 4) β-Thalassemia, and 76 (of 169) migraine variants are seen in the genome. Intergenome comparisons based on shared disease-causing variants, positions the sequenced genome between Asian and European genomes in congruence with geographical location of the region. On comparison, bead arrays perform better than sequencing platforms in correctly calling genotypes in low-coverage sequenced genome regions however in the event of novel SNP or indel near genotype calling position can lead to false calls using bead arrays.CONCLUSIONS: We report, for the first time, reference genome resource for the population of Persian ancestry. The resource provides a starting point for designing large-scale genetic studies in Peninsula including Kuwait, and Persian population. Such efforts on populations under-represented in global genome variation surveys help augment current knowledge on human genome diversity.

UR - http://www.scopus.com/inward/record.url?scp=84965089097&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84965089097&partnerID=8YFLogxK

U2 - 10.1186/s12864-015-1233-x

DO - 10.1186/s12864-015-1233-x

M3 - Article

VL - 16

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

ER -