Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample

Iris M. Heid, Claudia Lamina, Helmut Küchenhoff, Guido Fischer, Norman Klopp, Melanie Kolz, Harald Grallert, Caren Vollmert, Stefanie Wagner, Cornelia Huth, Julia Müller, Martina Müller, Steven Hunt, Annette Peters, Bernhard Paulweber, H. Erich Wichmann, Florian Kronenberg, Thomas Illig

Research output: Contribution to journalArticle

10 Citations (Scopus)

Abstract

Previously, estimation of genotype misclassification of single nucleotide polymorphisms (SNPs) as encountered in epidemiologic practice and involving thousands of subjects was lacking. The authors collected representative data on approximately 14,000 subjects from 8 studies and 646,558 genotypes assessed in 2005 by means of matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Overall discordance among 57,805 double genotypes from routine quality control was 0.36%. Fitting different misclassification models by maximum likelihood assuming identical misclassification for all SNPs, the estimated misclassification probabilities ranged from 0.0000 to 0.0035. When applying the misclassification simulation and extrapolation (MC-SIMEX) method for the first time to genetic data to account for the misclassification in a reanalysis of adiponectin-encoding (APM1) gene SNP associations with plasma adiponectin in 1,770 subjects, the authors found no impact of this small error on association estimates but increased estimates for a more substantial error. This study is the first to provide large-scale epidemiologic data on SNP genotype misclassification. The estimated misclassification in this example was small and negligible for association estimates, which is reassuring and essential for detecting SNP associations. In situations with more substantial error, the presented approach using duplicate genotyping and the MC-SIMEX method is practical and helpful for quantifying the genotyping error and its impact.

Original languageEnglish
Pages (from-to)878-889
Number of pages12
JournalAmerican Journal of Epidemiology
Volume168
Issue number8
DOIs
Publication statusPublished - Oct 2008
Externally publishedYes

Fingerprint

Single Nucleotide Polymorphism
Genotype
Adiponectin
Quality Control
Mass Spectrometry
Lasers
Genes

Keywords

  • Bias (epidemiology)
  • Genetics
  • Genotype
  • Likelihood functions
  • Polymorphism, single nucleotide

ASJC Scopus subject areas

  • Epidemiology

Cite this

Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample. / Heid, Iris M.; Lamina, Claudia; Küchenhoff, Helmut; Fischer, Guido; Klopp, Norman; Kolz, Melanie; Grallert, Harald; Vollmert, Caren; Wagner, Stefanie; Huth, Cornelia; Müller, Julia; Müller, Martina; Hunt, Steven; Peters, Annette; Paulweber, Bernhard; Wichmann, H. Erich; Kronenberg, Florian; Illig, Thomas.

In: American Journal of Epidemiology, Vol. 168, No. 8, 10.2008, p. 878-889.

Research output: Contribution to journalArticle

Heid, IM, Lamina, C, Küchenhoff, H, Fischer, G, Klopp, N, Kolz, M, Grallert, H, Vollmert, C, Wagner, S, Huth, C, Müller, J, Müller, M, Hunt, S, Peters, A, Paulweber, B, Wichmann, HE, Kronenberg, F & Illig, T 2008, 'Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample', American Journal of Epidemiology, vol. 168, no. 8, pp. 878-889. https://doi.org/10.1093/aje/kwn208
Heid, Iris M. ; Lamina, Claudia ; Küchenhoff, Helmut ; Fischer, Guido ; Klopp, Norman ; Kolz, Melanie ; Grallert, Harald ; Vollmert, Caren ; Wagner, Stefanie ; Huth, Cornelia ; Müller, Julia ; Müller, Martina ; Hunt, Steven ; Peters, Annette ; Paulweber, Bernhard ; Wichmann, H. Erich ; Kronenberg, Florian ; Illig, Thomas. / Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample. In: American Journal of Epidemiology. 2008 ; Vol. 168, No. 8. pp. 878-889.
@article{92213bc18d1145f4bfcc4b27a9d4536c,
title = "Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample",
abstract = "Previously, estimation of genotype misclassification of single nucleotide polymorphisms (SNPs) as encountered in epidemiologic practice and involving thousands of subjects was lacking. The authors collected representative data on approximately 14,000 subjects from 8 studies and 646,558 genotypes assessed in 2005 by means of matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Overall discordance among 57,805 double genotypes from routine quality control was 0.36{\%}. Fitting different misclassification models by maximum likelihood assuming identical misclassification for all SNPs, the estimated misclassification probabilities ranged from 0.0000 to 0.0035. When applying the misclassification simulation and extrapolation (MC-SIMEX) method for the first time to genetic data to account for the misclassification in a reanalysis of adiponectin-encoding (APM1) gene SNP associations with plasma adiponectin in 1,770 subjects, the authors found no impact of this small error on association estimates but increased estimates for a more substantial error. This study is the first to provide large-scale epidemiologic data on SNP genotype misclassification. The estimated misclassification in this example was small and negligible for association estimates, which is reassuring and essential for detecting SNP associations. In situations with more substantial error, the presented approach using duplicate genotyping and the MC-SIMEX method is practical and helpful for quantifying the genotyping error and its impact.",
keywords = "Bias (epidemiology), Genetics, Genotype, Likelihood functions, Polymorphism, single nucleotide",
author = "Heid, {Iris M.} and Claudia Lamina and Helmut K{\"u}chenhoff and Guido Fischer and Norman Klopp and Melanie Kolz and Harald Grallert and Caren Vollmert and Stefanie Wagner and Cornelia Huth and Julia M{\"u}ller and Martina M{\"u}ller and Steven Hunt and Annette Peters and Bernhard Paulweber and Wichmann, {H. Erich} and Florian Kronenberg and Thomas Illig",
year = "2008",
month = "10",
doi = "10.1093/aje/kwn208",
language = "English",
volume = "168",
pages = "878--889",
journal = "American Journal of Epidemiology",
issn = "0002-9262",
publisher = "Oxford University Press",
number = "8",

}

TY - JOUR

T1 - Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample

AU - Heid, Iris M.

AU - Lamina, Claudia

AU - Küchenhoff, Helmut

AU - Fischer, Guido

AU - Klopp, Norman

AU - Kolz, Melanie

AU - Grallert, Harald

AU - Vollmert, Caren

AU - Wagner, Stefanie

AU - Huth, Cornelia

AU - Müller, Julia

AU - Müller, Martina

AU - Hunt, Steven

AU - Peters, Annette

AU - Paulweber, Bernhard

AU - Wichmann, H. Erich

AU - Kronenberg, Florian

AU - Illig, Thomas

PY - 2008/10

Y1 - 2008/10

N2 - Previously, estimation of genotype misclassification of single nucleotide polymorphisms (SNPs) as encountered in epidemiologic practice and involving thousands of subjects was lacking. The authors collected representative data on approximately 14,000 subjects from 8 studies and 646,558 genotypes assessed in 2005 by means of matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Overall discordance among 57,805 double genotypes from routine quality control was 0.36%. Fitting different misclassification models by maximum likelihood assuming identical misclassification for all SNPs, the estimated misclassification probabilities ranged from 0.0000 to 0.0035. When applying the misclassification simulation and extrapolation (MC-SIMEX) method for the first time to genetic data to account for the misclassification in a reanalysis of adiponectin-encoding (APM1) gene SNP associations with plasma adiponectin in 1,770 subjects, the authors found no impact of this small error on association estimates but increased estimates for a more substantial error. This study is the first to provide large-scale epidemiologic data on SNP genotype misclassification. The estimated misclassification in this example was small and negligible for association estimates, which is reassuring and essential for detecting SNP associations. In situations with more substantial error, the presented approach using duplicate genotyping and the MC-SIMEX method is practical and helpful for quantifying the genotyping error and its impact.

AB - Previously, estimation of genotype misclassification of single nucleotide polymorphisms (SNPs) as encountered in epidemiologic practice and involving thousands of subjects was lacking. The authors collected representative data on approximately 14,000 subjects from 8 studies and 646,558 genotypes assessed in 2005 by means of matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Overall discordance among 57,805 double genotypes from routine quality control was 0.36%. Fitting different misclassification models by maximum likelihood assuming identical misclassification for all SNPs, the estimated misclassification probabilities ranged from 0.0000 to 0.0035. When applying the misclassification simulation and extrapolation (MC-SIMEX) method for the first time to genetic data to account for the misclassification in a reanalysis of adiponectin-encoding (APM1) gene SNP associations with plasma adiponectin in 1,770 subjects, the authors found no impact of this small error on association estimates but increased estimates for a more substantial error. This study is the first to provide large-scale epidemiologic data on SNP genotype misclassification. The estimated misclassification in this example was small and negligible for association estimates, which is reassuring and essential for detecting SNP associations. In situations with more substantial error, the presented approach using duplicate genotyping and the MC-SIMEX method is practical and helpful for quantifying the genotyping error and its impact.

KW - Bias (epidemiology)

KW - Genetics

KW - Genotype

KW - Likelihood functions

KW - Polymorphism, single nucleotide

UR - http://www.scopus.com/inward/record.url?scp=53749088195&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=53749088195&partnerID=8YFLogxK

U2 - 10.1093/aje/kwn208

DO - 10.1093/aje/kwn208

M3 - Article

VL - 168

SP - 878

EP - 889

JO - American Journal of Epidemiology

JF - American Journal of Epidemiology

SN - 0002-9262

IS - 8

ER -