Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set

Brooke L. Fridley, Stephen T Turner, Arlene B. Chapman, Andrei S. Rodin, Eric Boerwinkle, Kent R Bailey

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip®100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27%. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53%. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6% to 95% with a slight increase in the "No Call" rate from 1.81% to 2.33%. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.

Original languageEnglish (US)
Pages (from-to)5367-5374
Number of pages8
JournalComputational Statistics and Data Analysis
Volume52
Issue number12
DOIs
StatePublished - Aug 15 2008

Fingerprint

Reproducibility
Genotype
Uncertainty
Genome
Genes
Single nucleotide Polymorphism
Nucleotides
Polymorphism
Rejection
Likelihood
High Accuracy
Statistics
Decrease
Human
Estimate

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Statistics, Probability and Uncertainty
  • Electrical and Electronic Engineering
  • Computational Mathematics
  • Numerical Analysis
  • Statistics and Probability

Cite this

Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set. / Fridley, Brooke L.; Turner, Stephen T; Chapman, Arlene B.; Rodin, Andrei S.; Boerwinkle, Eric; Bailey, Kent R.

In: Computational Statistics and Data Analysis, Vol. 52, No. 12, 15.08.2008, p. 5367-5374.

Research output: Contribution to journalArticle

Fridley, Brooke L. ; Turner, Stephen T ; Chapman, Arlene B. ; Rodin, Andrei S. ; Boerwinkle, Eric ; Bailey, Kent R. / Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set. In: Computational Statistics and Data Analysis. 2008 ; Vol. 52, No. 12. pp. 5367-5374.
@article{12f943ff95d5450195d9c8167d80c499,
title = "Reproducibility of genotypes as measured by the affymetrix GeneChip{\circledR} 100K Human Mapping Array set",
abstract = "Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip{\circledR}100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27{\%}. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53{\%}. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6{\%} to 95{\%} with a slight increase in the {"}No Call{"} rate from 1.81{\%} to 2.33{\%}. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.",
author = "Fridley, {Brooke L.} and Turner, {Stephen T} and Chapman, {Arlene B.} and Rodin, {Andrei S.} and Eric Boerwinkle and Bailey, {Kent R}",
year = "2008",
month = "8",
day = "15",
doi = "10.1016/j.csda.2008.05.020",
language = "English (US)",
volume = "52",
pages = "5367--5374",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",
number = "12",

}

TY - JOUR

T1 - Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set

AU - Fridley, Brooke L.

AU - Turner, Stephen T

AU - Chapman, Arlene B.

AU - Rodin, Andrei S.

AU - Boerwinkle, Eric

AU - Bailey, Kent R

PY - 2008/8/15

Y1 - 2008/8/15

N2 - Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip®100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27%. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53%. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6% to 95% with a slight increase in the "No Call" rate from 1.81% to 2.33%. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.

AB - Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip®100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27%. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53%. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6% to 95% with a slight increase in the "No Call" rate from 1.81% to 2.33%. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.

UR - http://www.scopus.com/inward/record.url?scp=47749095521&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=47749095521&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2008.05.020

DO - 10.1016/j.csda.2008.05.020

M3 - Article

AN - SCOPUS:47749095521

VL - 52

SP - 5367

EP - 5374

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 12

ER -