TY - JOUR
T1 - Reproducibility of genotypes as measured by the affymetrix GeneChip® 100K Human Mapping Array set
AU - Fridley, Brooke L.
AU - Turner, Stephen T.
AU - Chapman, Arlene B.
AU - Rodin, Andrei S.
AU - Boerwinkle, Eric
AU - Bailey, Kent R.
N1 - Funding Information:
This work is supported by U.S. Public Health Services grants R01 HL74735 and R01 HL53330. We would like to thank Prabin Thapa, Jodie Van De Rostyne, Jeremy Palbicki and Zhiying Wang for their help with the genotyping, sample management, and the analysis of the data. We would also like to thank Dr. Ramanath Majumdar for the use of 8 Xba1 chips and 1 tube of PCR primer.
PY - 2008/8/15
Y1 - 2008/8/15
N2 - Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip®100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27%. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53%. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6% to 95% with a slight increase in the "No Call" rate from 1.81% to 2.33%. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.
AB - Genotyping errors that are undetected in genome-wide association studies using single nucleotide polymorphisms (SNPs) may degrade the likelihood of detecting true positive associations. To estimate the frequency of genotyping errors and assess the reproducibility of genotype calls, we analyzed two sets of duplicate data, one dataset containing twenty blind duplicates and another dataset containing twenty-eight nonrandom duplicates, from a genome-wide association study using Affymetrix GeneChip®100 K Human Mapping Arrays. For the twenty blind duplicates the overall agreement in genotyping calls as measured with the Kappa statistics, was 0.997, with a discordancy rate of 0.27%. For the twenty-eight nonrandom duplicates, the overall agreement was lower, 0.95, with a higher discordancy rate of 4.53%. The accuracy and probability of concordancy were inversely related to the genotyping uncertainty score, i.e., as the genotyping uncertainty score increased, the concordancy and probability of concordant calls decreased. Lowering of the uncertainty score threshold for rejection of genotype calls from the Affymetrix recommended value of 0.25 to 0.20 resulted in an increased predicted accuracy from 92.6% to 95% with a slight increase in the "No Call" rate from 1.81% to 2.33%. Hence, we suggest using a lower uncertainty score threshold, say 0.20, which will result in higher accuracy in calls at a modest decrease in the call rate.
UR - http://www.scopus.com/inward/record.url?scp=47749095521&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47749095521&partnerID=8YFLogxK
U2 - 10.1016/j.csda.2008.05.020
DO - 10.1016/j.csda.2008.05.020
M3 - Article
AN - SCOPUS:47749095521
SN - 0167-9473
VL - 52
SP - 5367
EP - 5374
JO - Computational Statistics and Data Analysis
JF - Computational Statistics and Data Analysis
IS - 12
ER -