Improving Single-Nucleotide Polymorphism-Based Fetal Fraction Estimation of Maternal Plasma Circulating Cell-Free DNA Using Bayesian Hierarchical Models

Nicholas Larson, Chen Wang, Jie Na, Ross A. Rowsey, W Edward Jr. Highsmith, Nicole L. Hoppman, Jean-Pierre Kocher, Eric W Klee

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The recent advances in next-generation sequencing (NGS) technologies have enabled the development of effective high-throughput noninvasive prenatal screening (NIPS) assays for fetal genetic abnormalities using maternal circulating cell-free DNA (ccfDNA). An important NIPS quality assurance is quantifying the fetal proportion of the sampled ccfDNA. For methods using allelic read count ratios from targeted sequencing of single-nucleotide polymorphisms (SNPs), systematic biases and errors may reduce accuracy and diminish assay performance. We collected ccfDNA NIPS MiSeq sequencing data from an amplicon-based 92 SNP panel along with complementary low-depth whole-genome sequencing (WGS) on 243 normal male fetus pregnancies along with additional 144 nonpregnant female donor samples. Using fetal fraction estimates based on X and Y chromosome WGS coverage as gold standard, we compared an existing SNP-based approach, FetalQuant, to a more flexible Bayesian hierarchical modeling strategy that borrows information across interrogated SNPs to character SNP-level error rates and biases to improve fetal fraction estimates. Posterior distributions for SNP-level model parameters indicate most SNPs exhibited modest to moderate extrabinomial variation and a consistent underrepresentation of fetal alleles, with some extreme outliers in both regards. Fetal fraction estimates using FetalQuant, naive to these SNP properties, were relatively poor (R2 = 0.14, root mean squared error [RMSE] = 0.050), particularly when the true fetal fraction was low (<5%). In contrast, by quantifying SNP-level biases and error rates, our proposed approach demonstrated improved performance by reducing the bias and variability in fetal fraction estimates (R2 = 0.794, RMSE = 0.025). Using high-depth targeted SNP sequencing data, we identified a high degree of variability in distributional properties across SNP allelic read counts. These results highlight the benefits of leveraging hierarchical modeling for SNP-based fetal quantification assays (FQAs) and the need to properly calibrate FQAs dependent on NGS allelic ratio data.

Original languageEnglish (US)
Pages (from-to)1040-1049
Number of pages10
JournalJournal of Computational Biology
Volume25
Issue number9
DOIs
StatePublished - Sep 1 2018

Fingerprint

Bayesian Hierarchical Model
Single nucleotide Polymorphism
Nucleotides
Plasma Cells
Polymorphism
Single Nucleotide Polymorphism
DNA
Plasma
Mothers
Plasmas
Cell
Sequencing
Assays
Prenatal Diagnosis
Screening
Hierarchical Modeling
Mean Squared Error
Estimate
Quantification
Error Rate

Keywords

  • Bayesian hierarchical models
  • cell-free DNA
  • next-generation sequencing
  • noninvasive prenatal screening

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics

Cite this

Improving Single-Nucleotide Polymorphism-Based Fetal Fraction Estimation of Maternal Plasma Circulating Cell-Free DNA Using Bayesian Hierarchical Models. / Larson, Nicholas; Wang, Chen; Na, Jie; Rowsey, Ross A.; Highsmith, W Edward Jr.; Hoppman, Nicole L.; Kocher, Jean-Pierre; Klee, Eric W.

In: Journal of Computational Biology, Vol. 25, No. 9, 01.09.2018, p. 1040-1049.

Research output: Contribution to journalArticle

@article{e5b0e93e16b74055bffac4559427b39d,
title = "Improving Single-Nucleotide Polymorphism-Based Fetal Fraction Estimation of Maternal Plasma Circulating Cell-Free DNA Using Bayesian Hierarchical Models",
abstract = "The recent advances in next-generation sequencing (NGS) technologies have enabled the development of effective high-throughput noninvasive prenatal screening (NIPS) assays for fetal genetic abnormalities using maternal circulating cell-free DNA (ccfDNA). An important NIPS quality assurance is quantifying the fetal proportion of the sampled ccfDNA. For methods using allelic read count ratios from targeted sequencing of single-nucleotide polymorphisms (SNPs), systematic biases and errors may reduce accuracy and diminish assay performance. We collected ccfDNA NIPS MiSeq sequencing data from an amplicon-based 92 SNP panel along with complementary low-depth whole-genome sequencing (WGS) on 243 normal male fetus pregnancies along with additional 144 nonpregnant female donor samples. Using fetal fraction estimates based on X and Y chromosome WGS coverage as gold standard, we compared an existing SNP-based approach, FetalQuant, to a more flexible Bayesian hierarchical modeling strategy that borrows information across interrogated SNPs to character SNP-level error rates and biases to improve fetal fraction estimates. Posterior distributions for SNP-level model parameters indicate most SNPs exhibited modest to moderate extrabinomial variation and a consistent underrepresentation of fetal alleles, with some extreme outliers in both regards. Fetal fraction estimates using FetalQuant, naive to these SNP properties, were relatively poor (R2 = 0.14, root mean squared error [RMSE] = 0.050), particularly when the true fetal fraction was low (<5{\%}). In contrast, by quantifying SNP-level biases and error rates, our proposed approach demonstrated improved performance by reducing the bias and variability in fetal fraction estimates (R2 = 0.794, RMSE = 0.025). Using high-depth targeted SNP sequencing data, we identified a high degree of variability in distributional properties across SNP allelic read counts. These results highlight the benefits of leveraging hierarchical modeling for SNP-based fetal quantification assays (FQAs) and the need to properly calibrate FQAs dependent on NGS allelic ratio data.",
keywords = "Bayesian hierarchical models, cell-free DNA, next-generation sequencing, noninvasive prenatal screening",
author = "Nicholas Larson and Chen Wang and Jie Na and Rowsey, {Ross A.} and Highsmith, {W Edward Jr.} and Hoppman, {Nicole L.} and Jean-Pierre Kocher and Klee, {Eric W}",
year = "2018",
month = "9",
day = "1",
doi = "10.1089/cmb.2018.0056",
language = "English (US)",
volume = "25",
pages = "1040--1049",
journal = "Journal of Computational Biology",
issn = "1066-5277",
publisher = "Mary Ann Liebert Inc.",
number = "9",

}

TY - JOUR

T1 - Improving Single-Nucleotide Polymorphism-Based Fetal Fraction Estimation of Maternal Plasma Circulating Cell-Free DNA Using Bayesian Hierarchical Models

AU - Larson, Nicholas

AU - Wang, Chen

AU - Na, Jie

AU - Rowsey, Ross A.

AU - Highsmith, W Edward Jr.

AU - Hoppman, Nicole L.

AU - Kocher, Jean-Pierre

AU - Klee, Eric W

PY - 2018/9/1

Y1 - 2018/9/1

N2 - The recent advances in next-generation sequencing (NGS) technologies have enabled the development of effective high-throughput noninvasive prenatal screening (NIPS) assays for fetal genetic abnormalities using maternal circulating cell-free DNA (ccfDNA). An important NIPS quality assurance is quantifying the fetal proportion of the sampled ccfDNA. For methods using allelic read count ratios from targeted sequencing of single-nucleotide polymorphisms (SNPs), systematic biases and errors may reduce accuracy and diminish assay performance. We collected ccfDNA NIPS MiSeq sequencing data from an amplicon-based 92 SNP panel along with complementary low-depth whole-genome sequencing (WGS) on 243 normal male fetus pregnancies along with additional 144 nonpregnant female donor samples. Using fetal fraction estimates based on X and Y chromosome WGS coverage as gold standard, we compared an existing SNP-based approach, FetalQuant, to a more flexible Bayesian hierarchical modeling strategy that borrows information across interrogated SNPs to character SNP-level error rates and biases to improve fetal fraction estimates. Posterior distributions for SNP-level model parameters indicate most SNPs exhibited modest to moderate extrabinomial variation and a consistent underrepresentation of fetal alleles, with some extreme outliers in both regards. Fetal fraction estimates using FetalQuant, naive to these SNP properties, were relatively poor (R2 = 0.14, root mean squared error [RMSE] = 0.050), particularly when the true fetal fraction was low (<5%). In contrast, by quantifying SNP-level biases and error rates, our proposed approach demonstrated improved performance by reducing the bias and variability in fetal fraction estimates (R2 = 0.794, RMSE = 0.025). Using high-depth targeted SNP sequencing data, we identified a high degree of variability in distributional properties across SNP allelic read counts. These results highlight the benefits of leveraging hierarchical modeling for SNP-based fetal quantification assays (FQAs) and the need to properly calibrate FQAs dependent on NGS allelic ratio data.

AB - The recent advances in next-generation sequencing (NGS) technologies have enabled the development of effective high-throughput noninvasive prenatal screening (NIPS) assays for fetal genetic abnormalities using maternal circulating cell-free DNA (ccfDNA). An important NIPS quality assurance is quantifying the fetal proportion of the sampled ccfDNA. For methods using allelic read count ratios from targeted sequencing of single-nucleotide polymorphisms (SNPs), systematic biases and errors may reduce accuracy and diminish assay performance. We collected ccfDNA NIPS MiSeq sequencing data from an amplicon-based 92 SNP panel along with complementary low-depth whole-genome sequencing (WGS) on 243 normal male fetus pregnancies along with additional 144 nonpregnant female donor samples. Using fetal fraction estimates based on X and Y chromosome WGS coverage as gold standard, we compared an existing SNP-based approach, FetalQuant, to a more flexible Bayesian hierarchical modeling strategy that borrows information across interrogated SNPs to character SNP-level error rates and biases to improve fetal fraction estimates. Posterior distributions for SNP-level model parameters indicate most SNPs exhibited modest to moderate extrabinomial variation and a consistent underrepresentation of fetal alleles, with some extreme outliers in both regards. Fetal fraction estimates using FetalQuant, naive to these SNP properties, were relatively poor (R2 = 0.14, root mean squared error [RMSE] = 0.050), particularly when the true fetal fraction was low (<5%). In contrast, by quantifying SNP-level biases and error rates, our proposed approach demonstrated improved performance by reducing the bias and variability in fetal fraction estimates (R2 = 0.794, RMSE = 0.025). Using high-depth targeted SNP sequencing data, we identified a high degree of variability in distributional properties across SNP allelic read counts. These results highlight the benefits of leveraging hierarchical modeling for SNP-based fetal quantification assays (FQAs) and the need to properly calibrate FQAs dependent on NGS allelic ratio data.

KW - Bayesian hierarchical models

KW - cell-free DNA

KW - next-generation sequencing

KW - noninvasive prenatal screening

UR - http://www.scopus.com/inward/record.url?scp=85053181537&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85053181537&partnerID=8YFLogxK

U2 - 10.1089/cmb.2018.0056

DO - 10.1089/cmb.2018.0056

M3 - Article

VL - 25

SP - 1040

EP - 1049

JO - Journal of Computational Biology

JF - Journal of Computational Biology

SN - 1066-5277

IS - 9

ER -