A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers

Jonathan D. Mosley, Qi Ping Feng, Quinn S. Wells, Sara L. Van Driest, Christian M. Shaffer, Todd L. Edwards, Lisa Bastarache, Wei Qi Wei, Lea K. Davis, Catherine A. McCarty, Will Thompson, Christopher G. Chute, Gail P. Jarvik, Adam S. Gordon, Melody R. Palmer, David R. Crosslin, Eric B. Larson, David S. Carrell, Iftikhar Jan Kullo, Jennifer A. PachecoPeggy L. Peissig, Murray H. Brilliant, James G. Linneman, Bahram Namjou, Marc S. Williams, Marylyn D. Ritchie, Kenneth M. Borthwick, Shefali S. Verma, Jason H. Karnes, Scott T. Weiss, Thomas J. Wang, C. Michael Stein, Josh C. Denny, Dan M. Roden

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.

Original languageEnglish (US)
Article number3522
JournalNature Communications
Volume9
Issue number1
DOIs
StatePublished - Dec 1 2018

Fingerprint

Electronic Health Records
biomarkers
Biomarkers
health
Health
electronics
Single Nucleotide Polymorphism
epidemiology
arteriosclerosis
biometrics
Epidemiology
metabolites
cholesterol
Biometrics
Metabolites
LDL Cholesterol
Population
Blood Proteins
Assays
Sepsis

ASJC Scopus subject areas

  • Chemistry(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Physics and Astronomy(all)

Cite this

Mosley, J. D., Feng, Q. P., Wells, Q. S., Van Driest, S. L., Shaffer, C. M., Edwards, T. L., ... Roden, D. M. (2018). A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers. Nature Communications, 9(1), [3522]. https://doi.org/10.1038/s41467-018-05624-4

A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers. / Mosley, Jonathan D.; Feng, Qi Ping; Wells, Quinn S.; Van Driest, Sara L.; Shaffer, Christian M.; Edwards, Todd L.; Bastarache, Lisa; Wei, Wei Qi; Davis, Lea K.; McCarty, Catherine A.; Thompson, Will; Chute, Christopher G.; Jarvik, Gail P.; Gordon, Adam S.; Palmer, Melody R.; Crosslin, David R.; Larson, Eric B.; Carrell, David S.; Kullo, Iftikhar Jan; Pacheco, Jennifer A.; Peissig, Peggy L.; Brilliant, Murray H.; Linneman, James G.; Namjou, Bahram; Williams, Marc S.; Ritchie, Marylyn D.; Borthwick, Kenneth M.; Verma, Shefali S.; Karnes, Jason H.; Weiss, Scott T.; Wang, Thomas J.; Stein, C. Michael; Denny, Josh C.; Roden, Dan M.

In: Nature Communications, Vol. 9, No. 1, 3522, 01.12.2018.

Research output: Contribution to journalArticle

Mosley, JD, Feng, QP, Wells, QS, Van Driest, SL, Shaffer, CM, Edwards, TL, Bastarache, L, Wei, WQ, Davis, LK, McCarty, CA, Thompson, W, Chute, CG, Jarvik, GP, Gordon, AS, Palmer, MR, Crosslin, DR, Larson, EB, Carrell, DS, Kullo, IJ, Pacheco, JA, Peissig, PL, Brilliant, MH, Linneman, JG, Namjou, B, Williams, MS, Ritchie, MD, Borthwick, KM, Verma, SS, Karnes, JH, Weiss, ST, Wang, TJ, Stein, CM, Denny, JC & Roden, DM 2018, 'A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers', Nature Communications, vol. 9, no. 1, 3522. https://doi.org/10.1038/s41467-018-05624-4
Mosley, Jonathan D. ; Feng, Qi Ping ; Wells, Quinn S. ; Van Driest, Sara L. ; Shaffer, Christian M. ; Edwards, Todd L. ; Bastarache, Lisa ; Wei, Wei Qi ; Davis, Lea K. ; McCarty, Catherine A. ; Thompson, Will ; Chute, Christopher G. ; Jarvik, Gail P. ; Gordon, Adam S. ; Palmer, Melody R. ; Crosslin, David R. ; Larson, Eric B. ; Carrell, David S. ; Kullo, Iftikhar Jan ; Pacheco, Jennifer A. ; Peissig, Peggy L. ; Brilliant, Murray H. ; Linneman, James G. ; Namjou, Bahram ; Williams, Marc S. ; Ritchie, Marylyn D. ; Borthwick, Kenneth M. ; Verma, Shefali S. ; Karnes, Jason H. ; Weiss, Scott T. ; Wang, Thomas J. ; Stein, C. Michael ; Denny, Josh C. ; Roden, Dan M. / A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers. In: Nature Communications. 2018 ; Vol. 9, No. 1.
@article{9a9fa8cb43d84448b860b592d60e6ebb,
title = "A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers",
abstract = "Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.",
author = "Mosley, {Jonathan D.} and Feng, {Qi Ping} and Wells, {Quinn S.} and {Van Driest}, {Sara L.} and Shaffer, {Christian M.} and Edwards, {Todd L.} and Lisa Bastarache and Wei, {Wei Qi} and Davis, {Lea K.} and McCarty, {Catherine A.} and Will Thompson and Chute, {Christopher G.} and Jarvik, {Gail P.} and Gordon, {Adam S.} and Palmer, {Melody R.} and Crosslin, {David R.} and Larson, {Eric B.} and Carrell, {David S.} and Kullo, {Iftikhar Jan} and Pacheco, {Jennifer A.} and Peissig, {Peggy L.} and Brilliant, {Murray H.} and Linneman, {James G.} and Bahram Namjou and Williams, {Marc S.} and Ritchie, {Marylyn D.} and Borthwick, {Kenneth M.} and Verma, {Shefali S.} and Karnes, {Jason H.} and Weiss, {Scott T.} and Wang, {Thomas J.} and Stein, {C. Michael} and Denny, {Josh C.} and Roden, {Dan M.}",
year = "2018",
month = "12",
day = "1",
doi = "10.1038/s41467-018-05624-4",
language = "English (US)",
volume = "9",
journal = "Nature Communications",
issn = "2041-1723",
publisher = "Nature Publishing Group",
number = "1",

}

TY - JOUR

T1 - A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers

AU - Mosley, Jonathan D.

AU - Feng, Qi Ping

AU - Wells, Quinn S.

AU - Van Driest, Sara L.

AU - Shaffer, Christian M.

AU - Edwards, Todd L.

AU - Bastarache, Lisa

AU - Wei, Wei Qi

AU - Davis, Lea K.

AU - McCarty, Catherine A.

AU - Thompson, Will

AU - Chute, Christopher G.

AU - Jarvik, Gail P.

AU - Gordon, Adam S.

AU - Palmer, Melody R.

AU - Crosslin, David R.

AU - Larson, Eric B.

AU - Carrell, David S.

AU - Kullo, Iftikhar Jan

AU - Pacheco, Jennifer A.

AU - Peissig, Peggy L.

AU - Brilliant, Murray H.

AU - Linneman, James G.

AU - Namjou, Bahram

AU - Williams, Marc S.

AU - Ritchie, Marylyn D.

AU - Borthwick, Kenneth M.

AU - Verma, Shefali S.

AU - Karnes, Jason H.

AU - Weiss, Scott T.

AU - Wang, Thomas J.

AU - Stein, C. Michael

AU - Denny, Josh C.

AU - Roden, Dan M.

PY - 2018/12/1

Y1 - 2018/12/1

N2 - Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.

AB - Defining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study. We use the SNP weightings to computed predicted biomarker values in an EHR population and test associations with 1139 diagnoses. Here we report 116 associations meeting a Bonferroni level of significance. A false discovery rate (FDR)-based significance threshold reveals more known and undescribed associations across a broad range of biomarkers, including biometric measures, plasma proteins and metabolites, functional assays, and behaviors. We confirm an inverse association between LDL-cholesterol level and septicemia risk in an independent epidemiological cohort. This approach efficiently discovers biomarker-disease associations.

UR - http://www.scopus.com/inward/record.url?scp=85052679446&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052679446&partnerID=8YFLogxK

U2 - 10.1038/s41467-018-05624-4

DO - 10.1038/s41467-018-05624-4

M3 - Article

C2 - 30166544

AN - SCOPUS:85052679446

VL - 9

JO - Nature Communications

JF - Nature Communications

SN - 2041-1723

IS - 1

M1 - 3522

ER -