Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies

Kathryn L. Jackson, Michael Mbagwu, Jennifer A. Pacheco, Abigail S. Baldridge, Daniel J. Viox, James G. Linneman, Sanjay K. Shukla, Peggy L. Peissig, Kenneth M. Borthwick, David A. Carrell, Suzette J Bielinski, Jacqueline C. Kirby, Joshua C. Denny, Frank D. Mentch, Lyam M. Vazquez, Laura J. Rasmussen-Torvik, Abel N. Kho

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Background: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. Methods: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. Results: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p<5 E -8) findings. Conclusions: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.

Original languageEnglish (US)
Article number684
JournalBMC Infectious Diseases
Volume16
Issue number1
DOIs
StatePublished - Nov 17 2016

Fingerprint

Electronic Health Records
Genetic Association Studies
Methicillin-Resistant Staphylococcus aureus
Phenotype
Genome-Wide Association Study
Soft Tissue Infections
Centers for Disease Control and Prevention (U.S.)
African Americans
Population
Skin
Infection

Keywords

  • Ca-MRSA Phenotype
  • Ca_MRSA
  • Electronic Health Record
  • GWAS
  • Phenotyping

ASJC Scopus subject areas

  • Infectious Diseases

Cite this

Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies. / Jackson, Kathryn L.; Mbagwu, Michael; Pacheco, Jennifer A.; Baldridge, Abigail S.; Viox, Daniel J.; Linneman, James G.; Shukla, Sanjay K.; Peissig, Peggy L.; Borthwick, Kenneth M.; Carrell, David A.; Bielinski, Suzette J; Kirby, Jacqueline C.; Denny, Joshua C.; Mentch, Frank D.; Vazquez, Lyam M.; Rasmussen-Torvik, Laura J.; Kho, Abel N.

In: BMC Infectious Diseases, Vol. 16, No. 1, 684, 17.11.2016.

Research output: Contribution to journalArticle

Jackson, KL, Mbagwu, M, Pacheco, JA, Baldridge, AS, Viox, DJ, Linneman, JG, Shukla, SK, Peissig, PL, Borthwick, KM, Carrell, DA, Bielinski, SJ, Kirby, JC, Denny, JC, Mentch, FD, Vazquez, LM, Rasmussen-Torvik, LJ & Kho, AN 2016, 'Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies', BMC Infectious Diseases, vol. 16, no. 1, 684. https://doi.org/10.1186/s12879-016-2020-2
Jackson, Kathryn L. ; Mbagwu, Michael ; Pacheco, Jennifer A. ; Baldridge, Abigail S. ; Viox, Daniel J. ; Linneman, James G. ; Shukla, Sanjay K. ; Peissig, Peggy L. ; Borthwick, Kenneth M. ; Carrell, David A. ; Bielinski, Suzette J ; Kirby, Jacqueline C. ; Denny, Joshua C. ; Mentch, Frank D. ; Vazquez, Lyam M. ; Rasmussen-Torvik, Laura J. ; Kho, Abel N. / Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies. In: BMC Infectious Diseases. 2016 ; Vol. 16, No. 1.
@article{db33db867b16437d9948a0d79ede1d78,
title = "Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies",
abstract = "Background: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. Methods: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. Results: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100{\%} for cases and 96 to 100{\%} for controls; sensitivity ranged from 94 to 100{\%} for cases and 75 to 100{\%} for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p<5 E -8) findings. Conclusions: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.",
keywords = "Ca-MRSA Phenotype, Ca_MRSA, Electronic Health Record, GWAS, Phenotyping",
author = "Jackson, {Kathryn L.} and Michael Mbagwu and Pacheco, {Jennifer A.} and Baldridge, {Abigail S.} and Viox, {Daniel J.} and Linneman, {James G.} and Shukla, {Sanjay K.} and Peissig, {Peggy L.} and Borthwick, {Kenneth M.} and Carrell, {David A.} and Bielinski, {Suzette J} and Kirby, {Jacqueline C.} and Denny, {Joshua C.} and Mentch, {Frank D.} and Vazquez, {Lyam M.} and Rasmussen-Torvik, {Laura J.} and Kho, {Abel N.}",
year = "2016",
month = "11",
day = "17",
doi = "10.1186/s12879-016-2020-2",
language = "English (US)",
volume = "16",
journal = "BMC Infectious Diseases",
issn = "1471-2334",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies

AU - Jackson, Kathryn L.

AU - Mbagwu, Michael

AU - Pacheco, Jennifer A.

AU - Baldridge, Abigail S.

AU - Viox, Daniel J.

AU - Linneman, James G.

AU - Shukla, Sanjay K.

AU - Peissig, Peggy L.

AU - Borthwick, Kenneth M.

AU - Carrell, David A.

AU - Bielinski, Suzette J

AU - Kirby, Jacqueline C.

AU - Denny, Joshua C.

AU - Mentch, Frank D.

AU - Vazquez, Lyam M.

AU - Rasmussen-Torvik, Laura J.

AU - Kho, Abel N.

PY - 2016/11/17

Y1 - 2016/11/17

N2 - Background: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. Methods: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. Results: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p<5 E -8) findings. Conclusions: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.

AB - Background: Community associated methicillin-resistant Staphylococcus aureus (CA-MRSA) is one of the most common causes of skin and soft tissue infections in the United States, and a variety of genetic host factors are suspected to be risk factors for recurrent infection. Based on the CDC definition, we have developed and validated an electronic health record (EHR) based CA-MRSA phenotype algorithm utilizing both structured and unstructured data. Methods: The algorithm was validated at three eMERGE consortium sites, and positive predictive value, negative predictive value and sensitivity, were calculated. The algorithm was then run and data collected across seven total sites. The resulting data was used in GWAS analysis. Results: Across seven sites, the CA-MRSA phenotype algorithm identified a total of 349 cases and 7761 controls among the genotyped European and African American biobank populations. PPV ranged from 68 to 100% for cases and 96 to 100% for controls; sensitivity ranged from 94 to 100% for cases and 75 to 100% for controls. Frequency of cases in the populations varied widely by site. There were no plausible GWAS-significant (p<5 E -8) findings. Conclusions: Differences in EHR data representation and screening patterns across sites may have affected identification of cases and controls and accounted for varying frequencies across sites. Future work identifying these patterns is necessary.

KW - Ca-MRSA Phenotype

KW - Ca_MRSA

KW - Electronic Health Record

KW - GWAS

KW - Phenotyping

UR - http://www.scopus.com/inward/record.url?scp=84997017373&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84997017373&partnerID=8YFLogxK

U2 - 10.1186/s12879-016-2020-2

DO - 10.1186/s12879-016-2020-2

M3 - Article

C2 - 27855652

AN - SCOPUS:84997017373

VL - 16

JO - BMC Infectious Diseases

JF - BMC Infectious Diseases

SN - 1471-2334

IS - 1

M1 - 684

ER -