Natural Language Processing for Asthma Ascertainment in Different Practice Settings

Chung Il Wi, Sunghwan Sohn, Mir Ali, Elizabeth Krusemark, Euijung Ryu, Hongfang D Liu, Young J Juhn

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Background: We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic. Objective: To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity. Methods: The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested. Results: Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review. Conclusions: Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.

Original languageEnglish (US)
JournalJournal of Allergy and Clinical Immunology: In Practice
DOIs
StateAccepted/In press - Nov 9 2016

Fingerprint

Natural Language Processing
Asthma
Electronic Health Records

Keywords

  • Algorithm adaptability
  • Asthma ascertainment
  • Electronic health records
  • Epidemiology
  • Informatics
  • Natural language processing
  • Retrospective study
  • Validation

ASJC Scopus subject areas

  • Immunology and Allergy

Cite this

@article{0853b7b9b9c6441cb4ae5a7a25eeb575,
title = "Natural Language Processing for Asthma Ascertainment in Different Practice Settings",
abstract = "Background: We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic. Objective: To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity. Methods: The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested. Results: Among the eligible test cohort (n = 297), 160 (53{\%}) were males, 268 (90{\%}) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25{\%}) and 72 (24{\%}) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92{\%}, 96{\%}, 89{\%}, and 97{\%}, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review. Conclusions: Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.",
keywords = "Algorithm adaptability, Asthma ascertainment, Electronic health records, Epidemiology, Informatics, Natural language processing, Retrospective study, Validation",
author = "Wi, {Chung Il} and Sunghwan Sohn and Mir Ali and Elizabeth Krusemark and Euijung Ryu and Liu, {Hongfang D} and Juhn, {Young J}",
year = "2016",
month = "11",
day = "9",
doi = "10.1016/j.jaip.2017.04.041",
language = "English (US)",
journal = "Journal of Allergy and Clinical Immunology: In Practice",
issn = "2213-2198",
publisher = "Elsevier",

}

TY - JOUR

T1 - Natural Language Processing for Asthma Ascertainment in Different Practice Settings

AU - Wi, Chung Il

AU - Sohn, Sunghwan

AU - Ali, Mir

AU - Krusemark, Elizabeth

AU - Ryu, Euijung

AU - Liu, Hongfang D

AU - Juhn, Young J

PY - 2016/11/9

Y1 - 2016/11/9

N2 - Background: We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic. Objective: To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity. Methods: The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested. Results: Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review. Conclusions: Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.

AB - Background: We developed and validated NLP-PAC, a natural language processing (NLP) algorithm based on predetermined asthma criteria (PAC) for asthma ascertainment using electronic health records at Mayo Clinic. Objective: To adapt NLP-PAC in a different health care setting, Sanford Children Hospital, by assessing its external validity. Methods: The study was designed as a retrospective cohort study that used a random sample of 2011-2012 Sanford Birth cohort (n = 595). Manual chart review was performed on the cohort for asthma ascertainment on the basis of the PAC. We then used half of the cohort as a training cohort (n = 298) and the other half as a blind test cohort to evaluate the adapted NLP-PAC algorithm. Association of known asthma-related risk factors with the Sanford-NLP algorithm-driven asthma ascertainment was tested. Results: Among the eligible test cohort (n = 297), 160 (53%) were males, 268 (90%) white, and the median age was 2.3 years (range, 1.5-3.1 years). NLP-PAC, after adaptation, and the human abstractor identified 74 (25%) and 72 (24%) subjects, respectively, with 66 subjects identified by both approaches. Sensitivity, specificity, positive predictive value, and negative predictive value for the NLP algorithm in predicting asthma status were 92%, 96%, 89%, and 97%, respectively. The known risk factors for asthma identified by NLP (eg, smoking history) were similar to the ones identified by manual chart review. Conclusions: Successful implementation of NLP-PAC for asthma ascertainment in 2 different practice settings demonstrates the feasibility of automated asthma ascertainment leveraging electronic health record data with a potential to enable large-scale, multisite asthma studies to improve asthma care and research.

KW - Algorithm adaptability

KW - Asthma ascertainment

KW - Electronic health records

KW - Epidemiology

KW - Informatics

KW - Natural language processing

KW - Retrospective study

KW - Validation

UR - http://www.scopus.com/inward/record.url?scp=85020504387&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020504387&partnerID=8YFLogxK

U2 - 10.1016/j.jaip.2017.04.041

DO - 10.1016/j.jaip.2017.04.041

M3 - Article

C2 - 28634104

AN - SCOPUS:85020504387

JO - Journal of Allergy and Clinical Immunology: In Practice

JF - Journal of Allergy and Clinical Immunology: In Practice

SN - 2213-2198

ER -