Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records

Sunyang Fu; Guilherme S. Lopes; Sandeep R. Pagali; Bjoerg Thorsteinsdottir; Nathan K. Lebrasseur; Andrew Wen; Hongfang Liu; Walter A. Rocca; Janet E. Olson; Jennifer St. Sauver; Sunghwan Sohn

doi:10.1093/gerona/glaa275

Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records

Sunyang Fu, Guilherme S. Lopes, Sandeep R. Pagali, Bjoerg Thorsteinsdottir, Nathan K. Lebrasseur, Andrew Wen, Hongfang Liu, Walter A. Rocca, Janet E. Olson, Jennifer St. Sauver, Sunghwan Sohn

Research output: Contribution to journal › Article › peer-review

Abstract

Background: Delirium is underdiagnosed in clinical practice and is not routinely coded for billing. Manual chart review can be used to identify the occurrence of delirium; however, it is labor-intensive and impractical for large-scale studies. Natural language processing (NLP) has the capability to process raw text in electronic health records (EHRs) and determine the meaning of the information. We developed and validated NLP algorithms to automatically identify the occurrence of delirium from EHRs. Methods: This study used a randomly selected cohort from the population-based Mayo Clinic Biobank (N = 300, age ≥65). We adopted the standardized evidence-based framework confusion assessment method (CAM) to develop and evaluate NLP algorithms to identify the occurrence of delirium using clinical notes in EHRs. Two NLP algorithms were developed based on CAM criteria: one based on the original CAM (NLP-CAM; delirium vs no delirium) and another based on our modified CAM (NLP-mCAM; definite, possible, and no delirium). The sensitivity, specificity, and accuracy were used for concordance in delirium status between NLP algorithms and manual chart review as the gold standard. The prevalence of delirium cases was examined using International Classification of Diseases, 9th Revision (ICD-9), NLP-CAM, and NLP-mCAM. Results: NLP-CAM demonstrated a sensitivity, specificity, and accuracy of 0.919, 1.000, and 0.967, respectively. NLP-mCAM demonstrated sensitivity, specificity, and accuracy of 0.827, 0.913, and 0.827, respectively. The prevalence analysis of delirium showed that the NLP-CAM algorithm identified 12 651 (9.4%) delirium patients, the NLP-mCAM algorithm identified 20 611 (15.3%) definite delirium cases, and 10 762 (8.0%) possible cases. Conclusions: NLP algorithms based on the standardized evidence-based CAM framework demonstrated high performance in delineating delirium status in an expeditious and cost-effective manner.

Original language	English (US)
Pages (from-to)	524-530
Number of pages	7
Journal	Journals of Gerontology - Series A Biological Sciences and Medical Sciences
Volume	77
Issue number	3
DOIs	https://doi.org/10.1093/gerona/glaa275
State	Published - Mar 1 2022

Keywords

Confusion assessment method
Delirium
Electronic health records
Natural language processing

ASJC Scopus subject areas

General Medicine

Access to Document

10.1093/gerona/glaa275

Cite this

Fu, S., Lopes, G. S., Pagali, S. R., Thorsteinsdottir, B., Lebrasseur, N. K., Wen, A., Liu, H., Rocca, W. A., Olson, J. E., St. Sauver, J., & Sohn, S. (2022). Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records. Journals of Gerontology - Series A Biological Sciences and Medical Sciences, 77(3), 524-530. https://doi.org/10.1093/gerona/glaa275

Fu, S, Lopes, GS, Pagali, SR, Thorsteinsdottir, B, Lebrasseur, NK, Wen, A, Liu, H , Rocca, WA , Olson, JE , St. Sauver, J & Sohn, S 2022, 'Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records', Journals of Gerontology - Series A Biological Sciences and Medical Sciences, vol. 77, no. 3, pp. 524-530. https://doi.org/10.1093/gerona/glaa275

@article{73727d9e7993410e84635f201b4f4ce9,

title = "Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records",

abstract = "Background: Delirium is underdiagnosed in clinical practice and is not routinely coded for billing. Manual chart review can be used to identify the occurrence of delirium; however, it is labor-intensive and impractical for large-scale studies. Natural language processing (NLP) has the capability to process raw text in electronic health records (EHRs) and determine the meaning of the information. We developed and validated NLP algorithms to automatically identify the occurrence of delirium from EHRs. Methods: This study used a randomly selected cohort from the population-based Mayo Clinic Biobank (N = 300, age ≥65). We adopted the standardized evidence-based framework confusion assessment method (CAM) to develop and evaluate NLP algorithms to identify the occurrence of delirium using clinical notes in EHRs. Two NLP algorithms were developed based on CAM criteria: one based on the original CAM (NLP-CAM; delirium vs no delirium) and another based on our modified CAM (NLP-mCAM; definite, possible, and no delirium). The sensitivity, specificity, and accuracy were used for concordance in delirium status between NLP algorithms and manual chart review as the gold standard. The prevalence of delirium cases was examined using International Classification of Diseases, 9th Revision (ICD-9), NLP-CAM, and NLP-mCAM. Results: NLP-CAM demonstrated a sensitivity, specificity, and accuracy of 0.919, 1.000, and 0.967, respectively. NLP-mCAM demonstrated sensitivity, specificity, and accuracy of 0.827, 0.913, and 0.827, respectively. The prevalence analysis of delirium showed that the NLP-CAM algorithm identified 12 651 (9.4%) delirium patients, the NLP-mCAM algorithm identified 20 611 (15.3%) definite delirium cases, and 10 762 (8.0%) possible cases. Conclusions: NLP algorithms based on the standardized evidence-based CAM framework demonstrated high performance in delineating delirium status in an expeditious and cost-effective manner.",

keywords = "Confusion assessment method, Delirium, Electronic health records, Natural language processing",

author = "Sunyang Fu and Lopes, {Guilherme S.} and Pagali, {Sandeep R.} and Bjoerg Thorsteinsdottir and Lebrasseur, {Nathan K.} and Andrew Wen and Hongfang Liu and Rocca, {Walter A.} and Olson, {Janet E.} and {St. Sauver}, Jennifer and Sunghwan Sohn",

note = "Publisher Copyright: {\textcopyright} 2020 The Author(s) 2020. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.",

year = "2022",

month = mar,

day = "1",

doi = "10.1093/gerona/glaa275",

language = "English (US)",

volume = "77",

pages = "524--530",

journal = "Journals of Gerontology - Series A Biological Sciences and Medical Sciences",

issn = "1079-5006",

publisher = "Oxford University Press",

number = "3",

}

TY - JOUR

T1 - Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records

AU - Fu, Sunyang

AU - Lopes, Guilherme S.

AU - Pagali, Sandeep R.

AU - Thorsteinsdottir, Bjoerg

AU - Lebrasseur, Nathan K.

AU - Wen, Andrew

AU - Liu, Hongfang

AU - Rocca, Walter A.

AU - Olson, Janet E.

AU - St. Sauver, Jennifer

AU - Sohn, Sunghwan

N1 - Publisher Copyright: © 2020 The Author(s) 2020. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

PY - 2022/3/1

Y1 - 2022/3/1

N2 - Background: Delirium is underdiagnosed in clinical practice and is not routinely coded for billing. Manual chart review can be used to identify the occurrence of delirium; however, it is labor-intensive and impractical for large-scale studies. Natural language processing (NLP) has the capability to process raw text in electronic health records (EHRs) and determine the meaning of the information. We developed and validated NLP algorithms to automatically identify the occurrence of delirium from EHRs. Methods: This study used a randomly selected cohort from the population-based Mayo Clinic Biobank (N = 300, age ≥65). We adopted the standardized evidence-based framework confusion assessment method (CAM) to develop and evaluate NLP algorithms to identify the occurrence of delirium using clinical notes in EHRs. Two NLP algorithms were developed based on CAM criteria: one based on the original CAM (NLP-CAM; delirium vs no delirium) and another based on our modified CAM (NLP-mCAM; definite, possible, and no delirium). The sensitivity, specificity, and accuracy were used for concordance in delirium status between NLP algorithms and manual chart review as the gold standard. The prevalence of delirium cases was examined using International Classification of Diseases, 9th Revision (ICD-9), NLP-CAM, and NLP-mCAM. Results: NLP-CAM demonstrated a sensitivity, specificity, and accuracy of 0.919, 1.000, and 0.967, respectively. NLP-mCAM demonstrated sensitivity, specificity, and accuracy of 0.827, 0.913, and 0.827, respectively. The prevalence analysis of delirium showed that the NLP-CAM algorithm identified 12 651 (9.4%) delirium patients, the NLP-mCAM algorithm identified 20 611 (15.3%) definite delirium cases, and 10 762 (8.0%) possible cases. Conclusions: NLP algorithms based on the standardized evidence-based CAM framework demonstrated high performance in delineating delirium status in an expeditious and cost-effective manner.

AB - Background: Delirium is underdiagnosed in clinical practice and is not routinely coded for billing. Manual chart review can be used to identify the occurrence of delirium; however, it is labor-intensive and impractical for large-scale studies. Natural language processing (NLP) has the capability to process raw text in electronic health records (EHRs) and determine the meaning of the information. We developed and validated NLP algorithms to automatically identify the occurrence of delirium from EHRs. Methods: This study used a randomly selected cohort from the population-based Mayo Clinic Biobank (N = 300, age ≥65). We adopted the standardized evidence-based framework confusion assessment method (CAM) to develop and evaluate NLP algorithms to identify the occurrence of delirium using clinical notes in EHRs. Two NLP algorithms were developed based on CAM criteria: one based on the original CAM (NLP-CAM; delirium vs no delirium) and another based on our modified CAM (NLP-mCAM; definite, possible, and no delirium). The sensitivity, specificity, and accuracy were used for concordance in delirium status between NLP algorithms and manual chart review as the gold standard. The prevalence of delirium cases was examined using International Classification of Diseases, 9th Revision (ICD-9), NLP-CAM, and NLP-mCAM. Results: NLP-CAM demonstrated a sensitivity, specificity, and accuracy of 0.919, 1.000, and 0.967, respectively. NLP-mCAM demonstrated sensitivity, specificity, and accuracy of 0.827, 0.913, and 0.827, respectively. The prevalence analysis of delirium showed that the NLP-CAM algorithm identified 12 651 (9.4%) delirium patients, the NLP-mCAM algorithm identified 20 611 (15.3%) definite delirium cases, and 10 762 (8.0%) possible cases. Conclusions: NLP algorithms based on the standardized evidence-based CAM framework demonstrated high performance in delineating delirium status in an expeditious and cost-effective manner.

KW - Confusion assessment method

KW - Delirium

KW - Electronic health records

KW - Natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85119831466&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85119831466&partnerID=8YFLogxK

U2 - 10.1093/gerona/glaa275

DO - 10.1093/gerona/glaa275

M3 - Article

C2 - 33125037

AN - SCOPUS:85119831466

SN - 1079-5006

VL - 77

SP - 524

EP - 530

JO - Journals of Gerontology - Series A Biological Sciences and Medical Sciences

JF - Journals of Gerontology - Series A Biological Sciences and Medical Sciences

IS - 3

ER -

Ascertainment of Delirium Status Using Natural Language Processing from Electronic Health Records

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this