Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing

Sungrim Moon; Sijia Liu; Christopher G. Scott; Sujith Samudrala; Mohamed M. Abidian; Jeffrey B. Geske; Peter A. Noseworthy; Jane L. Shellum; Rajeev Chaudhry; Steve R. Ommen; Rick A. Nishimura; Hongfang Liu; Adelaide M. Arruda-Olson

doi:10.1016/j.ijmedinf.2019.05.008

Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing

Sungrim Moon, Sijia Liu, Christopher G. Scott, Sujith Samudrala, Mohamed M. Abidian, Jeffrey B. Geske, Peter A. Noseworthy, Jane L. Shellum, Rajeev Chaudhry, Steve R. Ommen, Rick A. Nishimura, Hongfang Liu, Adelaide M. Arruda-Olson

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.

Original language	English (US)
Pages (from-to)	32-38
Number of pages	7
Journal	International Journal of Medical Informatics
Volume	128
DOIs	https://doi.org/10.1016/j.ijmedinf.2019.05.008
State	Published - Aug 2019

Keywords

Electronic health records
Hypertrophic cardiomyopathy
Natural language processing
Sudden cardiac death

ASJC Scopus subject areas

Health Informatics

Access to Document

10.1016/j.ijmedinf.2019.05.008

Cite this

Moon, S., Liu, S., Scott, C. G., Samudrala, S., Abidian, M. M., Geske, J. B., Noseworthy, P. A., Shellum, J. L., Chaudhry, R., Ommen, S. R., Nishimura, R. A., Liu, H., & Arruda-Olson, A. M. (2019). Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing. International Journal of Medical Informatics, 128, 32-38. https://doi.org/10.1016/j.ijmedinf.2019.05.008

Moon, S, Liu, S, Scott, CG, Samudrala, S, Abidian, MM, Geske, JB, Noseworthy, PA, Shellum, JL, Chaudhry, R, Ommen, SR, Nishimura, RA, Liu, H & Arruda-Olson, AM 2019, 'Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing', International Journal of Medical Informatics, vol. 128, pp. 32-38. https://doi.org/10.1016/j.ijmedinf.2019.05.008

@article{b27604839a63429dbc57e0398069d847,

title = "Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing",

abstract = "Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.",

keywords = "Electronic health records, Hypertrophic cardiomyopathy, Natural language processing, Sudden cardiac death",

author = "Sungrim Moon and Sijia Liu and Scott, {Christopher G.} and Sujith Samudrala and Abidian, {Mohamed M.} and Geske, {Jeffrey B.} and Noseworthy, {Peter A.} and Shellum, {Jane L.} and Rajeev Chaudhry and Ommen, {Steve R.} and Nishimura, {Rick A.} and Hongfang Liu and Arruda-Olson, {Adelaide M.}",

note = "Publisher Copyright: {\textcopyright} 2019 The Authors",

year = "2019",

month = aug,

doi = "10.1016/j.ijmedinf.2019.05.008",

language = "English (US)",

volume = "128",

pages = "32--38",

journal = "International Journal of Medical Informatics",

issn = "1386-5056",

publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing

AU - Moon, Sungrim

AU - Liu, Sijia

AU - Scott, Christopher G.

AU - Samudrala, Sujith

AU - Abidian, Mohamed M.

AU - Geske, Jeffrey B.

AU - Noseworthy, Peter A.

AU - Shellum, Jane L.

AU - Chaudhry, Rajeev

AU - Ommen, Steve R.

AU - Nishimura, Rick A.

AU - Liu, Hongfang

AU - Arruda-Olson, Adelaide M.

PY - 2019/8

Y1 - 2019/8

N2 - Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.

AB - Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.

KW - Electronic health records

KW - Hypertrophic cardiomyopathy

KW - Natural language processing

KW - Sudden cardiac death

UR - http://www.scopus.com/inward/record.url?scp=85065874206&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065874206&partnerID=8YFLogxK

U2 - 10.1016/j.ijmedinf.2019.05.008

DO - 10.1016/j.ijmedinf.2019.05.008

M3 - Article

C2 - 31160009

AN - SCOPUS:85065874206

SN - 1386-5056

VL - 128

SP - 32

EP - 38

JO - International Journal of Medical Informatics

JF - International Journal of Medical Informatics

ER -

Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this