TY - JOUR
T1 - Automated extraction of sudden cardiac death risk factors in hypertrophic cardiomyopathy patients by natural language processing
AU - Moon, Sungrim
AU - Liu, Sijia
AU - Scott, Christopher G.
AU - Samudrala, Sujith
AU - Abidian, Mohamed M.
AU - Geske, Jeffrey B.
AU - Noseworthy, Peter A.
AU - Shellum, Jane L.
AU - Chaudhry, Rajeev
AU - Ommen, Steve R.
AU - Nishimura, Rick A.
AU - Liu, Hongfang
AU - Arruda-Olson, Adelaide M.
N1 - Funding Information:
Research reported in this publication was supported by the National Heart, Lung, and Blood Institute of the National Institutes of Health (award K01HL124045 ), National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health (award R01EB19403 ), National Center for Advancing Translational Sciences of the National Institutes of Health (award U01TR02062 ) and by a Mayo Clinic K2R award . The content is solely the responsibility of the authors and does not necessarily represent official views of the National Institutes of Health.
Publisher Copyright:
© 2019 The Authors
PY - 2019/8
Y1 - 2019/8
N2 - Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.
AB - Background: The management of hypertrophic cardiomyopathy (HCM)patients requires the knowledge of risk factors associated with sudden cardiac death (SCD). SCD risk factors such as syncope and family history of SCD (FH-SCD)as well as family history of HCM (FH-HCM)are documented in electronic health records (EHRs)as clinical narratives. Automated extraction of risk factors from clinical narratives by natural language processing (NLP)may expedite management workflow of HCM patients. The aim of this study was to develop and deploy NLP algorithms for automated extraction of syncope, FH-SCD, and FH-HCM from clinical narratives. Methods and Results: We randomly selected 200 patients from the Mayo HCM registry for development (n = 100)and testing (n = 100)of NLP algorithms for extraction of syncope, FH-SCD as well as FH-HCM from clinical narratives of EHRs. The clinical reference standard was manually abstracted by 2 independent annotators. Performance of NLP algorithms was compared to aggregation and summarization of data entries in the HCM registry for syncope, FH-SCD, and FH-HCM. We also compared the NLP algorithms with billing codes for syncope as well as responses to patient survey questions for FH-SCD and FH-HCM. These analyses demonstrated NLP had superior sensitivity (0.96 vs 0.39, p < 0.001)and comparable specificity (0.90 vs 0.92, p = 0.74)and PPV (0.90 vs 0.83, p = 0.37)compared to billing codes for syncope. For FH-SCD, NLP outperformed survey responses for all parameters (sensitivity: 0.91 vs 0.59, p = 0.002; specificity: 0.98 vs 0.50, p < 0.001; PPV: 0.97 vs 0.38, p < 0.001). NLP also achieved superior sensitivity (0.95 vs 0.24, p < 0.001)with comparable specificity (0.95 vs 1.0, p-value not calculable)and positive predictive value (PPV)(0.92 vs 1.0, p = 0.09)compared to survey responses for FH-HCM. Conclusions: Automated extraction of syncope, FH-SCD and FH-HCM using NLP is feasible and has promise to increase efficiency of workflow for providers managing HCM patients.
KW - Electronic health records
KW - Hypertrophic cardiomyopathy
KW - Natural language processing
KW - Sudden cardiac death
UR - http://www.scopus.com/inward/record.url?scp=85065874206&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065874206&partnerID=8YFLogxK
U2 - 10.1016/j.ijmedinf.2019.05.008
DO - 10.1016/j.ijmedinf.2019.05.008
M3 - Article
C2 - 31160009
AN - SCOPUS:85065874206
SN - 1386-5056
VL - 128
SP - 32
EP - 38
JO - International Journal of Medical Informatics
JF - International Journal of Medical Informatics
ER -