Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports

Nakeya Dewaswala; David Chen; Huzefa Bhopalwala; Vinod C. Kaggal; Sean P. Murphy; J. Martijn Bos; Jeffrey B. Geske; Bernard J. Gersh; Steve R. Ommen; Philip A. Araoz; Michael J. Ackerman; Adelaide M. Arruda-Olson

doi:10.1186/s12911-022-02017-y

Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports

Nakeya Dewaswala, David Chen, Huzefa Bhopalwala, Vinod C. Kaggal, Sean P. Murphy, J. Martijn Bos, Jeffrey B. Geske, Bernard J. Gersh, Steve R. Ommen, Philip A. Araoz, Michael J. Ackerman, Adelaide M. Arruda-Olson

Research output: Contribution to journal › Article › peer-review

Abstract

Background: Cardiac magnetic resonance (CMR) imaging is important for diagnosis and risk stratification of hypertrophic cardiomyopathy (HCM) patients. However, collection of information from large numbers of CMR reports by manual review is time-consuming, error-prone and costly. Natural language processing (NLP) is an artificial intelligence method for automated extraction of information from narrative text including text in CMR reports in electronic health records (EHR). Our objective was to assess whether NLP can accurately extract diagnosis of HCM from CMR reports. Methods: An NLP system with two tiers was developed for information extraction from narrative text in CMR reports; the first tier extracted information regarding HCM diagnosis while the second extracted categorical and numeric concepts for HCM classification. We randomly allocated 200 HCM patients with CMR reports from 2004 to 2018 into training (100 patients with 185 CMR reports) and testing sets (100 patients with 206 reports). Results: NLP algorithms demonstrated very high performance compared to manual annotation. The algorithm to extract HCM diagnosis had accuracy of 0.99. The accuracy for categorical concepts included HCM morphologic subtype 0.99, systolic anterior motion of the mitral valve 0.96, mitral regurgitation 0.93, left ventricular (LV) obstruction 0.94, location of obstruction 0.92, apical pouch 0.98, LV delayed enhancement 0.93, left atrial enlargement 0.99 and right atrial enlargement 0.98. Accuracy for numeric concepts included maximal LV wall thickness 0.96, LV mass 0.99, LV mass index 0.98, LV ejection fraction 0.98 and right ventricular ejection fraction 0.99. Conclusions: NLP identified and classified HCM from CMR narrative text reports with very high performance.

Original language	English (US)
Article number	272
Journal	BMC Medical Informatics and Decision Making
Volume	22
Issue number	1
DOIs	https://doi.org/10.1186/s12911-022-02017-y
State	Published - Dec 2022

Keywords

Cardiac magnetic resonance imaging
Hypertrophic cardiomyopathy
Natural language processing
Radiology reports

ASJC Scopus subject areas

Health Policy
Health Informatics
Computer Science Applications

Access to Document

10.1186/s12911-022-02017-y

Cite this

Dewaswala, N., Chen, D., Bhopalwala, H., Kaggal, V. C., Murphy, S. P., Bos, J. M., Geske, J. B., Gersh, B. J., Ommen, S. R., Araoz, P. A., Ackerman, M. J., & Arruda-Olson, A. M. (2022). Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports. BMC Medical Informatics and Decision Making, 22(1), Article 272. https://doi.org/10.1186/s12911-022-02017-y

Dewaswala, N, Chen, D, Bhopalwala, H, Kaggal, VC, Murphy, SP, Bos, JM, Geske, JB, Gersh, BJ, Ommen, SR, Araoz, PA , Ackerman, MJ & Arruda-Olson, AM 2022, 'Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports', BMC Medical Informatics and Decision Making, vol. 22, no. 1, 272. https://doi.org/10.1186/s12911-022-02017-y

@article{482e77247809423e8c7475be6ce70e16,

title = "Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports",

abstract = "Background: Cardiac magnetic resonance (CMR) imaging is important for diagnosis and risk stratification of hypertrophic cardiomyopathy (HCM) patients. However, collection of information from large numbers of CMR reports by manual review is time-consuming, error-prone and costly. Natural language processing (NLP) is an artificial intelligence method for automated extraction of information from narrative text including text in CMR reports in electronic health records (EHR). Our objective was to assess whether NLP can accurately extract diagnosis of HCM from CMR reports. Methods: An NLP system with two tiers was developed for information extraction from narrative text in CMR reports; the first tier extracted information regarding HCM diagnosis while the second extracted categorical and numeric concepts for HCM classification. We randomly allocated 200 HCM patients with CMR reports from 2004 to 2018 into training (100 patients with 185 CMR reports) and testing sets (100 patients with 206 reports). Results: NLP algorithms demonstrated very high performance compared to manual annotation. The algorithm to extract HCM diagnosis had accuracy of 0.99. The accuracy for categorical concepts included HCM morphologic subtype 0.99, systolic anterior motion of the mitral valve 0.96, mitral regurgitation 0.93, left ventricular (LV) obstruction 0.94, location of obstruction 0.92, apical pouch 0.98, LV delayed enhancement 0.93, left atrial enlargement 0.99 and right atrial enlargement 0.98. Accuracy for numeric concepts included maximal LV wall thickness 0.96, LV mass 0.99, LV mass index 0.98, LV ejection fraction 0.98 and right ventricular ejection fraction 0.99. Conclusions: NLP identified and classified HCM from CMR narrative text reports with very high performance.",

keywords = "Cardiac magnetic resonance imaging, Hypertrophic cardiomyopathy, Natural language processing, Radiology reports",

author = "Nakeya Dewaswala and David Chen and Huzefa Bhopalwala and Kaggal, {Vinod C.} and Murphy, {Sean P.} and Bos, {J. Martijn} and Geske, {Jeffrey B.} and Gersh, {Bernard J.} and Ommen, {Steve R.} and Araoz, {Philip A.} and Ackerman, {Michael J.} and Arruda-Olson, {Adelaide M.}",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s).",

year = "2022",

month = dec,

doi = "10.1186/s12911-022-02017-y",

language = "English (US)",

volume = "22",

journal = "BMC Medical Informatics and Decision Making",

issn = "1472-6947",

publisher = "BioMed Central",

number = "1",

}

TY - JOUR

T1 - Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports

AU - Dewaswala, Nakeya

AU - Chen, David

AU - Bhopalwala, Huzefa

AU - Kaggal, Vinod C.

AU - Murphy, Sean P.

AU - Bos, J. Martijn

AU - Geske, Jeffrey B.

AU - Gersh, Bernard J.

AU - Ommen, Steve R.

AU - Araoz, Philip A.

AU - Ackerman, Michael J.

AU - Arruda-Olson, Adelaide M.

PY - 2022/12

Y1 - 2022/12

N2 - Background: Cardiac magnetic resonance (CMR) imaging is important for diagnosis and risk stratification of hypertrophic cardiomyopathy (HCM) patients. However, collection of information from large numbers of CMR reports by manual review is time-consuming, error-prone and costly. Natural language processing (NLP) is an artificial intelligence method for automated extraction of information from narrative text including text in CMR reports in electronic health records (EHR). Our objective was to assess whether NLP can accurately extract diagnosis of HCM from CMR reports. Methods: An NLP system with two tiers was developed for information extraction from narrative text in CMR reports; the first tier extracted information regarding HCM diagnosis while the second extracted categorical and numeric concepts for HCM classification. We randomly allocated 200 HCM patients with CMR reports from 2004 to 2018 into training (100 patients with 185 CMR reports) and testing sets (100 patients with 206 reports). Results: NLP algorithms demonstrated very high performance compared to manual annotation. The algorithm to extract HCM diagnosis had accuracy of 0.99. The accuracy for categorical concepts included HCM morphologic subtype 0.99, systolic anterior motion of the mitral valve 0.96, mitral regurgitation 0.93, left ventricular (LV) obstruction 0.94, location of obstruction 0.92, apical pouch 0.98, LV delayed enhancement 0.93, left atrial enlargement 0.99 and right atrial enlargement 0.98. Accuracy for numeric concepts included maximal LV wall thickness 0.96, LV mass 0.99, LV mass index 0.98, LV ejection fraction 0.98 and right ventricular ejection fraction 0.99. Conclusions: NLP identified and classified HCM from CMR narrative text reports with very high performance.

AB - Background: Cardiac magnetic resonance (CMR) imaging is important for diagnosis and risk stratification of hypertrophic cardiomyopathy (HCM) patients. However, collection of information from large numbers of CMR reports by manual review is time-consuming, error-prone and costly. Natural language processing (NLP) is an artificial intelligence method for automated extraction of information from narrative text including text in CMR reports in electronic health records (EHR). Our objective was to assess whether NLP can accurately extract diagnosis of HCM from CMR reports. Methods: An NLP system with two tiers was developed for information extraction from narrative text in CMR reports; the first tier extracted information regarding HCM diagnosis while the second extracted categorical and numeric concepts for HCM classification. We randomly allocated 200 HCM patients with CMR reports from 2004 to 2018 into training (100 patients with 185 CMR reports) and testing sets (100 patients with 206 reports). Results: NLP algorithms demonstrated very high performance compared to manual annotation. The algorithm to extract HCM diagnosis had accuracy of 0.99. The accuracy for categorical concepts included HCM morphologic subtype 0.99, systolic anterior motion of the mitral valve 0.96, mitral regurgitation 0.93, left ventricular (LV) obstruction 0.94, location of obstruction 0.92, apical pouch 0.98, LV delayed enhancement 0.93, left atrial enlargement 0.99 and right atrial enlargement 0.98. Accuracy for numeric concepts included maximal LV wall thickness 0.96, LV mass 0.99, LV mass index 0.98, LV ejection fraction 0.98 and right ventricular ejection fraction 0.99. Conclusions: NLP identified and classified HCM from CMR narrative text reports with very high performance.

KW - Cardiac magnetic resonance imaging

KW - Hypertrophic cardiomyopathy

KW - Natural language processing

KW - Radiology reports

UR - http://www.scopus.com/inward/record.url?scp=85140056555&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85140056555&partnerID=8YFLogxK

U2 - 10.1186/s12911-022-02017-y

DO - 10.1186/s12911-022-02017-y

M3 - Article

C2 - 36258218

AN - SCOPUS:85140056555

SN - 1472-6947

VL - 22

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

IS - 1

M1 - 272

ER -

Natural language processing for identification of hypertrophic cardiomyopathy patients from cardiac magnetic resonance reports

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this