Natural language processing of radiology reports for identification of skeletal site-specific fractures

Yanshan Wang; Saeed Mehrabi; Sunghwan Sohn; Elizabeth J. Atkinson; Shreyasee Amin; Hongfang Liu

doi:10.1186/s12911-019-0780-5

Natural language processing of radiology reports for identification of skeletal site-specific fractures

Yanshan Wang, Saeed Mehrabi, Sunghwan Sohn, Elizabeth J. Atkinson, Shreyasee Amin, Hongfang Liu

Research output: Contribution to journal › Article › peer-review

6 Scopus citations

Abstract

Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

Original language	English (US)
Article number	73
Journal	BMC Medical Informatics and Decision Making
Volume	19
DOIs	https://doi.org/10.1186/s12911-019-0780-5
State	Published - Apr 4 2019

Keywords

Electronic health records
Fracture identification
Natural language processing
Radiology reports

ASJC Scopus subject areas

Health Policy
Health Informatics

Access to Document

10.1186/s12911-019-0780-5

Cite this

@article{6f7487605eb349a9b124fdc27dfff7b0,

title = "Natural language processing of radiology reports for identification of skeletal site-specific fractures",

abstract = "Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.",

keywords = "Electronic health records, Fracture identification, Natural language processing, Radiology reports",

author = "Yanshan Wang and Saeed Mehrabi and Sunghwan Sohn and Atkinson, {Elizabeth J.} and Shreyasee Amin and Hongfang Liu",

note = "Publisher Copyright: {\textcopyright} 2019 The Author(s).",

year = "2019",

month = apr,

day = "4",

doi = "10.1186/s12911-019-0780-5",

language = "English (US)",

volume = "19",

journal = "BMC Medical Informatics and Decision Making",

issn = "1472-6947",

publisher = "BioMed Central",

}

TY - JOUR

T1 - Natural language processing of radiology reports for identification of skeletal site-specific fractures

AU - Wang, Yanshan

AU - Mehrabi, Saeed

AU - Sohn, Sunghwan

AU - Atkinson, Elizabeth J.

AU - Amin, Shreyasee

AU - Liu, Hongfang

PY - 2019/4/4

Y1 - 2019/4/4

N2 - Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

AB - Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

KW - Electronic health records

KW - Fracture identification

KW - Natural language processing

KW - Radiology reports

UR - http://www.scopus.com/inward/record.url?scp=85063931750&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063931750&partnerID=8YFLogxK

U2 - 10.1186/s12911-019-0780-5

DO - 10.1186/s12911-019-0780-5

M3 - Article

C2 - 30943952

AN - SCOPUS:85063931750

SN - 1472-6947

VL - 19

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

M1 - 73

ER -

Natural language processing of radiology reports for identification of skeletal site-specific fractures

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this