Natural language processing of radiology reports for identification of skeletal site-specific fractures

Yanshan Wang, Saeed Mehrabi, Sunghwan Sohn, Elizabeth J. Atkinson, Shreyasee Amin, Hongfang D Liu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

Original languageEnglish (US)
Article number73
JournalBMC Medical Informatics and Decision Making
Volume19
DOIs
StatePublished - Apr 4 2019

Fingerprint

Natural Language Processing
Radiology
Osteoporosis
Population Surveillance
Physicians
Information Management
Electronic Health Records
Information Storage and Retrieval
Research Ethics Committees
Public Health
Costs and Cost Analysis
Sensitivity and Specificity

Keywords

  • Electronic health records
  • Fracture identification
  • Natural language processing
  • Radiology reports

ASJC Scopus subject areas

  • Health Policy
  • Health Informatics

Cite this

Natural language processing of radiology reports for identification of skeletal site-specific fractures. / Wang, Yanshan; Mehrabi, Saeed; Sohn, Sunghwan; Atkinson, Elizabeth J.; Amin, Shreyasee; Liu, Hongfang D.

In: BMC Medical Informatics and Decision Making, Vol. 19, 73, 04.04.2019.

Research output: Contribution to journalArticle

@article{6f7487605eb349a9b124fdc27dfff7b0,
title = "Natural language processing of radiology reports for identification of skeletal site-specific fractures",
abstract = "Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85{\%}). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.",
keywords = "Electronic health records, Fracture identification, Natural language processing, Radiology reports",
author = "Yanshan Wang and Saeed Mehrabi and Sunghwan Sohn and Atkinson, {Elizabeth J.} and Shreyasee Amin and Liu, {Hongfang D}",
year = "2019",
month = "4",
day = "4",
doi = "10.1186/s12911-019-0780-5",
language = "English (US)",
volume = "19",
journal = "BMC Medical Informatics and Decision Making",
issn = "1472-6947",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Natural language processing of radiology reports for identification of skeletal site-specific fractures

AU - Wang, Yanshan

AU - Mehrabi, Saeed

AU - Sohn, Sunghwan

AU - Atkinson, Elizabeth J.

AU - Amin, Shreyasee

AU - Liu, Hongfang D

PY - 2019/4/4

Y1 - 2019/4/4

N2 - Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

AB - Background: Osteoporosis has become an important public health issue. Most of the population, particularly elderly people, are at some degree of risk of osteoporosis-related fractures. Accurate identification and surveillance of patient populations with fractures has a significant impact on reduction of cost of care by preventing future fractures and its corresponding complications. Methods: In this study, we developed a rule-based natural language processing (NLP) algorithm for identification of twenty skeletal site-specific fractures from radiology reports. The rule-based NLP algorithm was based on regular expressions developed using MedTagger, an NLP tool of the Apache Unstructured Information Management Architecture (UIMA) pipeline to facilitate information extraction from clinical narratives. Radiology notes were retrieved from the Mayo Clinic electronic health records data warehouse. We developed rules for identifying each fracture type according to physicians' knowledge and experience, and refined these rules via verification with physicians. This study was approved by the institutional review board (IRB) for human subject research. Results: We validated the NLP algorithm using the radiology reports of a community-based cohort at Mayo Clinic with the gold standard constructed by medical experts. The micro-averaged results of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and F1-score of the proposed NLP algorithm are 0.930, 1.0, 1.0, 0.941, 0.961, respectively. The F1-score is 1.0 for 8 fractures, and above 0.9 for a total of 17 out of 20 fractures (85%). Conclusions: The results verified the effectiveness of the proposed rule-based NLP algorithm in automatic identification of osteoporosis-related skeletal site-specific fractures from radiology reports. The NLP algorithm could be utilized to accurately identify the patients with fractures and those who are also at high risk of future fractures due to osteoporosis. Appropriate care interventions to those patients, not only the most at-risk patients but also those with emerging risk, would significantly reduce future fractures.

KW - Electronic health records

KW - Fracture identification

KW - Natural language processing

KW - Radiology reports

UR - http://www.scopus.com/inward/record.url?scp=85063931750&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063931750&partnerID=8YFLogxK

U2 - 10.1186/s12911-019-0780-5

DO - 10.1186/s12911-019-0780-5

M3 - Article

VL - 19

JO - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

SN - 1472-6947

M1 - 73

ER -