Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes

Hannu T. Huhdanpaa, W. Katherine Tan, Sean D. Rundell, Pradeep Suri, Falgun H. Chokshi, Bryan A. Comstock, Patrick J. Heagerty, Kathryn T. James, Andrew L. Avins, Srdjan S. Nedeljkovic, David R. Nerenz, David F Kallmes, Patrick H Luetmer, Karen J. Sherman, Nancy L. Organ, Brent Griffith, Curtis P. Langlotz, David Carrell, Saeed Hassanpour, Jeffrey G. Jarvik

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10%. Results were recall (sensitivity) 35/50 = 0.70 (95% confidence interval (C.I.) 0.52–0.82), specificity 404/408 = 0.99 (0.97–1.0), precision (positive predictive value) 35/39 = 0.90 (0.75–0.97), negative predictive value 404/419 = 0.96 (0.94–0.98), and F1-score 0.79 (0.43–1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.

Original languageEnglish (US)
Pages (from-to)1-7
Number of pages7
JournalJournal of Digital Imaging
DOIs
StateAccepted/In press - Aug 14 2017

Fingerprint

Natural Language Processing
Radiology
Electronic medical equipment
Spine
Electronic Health Records
Magnetic resonance
Processing
Quality Improvement
Magnetic Resonance Spectroscopy
Magnetic Resonance Imaging
Clinical Trials
Availability
Confidence Intervals
Imaging techniques
Research

Keywords

  • Lumbar spine imaging
  • Modic classification
  • Natural language processing
  • Radiology reporting

ASJC Scopus subject areas

  • Radiological and Ultrasound Technology
  • Radiology Nuclear Medicine and imaging
  • Computer Science Applications

Cite this

Huhdanpaa, H. T., Tan, W. K., Rundell, S. D., Suri, P., Chokshi, F. H., Comstock, B. A., ... Jarvik, J. G. (Accepted/In press). Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes. Journal of Digital Imaging, 1-7. https://doi.org/10.1007/s10278-017-0013-3

Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes. / Huhdanpaa, Hannu T.; Tan, W. Katherine; Rundell, Sean D.; Suri, Pradeep; Chokshi, Falgun H.; Comstock, Bryan A.; Heagerty, Patrick J.; James, Kathryn T.; Avins, Andrew L.; Nedeljkovic, Srdjan S.; Nerenz, David R.; Kallmes, David F; Luetmer, Patrick H; Sherman, Karen J.; Organ, Nancy L.; Griffith, Brent; Langlotz, Curtis P.; Carrell, David; Hassanpour, Saeed; Jarvik, Jeffrey G.

In: Journal of Digital Imaging, 14.08.2017, p. 1-7.

Research output: Contribution to journalArticle

Huhdanpaa, HT, Tan, WK, Rundell, SD, Suri, P, Chokshi, FH, Comstock, BA, Heagerty, PJ, James, KT, Avins, AL, Nedeljkovic, SS, Nerenz, DR, Kallmes, DF, Luetmer, PH, Sherman, KJ, Organ, NL, Griffith, B, Langlotz, CP, Carrell, D, Hassanpour, S & Jarvik, JG 2017, 'Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes', Journal of Digital Imaging, pp. 1-7. https://doi.org/10.1007/s10278-017-0013-3
Huhdanpaa, Hannu T. ; Tan, W. Katherine ; Rundell, Sean D. ; Suri, Pradeep ; Chokshi, Falgun H. ; Comstock, Bryan A. ; Heagerty, Patrick J. ; James, Kathryn T. ; Avins, Andrew L. ; Nedeljkovic, Srdjan S. ; Nerenz, David R. ; Kallmes, David F ; Luetmer, Patrick H ; Sherman, Karen J. ; Organ, Nancy L. ; Griffith, Brent ; Langlotz, Curtis P. ; Carrell, David ; Hassanpour, Saeed ; Jarvik, Jeffrey G. / Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes. In: Journal of Digital Imaging. 2017 ; pp. 1-7.
@article{85246a1b71c24273a2ec532fee2d319d,
title = "Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes",
abstract = "Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10{\%}. Results were recall (sensitivity) 35/50 = 0.70 (95{\%} confidence interval (C.I.) 0.52–0.82), specificity 404/408 = 0.99 (0.97–1.0), precision (positive predictive value) 35/39 = 0.90 (0.75–0.97), negative predictive value 404/419 = 0.96 (0.94–0.98), and F1-score 0.79 (0.43–1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.",
keywords = "Lumbar spine imaging, Modic classification, Natural language processing, Radiology reporting",
author = "Huhdanpaa, {Hannu T.} and Tan, {W. Katherine} and Rundell, {Sean D.} and Pradeep Suri and Chokshi, {Falgun H.} and Comstock, {Bryan A.} and Heagerty, {Patrick J.} and James, {Kathryn T.} and Avins, {Andrew L.} and Nedeljkovic, {Srdjan S.} and Nerenz, {David R.} and Kallmes, {David F} and Luetmer, {Patrick H} and Sherman, {Karen J.} and Organ, {Nancy L.} and Brent Griffith and Langlotz, {Curtis P.} and David Carrell and Saeed Hassanpour and Jarvik, {Jeffrey G.}",
year = "2017",
month = "8",
day = "14",
doi = "10.1007/s10278-017-0013-3",
language = "English (US)",
pages = "1--7",
journal = "Journal of Digital Imaging",
issn = "0897-1889",
publisher = "Springer New York",

}

TY - JOUR

T1 - Using Natural Language Processing of Free-Text Radiology Reports to Identify Type 1 Modic Endplate Changes

AU - Huhdanpaa, Hannu T.

AU - Tan, W. Katherine

AU - Rundell, Sean D.

AU - Suri, Pradeep

AU - Chokshi, Falgun H.

AU - Comstock, Bryan A.

AU - Heagerty, Patrick J.

AU - James, Kathryn T.

AU - Avins, Andrew L.

AU - Nedeljkovic, Srdjan S.

AU - Nerenz, David R.

AU - Kallmes, David F

AU - Luetmer, Patrick H

AU - Sherman, Karen J.

AU - Organ, Nancy L.

AU - Griffith, Brent

AU - Langlotz, Curtis P.

AU - Carrell, David

AU - Hassanpour, Saeed

AU - Jarvik, Jeffrey G.

PY - 2017/8/14

Y1 - 2017/8/14

N2 - Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10%. Results were recall (sensitivity) 35/50 = 0.70 (95% confidence interval (C.I.) 0.52–0.82), specificity 404/408 = 0.99 (0.97–1.0), precision (positive predictive value) 35/39 = 0.90 (0.75–0.97), negative predictive value 404/419 = 0.96 (0.94–0.98), and F1-score 0.79 (0.43–1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.

AB - Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10%. Results were recall (sensitivity) 35/50 = 0.70 (95% confidence interval (C.I.) 0.52–0.82), specificity 404/408 = 0.99 (0.97–1.0), precision (positive predictive value) 35/39 = 0.90 (0.75–0.97), negative predictive value 404/419 = 0.96 (0.94–0.98), and F1-score 0.79 (0.43–1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.

KW - Lumbar spine imaging

KW - Modic classification

KW - Natural language processing

KW - Radiology reporting

UR - http://www.scopus.com/inward/record.url?scp=85028527814&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028527814&partnerID=8YFLogxK

U2 - 10.1007/s10278-017-0013-3

DO - 10.1007/s10278-017-0013-3

M3 - Article

C2 - 28808792

AN - SCOPUS:85028527814

SP - 1

EP - 7

JO - Journal of Digital Imaging

JF - Journal of Digital Imaging

SN - 0897-1889

ER -