MedXN

An open source medication extraction and normalization tool for clinical text

Sunghwan Sohn, Cheryl Clark, Scott R. Halgrim, Sean P. Murphy, Christopher G. Chute, Hongfang D Liu

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Objective: We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Methods Medication: descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. Results: An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. Conclusions: The MedXN system (http://sourceforge. net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.

Original languageEnglish (US)
Pages (from-to)858-865
Number of pages8
JournalJournal of the American Medical Informatics Association
Volume21
Issue number5
DOIs
StatePublished - 2014

Fingerprint

RxNorm
Names
Information Management

ASJC Scopus subject areas

  • Health Informatics

Cite this

MedXN : An open source medication extraction and normalization tool for clinical text. / Sohn, Sunghwan; Clark, Cheryl; Halgrim, Scott R.; Murphy, Sean P.; Chute, Christopher G.; Liu, Hongfang D.

In: Journal of the American Medical Informatics Association, Vol. 21, No. 5, 2014, p. 858-865.

Research output: Contribution to journalArticle

Sohn, Sunghwan ; Clark, Cheryl ; Halgrim, Scott R. ; Murphy, Sean P. ; Chute, Christopher G. ; Liu, Hongfang D. / MedXN : An open source medication extraction and normalization tool for clinical text. In: Journal of the American Medical Informatics Association. 2014 ; Vol. 21, No. 5. pp. 858-865.
@article{8e8be0144ff644c0b04b2138be5ee6e6,
title = "MedXN: An open source medication extraction and normalization tool for clinical text",
abstract = "Objective: We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Methods Medication: descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. Results: An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. Conclusions: The MedXN system (http://sourceforge. net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.",
author = "Sunghwan Sohn and Cheryl Clark and Halgrim, {Scott R.} and Murphy, {Sean P.} and Chute, {Christopher G.} and Liu, {Hongfang D}",
year = "2014",
doi = "10.1136/amiajnl-2013-002190",
language = "English (US)",
volume = "21",
pages = "858--865",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "5",

}

TY - JOUR

T1 - MedXN

T2 - An open source medication extraction and normalization tool for clinical text

AU - Sohn, Sunghwan

AU - Clark, Cheryl

AU - Halgrim, Scott R.

AU - Murphy, Sean P.

AU - Chute, Christopher G.

AU - Liu, Hongfang D

PY - 2014

Y1 - 2014

N2 - Objective: We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Methods Medication: descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. Results: An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. Conclusions: The MedXN system (http://sourceforge. net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.

AB - Objective: We developed the Medication Extraction and Normalization (MedXN) system to extract comprehensive medication information and normalize it to the most appropriate RxNorm concept unique identifier (RxCUI) as specifically as possible. Methods Medication: descriptions in clinical notes were decomposed into medication name and attributes, which were separately extracted using RxNorm dictionary lookup and regular expression. Then, each medication name and its attributes were combined together according to RxNorm convention to find the most appropriate RxNorm representation. To do this, we employed serialized hierarchical steps implemented in Apache's Unstructured Information Management Architecture. We also performed synonym expansion, removed false medications, and employed inference rules to improve the medication extraction and normalization performance. Results: An evaluation on test data of 397 medication mentions showed F-measures of 0.975 for medication name and over 0.90 for most attributes. The RxCUI assignment produced F-measures of 0.932 for medication name and 0.864 for full medication information. Most false negative RxCUI assignments in full medication information are due to human assumption of missing attributes and medication names in the gold standard. Conclusions: The MedXN system (http://sourceforge. net/projects/ohnlp/files/MedXN/) was able to extract comprehensive medication information with high accuracy and demonstrated good normalization capability to RxCUI as long as explicit evidence existed. More sophisticated inference rules might result in further improvements to specific RxCUI assignments for incomplete medication descriptions.

UR - http://www.scopus.com/inward/record.url?scp=84906323732&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906323732&partnerID=8YFLogxK

U2 - 10.1136/amiajnl-2013-002190

DO - 10.1136/amiajnl-2013-002190

M3 - Article

VL - 21

SP - 858

EP - 865

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - 5

ER -