Evaluating the UMLS as a source of lexical knowledge for medical language processing.

C. Friedman, Hongfang D Liu, L. Shagina, S. Johnson, G. Hripcsak

Research output: Contribution to journalArticle

31 Citations (Scopus)

Abstract

Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

Original languageEnglish (US)
Pages (from-to)189-193
Number of pages5
JournalProceedings / AMIA ... Annual Symposium. AMIA Symposium
StatePublished - 2001
Externally publishedYes

Fingerprint

Unified Medical Language System
Language
Terminology
Sensitivity and Specificity

Cite this

Evaluating the UMLS as a source of lexical knowledge for medical language processing. / Friedman, C.; Liu, Hongfang D; Shagina, L.; Johnson, S.; Hripcsak, G.

In: Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 2001, p. 189-193.

Research output: Contribution to journalArticle

@article{a282ff88d92e4baa88a1a7b5e70cccfe,
title = "Evaluating the UMLS as a source of lexical knowledge for medical language processing.",
abstract = "Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.",
author = "C. Friedman and Liu, {Hongfang D} and L. Shagina and S. Johnson and G. Hripcsak",
year = "2001",
language = "English (US)",
pages = "189--193",
journal = "Proceedings / AMIA . Annual Symposium. AMIA Symposium",
issn = "1531-605X",
publisher = "Hanley & Belfus",

}

TY - JOUR

T1 - Evaluating the UMLS as a source of lexical knowledge for medical language processing.

AU - Friedman, C.

AU - Liu, Hongfang D

AU - Shagina, L.

AU - Johnson, S.

AU - Hripcsak, G.

PY - 2001

Y1 - 2001

N2 - Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

AB - Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

UR - http://www.scopus.com/inward/record.url?scp=0035753772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035753772&partnerID=8YFLogxK

M3 - Article

C2 - 11825178

AN - SCOPUS:0035753772

SP - 189

EP - 193

JO - Proceedings / AMIA . Annual Symposium. AMIA Symposium

JF - Proceedings / AMIA . Annual Symposium. AMIA Symposium

SN - 1531-605X

ER -