Evaluating the UMLS as a source of lexical knowledge for medical language processing.

C. Friedman; H. Liu; L. Shagina; S. Johnson; G. Hripcsak

Evaluating the UMLS as a source of lexical knowledge for medical language processing.

C. Friedman, H. Liu, L. Shagina, S. Johnson, G. Hripcsak

Digital Health Sciences

Research output: Contribution to journal › Article › peer-review

31 Scopus citations

Abstract

Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

Original language	English (US)
Pages (from-to)	189-193
Number of pages	5
Journal	Proceedings / AMIA ... Annual Symposium. AMIA Symposium
State	Published - 2001

ASJC Scopus subject areas

General Medicine

Cite this

@article{a282ff88d92e4baa88a1a7b5e70cccfe,

title = "Evaluating the UMLS as a source of lexical knowledge for medical language processing.",

abstract = "Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.",

author = "C. Friedman and H. Liu and L. Shagina and S. Johnson and G. Hripcsak",

year = "2001",

language = "English (US)",

pages = "189--193",

journal = "Proceedings / AMIA ... Annual Symposium. AMIA Symposium",

issn = "1531-605X",

publisher = "Hanley & Belfus",

}

TY - JOUR

T1 - Evaluating the UMLS as a source of lexical knowledge for medical language processing.

AU - Friedman, C.

AU - Liu, H.

AU - Shagina, L.

AU - Johnson, S.

AU - Hripcsak, G.

PY - 2001

Y1 - 2001

N2 - Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

AB - Medical language processing (MLP) systems rely on specialized lexicons in order to recognize, classify, and normalize medical terminology, and the performance of an MLP system is dependent on the coverage and quality of such lexicons. However, the acquisition of lexical knowledge is expensive and time-consuming. The UMLS is a comprehensive resource that can be used to acquire lexical knowledge needed for medical language processing. This paper describes methods that use these resources to automatically create lexical entries and generate two lexicons. The first lexicon was created primarily using the UMLS, whereas the second was created by supplementing the lexicon of an existing MLP system called MedLEE with entries based on the UMLS. We subsequently carried out a study, which is the primary focus of this paper, using MedLEE with each of the two lexicons and also the current MedLEE lexicon to measure performance. Overall accuracy, sensitivity, and specificity using the lexicon primarily based on the UMLS were.86,.60, and.96 respectively. Those measures using the MedLEE lexicon alone were.93,.81, and.93, which was significantly better except for specificity; performance using the supplemental lexicon was exactly the same as performance using solely the MedLEE lexicon.

UR - http://www.scopus.com/inward/record.url?scp=0035753772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0035753772&partnerID=8YFLogxK

M3 - Article

C2 - 11825178

AN - SCOPUS:0035753772

SN - 1531-605X

SP - 189

EP - 193

JO - Proceedings / AMIA ... Annual Symposium. AMIA Symposium

JF - Proceedings / AMIA ... Annual Symposium. AMIA Symposium

ER -

Evaluating the UMLS as a source of lexical knowledge for medical language processing.

Abstract

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this