A study of abbreviations in MEDLINE abstracts.

Hongfang D Liu, Alan R. Aronson, Carol Friedman

Research output: Contribution to journalArticle

43 Citations (Scopus)

Abstract

Abbreviations are widely used in writing, and the understanding of abbreviations is important for natural language processing applications. Abbreviations are not always defined in a document and they are highly ambiguous. A knowledge base that consists of abbreviations with their associated senses and a method to resolve the ambiguities are needed. In this paper, we studied the UMLS coverage, textual variants of senses, and the ambiguity of abbreviations in MEDLINE abstracts. We restricted our study to three-letter abbreviations which were defined using parenthetical expressions. When grouping similar expansions together and representing senses using groups, we found that after ignoring senses where the total number of occurrences within the corresponding group was less than 100, 82.8% of the senses matched the UMLS, covered over 93% of occurrences that were considered, and had an average of 7.74 expansions for each sense. Abbreviations are highly ambiguous: 81.2% of the abbreviations were ambiguous, and had an average of 16.6 senses. However, after ignoring senses with occurrences of less than 5, 64.6% of the abbreviations were ambiguous, and had an average of 4.91 senses.

Original languageEnglish (US)
Pages (from-to)464-468
Number of pages5
JournalProceedings / AMIA ... Annual Symposium. AMIA Symposium
StatePublished - 2002
Externally publishedYes

Fingerprint

Unified Medical Language System
MEDLINE
Natural Language Processing
Knowledge Bases

Cite this

A study of abbreviations in MEDLINE abstracts. / Liu, Hongfang D; Aronson, Alan R.; Friedman, Carol.

In: Proceedings / AMIA ... Annual Symposium. AMIA Symposium, 2002, p. 464-468.

Research output: Contribution to journalArticle

@article{78dce4d2bdc2435db7b7d6fcfe0ce0dd,
title = "A study of abbreviations in MEDLINE abstracts.",
abstract = "Abbreviations are widely used in writing, and the understanding of abbreviations is important for natural language processing applications. Abbreviations are not always defined in a document and they are highly ambiguous. A knowledge base that consists of abbreviations with their associated senses and a method to resolve the ambiguities are needed. In this paper, we studied the UMLS coverage, textual variants of senses, and the ambiguity of abbreviations in MEDLINE abstracts. We restricted our study to three-letter abbreviations which were defined using parenthetical expressions. When grouping similar expansions together and representing senses using groups, we found that after ignoring senses where the total number of occurrences within the corresponding group was less than 100, 82.8{\%} of the senses matched the UMLS, covered over 93{\%} of occurrences that were considered, and had an average of 7.74 expansions for each sense. Abbreviations are highly ambiguous: 81.2{\%} of the abbreviations were ambiguous, and had an average of 16.6 senses. However, after ignoring senses with occurrences of less than 5, 64.6{\%} of the abbreviations were ambiguous, and had an average of 4.91 senses.",
author = "Liu, {Hongfang D} and Aronson, {Alan R.} and Carol Friedman",
year = "2002",
language = "English (US)",
pages = "464--468",
journal = "Proceedings / AMIA . Annual Symposium. AMIA Symposium",
issn = "1531-605X",
publisher = "Hanley & Belfus",

}

TY - JOUR

T1 - A study of abbreviations in MEDLINE abstracts.

AU - Liu, Hongfang D

AU - Aronson, Alan R.

AU - Friedman, Carol

PY - 2002

Y1 - 2002

N2 - Abbreviations are widely used in writing, and the understanding of abbreviations is important for natural language processing applications. Abbreviations are not always defined in a document and they are highly ambiguous. A knowledge base that consists of abbreviations with their associated senses and a method to resolve the ambiguities are needed. In this paper, we studied the UMLS coverage, textual variants of senses, and the ambiguity of abbreviations in MEDLINE abstracts. We restricted our study to three-letter abbreviations which were defined using parenthetical expressions. When grouping similar expansions together and representing senses using groups, we found that after ignoring senses where the total number of occurrences within the corresponding group was less than 100, 82.8% of the senses matched the UMLS, covered over 93% of occurrences that were considered, and had an average of 7.74 expansions for each sense. Abbreviations are highly ambiguous: 81.2% of the abbreviations were ambiguous, and had an average of 16.6 senses. However, after ignoring senses with occurrences of less than 5, 64.6% of the abbreviations were ambiguous, and had an average of 4.91 senses.

AB - Abbreviations are widely used in writing, and the understanding of abbreviations is important for natural language processing applications. Abbreviations are not always defined in a document and they are highly ambiguous. A knowledge base that consists of abbreviations with their associated senses and a method to resolve the ambiguities are needed. In this paper, we studied the UMLS coverage, textual variants of senses, and the ambiguity of abbreviations in MEDLINE abstracts. We restricted our study to three-letter abbreviations which were defined using parenthetical expressions. When grouping similar expansions together and representing senses using groups, we found that after ignoring senses where the total number of occurrences within the corresponding group was less than 100, 82.8% of the senses matched the UMLS, covered over 93% of occurrences that were considered, and had an average of 7.74 expansions for each sense. Abbreviations are highly ambiguous: 81.2% of the abbreviations were ambiguous, and had an average of 16.6 senses. However, after ignoring senses with occurrences of less than 5, 64.6% of the abbreviations were ambiguous, and had an average of 4.91 senses.

UR - http://www.scopus.com/inward/record.url?scp=0036357372&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0036357372&partnerID=8YFLogxK

M3 - Article

C2 - 12463867

AN - SCOPUS:0036357372

SP - 464

EP - 468

JO - Proceedings / AMIA . Annual Symposium. AMIA Symposium

JF - Proceedings / AMIA . Annual Symposium. AMIA Symposium

SN - 1531-605X

ER -