Generating quality word sense disambiguation test sets based on MeSH indexing.

Jung Wei Fan, Carol Friedman

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text. Preliminary results were promising and revealed important issues to be addressed in biomedical WSD research. We also suggest that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets. Online supplement is available at: http://www.dbmi.columbia.edu/~juf7002/AMIA09.

Original languageEnglish (US)
Pages (from-to)183-187
Number of pages5
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
Volume2009
StatePublished - Dec 1 2009
Externally publishedYes

Fingerprint

Natural Language Processing
Research

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Generating quality word sense disambiguation test sets based on MeSH indexing. / Fan, Jung Wei; Friedman, Carol.

In: AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium, Vol. 2009, 01.12.2009, p. 183-187.

Research output: Contribution to journalArticle

@article{529c3da431164d39a7794f1c665cda50,
title = "Generating quality word sense disambiguation test sets based on MeSH indexing.",
abstract = "Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text. Preliminary results were promising and revealed important issues to be addressed in biomedical WSD research. We also suggest that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets. Online supplement is available at: http://www.dbmi.columbia.edu/~juf7002/AMIA09.",
author = "Fan, {Jung Wei} and Carol Friedman",
year = "2009",
month = "12",
day = "1",
language = "English (US)",
volume = "2009",
pages = "183--187",
journal = "AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium",
issn = "1559-4076",
publisher = "American Medical Informatics Association",

}

TY - JOUR

T1 - Generating quality word sense disambiguation test sets based on MeSH indexing.

AU - Fan, Jung Wei

AU - Friedman, Carol

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text. Preliminary results were promising and revealed important issues to be addressed in biomedical WSD research. We also suggest that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets. Online supplement is available at: http://www.dbmi.columbia.edu/~juf7002/AMIA09.

AB - Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text. Preliminary results were promising and revealed important issues to be addressed in biomedical WSD research. We also suggest that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets. Online supplement is available at: http://www.dbmi.columbia.edu/~juf7002/AMIA09.

UR - http://www.scopus.com/inward/record.url?scp=79953784452&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79953784452&partnerID=8YFLogxK

M3 - Article

C2 - 20351846

AN - SCOPUS:79953784452

VL - 2009

SP - 183

EP - 187

JO - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

JF - AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium

SN - 1559-4076

ER -