Fusion of knowledge-intensive and statistical approaches for retrieving and annotating textual genomics documents

Alan R. Aronson, Dina Demner-Fushman, Susanne M. Humphrey, Jimmy Lin, Hongfang Liu, Patrick Ruch, Miguel E. Ruiz, Lawrence H. Smith, Lorraine K. Tanabe, W. John Wilbur

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

This paper represents a continuation of research into the retrieval and annotation of textual genomics documents (both MEDLINE® citations and full text articles) for the purpose of satisfying biologists' real information needs. The overall approach taken here for both the ad hoc retrieval and categorization tasks within the TREC genomics track in 2005 was one combining the results of several NLP, statistical and ML methods, using a fusion method for ad hoc retrieval and ensemble methods for categorization. The results show that fusion approaches can improve the final outcome for the ad hoc and the categorization tasks, but that care must be taken in order to take advantage of the strengths of the constituent methods.

Original languageEnglish (US)
JournalNIST Special Publication
StatePublished - 2005
Event14th Text REtrieval Conference, TREC 2005 - Gaithersburg, MD, United States
Duration: Nov 15 2005Nov 18 2005

Keywords

  • Genomics
  • Information retrieval
  • MEDLINE/pubmed
  • Machine learning
  • Mesh
  • Statistical natural language processing
  • Thematic analysis
  • Vector space models

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Fusion of knowledge-intensive and statistical approaches for retrieving and annotating textual genomics documents'. Together they form a unique fingerprint.

Cite this