A vocabulary development and visualization tool based on natural language processing and the mining of textual patient reports

Carol Friedman, Hongfang Liu, Lyudmila Shagina

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

Medical terminologies are critical for automated healthcare systems. Some terminologies, such as the UMLS and SNOMED are comprehensive, whereas others specialize in limited domains (i.e., BIRADS) or are developed for specific applications. An important feature of a terminology is comprehensive coverage of relevant clinical terms and ease of use by users, which include computerized applications. We have developed a method for facilitating vocabulary development and maintenance that is based on utilization of natural language processing to mine large collections of clinical reports in order to obtain information on terminology as expressed by physicians. Once the reports are processed and the terms structured and collected into an XML representational schema, it is possible to determine information about terms, such as frequency of occurrence, compositionality, relations to other terms (such as modifiers), and correspondence to a controlled vocabulary. This paper describes the method and discusses how it can be used as a tool to help vocabulary builders navigate through the terms physicians use, visualize their relations to other terms via a flexible viewer, and determine their correspondence to a controlled vocabulary.

Original languageEnglish (US)
Pages (from-to)189-201
Number of pages13
JournalJournal of Biomedical Informatics
Volume36
Issue number3
DOIs
StatePublished - Jun 2003

Keywords

  • Controlled vocabulary
  • Medical terminology
  • Natural language processing
  • Text mining
  • XML-based graphical user interface

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Fingerprint

Dive into the research topics of 'A vocabulary development and visualization tool based on natural language processing and the mining of textual patient reports'. Together they form a unique fingerprint.

Cite this