Benchmarking ontologies: Bigger or better?

Lixia Yao, Anna Divoli, Ilya Mayzus, James A. Evans, Andrey Rzhetsky

Research output: Contribution to journalArticle

19 Scopus citations

Abstract

A scientific ontology is a formal representation of knowledge within a domain, typically including central concepts, their properties, and relations. With the rise of computers and high-throughput data collection, ontologies have become essential to data mining and sharing across communities in the biomedical sciences. Powerful approaches exist for testing the internal consistency of an ontology, but not for assessing the fidelity of its domain representation. We introduce a family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain. We then test these metrics using (1) four of the most common medical ontologies with respect to a corpus of medical documents and (2) seven of the most popular English thesauri with respect to three corpora that sample language from medicine, news, and novels. Here we show that our approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain. Our results also demonstrate key features of medical ontologies, English thesauri, and discourse from different domains. Medical ontologies have a small intersection, as do English thesauri. Moreover, dialects characteristic of distinct domains vary strikingly as many of the same words are used quite differently in medicine, news, and novels. As ontologies are intended to mirror the state of knowledge, our methods to tighten the fit between ontology and domain will increase their relevance for new areas of biomedical science and improve the accuracy and power of inferences computed across them.

Original languageEnglish (US)
Article numbere1001055
JournalPLoS computational biology
Volume7
Issue number1
DOIs
StatePublished - Jan 2011

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Ecology
  • Molecular Biology
  • Genetics
  • Cellular and Molecular Neuroscience
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Benchmarking ontologies: Bigger or better?'. Together they form a unique fingerprint.

  • Cite this

    Yao, L., Divoli, A., Mayzus, I., Evans, J. A., & Rzhetsky, A. (2011). Benchmarking ontologies: Bigger or better? PLoS computational biology, 7(1), [e1001055]. https://doi.org/10.1371/journal.pcbi.1001055