Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature

Guocai Chen, Jieyi Zhao, Trevor Cohen, Cui Tao, Jingchun Sun, Hua Xu, Elmer V. Bernstam, Andrew Lawson, Jia Zeng, Amber M. Johnson, Vijaykumar Holla, Ann M. Bailey, Humberto Lara-Guerra, Beate Litzenburger, Funda Meric-Bernstam, W. Jim Zheng

Research output: Contribution to journalArticle

9 Scopus citations

Abstract

Ambiguous gene names in the biomedical literature are a barrier to accurate information extraction. To overcome this hurdle, we generated Ontology Fingerprints for selected genes that are relevant for personalized cancer therapy. These Ontology Fingerprints were used to evaluate the association between genes and biomedical literature to disambiguate gene names. We obtained 93.6% precision for the test gene set and 80.4% for the area under a receiver-operating characteristics curve for gene and article association. The core algorithm was implemented using a graphics processing unit-based MapReduce framework to handle big data and to improve performance. We conclude that Ontology Fingerprints can help disambiguate gene names mentioned in text and analyse the association between genes and articles.

Original languageEnglish (US)
Article numberbav034
JournalDatabase
Volume2015
DOIs
StatePublished - Jan 1 2015

    Fingerprint

ASJC Scopus subject areas

  • Information Systems
  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Chen, G., Zhao, J., Cohen, T., Tao, C., Sun, J., Xu, H., Bernstam, E. V., Lawson, A., Zeng, J., Johnson, A. M., Holla, V., Bailey, A. M., Lara-Guerra, H., Litzenburger, B., Meric-Bernstam, F., & Zheng, W. J. (2015). Using Ontology Fingerprints to disambiguate gene name entities in the biomedical literature. Database, 2015, [bav034]. https://doi.org/10.1093/database/bav034