Dynamically generating a protein entity dictionary using online resources

Hongfang Liu; Zhangzhi Hu; Cathy Wu

doi:10.3115/1225753.1225758

Dynamically generating a protein entity dictionary using online resources

Hongfang Liu, Zhangzhi Hu, Cathy Wu

Digital Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

With the overwhelming amount of biologica knowledge stored in free text, natural language processing (NLP) has received much attention recently to make the task of managing information recorded in free text more feasible. One requirement for most NLP systems is the ability to accurately recognize biological entity terms in free text and the ability to map these terms to corresponding records in databases. Such task is called biological named entity tagging. In this paper, we present a system that automatically constructs a protein entity dictionary, which contains gene or protein names associated with UniProt identifiers using online resources. The system can run periodically to always keep up-to-date with these online resources. Using online resources that were available on Dec. 25, 2004, we obtained 4,046,733 terms for 1,640,082 entities. The dictionary can be accessed from the following website: http://biocreative.ifsm.umbc.edu/biothesaurus/.

Original language	English (US)
Title of host publication	ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Publisher	Association for Computational Linguistics (ACL)
Pages	17-20
Number of pages	4
ISBN (Print)	1932432515, 9781932432510
DOIs	https://doi.org/10.3115/1225753.1225758
State	Published - 2005
Event	43rd Annual Meeting of the Association for Computational Linguistics, ACL-05 - Ann Arbor, MI, United States Duration: Jun 25 2005 → Jun 30 2005

Publication series

Name	ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Other

Other	43rd Annual Meeting of the Association for Computational Linguistics, ACL-05
Country/Territory	United States
City	Ann Arbor, MI
Period	6/25/05 → 6/30/05

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language

Access to Document

10.3115/1225753.1225758

Cite this

Liu, H., Hu, Z., & Wu, C. (2005). Dynamically generating a protein entity dictionary using online resources. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 17-20). (ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1225753.1225758

Dynamically generating a protein entity dictionary using online resources. / Liu, Hongfang; Hu, Zhangzhi; Wu, Cathy.
ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2005. p. 17-20 (ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Liu, H, Hu, Z & Wu, C 2005, Dynamically generating a protein entity dictionary using online resources. in ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Association for Computational Linguistics (ACL), pp. 17-20, 43rd Annual Meeting of the Association for Computational Linguistics, ACL-05, Ann Arbor, MI, United States, 6/25/05. https://doi.org/10.3115/1225753.1225758

Liu H, Hu Z, Wu C. Dynamically generating a protein entity dictionary using online resources. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. Association for Computational Linguistics (ACL). 2005. p. 17-20. (ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference). doi: 10.3115/1225753.1225758

Liu, Hongfang ; Hu, Zhangzhi ; Wu, Cathy. / Dynamically generating a protein entity dictionary using online resources. ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2005. pp. 17-20 (ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference).

@inproceedings{bff847158d7c46f1b926f1bec8e55a62,

title = "Dynamically generating a protein entity dictionary using online resources",

abstract = "With the overwhelming amount of biologica knowledge stored in free text, natural language processing (NLP) has received much attention recently to make the task of managing information recorded in free text more feasible. One requirement for most NLP systems is the ability to accurately recognize biological entity terms in free text and the ability to map these terms to corresponding records in databases. Such task is called biological named entity tagging. In this paper, we present a system that automatically constructs a protein entity dictionary, which contains gene or protein names associated with UniProt identifiers using online resources. The system can run periodically to always keep up-to-date with these online resources. Using online resources that were available on Dec. 25, 2004, we obtained 4,046,733 terms for 1,640,082 entities. The dictionary can be accessed from the following website: http://biocreative.ifsm.umbc.edu/biothesaurus/.",

author = "Hongfang Liu and Zhangzhi Hu and Cathy Wu",

year = "2005",

doi = "10.3115/1225753.1225758",

language = "English (US)",

isbn = "1932432515",

series = "ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference",

publisher = "Association for Computational Linguistics (ACL)",

pages = "17--20",

booktitle = "ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference",

note = "43rd Annual Meeting of the Association for Computational Linguistics, ACL-05 ; Conference date: 25-06-2005 Through 30-06-2005",

}

TY - GEN

T1 - Dynamically generating a protein entity dictionary using online resources

AU - Liu, Hongfang

AU - Hu, Zhangzhi

AU - Wu, Cathy

PY - 2005

Y1 - 2005

N2 - With the overwhelming amount of biologica knowledge stored in free text, natural language processing (NLP) has received much attention recently to make the task of managing information recorded in free text more feasible. One requirement for most NLP systems is the ability to accurately recognize biological entity terms in free text and the ability to map these terms to corresponding records in databases. Such task is called biological named entity tagging. In this paper, we present a system that automatically constructs a protein entity dictionary, which contains gene or protein names associated with UniProt identifiers using online resources. The system can run periodically to always keep up-to-date with these online resources. Using online resources that were available on Dec. 25, 2004, we obtained 4,046,733 terms for 1,640,082 entities. The dictionary can be accessed from the following website: http://biocreative.ifsm.umbc.edu/biothesaurus/.

AB - With the overwhelming amount of biologica knowledge stored in free text, natural language processing (NLP) has received much attention recently to make the task of managing information recorded in free text more feasible. One requirement for most NLP systems is the ability to accurately recognize biological entity terms in free text and the ability to map these terms to corresponding records in databases. Such task is called biological named entity tagging. In this paper, we present a system that automatically constructs a protein entity dictionary, which contains gene or protein names associated with UniProt identifiers using online resources. The system can run periodically to always keep up-to-date with these online resources. Using online resources that were available on Dec. 25, 2004, we obtained 4,046,733 terms for 1,640,082 entities. The dictionary can be accessed from the following website: http://biocreative.ifsm.umbc.edu/biothesaurus/.

UR - http://www.scopus.com/inward/record.url?scp=84859895200&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859895200&partnerID=8YFLogxK

U2 - 10.3115/1225753.1225758

DO - 10.3115/1225753.1225758

M3 - Conference contribution

AN - SCOPUS:84859895200

SN - 1932432515

SN - 9781932432510

T3 - ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

SP - 17

EP - 20

BT - ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

PB - Association for Computational Linguistics (ACL)

T2 - 43rd Annual Meeting of the Association for Computational Linguistics, ACL-05

Y2 - 25 June 2005 through 30 June 2005

ER -

Dynamically generating a protein entity dictionary using online resources

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this