TY - JOUR
T1 - CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines
AU - Soysal, Ergin
AU - Wang, Jingqi
AU - Jiang, Min
AU - Wu, Yonghui
AU - Pakhomov, Serguei
AU - Liu, Hongfang
AU - Xu, Hua
N1 - Funding Information:
This work was supported in part by grants from the National Institute of General Medical Sciences, GM102282 and GM103859, the National Library of Medicine, LM 010681, the National Cancer Institute, CA194215, and the Cancer Prevention and Research Institute of Texas, R1307.
Publisher Copyright:
© The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved.
PY - 2018/3/1
Y1 - 2018/3/1
N2 - Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.
AB - Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.
KW - Clinical text processing
KW - Machine learning
KW - Natural language processing
UR - http://www.scopus.com/inward/record.url?scp=85043312385&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85043312385&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocx132
DO - 10.1093/jamia/ocx132
M3 - Article
C2 - 29186491
AN - SCOPUS:85043312385
SN - 1067-5027
VL - 25
SP - 331
EP - 336
JO - Journal of the American Medical Informatics Association : JAMIA
JF - Journal of the American Medical Informatics Association : JAMIA
IS - 3
ER -