CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines

Ergin Soysal; Jingqi Wang; Min Jiang; Yonghui Wu; Serguei Pakhomov; Hongfang Liu; Hua Xu

doi:10.1093/jamia/ocx132

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines

Ergin Soysal, Jingqi Wang, Min Jiang, Yonghui Wu, Serguei Pakhomov, Hongfang Liu, Hua Xu

Digital Health Sciences

Research output: Contribution to journal › Article › peer-review

88 Scopus citations

Abstract

Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.

Original language	English (US)
Pages (from-to)	331-336
Number of pages	6
Journal	Journal of the American Medical Informatics Association
Volume	25
Issue number	3
DOIs	https://doi.org/10.1093/jamia/ocx132
State	Published - Mar 1 2018

Keywords

Clinical text processing
Machine learning
Natural language processing

ASJC Scopus subject areas

Health Informatics

Access to Document

10.1093/jamia/ocx132

Cite this

@article{aad9b8b7277d44e09b87851d6f3aa487,

title = "CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines",

abstract = "Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.",

keywords = "Clinical text processing, Machine learning, Natural language processing",

author = "Ergin Soysal and Jingqi Wang and Min Jiang and Yonghui Wu and Serguei Pakhomov and Hongfang Liu and Hua Xu",

year = "2018",

month = mar,

day = "1",

doi = "10.1093/jamia/ocx132",

language = "English (US)",

volume = "25",

pages = "331--336",

journal = "Journal of the American Medical Informatics Association",

issn = "1067-5027",

publisher = "Oxford University Press",

number = "3",

}

TY - JOUR

T1 - CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines

AU - Soysal, Ergin

AU - Wang, Jingqi

AU - Jiang, Min

AU - Wu, Yonghui

AU - Pakhomov, Serguei

AU - Liu, Hongfang

AU - Xu, Hua

PY - 2018/3/1

Y1 - 2018/3/1

N2 - Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.

AB - Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.

KW - Clinical text processing

KW - Machine learning

KW - Natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85043312385&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85043312385&partnerID=8YFLogxK

U2 - 10.1093/jamia/ocx132

DO - 10.1093/jamia/ocx132

M3 - Article

C2 - 29186491

AN - SCOPUS:85043312385

SN - 1067-5027

VL - 25

SP - 331

EP - 336

JO - Journal of the American Medical Informatics Association

JF - Journal of the American Medical Informatics Association

IS - 3

ER -

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this