Developing customizable cancer information extraction modules for pathology reports using clamp

Ergin Soysal; Jeremy L. Warner; Jingqi Wang; Min Jiang; Krysten Harvey; Sandeep Kumar Jain; Xiao Dong; Hsing Yi Song; Harish Siddhanamatha; Liwei Wang; Qi Dai; Qingxia Chen; Xianglin Du; Cui Tao; Ping Yang; Joshua Charles Denny; Hongfang Liu; Hua Xu

doi:10.3233/SHTI190383

Developing customizable cancer information extraction modules for pathology reports using clamp

Ergin Soysal, Jeremy L. Warner, Jingqi Wang, Min Jiang, Krysten Harvey, Sandeep Kumar Jain, Xiao Dong, Hsing Yi Song, Harish Siddhanamatha, Liwei Wang, Qi Dai, Qingxia Chen, Xianglin Du, Cui Tao, Ping Yang, Joshua Charles Denny, Hongfang Liu, Hua Xu

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

Original language	English (US)
Title of host publication	MEDINFO 2019
Subtitle of host publication	Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics
Editors	Brigitte Seroussi, Lucila Ohno-Machado, Lucila Ohno-Machado, Brigitte Seroussi
Publisher	IOS Press
Pages	1041-1045
Number of pages	5
ISBN (Electronic)	9781643680026
DOIs	https://doi.org/10.3233/SHTI190383
State	Published - Aug 21 2019
Event	17th World Congress on Medical and Health Informatics, MEDINFO 2019 - Lyon, France Duration: Aug 25 2019 → Aug 30 2019

Publication series

Name	Studies in Health Technology and Informatics
Volume	264
ISSN (Print)	0926-9630
ISSN (Electronic)	1879-8365

Conference

Conference	17th World Congress on Medical and Health Informatics, MEDINFO 2019
Country/Territory	France
City	Lyon
Period	8/25/19 → 8/30/19

Keywords

Electronic Health Records
Information Storage and Retrieval
Natural Language Processing

ASJC Scopus subject areas

Biomedical Engineering
Health Informatics
Health Information Management

Access to Document

10.3233/SHTI190383

Cite this

Soysal, E., Warner, J. L., Wang, J., Jiang, M., Harvey, K., Jain, S. K., Dong, X., Song, H. Y., Siddhanamatha, H., Wang, L., Dai, Q., Chen, Q., Du, X., Tao, C., Yang, P., Denny, J. C., Liu, H., & Xu, H. (2019). Developing customizable cancer information extraction modules for pathology reports using clamp. In B. Seroussi, L. Ohno-Machado, L. Ohno-Machado, & B. Seroussi (Eds.), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics (pp. 1041-1045). (Studies in Health Technology and Informatics; Vol. 264). IOS Press. https://doi.org/10.3233/SHTI190383

Developing customizable cancer information extraction modules for pathology reports using clamp. / Soysal, Ergin; Warner, Jeremy L.; Wang, Jingqi et al.
MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. ed. / Brigitte Seroussi; Lucila Ohno-Machado; Lucila Ohno-Machado; Brigitte Seroussi. IOS Press, 2019. p. 1041-1045 (Studies in Health Technology and Informatics; Vol. 264).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Soysal, E, Warner, JL, Wang, J, Jiang, M, Harvey, K, Jain, SK, Dong, X, Song, HY, Siddhanamatha, H, Wang, L, Dai, Q, Chen, Q, Du, X, Tao, C, Yang, P, Denny, JC, Liu, H & Xu, H 2019, Developing customizable cancer information extraction modules for pathology reports using clamp. in B Seroussi, L Ohno-Machado, L Ohno-Machado & B Seroussi (eds), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. Studies in Health Technology and Informatics, vol. 264, IOS Press, pp. 1041-1045, 17th World Congress on Medical and Health Informatics, MEDINFO 2019, Lyon, France, 8/25/19. https://doi.org/10.3233/SHTI190383

Soysal E, Warner JL, Wang J, Jiang M, Harvey K, Jain SK et al. Developing customizable cancer information extraction modules for pathology reports using clamp. In Seroussi B, Ohno-Machado L, Ohno-Machado L, Seroussi B, editors, MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. IOS Press. 2019. p. 1041-1045. (Studies in Health Technology and Informatics). doi: 10.3233/SHTI190383

Soysal, Ergin ; Warner, Jeremy L. ; Wang, Jingqi et al. / Developing customizable cancer information extraction modules for pathology reports using clamp. MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. editor / Brigitte Seroussi ; Lucila Ohno-Machado ; Lucila Ohno-Machado ; Brigitte Seroussi. IOS Press, 2019. pp. 1041-1045 (Studies in Health Technology and Informatics).

@inproceedings{1d982fcc96d445c68bd68a6d8d844eff,

title = "Developing customizable cancer information extraction modules for pathology reports using clamp",

abstract = "Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.",

keywords = "Electronic Health Records, Information Storage and Retrieval, Natural Language Processing",

author = "Ergin Soysal and Warner, {Jeremy L.} and Jingqi Wang and Min Jiang and Krysten Harvey and Jain, {Sandeep Kumar} and Xiao Dong and Song, {Hsing Yi} and Harish Siddhanamatha and Liwei Wang and Qi Dai and Qingxia Chen and Xianglin Du and Cui Tao and Ping Yang and Denny, {Joshua Charles} and Hongfang Liu and Hua Xu",

note = "Publisher Copyright: {\textcopyright} 2019 International Medical Informatics Association (IMIA) and IOS Press.; 17th World Congress on Medical and Health Informatics, MEDINFO 2019 ; Conference date: 25-08-2019 Through 30-08-2019",

year = "2019",

month = aug,

day = "21",

doi = "10.3233/SHTI190383",

language = "English (US)",

series = "Studies in Health Technology and Informatics",

publisher = "IOS Press",

pages = "1041--1045",

editor = "Brigitte Seroussi and Lucila Ohno-Machado and Lucila Ohno-Machado and Brigitte Seroussi",

booktitle = "MEDINFO 2019",

}

TY - GEN

T1 - Developing customizable cancer information extraction modules for pathology reports using clamp

AU - Soysal, Ergin

AU - Warner, Jeremy L.

AU - Wang, Jingqi

AU - Jiang, Min

AU - Harvey, Krysten

AU - Jain, Sandeep Kumar

AU - Dong, Xiao

AU - Song, Hsing Yi

AU - Siddhanamatha, Harish

AU - Wang, Liwei

AU - Dai, Qi

AU - Chen, Qingxia

AU - Du, Xianglin

AU - Tao, Cui

AU - Yang, Ping

AU - Denny, Joshua Charles

AU - Liu, Hongfang

AU - Xu, Hua

PY - 2019/8/21

Y1 - 2019/8/21

N2 - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

AB - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

KW - Electronic Health Records

KW - Information Storage and Retrieval

KW - Natural Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85071496267&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071496267&partnerID=8YFLogxK

U2 - 10.3233/SHTI190383

DO - 10.3233/SHTI190383

M3 - Conference contribution

C2 - 31438083

AN - SCOPUS:85071496267

T3 - Studies in Health Technology and Informatics

SP - 1041

EP - 1045

BT - MEDINFO 2019

A2 - Seroussi, Brigitte

A2 - Ohno-Machado, Lucila

A2 - Seroussi, Brigitte

PB - IOS Press

T2 - 17th World Congress on Medical and Health Informatics, MEDINFO 2019

Y2 - 25 August 2019 through 30 August 2019

ER -

Developing customizable cancer information extraction modules for pathology reports using clamp

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this