Developing customizable cancer information extraction modules for pathology reports using clamp

Ergin Soysal, Jeremy L. Warner, Jingqi Wang, Min Jiang, Krysten Harvey, Sandeep Kumar Jain, Xiao Dong, Hsing Yi Song, Harish Siddhanamatha, Liwei Wang, Qi Dai, Qingxia Chen, Xianglin Du, Cui Tao, Ping Yang, Joshua Charles Denny, Hongfang D Liu, Hua Xu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

Original languageEnglish (US)
Title of host publicationMEDINFO 2019
Subtitle of host publicationHealth and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics
EditorsBrigitte Seroussi, Lucila Ohno-Machado, Lucila Ohno-Machado, Brigitte Seroussi
PublisherIOS Press
Pages1041-1045
Number of pages5
ISBN (Electronic)9781643680026
DOIs
StatePublished - Aug 21 2019
Event17th World Congress on Medical and Health Informatics, MEDINFO 2019 - Lyon, France
Duration: Aug 25 2019Aug 30 2019

Publication series

NameStudies in Health Technology and Informatics
Volume264
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Conference

Conference17th World Congress on Medical and Health Informatics, MEDINFO 2019
CountryFrance
CityLyon
Period8/25/198/30/19

Fingerprint

Information Storage and Retrieval
Clamping devices
Pathology
Natural Language Processing
Processing
Tumors
Neoplasms
Natural language processing systems
Biomarkers
User interfaces
Health
Electronic Health Records
Tumor Biomarkers
Research
Technology

Keywords

  • Electronic Health Records
  • Information Storage and Retrieval
  • Natural Language Processing

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Cite this

Soysal, E., Warner, J. L., Wang, J., Jiang, M., Harvey, K., Jain, S. K., ... Xu, H. (2019). Developing customizable cancer information extraction modules for pathology reports using clamp. In B. Seroussi, L. Ohno-Machado, L. Ohno-Machado, & B. Seroussi (Eds.), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics (pp. 1041-1045). (Studies in Health Technology and Informatics; Vol. 264). IOS Press. https://doi.org/10.3233/SHTI190383

Developing customizable cancer information extraction modules for pathology reports using clamp. / Soysal, Ergin; Warner, Jeremy L.; Wang, Jingqi; Jiang, Min; Harvey, Krysten; Jain, Sandeep Kumar; Dong, Xiao; Song, Hsing Yi; Siddhanamatha, Harish; Wang, Liwei; Dai, Qi; Chen, Qingxia; Du, Xianglin; Tao, Cui; Yang, Ping; Denny, Joshua Charles; Liu, Hongfang D; Xu, Hua.

MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. ed. / Brigitte Seroussi; Lucila Ohno-Machado; Lucila Ohno-Machado; Brigitte Seroussi. IOS Press, 2019. p. 1041-1045 (Studies in Health Technology and Informatics; Vol. 264).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Soysal, E, Warner, JL, Wang, J, Jiang, M, Harvey, K, Jain, SK, Dong, X, Song, HY, Siddhanamatha, H, Wang, L, Dai, Q, Chen, Q, Du, X, Tao, C, Yang, P, Denny, JC, Liu, HD & Xu, H 2019, Developing customizable cancer information extraction modules for pathology reports using clamp. in B Seroussi, L Ohno-Machado, L Ohno-Machado & B Seroussi (eds), MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. Studies in Health Technology and Informatics, vol. 264, IOS Press, pp. 1041-1045, 17th World Congress on Medical and Health Informatics, MEDINFO 2019, Lyon, France, 8/25/19. https://doi.org/10.3233/SHTI190383
Soysal E, Warner JL, Wang J, Jiang M, Harvey K, Jain SK et al. Developing customizable cancer information extraction modules for pathology reports using clamp. In Seroussi B, Ohno-Machado L, Ohno-Machado L, Seroussi B, editors, MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. IOS Press. 2019. p. 1041-1045. (Studies in Health Technology and Informatics). https://doi.org/10.3233/SHTI190383
Soysal, Ergin ; Warner, Jeremy L. ; Wang, Jingqi ; Jiang, Min ; Harvey, Krysten ; Jain, Sandeep Kumar ; Dong, Xiao ; Song, Hsing Yi ; Siddhanamatha, Harish ; Wang, Liwei ; Dai, Qi ; Chen, Qingxia ; Du, Xianglin ; Tao, Cui ; Yang, Ping ; Denny, Joshua Charles ; Liu, Hongfang D ; Xu, Hua. / Developing customizable cancer information extraction modules for pathology reports using clamp. MEDINFO 2019: Health and Wellbeing e-Networks for All - Proceedings of the 17th World Congress on Medical and Health Informatics. editor / Brigitte Seroussi ; Lucila Ohno-Machado ; Lucila Ohno-Machado ; Brigitte Seroussi. IOS Press, 2019. pp. 1041-1045 (Studies in Health Technology and Informatics).
@inproceedings{1d982fcc96d445c68bd68a6d8d844eff,
title = "Developing customizable cancer information extraction modules for pathology reports using clamp",
abstract = "Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.",
keywords = "Electronic Health Records, Information Storage and Retrieval, Natural Language Processing",
author = "Ergin Soysal and Warner, {Jeremy L.} and Jingqi Wang and Min Jiang and Krysten Harvey and Jain, {Sandeep Kumar} and Xiao Dong and Song, {Hsing Yi} and Harish Siddhanamatha and Liwei Wang and Qi Dai and Qingxia Chen and Xianglin Du and Cui Tao and Ping Yang and Denny, {Joshua Charles} and Liu, {Hongfang D} and Hua Xu",
year = "2019",
month = "8",
day = "21",
doi = "10.3233/SHTI190383",
language = "English (US)",
series = "Studies in Health Technology and Informatics",
publisher = "IOS Press",
pages = "1041--1045",
editor = "Brigitte Seroussi and Lucila Ohno-Machado and Lucila Ohno-Machado and Brigitte Seroussi",
booktitle = "MEDINFO 2019",

}

TY - GEN

T1 - Developing customizable cancer information extraction modules for pathology reports using clamp

AU - Soysal, Ergin

AU - Warner, Jeremy L.

AU - Wang, Jingqi

AU - Jiang, Min

AU - Harvey, Krysten

AU - Jain, Sandeep Kumar

AU - Dong, Xiao

AU - Song, Hsing Yi

AU - Siddhanamatha, Harish

AU - Wang, Liwei

AU - Dai, Qi

AU - Chen, Qingxia

AU - Du, Xianglin

AU - Tao, Cui

AU - Yang, Ping

AU - Denny, Joshua Charles

AU - Liu, Hongfang D

AU - Xu, Hua

PY - 2019/8/21

Y1 - 2019/8/21

N2 - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

AB - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

KW - Electronic Health Records

KW - Information Storage and Retrieval

KW - Natural Language Processing

UR - http://www.scopus.com/inward/record.url?scp=85071496267&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071496267&partnerID=8YFLogxK

U2 - 10.3233/SHTI190383

DO - 10.3233/SHTI190383

M3 - Conference contribution

C2 - 31438083

AN - SCOPUS:85071496267

T3 - Studies in Health Technology and Informatics

SP - 1041

EP - 1045

BT - MEDINFO 2019

A2 - Seroussi, Brigitte

A2 - Ohno-Machado, Lucila

A2 - Ohno-Machado, Lucila

A2 - Seroussi, Brigitte

PB - IOS Press

ER -