TY - GEN
T1 - Developing customizable cancer information extraction modules for pathology reports using clamp
AU - Soysal, Ergin
AU - Warner, Jeremy L.
AU - Wang, Jingqi
AU - Jiang, Min
AU - Harvey, Krysten
AU - Jain, Sandeep Kumar
AU - Dong, Xiao
AU - Song, Hsing Yi
AU - Siddhanamatha, Harish
AU - Wang, Liwei
AU - Dai, Qi
AU - Chen, Qingxia
AU - Du, Xianglin
AU - Tao, Cui
AU - Yang, Ping
AU - Denny, Joshua Charles
AU - Liu, Hongfang
AU - Xu, Hua
N1 - Publisher Copyright:
© 2019 International Medical Informatics Association (IMIA) and IOS Press.
PY - 2019/8/21
Y1 - 2019/8/21
N2 - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.
AB - Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.
KW - Electronic Health Records
KW - Information Storage and Retrieval
KW - Natural Language Processing
UR - http://www.scopus.com/inward/record.url?scp=85071496267&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85071496267&partnerID=8YFLogxK
U2 - 10.3233/SHTI190383
DO - 10.3233/SHTI190383
M3 - Conference contribution
C2 - 31438083
AN - SCOPUS:85071496267
T3 - Studies in Health Technology and Informatics
SP - 1041
EP - 1045
BT - MEDINFO 2019
A2 - Seroussi, Brigitte
A2 - Ohno-Machado, Lucila
A2 - Ohno-Machado, Lucila
A2 - Seroussi, Brigitte
PB - IOS Press
T2 - 17th World Congress on Medical and Health Informatics, MEDINFO 2019
Y2 - 25 August 2019 through 30 August 2019
ER -