Identification of Patients with Family History of Pancreatic Cancer-Investigation of an NLP System Portability

Saeed Mehrabi, Anand Krishnan, Alexandra M. Roch, Heidi Schmidt, Dingcheng Li, Joe Kesterson, Chris Beesley, Paul Dexter, Max Schmidt, Mathew Palakal, Hongfang Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

In this study we have developed a rule-based natural language processing (NLP) system to identify patients with family history of pancreatic cancer. The algorithm was developed in a Unstructured Information Management Architecture (UIMA) framework and consisted of section segmentation, relation discovery, and negation detection. The system was evaluated on data from two institutions. The family history identification precision was consistent across the institutions shifting from 88.9% on Indiana University (IU) dataset to 87.8% on Mayo Clinic dataset. Customizing the algorithm on the the Mayo Clinic data, increased its precision to 88.1%. The family member relation discovery achieved precision, recall, and F-measure of 75.3%, 91.6% and 82.6% respectively. Negation detection resulted in precision of 99.1%. The results show that rule-based NLP approaches for specific information extraction tasks are portable across institutions; however customization of the algorithm on the new dataset improves its performance.

Original languageEnglish (US)
Title of host publicationMEDINFO 2015
Subtitle of host publicationeHealth-Enabled Health - Proceedings of the 15th World Congress on Health and Biomedical Informatics
EditorsAndrew Georgiou, Indra Neil Sarkar, Paulo Mazzoncini de Azevedo Marques
PublisherIOS Press
Pages604-608
Number of pages5
ISBN (Electronic)9781614995630
DOIs
StatePublished - 2015
Event15th World Congress on Health and Biomedical Informatics, MEDINFO 2015 - Sao Paulo, Brazil
Duration: Aug 19 2015Aug 23 2015

Publication series

NameStudies in Health Technology and Informatics
Volume216
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Other

Other15th World Congress on Health and Biomedical Informatics, MEDINFO 2015
Country/TerritoryBrazil
CitySao Paulo
Period8/19/158/23/15

Keywords

  • Family History
  • Natural language processing
  • Pancreatic cancer
  • Unstructured Information Management Architecture

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Fingerprint

Dive into the research topics of 'Identification of Patients with Family History of Pancreatic Cancer-Investigation of an NLP System Portability'. Together they form a unique fingerprint.

Cite this