Disorder concept identification from clinical notes an experience with the ShARe/CLEF 2013 challenge

Jung Wei Fan, Navdeep Sood, Yang Huang

Research output: Contribution to journalConference articlepeer-review

6 Scopus citations


We participated in both tasks 1a and 1b of the ShARe/CLEF 2013 NLP Challenge, where 1a was on detecting disorder concept boundaries and 1b was on assigning concept IDs to the entities from 1a. An existing NLP system developed at Kaiser Permanente was modified to output concepts that were close to the disorder definition of the Challenge. The core pipeline involved deterministic section detection, tokenization, sentence chunking, probabilistic POS tagging, rule-based phrase chunking, terminology look-up (using UMLS 2012AB), rule-based concept disambiguation and post-coordination. The system originally identifies findings (both normal and abnormal), procedures, anatomies, etc., and therefore a post-filter was created to subset the concepts with the source (SNOMED) and semantic types expected by the Challenge. A list of frequency-ranked CUIs was extracted from the training corpus to help break ties when multiple concepts were proposed on a single set of span. However, no retraining/customization was made to meet the boundary annotation preference specified in the challenge guidelines. Our best settings achieved an F-score of 0.503 (was 0.684 with relaxed boundary penalty) in task 1a, and best accuracy of 0.443 (was 0.865 on relaxed boundaries) in task 1b.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
StatePublished - 2013
Event2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain
Duration: Sep 23 2013Sep 26 2013


  • Concept boundary detection
  • Concept normalization
  • Medical language processing

ASJC Scopus subject areas

  • Computer Science(all)


Dive into the research topics of 'Disorder concept identification from clinical notes an experience with the ShARe/CLEF 2013 challenge'. Together they form a unique fingerprint.

Cite this