The short forms of medical concepts or expressions (i.e., acronyms/abbreviations) are prevalent in clinical documentation. Given the limited number of potential short forms, they are also highly ambiguous. Resolving the ambiguity of short forms is essential in clinical natural language processing (NLP). However, one prerequisite for resolving ambiguity of short forms is to have a sense inventory. This paper outlines our process of identifying 141 potential short forms with randomly sampled phrases from a large clinical corpus. We assessed various features in their ability to disambiguate medical and non-medical usages. We identified 68% of our short forms as primarily serving medical usages, whereas 12% had non-medical usages. The remaining 19% showed alternating usage based upon case form. Our short forms had an average of 3.58 senses. Usages could be distinguished using basic trigram/bigram/line information. Our initial findings will be applicable for automatic usage/sense resolution.
|Original language||English (US)|
|Number of pages||10|
|Journal||AMIA ... Annual Symposium proceedings. AMIA Symposium|
|State||Published - 2017|
ASJC Scopus subject areas