Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes

Peter L. Elkin, David A. Froehling, Dietlind L. Wahner-Roedler, Steven H. Brown, Kent R. Bailey

Research output: Contribution to journalArticlepeer-review

47 Scopus citations

Abstract

Background: An effective national biosurveillance system expedites outbreak recognition and facilitates response coordination at the federal, state, and local levels. The BioSense system, used at the Centers for Disease Control and Prevention, incorporates chief complaints but not data from the whole encounter note into its surveillance algorithms. Objective: To evaluate whether biosurveillance by using data from the whole encounter note is superior to that using data from the chief complaint field alone. Design: 6-year retrospective case-control cohort study. Setting: Mayo Clinic, Rochester, Minnesota. Participants: 17 243 persons tested for influenza A or B virus between 1 January 2000 and 31 December 2006. Measurements: The accuracy of a model based on signs and symptoms to predict influenza virus infection in patients with upper respiratory tract symptoms, and the ability of a natural language processing technique to identify definitional clinical features from free-text encounter notes. Results: Surveillance based on the whole encounter note was superior to the chief complaint field alone. For the case definition used by surveillance of the whole encounter note, the normalized partial area under the receiver-operating characteristic curve (specificity, 0.1 to 0.4) for surveillance using the whole encounter note was 92.9% versus 70.3% for surveillance with the chief complaint field (difference, 22.6%; P < 0.001). Comparison of the 2 models at the fixed specificity of 0.4 resulted in sensitivities of 89.0% and 74.4%, respectively (P < 0.001). The relative risk for missing a true case of influenza was 2.3 by using the chief complaint field model. Limitations: Participants were seen at 1 tertiary referral center. The cost of comprehensive biosurveillance monitoring was not studied. Conclusion: A biosurveillance model for influenza using the whole encounter note is more accurate than a model that uses only the chief complaint field. Because case-defining signs and symptoms of influenza are commonly available in health records, the investigators believe that the national strategy for biosurveillance should be changed to incorporate data from the whole health record. Primary Funding Source: Centers for Disease Control and Prevention.

Original languageEnglish (US)
Pages (from-to)11-18
Number of pages8
JournalAnnals of internal medicine
Volume156
Issue number1
DOIs
StatePublished - 2012

ASJC Scopus subject areas

  • Internal Medicine

Fingerprint

Dive into the research topics of 'Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes'. Together they form a unique fingerprint.

Cite this