Enabling Comparative Effectiveness Research in Silent Brain Infarction Through Natural Language Processing and Big Data

Project: Research project

Project Details


It is a common clinical occurrence that neuroimaging scans obtained in the course of routine clinical care discover a prior brain infarction in patients with no history of stroke or transient ischemic attack. Indeed, epidemiologic studies indicate that silent brain infarctions (SBI) are far more common than strokes; MRI- defined SBI can be detected in ~20% of the healthy elderly. In these studies based on screened patients, the findings have been shown to be associated with subtle, typically unrecognized, deficits in physical and cognitive function. These imaging findings are also strong, independent risk factors for future stroke and dementia. Despite the very high prevalence of SBI in screened populations, and their serious consequences, little is known about the significance or the appropriate management of SBI when discovered incidentally in routine care. While there is strong evidence that antiplatelet therapy and statin therapy are effective in preventing recurrent stroke in patients with prior stroke, the degree to which these results apply to patients with SBI is unclear. Because patients with SBI define a population that falls in between primary and secondary stroke prevention, the approach to these patients is marked by uncertainty and practice variation, making it an ideal condition for observational comparative effectiveness research. Nonetheless, there are serious challenges for the study of SBI. As patients have no overt symptoms, recruitment into a trial can be problematic. Comparative effectiveness research on SBI, using routinely collected data and leveraging the variation in care of these patients, is impeded by the fact that there are no ICD codes for SBI, and it is generally not included in structured fields of electronic health records (EHR) as it is typically considered an incidental finding. In order to establish the comparative effectiveness of treatment strategies for SBI in a large, heterogeneous population, we propose to develop Natural Language Processing (NLP) algorithms to identify individuals with SBI through the automated review of neuroradiology reports. We have performed preliminary work to demonstrate that such an approach is feasible. We will then apply state-of-the-science observational comparative effectiveness methods in a massive database with linked EHR and claims data to examine the effectiveness of statins and antiplatelet agents in preventing future stroke and dementia. Thus, our aims are: Aim 1: We will develop NLP algorithms that can accurately identify cases of SBI and white matter disease (WMD) in two different health systems. Aim 2: We will port, refine, and validate the NLP algorithm in a large heterogeneous database including more than 600 hospitals and 6500 clinics and identify a large cohort of patients with routinely-discovered SBI and WMD. Aim 3: We will characterize the cohort with respect to age- specific prevalence of SBI, management patterns, and the risk of future stroke and dementia associated with SBI and WMD. Aim 4: We will estimate the comparative effectiveness of preventive therapies (statins and antiplatelet agents) on the risk of future stroke and dementia in patients with incidentally-discovered SBI.
StatusNot started