Frequent Causal Pattern Mining: A Computationally Efficient Framework for Estimating Bias-Corrected Effects

Pranjul Yadav, Michael Steinbach, M. Regina Castro, Pedro J. Caraballo, Vipin Kumar, Gyorgy Simon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our aging population increasingly suffers from multiple chronic diseases simultaneously, necessitating the comprehensive treatment of these conditions. Finding the optimal set of drugs for a combinatorial set of diseases is a combinatorial pattern exploration problem. Association rule mining is a popular tool for such problems, but the requirement of health care for finding causal, rather than associative, patterns renders association rule mining unsuitable. To address this issue, we propose a novel framework based on the Rubin-Neyman causal model for extracting causal rules from observational data, correcting for a number of common biases. Specifically, given a set of interventions and a set of items that define subpopulations (e.g., diseases), we wish to find all subpopulations in which effective intervention combinations exist and in each such subpopulation, we wish to find all intervention combinations such that dropping any intervention from this combination will reduce the efficacy of the treatment. A key aspect of our framework is the concept of closed intervention sets which extend the concept of quantifying the effect of a single intervention to a set of concurrent interventions. Closed intervention sets also allow for a pruning strategy that is strictly more efficient than the traditional pruning strategy used by the Apriori algorithm. To implement our ideas, we introduce and compare five methods of estimating causal effect from observational data and rigorously evaluate them on synthetic data to mathematically prove (when possible) why they work. We also evaluated our causal rule mining framework on the Electronic Health Records (EHR) data of a large cohort of 152000 patients from Mayo Clinic and showed that the patterns we extracted are sufficiently rich to explain the controversial findings in the medical literature regarding the effect of a class of cholesterol drugs on Type-II Diabetes Mellitus (T2DM).

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1981-1990
Number of pages10
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: Dec 9 2019Dec 12 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
CountryUnited States
CityLos Angeles
Period12/9/1912/12/19

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Fingerprint Dive into the research topics of 'Frequent Causal Pattern Mining: A Computationally Efficient Framework for Estimating Bias-Corrected Effects'. Together they form a unique fingerprint.

  • Cite this

    Yadav, P., Steinbach, M., Castro, M. R., Caraballo, P. J., Kumar, V., & Simon, G. (2019). Frequent Causal Pattern Mining: A Computationally Efficient Framework for Estimating Bias-Corrected Effects. In C. Baru, J. Huan, L. Khan, X. T. Hu, R. Ak, Y. Tian, R. Barga, C. Zaniolo, K. Lee, & Y. F. Ye (Eds.), Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019 (pp. 1981-1990). [9005977] (Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData47090.2019.9005977