TY - JOUR
T1 - A Computational Method for Learning Disease Trajectories from Partially Observable EHR Data
AU - Oh, Wonsuk
AU - Steinbach, Michael S.
AU - Castro, M. Regina
AU - Peterson, Kevin A.
AU - Kumar, Vipin
AU - Caraballo, Pedro J.
AU - Simon, Gyorgy J.
N1 - Funding Information:
Manuscript received August 24, 2020; revised November 16, 2020, December 28, 2020, April 19, 2021, and June 4, 2021; accepted June 6, 2021. Date of publication June 15, 2021; date of current version July 20, 2021. This work was supported by NIH Award LM011972, NSF Awards IIS 1602394 and IIS 1602198. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the funding agencies. (Corresponding author: Gyorgy Simon.) Wonsuk Oh is with the Institute for Health Informatics, University of Minnesota, Minneapolis, MN 55455 USA, and also with Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029 USA (e-mail: ohxxx215@umn.edu).
Publisher Copyright:
© 2013 IEEE.
PY - 2021/7
Y1 - 2021/7
N2 - Diseases can show different courses of progression even when patients share the same risk factors. Recent studies have revealed that the use of trajectories, the order in which diseases manifest throughout life, can be predictive of the course of progression. In this study, we propose a novel computational method for learning disease trajectories from EHR data. The proposed method consists of three parts: first, we propose an algorithm for extracting trajectories from EHR data; second, three criteria for filtering trajectories; and third, a likelihood function for assessing the risk of developing a set of outcomes given a trajectory set. We applied our methods to extract a set of disease trajectories from Mayo Clinic EHR data and evaluated it internally based on log-likelihood, which can be interpreted as the trajectories' ability to explain the observed (partial) disease progressions. We then externally evaluated the trajectories on EHR data from an independent health system, M Health Fairview. The proposed algorithm extracted a comprehensive set of disease trajectories that can explain the observed outcomes substantially better than competing methods and the proposed filtering criteria selected a small subset of disease trajectories that are highly interpretable and suffered only a minimal (relative 5%) loss of the ability to explain disease progression in both the internal and external validation.
AB - Diseases can show different courses of progression even when patients share the same risk factors. Recent studies have revealed that the use of trajectories, the order in which diseases manifest throughout life, can be predictive of the course of progression. In this study, we propose a novel computational method for learning disease trajectories from EHR data. The proposed method consists of three parts: first, we propose an algorithm for extracting trajectories from EHR data; second, three criteria for filtering trajectories; and third, a likelihood function for assessing the risk of developing a set of outcomes given a trajectory set. We applied our methods to extract a set of disease trajectories from Mayo Clinic EHR data and evaluated it internally based on log-likelihood, which can be interpreted as the trajectories' ability to explain the observed (partial) disease progressions. We then externally evaluated the trajectories on EHR data from an independent health system, M Health Fairview. The proposed algorithm extracted a comprehensive set of disease trajectories that can explain the observed outcomes substantially better than competing methods and the proposed filtering criteria selected a small subset of disease trajectories that are highly interpretable and suffered only a minimal (relative 5%) loss of the ability to explain disease progression in both the internal and external validation.
KW - Disease trajectories
KW - electronic health records
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85111694327&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85111694327&partnerID=8YFLogxK
U2 - 10.1109/JBHI.2021.3089441
DO - 10.1109/JBHI.2021.3089441
M3 - Article
C2 - 34129510
AN - SCOPUS:85111694327
SN - 2168-2194
VL - 25
SP - 2476
EP - 2486
JO - IEEE Journal of Biomedical and Health Informatics
JF - IEEE Journal of Biomedical and Health Informatics
IS - 7
M1 - 9456038
ER -