Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks

Wonsuk Oh, Pranjul Yadav, Vipin Kumar, Pedro Caraballo, M. Regina Castro, Michael S. Steinbach, Gyorgy J. Simon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM. A retrospective observational study design is used. A cohort of 5,874 non-diabetic patients from a large healthcare system in the Upper Midwest United States was constructed with a three-year follow-up period. The HbA1c level of each patient was collected from earliest and the latest follow-up. We modeled the patients' HbA1c trajectories through Bayesian networks to estimate the onset time of diabetes. Due to non-random censoring and interventions unobservable from EHR data (such as lifestyle changes), naïve modeling of HbA1c through linear regression or modeling time-to-event through proportional hazard model leads to a clinically infeasible model with no or limited ability to predict the onset time of diabetes. Our model is consistent with clinical knowledge and estimated the onset of diabetes with less than a six-month error for almost half the patients for whom the onset time could be clinically ascertained. To our knowledge, this is the first study of modeling long-term HbA1c progression in non-diabetic patients and estimating the onset time of diabetes.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017
EditorsMollie Cummins, Julio Facelli, Gerrit Meixner, Christophe Giraud-Carrier, Hiroshi Nakajima
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages374-379
Number of pages6
ISBN (Electronic)9781509048816
DOIs
StatePublished - Sep 8 2017
Event5th IEEE International Conference on Healthcare Informatics, ICHI 2017 - Park City, United States
Duration: Aug 23 2017Aug 26 2017

Other

Other5th IEEE International Conference on Healthcare Informatics, ICHI 2017
CountryUnited States
CityPark City
Period8/23/178/26/17

Fingerprint

Electronic Health Records
Type 2 Diabetes Mellitus
Aptitude
Proportional Hazards Models
Observational Studies
Life Style
Linear Models
Retrospective Studies
Delivery of Health Care

Keywords

  • Disease Progression
  • Disease Trajectory
  • Electronic Health Records
  • Hemoglobin A1c
  • Onset time
  • Type 2 diabetes

ASJC Scopus subject areas

  • Health Informatics

Cite this

Oh, W., Yadav, P., Kumar, V., Caraballo, P., Castro, M. R., Steinbach, M. S., & Simon, G. J. (2017). Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks. In M. Cummins, J. Facelli, G. Meixner, C. Giraud-Carrier, & H. Nakajima (Eds.), Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017 (pp. 374-379). [8031177] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICHI.2017.41

Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks. / Oh, Wonsuk; Yadav, Pranjul; Kumar, Vipin; Caraballo, Pedro; Castro, M. Regina; Steinbach, Michael S.; Simon, Gyorgy J.

Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017. ed. / Mollie Cummins; Julio Facelli; Gerrit Meixner; Christophe Giraud-Carrier; Hiroshi Nakajima. Institute of Electrical and Electronics Engineers Inc., 2017. p. 374-379 8031177.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Oh, W, Yadav, P, Kumar, V, Caraballo, P, Castro, MR, Steinbach, MS & Simon, GJ 2017, Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks. in M Cummins, J Facelli, G Meixner, C Giraud-Carrier & H Nakajima (eds), Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017., 8031177, Institute of Electrical and Electronics Engineers Inc., pp. 374-379, 5th IEEE International Conference on Healthcare Informatics, ICHI 2017, Park City, United States, 8/23/17. https://doi.org/10.1109/ICHI.2017.41
Oh W, Yadav P, Kumar V, Caraballo P, Castro MR, Steinbach MS et al. Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks. In Cummins M, Facelli J, Meixner G, Giraud-Carrier C, Nakajima H, editors, Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 374-379. 8031177 https://doi.org/10.1109/ICHI.2017.41
Oh, Wonsuk ; Yadav, Pranjul ; Kumar, Vipin ; Caraballo, Pedro ; Castro, M. Regina ; Steinbach, Michael S. ; Simon, Gyorgy J. / Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks. Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017. editor / Mollie Cummins ; Julio Facelli ; Gerrit Meixner ; Christophe Giraud-Carrier ; Hiroshi Nakajima. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 374-379
@inproceedings{94dff18c505e4d469e8bf030214f2d38,
title = "Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks",
abstract = "The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM. A retrospective observational study design is used. A cohort of 5,874 non-diabetic patients from a large healthcare system in the Upper Midwest United States was constructed with a three-year follow-up period. The HbA1c level of each patient was collected from earliest and the latest follow-up. We modeled the patients' HbA1c trajectories through Bayesian networks to estimate the onset time of diabetes. Due to non-random censoring and interventions unobservable from EHR data (such as lifestyle changes), na{\"i}ve modeling of HbA1c through linear regression or modeling time-to-event through proportional hazard model leads to a clinically infeasible model with no or limited ability to predict the onset time of diabetes. Our model is consistent with clinical knowledge and estimated the onset of diabetes with less than a six-month error for almost half the patients for whom the onset time could be clinically ascertained. To our knowledge, this is the first study of modeling long-term HbA1c progression in non-diabetic patients and estimating the onset time of diabetes.",
keywords = "Disease Progression, Disease Trajectory, Electronic Health Records, Hemoglobin A1c, Onset time, Type 2 diabetes",
author = "Wonsuk Oh and Pranjul Yadav and Vipin Kumar and Pedro Caraballo and Castro, {M. Regina} and Steinbach, {Michael S.} and Simon, {Gyorgy J.}",
year = "2017",
month = "9",
day = "8",
doi = "10.1109/ICHI.2017.41",
language = "English (US)",
pages = "374--379",
editor = "Mollie Cummins and Julio Facelli and Gerrit Meixner and Christophe Giraud-Carrier and Hiroshi Nakajima",
booktitle = "Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Estimating Disease Onset Time by Modeling Lab Result Trajectories via Bayes Networks

AU - Oh, Wonsuk

AU - Yadav, Pranjul

AU - Kumar, Vipin

AU - Caraballo, Pedro

AU - Castro, M. Regina

AU - Steinbach, Michael S.

AU - Simon, Gyorgy J.

PY - 2017/9/8

Y1 - 2017/9/8

N2 - The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM. A retrospective observational study design is used. A cohort of 5,874 non-diabetic patients from a large healthcare system in the Upper Midwest United States was constructed with a three-year follow-up period. The HbA1c level of each patient was collected from earliest and the latest follow-up. We modeled the patients' HbA1c trajectories through Bayesian networks to estimate the onset time of diabetes. Due to non-random censoring and interventions unobservable from EHR data (such as lifestyle changes), naïve modeling of HbA1c through linear regression or modeling time-to-event through proportional hazard model leads to a clinically infeasible model with no or limited ability to predict the onset time of diabetes. Our model is consistent with clinical knowledge and estimated the onset of diabetes with less than a six-month error for almost half the patients for whom the onset time could be clinically ascertained. To our knowledge, this is the first study of modeling long-term HbA1c progression in non-diabetic patients and estimating the onset time of diabetes.

AB - The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM. A retrospective observational study design is used. A cohort of 5,874 non-diabetic patients from a large healthcare system in the Upper Midwest United States was constructed with a three-year follow-up period. The HbA1c level of each patient was collected from earliest and the latest follow-up. We modeled the patients' HbA1c trajectories through Bayesian networks to estimate the onset time of diabetes. Due to non-random censoring and interventions unobservable from EHR data (such as lifestyle changes), naïve modeling of HbA1c through linear regression or modeling time-to-event through proportional hazard model leads to a clinically infeasible model with no or limited ability to predict the onset time of diabetes. Our model is consistent with clinical knowledge and estimated the onset of diabetes with less than a six-month error for almost half the patients for whom the onset time could be clinically ascertained. To our knowledge, this is the first study of modeling long-term HbA1c progression in non-diabetic patients and estimating the onset time of diabetes.

KW - Disease Progression

KW - Disease Trajectory

KW - Electronic Health Records

KW - Hemoglobin A1c

KW - Onset time

KW - Type 2 diabetes

UR - http://www.scopus.com/inward/record.url?scp=85032377768&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032377768&partnerID=8YFLogxK

U2 - 10.1109/ICHI.2017.41

DO - 10.1109/ICHI.2017.41

M3 - Conference contribution

AN - SCOPUS:85032377768

SP - 374

EP - 379

BT - Proceedings - 2017 IEEE International Conference on Healthcare Informatics, ICHI 2017

A2 - Cummins, Mollie

A2 - Facelli, Julio

A2 - Meixner, Gerrit

A2 - Giraud-Carrier, Christophe

A2 - Nakajima, Hiroshi

PB - Institute of Electrical and Electronics Engineers Inc.

ER -