The true onset time of a disease, particularly slow-onset diseases like Type 2 diabetes mellitus (T2DM), is rarely observable in electronic health records (EHRs). However, it is critical for analysis of time to events and for studying sequences of diseases. The aim of this study is to demonstrate a method for estimating the onset time of such diseases from intermittently observable laboratory results in the specific context of T2DM. A retrospective observational study design is used. A cohort of 5,874 non-diabetic patients from a large healthcare system in the Upper Midwest United States was constructed with a three-year follow-up period. The HbA1c level of each patient was collected from earliest and the latest follow-up. We modeled the patients' HbA1c trajectories through Bayesian networks to estimate the onset time of diabetes. Due to non-random censoring and interventions unobservable from EHR data (such as lifestyle changes), naïve modeling of HbA1c through linear regression or modeling time-to-event through proportional hazard model leads to a clinically infeasible model with no or limited ability to predict the onset time of diabetes. Our model is consistent with clinical knowledge and estimated the onset of diabetes with less than a six-month error for almost half the patients for whom the onset time could be clinically ascertained. To our knowledge, this is the first study of modeling long-term HbA1c progression in non-diabetic patients and estimating the onset time of diabetes.