Context Quality-adjusted life year (QALY) estimation is a well-known but little used technique to compare survival adjusted for complications. Lack of calibration and interpretation guidance hinders implementation of QALY analyses. Objectives We conducted simulation studies to assess the impact of differences in survival, toxicity rates, and utility values on QALY results. Methods Survival comparisons used both log-rank and Wilcoxon testing. We examined power considerations for a North Central Cancer Treatment Group Phase III lung cancer clinical trial (89-20-52). Results Sample sizes of 100 events per treatment have low power to generate a statistically significant difference in QALYs unless the toxicity rate is 44% higher in one arm. For sample sizes of 200 per arm and equal survival times, toxicity needs to be at least 38% more in one arm for the result to be statistically significant, using a utility of 0.3 for days with toxicity. Sample sizes of 300 (500)/arm provide 80% power if there is a 31% (25%) toxicity difference. If the overall survival hazard ratio between the two treatment arms is 1.25, then samples of at least 150 patients and 13% increased toxicity are necessary to have 80% power to detect QALY differences. In study 89-20-52, there was only 56% power to determine the statistical significance of the observed QALY differences, clarifying the enigmatic conclusion of no statistically significant difference in QALY despite an observed 14.5% increase in toxicity between treatments. Conclusion This calibration allows researchers to interpret the clinical significance of QALY analyses and facilitates QALY inclusion in clinical trials through improved study design.
- quality of life
- quality-adjusted life year
ASJC Scopus subject areas
- Clinical Neurology
- Anesthesiology and Pain Medicine