Comparison of futility monitoring guidelines using completed phase III oncology trials

Qiang Zhang, Boris Freidlin, Edward L. Korn, Susan Halabi, Sumithra J Mandrekar, James J. Dignam

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

Original languageEnglish (US)
Pages (from-to)48-58
Number of pages11
JournalClinical Trials
Volume14
Issue number1
DOIs
StatePublished - Feb 1 2017

Fingerprint

Medical Futility
Guidelines
Phase III Clinical Trials
Sample Size
Confidence Intervals
Neoplasms
Radiation Oncology
Appointments and Schedules
Leukemia
Radiotherapy
Therapeutics
Clinical Trials

Keywords

  • clinical trials
  • conditional power
  • Futility monitoring
  • linear inefficacy boundary
  • oncology
  • repeated confidence intervals
  • testing alternative hypothesis

ASJC Scopus subject areas

  • Medicine(all)
  • Pharmacology

Cite this

Comparison of futility monitoring guidelines using completed phase III oncology trials. / Zhang, Qiang; Freidlin, Boris; Korn, Edward L.; Halabi, Susan; Mandrekar, Sumithra J; Dignam, James J.

In: Clinical Trials, Vol. 14, No. 1, 01.02.2017, p. 48-58.

Research output: Contribution to journalArticle

Zhang, Qiang ; Freidlin, Boris ; Korn, Edward L. ; Halabi, Susan ; Mandrekar, Sumithra J ; Dignam, James J. / Comparison of futility monitoring guidelines using completed phase III oncology trials. In: Clinical Trials. 2017 ; Vol. 14, No. 1. pp. 48-58.
@article{4d73c1b3f43542a4bd2b09ce5c1dbcfb,
title = "Comparison of futility monitoring guidelines using completed phase III oncology trials",
abstract = "Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50{\%} of information). The conditional power rules are too aggressive during the second half of the study (>50{\%} of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.",
keywords = "clinical trials, conditional power, Futility monitoring, linear inefficacy boundary, oncology, repeated confidence intervals, testing alternative hypothesis",
author = "Qiang Zhang and Boris Freidlin and Korn, {Edward L.} and Susan Halabi and Mandrekar, {Sumithra J} and Dignam, {James J.}",
year = "2017",
month = "2",
day = "1",
doi = "10.1177/1740774516666502",
language = "English (US)",
volume = "14",
pages = "48--58",
journal = "Clinical Trials",
issn = "1740-7745",
publisher = "SAGE Publications Ltd",
number = "1",

}

TY - JOUR

T1 - Comparison of futility monitoring guidelines using completed phase III oncology trials

AU - Zhang, Qiang

AU - Freidlin, Boris

AU - Korn, Edward L.

AU - Halabi, Susan

AU - Mandrekar, Sumithra J

AU - Dignam, James J.

PY - 2017/2/1

Y1 - 2017/2/1

N2 - Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

AB - Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

KW - clinical trials

KW - conditional power

KW - Futility monitoring

KW - linear inefficacy boundary

KW - oncology

KW - repeated confidence intervals

KW - testing alternative hypothesis

UR - http://www.scopus.com/inward/record.url?scp=85012050883&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012050883&partnerID=8YFLogxK

U2 - 10.1177/1740774516666502

DO - 10.1177/1740774516666502

M3 - Article

VL - 14

SP - 48

EP - 58

JO - Clinical Trials

JF - Clinical Trials

SN - 1740-7745

IS - 1

ER -