Comparison of futility monitoring guidelines using completed phase III oncology trials

Qiang Zhang; Boris Freidlin; Edward L. Korn; Susan Halabi; Sumithra Mandrekar; James J. Dignam

doi:10.1177/1740774516666502

Comparison of futility monitoring guidelines using completed phase III oncology trials

Qiang Zhang, Boris Freidlin, Edward L. Korn, Susan Halabi, Sumithra Mandrekar, James J. Dignam

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

Original language	English (US)
Pages (from-to)	48-58
Number of pages	11
Journal	Clinical Trials
Volume	14
Issue number	1
DOIs	https://doi.org/10.1177/1740774516666502
State	Published - Feb 1 2017

Keywords

Futility monitoring
clinical trials
conditional power
linear inefficacy boundary
oncology
repeated confidence intervals
testing alternative hypothesis

ASJC Scopus subject areas

Pharmacology

Access to Document

10.1177/1740774516666502

Cite this

@article{4d73c1b3f43542a4bd2b09ce5c1dbcfb,

title = "Comparison of futility monitoring guidelines using completed phase III oncology trials",

abstract = "Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.",

keywords = "Futility monitoring, clinical trials, conditional power, linear inefficacy boundary, oncology, repeated confidence intervals, testing alternative hypothesis",

author = "Qiang Zhang and Boris Freidlin and Korn, {Edward L.} and Susan Halabi and Sumithra Mandrekar and Dignam, {James J.}",

note = "Publisher Copyright: {\textcopyright} The Society for Clinical Trials.",

year = "2017",

month = feb,

day = "1",

doi = "10.1177/1740774516666502",

language = "English (US)",

volume = "14",

pages = "48--58",

journal = "Clinical Trials",

issn = "1740-7745",

publisher = "SAGE Publications Ltd",

number = "1",

}

TY - JOUR

T1 - Comparison of futility monitoring guidelines using completed phase III oncology trials

AU - Zhang, Qiang

AU - Freidlin, Boris

AU - Korn, Edward L.

AU - Halabi, Susan

AU - Mandrekar, Sumithra

AU - Dignam, James J.

N1 - Publisher Copyright: © The Society for Clinical Trials.

PY - 2017/2/1

Y1 - 2017/2/1

N2 - Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

AB - Background: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study. Purpose: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group. Methods: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales. Results: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials. Conclusion: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.

KW - Futility monitoring

KW - clinical trials

KW - conditional power

KW - linear inefficacy boundary

KW - oncology

KW - repeated confidence intervals

KW - testing alternative hypothesis

UR - http://www.scopus.com/inward/record.url?scp=85012050883&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012050883&partnerID=8YFLogxK

U2 - 10.1177/1740774516666502

DO - 10.1177/1740774516666502

M3 - Article

C2 - 27590208

AN - SCOPUS:85012050883

SN - 1740-7745

VL - 14

SP - 48

EP - 58

JO - Clinical Trials

JF - Clinical Trials

IS - 1

ER -

Comparison of futility monitoring guidelines using completed phase III oncology trials

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this