Design of Phase II cancer trials evaluating survival probabilities
- L Douglas Case^{1}Email author and
- Timothy M Morgan^{1}
DOI: 10.1186/1471-2288-3-6
© Case and Morgan; licensee BioMed Central Ltd. 2003
Received: 30 September 2002
Accepted: 3 April 2003
Published: 3 April 2003
Abstract
Background
Phase II cancer studies are undertaken to assess the activity of a new drug or a new treatment regimen. Activity is sometimes defined in terms of a survival probability, a binary outcome such as one-year survival that is derived from a time-to-event variable. Phase II studies are usually designed with an interim analysis so they can be stopped if early results are disappointing. Most designs that allow for an interim look are not appropriate for monitoring survival probabilities since many patients will not have enough follow-up by the time of the interim analysis, thus necessitating an inconvenient suspension of accrual while patients are being followed.
Methods
Two-stage phase II clinical trial designs are developed for evaluating survival probabilities. These designs are compared to fixed sample designs and to existing designs developed to monitor binomial probabilities to illustrate the expected reduction in sample size or study length possible with the use of the proposed designs.
Results
Savings can be realized in both the duration of accrual and the total study length, with the expected savings increasing as the accrual rate decreases. Misspecifying the underlying survival distribution and the accrual rate during the planning phase can adversely influence the operating characteristics of the designs.
Conclusion
Two-stage phase II trials for assessing survival probabilities can be designed that do not require prolonged suspension of patient accrual. These designs are more efficient than single stage designs and more practical than existing two-stage designs developed for binomial outcomes, particularly in trials with slow accrual.
Background
Phase II clinical trials are usually conducted to assess the activity of a new drug or treatment regimen. Activity in many phase II cancer trials is quantified by tumor response, and a treatment is considered successful for an individual patient if his or her tumor burden is reduced by 50% or more. Here, the response rate is the number of patients responding divided by the total number of evaluable patients. In other trials, activity might be quantified by a time to event variable such as remission-free or overall survival, and the outcome might be the proportion of patients remission-free or alive at one or two years. Regimens showing sufficient activity in the phase II setting might be evaluated subsequently in a phase III comparative trial. Phase II trials typically test the null hypothesis H_{0}: p ≤ p_{0} that the true response rate of the treatment under consideration is less than some level (p_{0}) that would be deemed too low for further consideration. Studies are designed so that the probability of falsely rejecting H_{0} (i.e., considering the treatment worthy of further investigation when in fact it is not) is α and the probability of rejecting H_{0} when p = p_{1} (i.e., considering the drug worthy of further investigation when in fact it is) is 1-β.
Due to ethical concerns, interim analyses are done in phase II trials to ensure that patients are not receiving a treatment that is clearly inferior to other available options. Usually these trials are not stopped early when a treatment has shown better than expected activity because there is no ethical dilemma, and there is usually interest in obtaining better estimates of the treatment's activity. On the other hand, if the treatment is performing poorly, physicians want to explore other options for subsequent patients. Numerous frequentist and Bayesian multi-stage phase II designs have been developed for monitoring binary outcomes [see, for example, [1–10]]. Simon [4] developed two-stage plans that minimized the maximum expected sample size and plans that minimized the expected sample size under the null hypothesis. He felt one interim analysis was usually sufficient since it is frequently impractical to analyze the data multiple times, and two-stage designs realize much of the savings possible with sequential or group sequential designs [11]. For example, Chen [7] extended Simon's designs to three stages and found a mean reduction in sample size of only 10% with the addition of an extra stage.
When the response of interest is based on a time to event outcome (e.g., one-year survival), censoring becomes an issue during the interim analysis, as it is unclear how subjects without sufficient follow-up should be handled. The simple proportion of all subjects surviving the required time is a biased estimate of the survival probability if some subjects have incomplete follow-up, and restricting the estimate to those subjects followed for the required time results in an inefficient estimate. On the other hand, suspending accrual while all the subjects are followed the required length of time is impractical. Long trial suspensions can ruin a trial's momentum, increase study length, and increase costs, and it is potentially unclear how to treat new subjects during the suspension.
Example
In 1998, investigators in the Comprehensive Cancer Center of Wake Forest University wanted to design a phase II study to assess the activity a new chemo-radiation combination for patients with resectable pancreatic cancer. Pancreatic cancer is the most deadly of all the major cancers. When this trial was planned, 29,000 new cases of pancreatic cancer and 28,900 deaths from the disease were expected in the United States [12]. Patients with resectable pancreatic cancer have a better prognosis than those with unresectable disease, but the long-term outlook for these patients is still bleak. Investigators felt that Gemcitabine, a radiation sensitizer that had shown activity in pancreatic cancer [13–16], combined with external beam radiation, could improve the prognosis of these patients. Since the tumors were resected in these patients, objective tumor response was not a possible outcome, and one-year survival was chosen as a clinically meaningful outcome. The investigators decided that the treatment would be considered unsuccessful if the one-year survival was 35% or less, and it would be considered active enough to pursue further if the one-year survival was 50% or greater.
The fixed sample size required for a single-stage study to test this hypothesis, based on an exact binomial test, would be 72. If 31 or more of the patients live longer than one year, the null hypothesis would be rejected. This design has type I and II errors of .096 and .097, respectively. Assuming accrual of 24 patients per year (for a duration of accrual of 3 years), each patient would be followed until failure or for a maximum of one year and the single analysis would be done at 4 years.
Characteristics of Single Stage, Simon Two-Stage, and Optimal Two-Stage Designs *
Design | ESS (%) | MDA (%) | ETSL (%) | MTSL (%) |
---|---|---|---|---|
Single Stage | 72.0 (100) | 3.00 (100) | 4.00 (100) | 4.00 (100) |
Simon – no interim accrual | 53.2 (74) | 3.38 (112) | 3.63 (91) | 5.38 (134) |
Simon – interim accrual | 67.4 (94) | 3.38 (112) | 3.22 (80) | 4.38 (109) |
Proposed – minimize ETSL | 63.5 (88) | 3.44 (115) | 3.00 (75) | 4.44 (111) |
Proposed – minimize EDA | 62.0 (86) | 3.27 (109) | 3.08 (77) | 4.27 (107) |
One might be tempted to use Simon's design, but to accrue additional patients while the initial 34 are followed. Characteristics for this design are also shown in Table 1. For our example, 24 additional patients would be accrued while the first 34 are being followed, and the trial would have an expected sample size under H_{0} of 67.4 (or 94% of the fixed sample value). The expected and maximum study lengths would be 3.22 and 4.38 years, respectively (or 80% and 109% of the fixed sample values). One sees that accruing additional patients while the initial patients are followed results in a greater expected sample size, but shorter expected and maximum total study lengths. The two main problems with this design are 1) information from patients who have been followed less than one year during the interim analysis is ignored, and 2) information collected from the additional patients is potentially never used. Additionally, if accrual is fast, the total required recruitment could be completed by the time of the interim review, possibly resulting in more patients than necessary being treated with, what turns out to be, an ineffective regimen. Note that Herndon [9] proposes a hybrid phase II design that allows for interim accrual but uses information collected on all patients by delaying the decision to stop the trial until all data are reviewed.
Several researchers have proposed using the Kaplan-Meier [17] or Nelson-Aalen [18] estimates of the survival probability during interim analyses to account for the information available from those with partial follow-up without necessitating trial suspension. Jennison and Turnbull [19] give an example of assessing the effect of a drug on the mother to infant HIV transmission rate among HIV infected breast-feeding mothers, where the proportion of infants infection-free at two years would be a meaningful outcome. They suggest monitoring such a trial using Kaplan-Meier [17] estimates and the spending function approach originally described by Lan and DeMets [20]. Lin et al [21] discuss the design of a study in young children with Wilms cancer for which two-year relapse-free survival is the primary outcome, and Nelson-Aalen [18] estimates are used during the interim analyses.
In the next section, we apply results presented by Lin et al [21] to show how to design efficient phase II studies for monitoring survival probabilities without the aforementioned drawbacks. We develop designs that minimize either the expected duration of accrual or the expected total study length under H_{0}. We choose optimal designs under the null hypothesis since we want to minimize the number of patients treated with an ineffective regimen. Numerical results are presented to illustrate the effect of different choices for the design parameters, including the type I and II errors and the duration of accrual (or, alternatively, the accrual rate). We illustrate these methods in detail for our particular trial and follow with a discussion of possible limitations and extensions.
Methods
Lin et al [21] derived the asymptotic joint distribution of the Nelson-Aalen [18] estimates of survival calculated at different calendar times during a study and applied the results to longitudinally monitor a National Wilms Tumor Study Group protocol. A brief summary of their relevant results is given below. Following their notation, assume n patients are accrued to a trial at times Y _{1},...,Y _{ n }. Let T _{1},...,T _{ n }denote the failure times and C _{1},...,C _{ n }the censoring times since study entry. At time t, we observe for each individual either a failure time or a censoring time and an indicator specifying which. That is, we observe the time X _{ i }(t) = min(T _{ i },C _{ i }, max (0, t - Y _{ i }))) and the failure indicator Δ_{ i }(t) = I{T _{ i }≤ min(C _{ i }, max(0,(t - Y _{ i })))}. Let denote the estimate of x-year survival at time t, based on the Nelson-Aalen [18] estimate of the cumulative hazard function, Λ(x;t):
where λ(u) is the hazard function and MDA is the maximum duration of accrual. Lin et al [21] recommend assessing the hypothesis H_{0}: S(x) = S _{0}(x) at time t using the asymptotically standard normal test statistic
where
Let I(x;t _{ i }) denote the information available for estimating S(x) at time t _{ i }. The joint distribution of Z(x;t) calculated over the course of the study is multivariate normal with correlations given by , where t _{ i }≤ t _{ j }.
Now consider the design of a two-stage phase II trial for testing H_{0}: S(x*) = S _{0}(x*), where x* denotes the survival time of interest and S(.) denotes the survival function which can be estimated as described above (or by using the Kaplan-Meier estimator). As illustrated below, let t _{1} and t _{2} denote the duration of the first and second accrual periods, and let MTSL denote the maximum total study length (MTSL = t _{1} + t _{2} + x*).
We will use the following notation:
x* = survival time of interest
t _{ i }= duration of accrual for the i^{th} stage
n _{ i }= sample size accrued during the i^{th} stage; n = n _{1} + n _{2}
ν = constant rate of accrual
P _{ s }= probability of stopping at t _{1}
DA = duration of accrual
EDA = expected duration of accrual = t _{1} + (1 - P _{ s })t _{2}, calculated under the null hypothesis
ESS = expected sample size = n _{1} + (1 - P _{ s })n _{2} = νEDA, calculated under the null hypothesis
MDA = maximum duration of accrual = t _{1} + t _{2}
ETSL = expected total study length = t _{1} + (1 - P _{ s })(t _{2} + x*), calculated under the null hypothesis
MTSL = maximum total study length = t _{1} + t _{2}+ x*
I _{1} = information available at t _{1}
I _{ max }= information available at MTSL
A two-stage design proceeds as follows.
Stage 1. Accrue n _{1} patients between time 0 and time t _{1}. Each patient will be followed until failure or for x* years or until time t _{1}, whichever is less. Calculate Z _{1}(x*;t _{1}) as given in (2). If Z _{1}(x*;t _{1}) < C_{1} stop the study and "accept" H_{0}; otherwise, continue to the next stage.
Stage 2. Accrue n _{2} additional patients between times t _{1} and t _{1} + t _{2} (= MDA). Follow all patients until failure or for x* years, calculate Z _{2}(x*;MTSL), and reject H_{0} if Z _{2}(x*;MTSL) > C_{2}.
An interim analysis could be done anytime after x* years and before time MTSL. The expected duration of accrual (EDA) under the null hypothesis is given by EDA = t _{1} + (1 - P _{ s })t _{2}, where P _{ s }= Φ(C_{1}) and Φ(.) denotes the standard normal distribution function. The expected total study length (ETSL) under the null hypothesis is given by ETSL = t _{1} + (1 - P _{ s })(t _{2} + x*). The maximum amount of information for estimating Λ(x*), (I _{ max }), occurs whenever t ≥ MTSL. The joint distribution of Z _{1} and Z _{2} is bivariate normal with correlation .
During the design stage, estimates of the information as a function of time can be obtained by making assumptions regarding the expected survival distribution. Once the survival distribution is specified, the expected information at any time (I(x*;t) = 1/σ^{2}(x*;t)) can be easily obtained by numerically evaluating equation 1. The Weibull distribution is a flexible and simple choice, as the survival distributions are completely specified by the null and alternative survival probabilities for any given shape parameter.
For two-stage phase II designs for survival probabilities, there are four unknowns – n _{1} or t _{1}, n _{2} or t _{2}, C _{1}, and C _{2} – and two constraints (type I and II errors). We assume the accrual rate is fixed. As there are more unknowns than constraints, there will be an infinite number of solutions. We will choose those solutions that minimize either the expected duration of accrual (i.e., expected sample size) or the expected total study length under the null hypothesis. Note that this is the same paradigm as used by Simon [4] in selecting optimal designs for binomial outcomes, where he only allows early acceptance of the null hypothesis and minimizes the expected sample size under the null hypothesis. Specifically, we choose to minimize the EDA or the ETSL given that B(C _{1},C _{2},ρ) = α and B(C _{1} - ρu, C _{2} -u, ρ) = 1 - β, where u = n ^{1/2}(μ - μ_{0})/σ, μ and σ are the mean and standard deviation of the test statistic, and B(C _{1},C _{2},ρ) denotes the bivariate normal probability that Z _{1} >C _{1} and Z _{2} >C _{2}, given a correlation between Z _{1} and Z _{2} of ρ. Numerical integration of the bivariate normal distribution is accomplished using a double precision Fortran function written by Donnelly [22]. For initially chosen values for C _{2} and ρ, values of C _{1} and n are found to satisfy the two error constraints. The parameter values that result in an optimal design are found by iterating over C _{2} and ρ using a combination golden-section search and parabolic interpolation minimization routine described by Brent [23]. A given choice of n and ρ corresponds to a specific t _{1}, which is obtained by solving equation 1 under the null hypothesis. This, in turn, corresponds to another ρ under the alternative hypothesis, obtained by solving equation 1 under the alternative specifications. This latter ρ is used in calculating the sample size (or duration of accrual) that satisfies the type II error constraint. For practical purposes this step is probably unnecessary since the correlation is very similar under the null and alternative hypotheses, even though the information is different at each stage under the two hypotheses. Fortran code implementing this algorithm is available from the first author upon request. Run times vary between 5 and 10 minutes on an IBM compatible Pentium 3 600 MHz PC; the run time can be decreased by decreasing the precision specified in the program.
Results
Application
Consider the example mentioned earlier. That is, suppose we would like to design a phase II study to assess the effectiveness of adjuvant Gemcitabine and external beam radiation for the treatment of patients with resectable pancreatic cancer. The principal outcome measure used to quantify treatment effect will be one-year survival, and we will test the null hypothesis that one-year survival is 35% or less. We desire to have 90% power at an alternative one-year survival of 50% for testing this hypothesis at the 10% one-sided level of significance.
Optimal (under H_{0}) two-stage parameters for testing H_{0}: S(1) ≤ .35 vs H_{1}: S(1) > .35. Power is 90% at S(1) = .5 *
Minimized | t _{1} | C _{1} | C _{2} | I _{1}/Imax | EDA (%) | MDA(%) | ETSL (%) | MTSL (%) |
---|---|---|---|---|---|---|---|---|
ETSL | 1.8 | .137 | 1.164 | .31 | 2.65 (88) | 3.71 (124) | 3.10 (77) | 4.71 (118) |
2.0 | .270 | 1.164 | .38 | 2.63 (88) | 3.59 (120) | 3.02 (76) | 4.59 (115) | |
2.2 ^{@} | .375 | 1.172 | .46 | 2.65 (88) | 3.44 (115) | 3.00 (75) | 4.44 (111) | |
2.4 | .464 | 1.184 | .53 | 2.70 (90) | 3.32 (111) | 3.02 (75) | 4.32 (108) | |
2.6 | .550 | 1.198 | .61 | 2.78 (93) | 3.22 (107) | 3.07 (77) | 4.22 (105) | |
EDA | 1.5 | -.313 | 1.223 | .25 | 2.66 (89) | 3.36 (112) | 3.28 (82) | 4.36 (109) |
1.7 | -.125 | 1.218 | .31 | 2.60 (87) | 3.33 (111) | 3.15 (79) | 4.33 (108) | |
1.9 ^{@} | .004 | 1.220 | .38 | 2.58 (86) | 3.27 (109) | 3.08 (77) | 4.27 (107) | |
2.1 | .109 | 1.227 | .46 | 2.60 (87) | 3.20 (107) | 3.06 (76) | 4.20 (105) | |
2.3 | .189 | 1.237 | .53 | 2.65 (88) | 3.13 (104) | 3.08 (77) | 4.13 (103) |
Misspecification of Survival Distributions and Accrual
Characteristics of Optimal (for ETSL) Design Assuming Misspecification of the Survival Distributions
Weibull Shape Parameter | α | 1-β |
---|---|---|
0.25 | .106 | .918 |
0.50 | .103 | .910 |
0.75 | .101 | .904 |
1.00 | .100 | .900 |
2.00 | .097 | .889 |
3.00 | .096 | .884 |
4.00 | .095 | .880 |
Characteristics of Optimal (for ETSL) Design Assuming Misspecification of Accrual
Scenario 1* | Scenario 2* | Scenario 3* | ||||
---|---|---|---|---|---|---|
Actual/Anticipated Accrual | α | 1-β | α | 1 - β | α | 1 - β |
0.25 | .1 | .512 | .070 | .687 | .109 | .924 |
0.50 | .1 | .711 | .082 | .800 | .106 | .918 |
1.00 | .1 | .900 | .100 | .900 | .100 | .900 |
1.50 | .1 | .963 | .114 | .934 | .092 | .863 |
2.00 | .1 | .986 | .115 | .970 | .079 | .779 |
General Results
Optimal two-stage parameters for designs that minimize either the ETSL (top line) or the EDA (bottom line) for testing H_{0}: S(1) ≤ .35. Power is specified for S(1) = .5.^{#}
α | 1 - β | DA/x* | t _{1} | C _{1} | C _{2} | I _{1}/Imax | EDA | MDA | ETSL | MTSL |
---|---|---|---|---|---|---|---|---|---|---|
.1 | .9 | 1.50 | 1.01 | .488 | 1.154 | .49 | 1.05 | 1.15 | .76 | 1.09 |
.81 | -.489 | 1.257 | .32 | .96 | 1.03 | .85 | 1.02 | |||
1.75 | .93 | .457 | 1.159 | .48 | 1.00 | 1.15 | .76 | 1.10 | ||
.76 | -.291 | 1.245 | .34 | .94 | 1.05 | .82 | 1.03 | |||
2 | .87 | .434 | 1.163 | .47 | .96 | 1.15 | .75 | 1.10 | ||
.72 | -.179 | 1.237 | .36 | .92 | 1.06 | .80 | 1.04 | |||
3 | .74 | .375 | 1.172 | .46 | .88 | 1.15 | .75 | 1.11 | ||
.63 | .004 | 1.220 | .38 | .86 | 1.09 | .77 | 1.07 | |||
4 | .67 | .346 | 1.176 | .45 | .84 | 1.15 | .75 | 1.12 | ||
.59 | .073 | 1.212 | .39 | .83 | 1.10 | .76 | 1.08 | |||
.05 | .95 | 1.50 | .99 | .680 | 1.536 | .46 | 1.03 | 1.17 | .72 | 1.10 |
.81 | -.274 | 1.622 | .32 | .95 | 1.04 | .81 | 1.02 | |||
1.75 | .91 | .648 | 1.541 | .46 | .98 | 1.16 | .72 | 1.10 | ||
.75 | -.084 | 1.612 | .34 | .92 | 1.06 | .78 | 1.04 | |||
2 | .85 | .624 | 1.545 | .45 | .94 | 1.16 | .71 | 1.11 | ||
.72 | .024 | 1.605 | .35 | .89 | 1.07 | .76 | 1.05 | |||
3 | .72 | .563 | 1.553 | .44 | .85 | 1.16 | .71 | 1.12 | ||
.63 | .202 | 1.591 | .38 | .83 | 1.10 | .72 | 1.08 | |||
4 | .66 | .531 | 1.557 | .44 | .80 | 1.15 | .70 | 1.12 | ||
.59 | .266 | 1.586 | .39 | .79 | 1.11 | .71 | 1.09 |
As the duration of accrual increases relative to x*, the interim analysis is done earlier relative to the fixed sample duration of accrual. As the duration of accrual approaches infinity, the optimal design becomes the Simon design (or, in our case, the normal approximation to the Simon design) since it becomes increasingly likely that each patient reaches his or her end point before the next patient is accrued. Designs optimized to minimize the EDA have their interim analyses earlier than the respective designs optimized to minimize the ETSL. This difference becomes smaller as the duration of accrual increases. As DA/x* increases, ρ decreases slightly for designs that minimize the ETSL but increases for designs that minimize the EDA. One observes that the maximum sample size is typically 15–16% greater than the fixed sample size, regardless of the design parameters for designs that minimize the ETSL. The maximum sample size is smaller and increases as DA/x* increases for designs that minimize the EDA. We derived results similar to those shown in Table 5 for other choices of survival under the null and alternative hypotheses (results not shown). One notes that neither the designs which minimize the ETSL nor the ones which minimize the EDA are substantially affected by the choice of survival probabilities under the null and alternative hypotheses, for constant values of DA/x*.
Simulations
Realized α and 1 - β using simulations of the optimal designs for testing H_{0}: S(1) ≤ .35. Power is specified for S(1) = .5.
Desired α = .1, 1 - β = .9 | Desired α = .05, 1 - β = .95 | |||||||
---|---|---|---|---|---|---|---|---|
ETSL | EDA | ETSL | EDA | |||||
DA/x* | α | 1 - β | α | 1 - β | α | 1 - β | α | 1 - β |
1.50 | .103 | .909 | .100 | .907 | .049 | .953 | .043 | .948 |
1.75 | .104 | .909 | .114 | .918 | .049 | .954 | .045 | .951 |
2 | .126 | .919 | .089 | .898 | .050 | .954 | .051 | .955 |
3 | .092 | .904 | .116 | .918 | .045 | .952 | .059 | .961 |
4 | .092 | .903 | .091 | .903 | .045 | .954 | .048 | .955 |
Discussion
In many phase II cancer trials, activity might be quantified by a time to event variable such as remission-free or overall survival, and the outcome might be a survival probability such as the proportion of patients remission-free or alive at one or two years. It could be that this is the primary outcome of interest due to clinical relevance or it might be that tumor response, the typical outcome in phase II cancer trials, is not an option as all disease may have been irradiated or removed during surgery or the trial may be done only in patients who have experienced a complete response. Although designs developed for monitoring binomial proportions or modifications of these as described by Herndon [9] could also be used for these outcomes, they are not optimal. We have shown how results described by Lin et al [21] could be used to design efficient phase II trials for monitoring survival probabilities. We presented designs that minimized either the expected duration of accrual or the expected total study length under the null hypothesis. The costs of these designs are maximum sample sizes and total study lengths that are greater than the fixed sample values. However, the maximum duration of accrual and total study length are reached fairly quickly and then decrease with increasing time to the interim analysis. By considering all possible designs, one can choose a design that is almost optimal but which has a smaller sample size and maximum total study length than that of the fully optimal design.
The frequentist approach presented here represents one possible strategy for monitoring survival probabilities in a phase II trial. Other researchers have proposed rather elegant Bayesian approaches to this problem. Follman and Albert [24] use a Dirichlet prior distribution for describing the probabilities of failure at discrete times. They show that the posterior distribution incorporating censored data is a mixture of Dirichlets, and they use simulations to estimate the posterior probability that an event rate exceeds some threshold. Cheung and Thall [25] present a method for monitoring survival probabilities based on an approximate posterior, and they apply stopping rules described in Thall and Simon [6] based on the approximate probabilities. In addition, they extend their methods to the more complicated case of composite events. Both of these methods have the advantage that they can be applied continuously.
Another approach to the design of phase II trials for monitoring survival probabilities would be to use parametric methods. The survival probability at a given time would be a function of the parameters of the parametric model. For example, assuming an exponential model, monitoring a survival probability at a given time is equivalent to monitoring the hazard parameter. Methods proposed by Case and Morgan [26] could then be used to obtain optimal durations of accrual and follow-up. This approach has the advantage of using events that occur before or after the time of interest, but the disadvantage of relying on a specific model.
In the designs discussed above, we assumed a parametric distribution to obtain estimates of the information over time. Were we to have access to data or simply the Kaplan-Meier estimate of the survival distribution from a previous trial, we could use those data to obtain a nonparametric estimate of the information using equation (3). In that equation, the denominator would need to be multiplied by min[(t - X _{ i })/MDA,1]. This fully nonparametric approach has the conceptual advantage that when we design a study to compare nonparametric estimates of survival distributions, we frequently do not want to make assumptions about the form of S(.). Although one would like to know S(.) with certainty in planning a study, often we are faced with an estimate of a summary statistic such as a percentile, median, or mean and maybe an idea of the shape from a Kaplan-Meier plot without access to the data. Sometimes we have the data from a previous study but do not have confidence that the population is relevant to the planned study. Often the choice of developing a design using the parametric or nonparametric approach will depend on the existence of previous raw data and, if such data exist, our relative confidence in the parametric assumptions versus the applicability of the population in the previous study.
Although the approach described above is applicable when the duration of accrual is long relative to the survival time of interest, it is not practical when the duration of accrual is short (say less than 1.5 times the survival time of interest). This is because there is little information available for estimating the survival probability before most of the patients are accrued to the study. If the accrual period is less than the time of interest, there is no design that leads to an expected duration of accrual that is less than the fixed sample value. In our experience of doing phase II trials in a single institution, it has been rare that fast accrual would have limited the applicability of these designs. However, it could be more of a problem with multicenter or cooperative group trials. A possible solution includes monitoring a different (earlier) time during the interim analysis as Lin et al [21] did in their example. Unfortunately, the connection with the primary end point of interest is not always clear.
Misspecification of the accrual rate can have a major effect on the operating characteristics of these designs. The misspecification of survival during the design phase has much less impact on the type I and II errors. However, it is clear that one needs to monitor the ongoing accrual (as is typically done in clinical trials) and make modifications to the design once the accrual rate is ascertained. The ultimate critical values can be modified once the actual estimate of the information is obtained during the interim analysis, but this strategy should result in fairly efficient designs.
Declarations
Acknowledgements
Supported in part by grants P30-CA-12127 and U10-CA-81851 from the Public Health Service, National Institutes of Health.
Authors’ Affiliations
References
- Gehan EA: The determination of the number of patients required in a preliminary and a follow-up trial of a new chemotherapeutic agent. Journal of Chronic Diseases. 1961, 13: 346-353.View ArticlePubMedGoogle Scholar
- Fleming TR: One sample multiple testing procedure for phase II clinical trials. Biometrics. 1982, 38: 143-151.View ArticlePubMedGoogle Scholar
- Chang MN, Therneau TM, Wieand HS, Cha SS: Designs for group sequential phase II clinical trials. Biometrics. 1987, 43: 865-874.View ArticlePubMedGoogle Scholar
- Simon R: Optimal two-stage designs for phase II clinical trials. Controlled Clinical Trials. 1989, 10: 1-10. 10.1016/0197-2456(89)90015-9.View ArticlePubMedGoogle Scholar
- Green SJ, Dahlberg S: Planned versus attained design in phase II clinical trials. Statistics in Medicine. 1992, 11: 853-862.View ArticlePubMedGoogle Scholar
- Thall PF, Richard Simon: Practical Bayesian guidelines for phase IIB clinical trials. Biometrics. 1994, 50: 337-349.View ArticlePubMedGoogle Scholar
- Chen TT: Optimal three-stage designs for phase II cancer clinical trials. Statistics in Medicine. 1997, 16: 2701-2711. 10.1002/(SICI)1097-0258(19971215)16:23<2701::AID-SIM704>3.0.CO;2-1.View ArticlePubMedGoogle Scholar
- Heitjan DF: Bayesian interim analysis of phase II cancer clinical trials. Statistics in Medicine. 1997, 16: 1791-1802. 10.1002/(SICI)1097-0258(19970830)16:16<1791::AID-SIM609>3.3.CO;2-5.View ArticlePubMedGoogle Scholar
- Herndon JE: A design alternative for two-stage, phase II, multicenter cancer clinical trials. Controlled Clinical Trials. 1998, 19: 440-450. 10.1016/S0197-2456(98)00012-9.View ArticlePubMedGoogle Scholar
- Chen TT, Ng T-H: Optimal flexible designs in phase II clinical trials. Statistics in Medicine. 1998, 17: 2301-2312. 10.1002/(SICI)1097-0258(19981030)17:20<2301::AID-SIM927>3.0.CO;2-X.View ArticlePubMedGoogle Scholar
- Colton T, McPherson K: Two-stage plans compared with fixed-sample-size and Wald SPRT plans. J Am Stat Assoc. 1976, 71: 80-86.View ArticleGoogle Scholar
- Landis SH, Murray T, Bolden S, Wingo PA: Cancer Statistics, 1998. CA: A Cancer Journal for Clinicians. 1998, 48: 6-29.Google Scholar
- Lawrence T: Gemcitabine as a radiation sensitizer. Sem Oncol. 1995, 22: 68-71.Google Scholar
- Casper ES, Green MR, Kelsen DP, Heelan RT, Brown TD, Flombaum CD, Trochanowski B, Tarassoff PG: Phase II trial of gemcitabine (2'2'-difluoro-2'-deoxycytidine) in patients with adenocarcinoma of the pancreas. Investigational New Drugs. 1994, 12: 29-34.View ArticlePubMedGoogle Scholar
- Burris HA, Moore MJ, Andersen J, Green MR, Rothenberg ML, Modiano MR, Cripps MC, Portenoy RK, Storniolo AM, Tarassoff R, Nelson R, Dorr FA, Stephens CD, Von Hoff DD: Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: a randomized trial. J Clin Oncol. 1997, 15: 2403-2413.PubMedGoogle Scholar
- Blackstock AW, Bernard SA: Twice weekly gemcitabine and concurrent radiation: laboratory studies supporting phase I clinical trials in pancreatic cancer. Can Conf. 1999, 3: 2-6.Google Scholar
- Kaplan EL, Meier P: Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958, 53: 457-481.View ArticleGoogle Scholar
- Nelson W: Hazard plotting for incomplete failure data. J Quality Technology. 1969, 1: 27-52.Google Scholar
- Jennison C, Turnbull BW: Group Sequential Designs with Applications to Clinical Trials. Boca Raton, Chapman & Hall/CRC. 2000Google Scholar
- Lan KKG, DeMets DL: Discrete sequential boundaries for clinical trials. Biometrika. 1983, 70: 659-663.View ArticleGoogle Scholar
- Lin DY, Shen L, Ying Z, Breslow NE: Group sequential designs for monitoring survival probabilities. Biometrics. 1996, 52: 1033-1042.View ArticlePubMedGoogle Scholar
- Donnely TG: Algorithm 462 – Bivariate normal distribution [S15]. Comm ACM. 1973, 16: 638-10.1145/362375.362414.View ArticleGoogle Scholar
- Brent R: Algorithms for Minimization without Derivatives. Englewood Cliffs, Prentice Hall. 1973Google Scholar
- Follmann DA, Albert PS: Bayesian monitoring of event rates with censored data. Biometrics. 1999, 55: 603-607.View ArticlePubMedGoogle Scholar
- Cheung YK, Thall PF: Monitoring the rates of composite events with censored data in phase II clinical trials. Biometrics. 2002, 58: 89-97.View ArticlePubMedGoogle Scholar
- Case LD, Morgan TM: Duration of accrual and follow-up for two-stage clinical trials. Lifetime Data Analysis. 2001, 7: 21-37. 10.1023/A:1009621009283.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/3/6/prepub
Pre-publication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.