This article has Open Peer Review reports available.

# Sample size re-assessment leading to a raised sample size does not inflate type I error rate under mild conditions

- Per Broberg
^{1}Email author

**13**:94

https://doi.org/10.1186/1471-2288-13-94

© Broberg; licensee BioMed Central Ltd. 2013

**Received: **25 November 2012

**Accepted: **15 July 2013

**Published: **19 July 2013

## Abstract

### Background

One major concern with adaptive designs, such as the sample size adjustable designs, has been the fear of inflating the type I error rate. In (Stat Med 23:1023-1038, 2004) it is however proven that when observations follow a normal distribution and the interim result show promise, meaning that the conditional power exceeds 50%, type I error rate is protected. This bound and the distributional assumptions may seem to impose undesirable restrictions on the use of these designs. In (Stat Med 30:3267-3284, 2011) the possibility of going below 50% is explored and a region that permits an increased sample size without inflation is defined in terms of the conditional power at the interim.

### Methods

A criterion which is implicit in (Stat Med 30:3267-3284, 2011) is derived by elementary methods and expressed in terms of the test statistic at the interim to simplify practical use. Mathematical and computational details concerning this criterion are exhibited.

### Results

Under very general conditions the type I error rate is preserved under sample size adjustable schemes that permit a raise. The main result states that for normally distributed observations raising the sample size when the result looks promising, where the definition of promising depends on the amount of knowledge gathered so far, guarantees the protection of the type I error rate. Also, in the many situations where the test statistic approximately follows a normal law, the deviation from the main result remains negligible. This article provides details regarding the Weibull and binomial distributions and indicates how one may approach these distributions within the current setting.

### Conclusions

There is thus reason to consider such designs more often, since they offer a means of adjusting an important design feature at little or no cost in terms of error rate.

## Keywords

## Background

The last few years interest in various adaptive trial designs has surged [1]. A greater flexibility of clinical study design and conduct has followed from the application of these new ideas [2]. In [1] adaptive designs are defined as:

“...a clinical trial design that allows adaptations or modifications to aspects of the trial after its initiation without undermining the validity or integrity of the trial. ”

More and more trials of this sort are being reported and regulatory bodies take an increasingly favourable view on them [3]. All stand to win if these designs come to optimal use [4]. However, some concerns have been raised. One of these involve the risk of inflating type I error. The current article will assess that risk in the context of sample size adjustable (SSA) designs that allow choosing between raising the sample size, continuing as originally planned, or closing the trial due to futility.

The following will recapitulate parts of [5], add detail and draw conclusions for trial procedures. In that reference the authors show that if the interim results look promising, no inflation of type I error rate occurs. Here ’promising’ means that the conditional power at the current parameter estimate, i.e. the power updated by the accumulated knowledge at interim, amounts to at least 50%. This article will show that a less strict bound applies, in agreement with [6], exhibit the bound in terms of a test statistic, and present mathematical as well as computational aspects of it.

## Methods

### Assumptions

Denote the planned final sample size by *N*
_{0}, the number of patients available at the pre-planned interim analysis by *n*, and the possible raise determined at the interim taking conditional power into account by *r*. Let us consider a one-sided test at level *α* based on observing ${X}_{1},\dots ,{X}_{{N}_{\mathit{\text{final}}}}$. Here *N*
_{
final
} = *N*
_{0} or *N*
_{
final
} = *N* = *N*
_{0} + *r*, depending on a decision taken during the course of the trial. The main result assumes normal distribution, but as will be outlined, it will still hold true for more general distributions. Further, assume the *X*
_{
i
} to be independent normal with mean *θ* and variance 1. The null hypothesis states that *θ* = 0. Define the normalised test statistic *Z*
^{(x)} by ${Z}^{(x)}={\sum}_{i=1}^{x}{X}_{i}/\sqrt{x}$. The test rejects if ${Z}^{({N}_{\mathit{\text{final}}})}>{z}_{\alpha}$, where *z*
_{
α
} is the 100 × (1 − *α*) percentile of the standard normal distribution: Φ(*z*
_{
α
}) = 1 − *α* (Φ being the cumulative distribution function of the standard normal distribution). The normalised test statistic ${Z}^{(n)}={\sum}_{i=1}^{n}{X}_{i}/\sqrt{n}$ is observed when *n* patients have provided data, and the Data Monitoring Committee (DMC) will in part base its recommendations on the observed value. At this interim analysis an adaptation may lead to closing the study due to futility, continuing the study without changes or raising the sample size by recruiting an extra *r* subjects, yielding a total of *N* = *N*
_{0} + *r* subjects. Closing the study due to futility may only decrease the type I error rate. So let us, for the sake of argument, disregard that possibility, and show that the type I error rate still remains protected.

The study protocol will specify *n* and *N*
_{0}, and at the interim we will consider raising the final sample size based on the conditional power evaluated at the current parameter estimate. Since the objective is to assess if the interim results are promising the current estimate of the parameter of interest gives the appropriate information [6]. As pointed out by Müller and Schäfer in [7], the over-all type I error can be preserved unconditionally under any general adaptive change, provided the conditional type I error that would have been obtained had there been no adaptation is preserved. This article however only considers the case of SSA. Unlike the situation in [8] the design does not permit sequential testing. Also, the article only considers the conventional hypothesis tests and p-values without adjustments.

We assess the conditional error rate as a function of *r*. By showing that the conditional type I error rate is bounded by the error rate which arises from the design without adaptation the unconditional error rate is proven to be controlled at a pre-specified level *α*.

### Derivation of the main result

We use the notation *X* ∼ *N*(*μ*,*σ*
^{2}) to signify that *X* follows a normal law with mean *μ* and variance *σ*
^{2}.

The change in type I error rate conditional on a sample size increase decided at the interim equals $G(r)=\mathit{\text{Pr}}({Z}^{({N}_{0}+r)}>{z}_{\alpha}|{Z}^{(n)}=z,\theta =0)-\mathit{\text{Pr}}({Z}^{({N}_{0})}>{z}_{\alpha}|{Z}^{(n)}=z,\theta =0)$. The conditional distribution equals (*Z*
^{(N)}|*Z*
^{(n)} = *z*,*θ* = 0)∼*N*(*ρ* *z*, 1 − *ρ*
^{2}), where $\rho =\mathit{\text{Cov}}(\sum _{i=1}^{N}{X}_{i}/\sqrt{N},\sum _{j=1}^{n}{X}_{j}/\sqrt{n})=\sum _{i=1}^{n}1/\sqrt{\mathit{\text{nN}}}=\sqrt{n/N}$, and similarly for ${Z}^{({N}_{0})}$.

Now, in order to show this difference to be less than or equal to zero it may be equivalently shown that the difference of the arguments is negative (in the sense of non-positive), and denote this by *H*(*r*). Obviously *H*(0) = 0.

*q*=

*n*/(

*N*

_{0}+

*r*) and

*V*= (

*N*

_{0}+

*r*)/

*N*

_{0}for arbitrary

*n*,

*N*

_{0}and

*r*, satisfying

*N*

_{0}>

*n*> 0, and

*r*> 0. Please note

*q*

*V*=

*n*/

*N*

_{0}. Then we aim to show

*z*, to obtain

*z*

_{ α }

*b*(

*q*,

*V*), and set out to prove $b(q,V)\le \sqrt{\mathit{\text{qV}}}$. By subtracting $\sqrt{\mathit{\text{qV}}}$ from both sides and equating denominators we have

*q*, we finally have

which is true for all positive *V*.

Now regard *b* as a function of *r* for *n* and *N*
_{0} fix. One may show that, $b(r)\nearrow \sqrt{n/{N}_{0}}$ asymptotically as *r* ↘ 0. Further, *b* decreases as *r* grows in a close to linear fashion. Also, $b(r)\searrow (1-\sqrt{1-\mathit{\text{qV}}})/\sqrt{\mathit{\text{qV}}}$ when *r* tends to infinity.

Please note that since the $b(q,V)\le \sqrt{\mathit{\text{qV}}}=\sqrt{n/{N}_{0}}$ a sufficient but not necessary condition is $z\ge {z}_{\alpha}\sqrt{n/{N}_{0}}$, which will be seen to give the conditional power 50% (the simple criterion). Consequently, this new criterion is less restrictive than the one presented in [5], and, importantly, changes with *r*. The reference [6] provides an example where the type I error remains intact although the conditional power descends down to 36%.

*z*>

*b*(

*q*,

*V*)

*z*

_{ α }equals

From the definition of *G*(*r*) it follows that one cannot go further without increasing the conditional error rate. In this sense the bound is optimal.

### Weibull ditributed survival time points

We will now study the situation where survival times follow a Weibull distribution and right censoring time points are exponentially distributed.

In [9] the details of an Edgeworth expansion of the product limit estimator are given $\mathrm{\Phi}(x)-{n}^{-\frac{1}{2}}\varphi (x)(\stackrel{~}{{\kappa}_{3}}({x}^{2}-1)+3{\sigma}_{1})/6$. First some notation: *X* = lifetime, *T* = left truncation time point, *Y*= right censoring time point, *Z* = *m* *i* *n*(*X*,*Y*), *δ* = *I*(*X* ≤ *Y*). Further, put *C*(*z*) := *P*(*T* ≤ *z* ≤ *Z*|*T* ≤ *Z*). But since *T* ≡ 0 this probability equals *P*(*Z* ≥ *z*) = *P*(*X* ≥ *z*,*Y* ≥ *z*) = *P*(*X* ≥ *z*)*P*(*Y* ≥ *z*). Then *W*
_{1}(*y*) := *P*(*Z* ≤ *y*,*δ* = 1) = *P*(*X* ≤ *y*,*Y* ≥ *X*). With ${\sigma}_{1}(z)={\int}_{0}^{z}\frac{d{W}_{1}(u)}{{C}^{2}(u)}$, the constant $\stackrel{~}{{\kappa}_{3}}$ in the Edgeworth expansion equals ${\sigma}_{1}^{-1}\left(-7.5{\sigma}_{1}^{4}+{\int}_{0}^{z}{C}^{-3}(t)d{W}_{1}(t)\right)$. As stated we assume *X* ∼ *Weib*(*λ*,*β*) and *Y* ∼ *Exp*(*μ*). From this follows that *C*(*y*) = *exp*(−*μ* *y*−*λ* *y*
^{
β
}), ${W}_{1}(y)={\int}_{0}^{y}{e}^{-\mu x}\lambda \beta {x}^{\beta -1}{e}^{-\lambda {x}^{\beta}}\mathit{\text{dx}}$. Thus one may at the interim use parameter estimates to calculate a normal approximation to the conditional power. Alternatively, one may simulate the remainder of the trial. A third option is to base the procedure on the logrank test whose statistic converges to a normal distribution. Consider the situation where the time to some event is compared between patients in an active treatment group and those in a control group. Let *r*
_{
i
} refer to the number of patients remaining at time *i* and *o*
_{
i
} refer to the number of observed events. Further, let *A* refer to the active treatment group and *C* to the control group. If $T=\sum _{i=1}^{k}\frac{{r}_{\mathit{\text{iA}}}{o}_{\mathit{\text{iC}}}-{r}_{\mathit{\text{iC}}}{o}_{\mathit{\text{iA}}}}{{r}_{i}}$ and $V=\sum _{i=1}^{k}\frac{{o}_{i}({r}_{i}-{o}_{i}){r}_{\mathit{\text{iA}}}{r}_{\mathit{\text{iC}}}}{({r}_{i}-1){r}_{i}^{2}}$, then $z=T/\sqrt{V}$ will asymptotically be standard normal, e.g. [10]. Hence one may apply the simple criterion to *z* observed at the interim.

### Binomial proportion

For the sake of simplicity of exposition we focus attention to a single binomial proportion *p* and a one-sided test at the 5% level. Let the null hypothesis and alternative hypothesis be *H*
_{0} : *p* = *p*
_{0},*H*
_{1} : *p* > *p*
_{0}. Please note that for ${X}_{{N}_{0}}\sim \mathit{\text{Bin}}({N}_{0},{p}_{0})$ the conditional distribution given {*X*
_{
n
} = *k*} is the same as ${X}_{{N}_{0}-n}+k\sim \mathit{\text{Bin}}({N}_{0}-n,{p}_{0})$, and similarly for ${X}_{{N}_{0}+r}$.

- 1.the score test statistic: $z=\sqrt{n}(\widehat{p}-{p}_{0})/\sqrt{{p}_{0}(1-{p}_{0})})$,$\widehat{p}=k/n$
- 2.
the log-odds: $z=\sqrt{n\widehat{p}(1-\widehat{p})}\left(\mathit{\text{log}}(\widehat{p}/(1-\widehat{p}))-\mathit{\text{log}}({p}_{0}/(1-{p}_{0}))\right)$ [12]

The simple criterion would then say that if *z* as above exceeds $\sqrt{\frac{n}{{N}_{0}}}{z}_{\alpha}$, then the procedures protects the type I error rate (unconditionally). But we set out to find a more accurate approximation.

*X*

_{ n }=

*k*}

where *q*
_{
α,m
} is the 100 × (1−*α*) percentile of *Bin*(*m*,*p*
_{0}).

*X*

_{ n }∼

*Bin*(

*n*,

*p*) admits a normal approximation of the pivotal statistic

*U*= (

*X*

_{ n }−

*E*[

*X*

_{ n }])/

*S*

*D*(

*X*

_{ n }), which coincides with the score test statistic above, such that

in terms of the third cumulant of *U*, which picks up the skewness. As a rule of thumb it is often said that the normal approximation is quite accurate when *np* and *n*(1 − *p*) both exceed 5. But this statement holds even without the correction with respect to skewness.

*G*(

*r*) defined above by

*μ*

_{ n }=

*n*

*p*

_{0}, ${\sigma}_{0}=\sqrt{{p}_{0}(1-{p}_{0})}$, and the third cumulant (1 − 2

*p*

_{0})/

*σ*

_{0}by

*γ*

_{0}. This quantity will deviate less than 1 from the true percentile for n from 20 to 200, and

*min*{

*np*

_{0},

*n*(1 −

*p*

_{0})} > 5. Let us consider G(r) through the pivotal quantities

*n*

_{1}the larger of the two sample sizes and by

*n*

_{2}the smaller. After equating the denominators and noting ${\mu}_{{n}_{i}}-{\mu}_{{n}_{i}-n}=n{p}_{0},i=1,2$ the difference equals:

Please note that the first term corresponds to expectation of the null distribution. Further, the second term will be negative if *p*
_{0} > 0.5, and the third will always be negative under the conditions of this paper. From this follows that the normal approximation of *G*(*r*) is non-positive for *k* satisfying the above condition. Finally, invoke the fast convergence of the binomial distribution towards a normal law, which means that already 20 observations will make the normal approximation quite accurate, provided *min*{*n* *p*,*n*(1 − *p*)} > 5. Simulations indicate that this decision rule is accurate already at an interim sample size *n* as low as 20. However, the rule does not guarantee preservation of the conditional type I error rate for all *p*. Thus the conclusion is that for the binomial distribution there is no inflation of the unconditional type I error rate under the above conditions. A total of 900000 simulations with *n* from 20 to 100, *p*
_{0} picked randomly in [5/*n*,1−5/*n*], *k* randomly generated from *Bin*(*n*,*p*
_{0}) and *N*
_{0} = 2*n* and *r* = *n* gave a median and mean of *G*(*r*) equal to −0.004762 and −0.004574, respectively, over the set defined by the inequality above. A set of similar simulations using the simple criterion ($k>n({p}_{0}+\sqrt{{p}_{0}(1-{p}_{0}))/{N}_{0}}{z}_{\alpha}$) gave median and mean equal to −0.02429 and −0.02389, respectively. Thus the simple criterion will be on the conservative side.

## Results

### Main result

In **Methods** the following result was derived.

#### A conditional power that quarantees preservation of nominal significance level

*n*out

*N*

_{0}planned observations and leads to a raise of

*r*, equals at least

*q*=

*n*/(

*N*

_{0}+

*r*),

*V*= (

*N*

_{0}+

*r*)/

*N*

_{0}, and

then the type I error rate is preserved. The function *b* satisfies the inequalities $(1-\sqrt{1-\mathit{\text{qV}}})/\sqrt{\mathit{\text{qV}}}\le b(q,V)\le \sqrt{\mathit{\text{qV}}}=\sqrt{n/{N}_{0}}$.

A more practical criterion, or rule of thumb, may be to derive a test statistic *z* with close to a standard normal distribution under the null hypothesis, and check whether $z>\sqrt{\frac{n}{{N}_{0}}}{z}_{\alpha}$. This will be referred to as the simple criterion, and stems from [5]. More generally, the condition *z* > *b*(*q*,*V*)*z*
_{
α
} suffices (cf. equations (11) and (12) in [6]). The conditional power bound in (2) decreases as *r* increases, but the lower bound on *b* implies a limit.

### Example

Take the example of *n* = 55,*N*
_{0} = 110,*r* = 40 and *α* = 0.025,*z*
_{
α
} = 1.96. Then the minimum conditional power equals 43%, see next subsection. Thus a conditional power of considerably less than 50% is permissible from the point of view of type I error rate preservation. This may be good to know if the original sample size calculation was grossly wrong. Then recruiting more subjects than planned may resolve the issue without jeopardising the type I error rate. On the other hand, in such a situation the validity of the scientific hypotheses on which the trial design rests may be questioned, and the sponsor will have to judge whether the updated hypotheses suggest a commercially viable route. Nevertheless, in some cases raising the sample size will make sense, and may save the trial from unnecessary disaster.

Above we assume the variance to be known. If it is not we may estimate it and use for instance a *t*-test statistic which quickly converges to a normal as the sample size increases.

Examination of the *t*-test has provided evidence of a small degree of inflation [14]. In [15] further details of when inflation occurs are given. However, already at a sample size of 30 the *t*-distribution and the normal distribution appear almost identical.

### Calculations in R

*z*

_{ α }

*b*(

*q*,

*V*) as a function of (

*n*,

*N*

_{0},

*r*) instead. We may explore the bound

*z*

_{ α }

*b*(

*n*,

*N*

_{0},

*r*) through the R function

*B.func*

So, for instance *CP.min(alpha1 = 0.025, n = 55, N0 = 110, r = 40)* = 0.43, and, *B.func(n = 55, N0 = 110, r = 0.01)* = 0.7070907, which approximately equals $\sqrt{n/{N}_{0}}=\sqrt{55/110}\approx 0.7071068$. Also *CP.min(alpha1 = 0.025, n = 55, N0 = 110, r = 110)* = 0.3575873.

### Deviations from normal distribution

If we use non-normal data such as survival type of data, then it is often possible to approximate the test statistic by a normal variate. Many test statistics, e.g. those derived by the maximum likelihood method, converge quickly to a normal distribution when the sample size increases. This feature extends the relevance of the main result to measurements following other distributions than the normal.

In **Methods** we looked into the situation where a Kaplan-Meier (KM) estimate is used. The Edgeworth expansion of the distribution of the (standardised) KM estimator has the form $\mathrm{\Phi}(x)-{n}^{-\frac{1}{2}}\varphi (x)\stackrel{~}{{\kappa}_{3}}({x}^{2}-1)/6$, where $\stackrel{~}{{\kappa}_{3}}$ is specified in **Methods**[9, 16], Φ the cumulative distribution function of a standard normal variate and *ϕ* its frequency function. So if we express the change in conditional error rate (*G*(*r*) below) in terms of this expansion the correction term to difference between normal distribution functions will approach zero as $1/\sqrt{n}$. Assuming some parametric distribution, such as the Weibull distribution, one may work out the details regarding this approximation. Or, one may assess the deviation from normality through a simulation procedure.

*p*and a one-sided test of the null hypothesis

*H*

_{0}:

*p*=

*p*

_{0}versus the alternative hypothesis

*p*>

*p*

_{0}, it holds that if we at the interim observe

*X*

_{ n }=

*k*satisfying

with *σ*
_{0} = *p*
_{0} (1−*p*
_{0}), *γ*
_{0} = (1−2*p*
_{0})/*σ*
_{0} and *n*
_{1} = *N*
_{0} + *r*, then inflation of the type I error rate will not occur. More precisely put: on average, over all possible outcomes, the procedure will preserve the type I error rate. However, the conditional error rate will not always fall below the nominal one.

## Discussion

There are operational issues with adaptive designs that must be addressed during the planning stage. In order to safeguard the integrity of the trial and avoid operational bias following an unblinded interim precautions need to be put in place to limit access to both the results and, even, the analysis plans. The latter will specify the output and decision rules, but will leave open the possibility of including other information, such as external factors in the final decision whether to stop for futility or to continue, and if so, whether or not to raise the sample size.

Further, a number of concerns have been raised involving the risk of violating statistical principles or lack of efficacy compared to group sequential designs, e.g. [17–19].

However valid these objections may be, more and more practitioners have felt that the challenges are tractable and have found SSA designs an attractive option. For small biotechnology companies this option gives the possibility of starting a trial with rather limited resources, followed by an additional investment conditional on the interim results being promising. Also, the SSA design makes a lot of sense whenever a fix size design would have to rely on quite limited amount of information regarding the primary variable.

Several references have argued the superiority of seamless phase II/III designs over the traditional phase II and III trials. Merging the two phases produces gains in valuable time [20], and, under reasonable conditions, saves sample size [21].

Earlier research has established that a conditional power at the interim analysis exceeding 50% implies that the conditional, and hence also the unconditional, type I error rate is preserved, cf. [5, 7]. Further, the reference [6] builds on [8] and others to identify a more general region where this happens. The region is identified through equations (11) and (12) in [6]. The derivation of the region relies on results for Brownian motion. Together these two equations implicitly define a bound that coincides with *b* in (3) above.

Further, one cannot use a lower bound without risking inflation of the conditional error rate, and thus one may not rely on the Müller-Schäfer principle of conditional error functions [7] (new does not exceed the original) to prove preservation of unconditional error rate^{1}. By virtue of the Müller-Schäfer principle of conditional error functions any interim decision rule, pre-defined or not, that does not violate this fundamental requirement will permit a redesign of the trial. So from this perspective the SSA designs described here are well behaved and offer great flexibility.

## Conclusions

This article has shown that the risk of compromising the nominal significance level of a statistical test by allowing a sample size increase during the course of a trial remains low and controllable. The conditional error rate and power provide key decision tools.

## Endnotes

^{1} Also, by reversing the order of terms in *G*(*r*) and tracing the same line of thought one may conclude that a sample size decrease is permissible when results are discouraging. But then it may make more sense to discontinue the trial due to futility.

## Declarations

### Acknowledgements

Thanks are due to the referee who found an error in an earlier version of the main result and pointed my attention to crucial references I had overlooked.

## Authors’ Affiliations

## References

- Chang M: Adaptive Design Theory and Implementation Using SAS and R. 2007, Boca Raton, Florida: Chapman and Hall/CRCGoogle Scholar
- Bretz F, Schmidli H, Konig F, Racine A, Maurer W: Confirmatory seamless phase II/III clinical trials with hypotheses selection at interim: general concepts. Biom J. 2006, 48 (4): 623-634. 10.1002/bimj.200510232.View ArticlePubMedGoogle Scholar
- FDA: Guidance for industry: adaptive design clinical trials for drugs and biologics. 2010,, [http://www.fda.gov/downloads/Drugs/.../Guidances/ucm201790.pdf],Google Scholar
- PhRMA: Working group on adaptive designs. Full white paper. Drug Inf J. 2006, 40: 421-484.Google Scholar
- Chen YHJ, DeMets DL, Gordon Lan KK: Increasing the sample size when the unblinded interim result is promising. Stat Med. 2004, 23 (7): 1023-1038,. 10.1002/sim.1688. [http://dx.doi.org/10.1002/sim.1688],View ArticlePubMedGoogle Scholar
- Mehta CR, Pocock SJ: Adaptive increase in sample size when interim results are promising: A practical guide with examples. Stat Med. 2011, 30 (28): 3267-3284,. 10.1002/sim.4102. [http://dx.doi.org/10.1002/sim.4102],View ArticlePubMedGoogle Scholar
- Muller HH, Schafer H: A general statistical principle for changing a design any time during the course of a trial. Stat Med. 2004, 23 (16): 2497-2508,. 10.1002/sim.1852. [http://dx.doi.org/10.1002/sim.1852],View ArticlePubMedGoogle Scholar
- Gao P, Ware J, Mehta C: Sample size re-estimation for adaptive sequential design in clinical trials. J Biopharmaceutical Stat. 2008, 18 (6): 1184-1196,. 10.1080/10543400802369053. [http://www.tandfonline.com/doi/abs/10.1080/10543400802369053],View ArticleGoogle Scholar
- Wang Q, Jing BY: Edgeworth expansion and bootstrap approximation for studentized product-limit estimator with truncated and censored data. Commun Stat - Theory and Methods. 2006, 35 (4): 609-623,. 10.1080/03610920500498840. [http://www.informaworld.com/10.1080/03610920500498840],View ArticleGoogle Scholar
- Whitehead J: The Design and Analysis of Sequential Clinical Trials. 1992, Chichester, West Sussex: Ellis Horwood, second editionGoogle Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. 2010,, R Foundation for Statistical Computing, Vienna, Austria, [http://www.R-project.org] [ISBN 3-900051-07-0],Google Scholar
- Zhou X, Li C, Yang Z: Improving interval estimation of binomial proportions. Philos Trans R Society A: Math, Phys Eng Sci. 2008, 366 (1874): 2405-2418,. 10.1098/rsta.2008.0037. [http://rsta.royalsocietypublishing.org/content/366/1874/2405],View ArticleGoogle Scholar
- DiCiccio TJ, Efron B: Bootstrap confidence intervals. Stat Sci. 1996, 11 (3): 189-212,. [http://www.jstor.org/stable/2246110],View ArticleGoogle Scholar
- Friede T, Kieser M: Sample size recalculation in internal pilot study designs: a review. Biom J. 2006, 48 (4): 537-555,. 10.1002/bimj.200510238. [http://dx.doi.org/10.1002/bimj.200510238],View ArticlePubMedGoogle Scholar
- Graf AC, Bauer P: Maximum inflation of the type 1 error rate when sample size and allocation rate are adapted in a pre-planned interim look. Stat Med. 2011, 30 (14): 1637-1647,. 10.1002/sim.4230. [http://dx.doi.org/10.1002/sim.4230],View ArticlePubMedPubMed CentralGoogle Scholar
- Chang MN: Edgeworth expansion for the kaplan-meier estimator. Commun Stat - Theory and Methods. 1991, 20 (8): 2479-2494,. 10.1080/03610929108830645. [http://www.informaworld.com/10.1080/03610929108830645],View ArticleGoogle Scholar
- Burman CF, Sonesson C: Are flexible designs sound?. Biometrics. 2006, 62 (3): 664-669,. 10.1111/j.1541-0420.2006.00626.x. [http://dx.doi.org/10.1111/j.1541-0420.2006.00626.x],View ArticlePubMedGoogle Scholar
- Tsiatis AA, Mehta C: On the inefficiency of the adaptive design for monitoring clinical trials. Biometrika. 2003, 90 (2): 367-378,. 10.1093/biomet/90.2.367. [http://biomet.oxfordjournals.org/content/90/2/367.abstract],View ArticleGoogle Scholar
- Jennison C, Turnbull BW: Mid-course sample size modification in clinical trials based on the observed treatment effect. Stat Med. 2003, 22 (6): 971-993,. 10.1002/sim.1457. [http://dx.doi.org/10.1002/sim.1457],View ArticlePubMedGoogle Scholar
- Walton M: Adaptive designs: opportunities, challenges and scope in drug development. PhRMA-FDA Workshop. 2006, [http://www.innovation.org/documents/File/Adaptive_Designs_Presentations/04_Walton_Adaptive_Designs_Trial_ Issues_Goals_and_Needs.pdf],Google Scholar
- Bischoff W, Miller F: A seamless phase II/III design with sample-size re-estimation. J Biopharmaceutical Stat. 2009, 19 (4): 595-609,. 10.1080/10543400902963193. [http://www.tandfonline.com/doi/abs/10.1080/10543400902963193],View ArticleGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/13/94/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.