# Sequential boundaries approach in clinical trials with unequal allocation ratios

- Peyman Jafari
^{1}Email author, - Seyyed Mohammad Taghi Ayatollahi†
^{1}and - Javad Behboodian†
^{2}

**6**:1

https://doi.org/10.1186/1471-2288-6-1

© Jafari et al; licensee BioMed Central Ltd. 2006

**Received: **21 March 2005

**Accepted: **13 January 2006

**Published: **13 January 2006

## Abstract

### Background

In clinical trials, both unequal randomization design and sequential analyses have ethical and economic advantages. In the single-stage-design (SSD), however, if the sample size is not adjusted based on unequal randomization, the power of the trial will decrease, whereas with sequential analysis the power will always remain constant. Our aim was to compare sequential boundaries approach with the SSD when the allocation ratio (R) was not equal.

### Methods

We evaluated the influence of R, the ratio of the patients in experimental group to the standard group, on the statistical properties of two-sided tests, including the two-sided single triangular test (TT), double triangular test (DTT) and SSD by multiple simulations. The average sample size numbers (ASNs) and power (1-β) were evaluated for all tests.

### Results

Our simulation study showed that choosing R = 2 instead of R = 1 increases the sample size of SSD by 12% and the ASN of the TT and DTT by the same proportion. Moreover, when R = 2, compared to the adjusted SSD, using the TT or DTT allows to retrieve the well known reductions of ASN observed when R = 1, compared to SSD. In addition, when R = 2, compared to SSD, using the TT and DTT allows to obtain smaller reductions of ASN than when R = 1, but maintains the power of the test to its planned value.

### Conclusion

This study indicates that when the allocation ratio is not equal among the treatment groups, sequential analysis could indeed serve as a compromise between ethicists, economists and statisticians.

## Background

One of the key reasons for using sequential methods, instead of single-stage design (SSD), in planning clinical trials is that the expected number of patients is decreased while maintaining the pre-specified significance level and power. Sequential designs have become common practice in interim monitoring of clinical trials because of their ethical and economic advantages. Nowadays, investigators planning a clinical trial have a wide range of sequential methods available to choose. These methods can be categorized in two different types: boundaries approach and repeated significance tests [1–6]. In this paper we have only considered boundaries approach, including triangular test (TT) and double triangular test (DTT), because of their interesting properties [2, 7–9]. Sebille and Bellissant presented a nice account of the properties of TT and DTT [7, 8, 10]. However, the properties of these methods, in some cases, are still unknown. In particular, situations not dealt with in previous articles are sequential designs with unequal randomization ratios.

In a randomized controlled clinical trial with two treatments, it is a standard practice to have approximately equal-sized treatment groups since it maximizes the statistical power for a given total sample size. However, there are several research papers with topics on unequal randomization that have demonstrated the efficiency of this method in clinical trials [11–16]. They showed that in some trials, which compare a new treatment against a standard, using unequal randomization could be helpful from the ethical and economic viewpoints. Yet there is no consensus between ethicists and economists on the issue. On the other hand, since sequential designs have ethical and economic advantages per se, it seems reasonable to use these methods when investigators decide to randomize patients to the experimental and standard treatment groups in unequal ratios.

Hence, the purpose of this study is to assess the effect of unequal randomization on the statistical properties of TT, DTT and SSD adjusted for unequal allocation ratio (SSD_{adj}) by multiple simulation, and SSD using the formulas used by Pocock [16]. In all of these methods the power and average sample size numbers (ASNs) were computed when the patients were allocated to the experimental and standard treatments in different ratios.

## Methods

We follow notations similar to that used by Sebille and Bellissant [8]. Let θ be a measure of the difference between the experimental and standard treatments. The clinical trial can be viewed as a test of the null hypothesis of no treatment difference H_{0} (θ = 0) against the alternative that there is a difference H_{1} (θ ≠ 0). This parameter is designed such that θ = 0 when treatments are equivalent, θ > 0 (${\text{H}}_{\text{1}}^{\text{+}}$) when the experimental treatment is better than the Standard one, and θ < 0 (${\text{H}}_{\text{1}}^{\text{-}}$) when the experimental treatment is worse.

The trial considered here only involves the comparison of two normally distributed responses in two-sided tests. We defined the effect size as the difference between treatments in units of standard deviation, θ_{R} = (μ_{2}-μ_{1})/σ where μ_{1} and μ_{2} are the means for the standard and experimental groups, respectively, and σ is the common standard deviation (σ_{1} = σ_{2} = σ).

### Single stage design (SSD)

The traditional statistical approach in the analysis of clinical trials is SSD with equal patients in each group. In this method, the sample size is computed at the design phase based on the significance level (α), difference of clinical interest (θ_{R}), and power (1-β). In a two-group comparative study where the response measure is normally distributed, the total sample size formula would be:

$\text{N=4}{\left(\frac{{\text{Z}}_{\text{1}-\alpha /2}+{\text{Z}}_{\text{1}-\beta}}{{\theta}_{\text{R}}}\right)}^{2}\left(1\right),$

where ${\theta}_{\text{R}}=\frac{{\mu}_{2}-{\mu}_{1}}{\sigma}$ and Z_{α} is the upper 100α% percentile of *N*(0,1), that is, α = 1 - Φ(Z_{α}). *Z*
_{1-β} is defined similarly.

If N_{E} and N_{S} denote the numbers of patients assigned to experimental and standard treatments with N_{E} + N_{S} = N being fixed and $\text{r=}\frac{{\text{N}}_{\text{E}}}{\text{N}}$ denotes the proportion on the experimental treatment, then the power under H_{1} is given by [16]:

$\text{power=}\Phi \left\{\text{2}\left[{\Phi}^{-1}(1-\alpha )+{\Phi}^{-1}(1-\beta )\right]\times \sqrt{\text{r(1}-\text{r)}}-{\Phi}^{-1}(1-\alpha )\right\}\left(2\right).$

In this formula Φ(·) denotes the cumulative function of the standard normal distribution *N*(0,1).

However, if the investigator decides to allocate patients in unequal ratio and aims to achieve the pre-specified power, then the total sample size for SSD should be adjusted by a factor dependent on the allocation ratio. Therefore, the total sample size for SSD_{adj} is equal to [2]:

${\text{N}}_{\text{adj}}=\frac{{(\text{R+1)}}^{\text{2}}}{\text{R}}{\left(\frac{{\text{Z}}_{\text{1}-\alpha /2}+{\text{Z}}_{\text{1}-\beta}}{{\theta}_{\text{R}}}\right)}^{2}=\frac{{(\text{R+1)}}^{\text{2}}\text{N}}{\text{4R}}\left(3\right),$

where R is the ratio of patients in the experimental group to the standard group or the reverse ratio.

Once the data have been collected, the statistical analysis is conducted. Based on the SSD or SSD_{adj} we cannot stop an ongoing trial before inclusion of a predetermined sample size, even if the early data show a clear difference between treatments.

### Boundaries approach: triangular and double triangular tests (TT and DTT)

The triangular tests can be categorized in two classes based on their power function. The power function, denoted by C (θ), is defined as the probability that H_{0} is rejected when the parameter θ is true. When the true treatment difference is θ, C^{+} (θ) and C^{-} (θ) are the probability of reaching the conclusion that the experimental treatment is significantly better and worse than the standard, respectively. Based on this definition two alternative power requirements will be specified: power requirement I and power requirement II. TT is designed to satisfy power requirement I. In this situation C^{+}(θ_{R}) = 1-β but no specification is made for C^{-} (-θ_{R}) and also C^{-} (θ_{R}) is usually negligible. On the other hand, DTT is designed to satisfy power requirement II. In this situation, C^{+}(θ_{R}) = C^{-} (-θ_{R}) = 1-β and both C^{+}(-θ_{R}) and C^{-} (θ_{R}) are negligible [2, 17].

### Simulation study

We studied the ASN for the TT and DTT by multiple simulations in PEST3 [17]. Our simulation design was very similar to that used by Sebille and Bellissant [8]. For each studied situation, we generated 30,000 independent comparative trials in which patient responses were drawn from a normal distribution with mean μ_{1} (mean response in standard group) equal to 10 and the standard deviation equal to 5. The influences of different values of β and θ_{R} (μ_{2}) on the statistical properties of all tests were evaluated. The total number of patients at each interim analysis (n) was equal to 12. We also evaluated the influence of the allocation ratio (R) on the statistical properties. R is defined as the ratio of the patients in the experimental group to the standard group. Namely, we chose two different values for β(0.05 and 0.1), seven values for θ_{R} (0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.0), one value for n (n = 12) and two values for R (1 and 2). The value of α was set to 0.05 for all simulated trials. We also calculated the required sample size for SSD and SSD_{adj} for the same values of θ_{R}, R, β and α as for TT and DTT by Formulas (1) and (3), respectively. Moreover, we simulated the required ASN and power for the TT (n = 12), DTT (n = 12), and two-sided SSD_{adj} for different values of R, when θ_{R} = 0.7 and β = α = 0.05. For SSD, the required sample size and power were calculated for the same value of R, θ_{R}, α and β using Formulas (1) and (2).

## Results

_{0}, H

_{1}and θ = θ

_{R}/2 for different values of θ

_{R}and β for the SSD when R = 1, SSD when R = 2 (SSD

_{adj}), TT when R = 1 and when R = 2, and DTT when R = 1 and when R = 2. The ASNs under H

_{0}, H

_{1}and θ = θ

_{R}/2 were smaller for all sequential tests than for SSD (R = 1), whatever values of θ

_{R}, β and R were considered. Indeed, as compared with the SSD (R = 1), there were decreases of approximately 39% and 20% under H

_{0}, 29% and 28% under H

_{1}, and 14% and 12% under θ = θ

_{R}/2 in the ASNs for the TT (R = 2) and DTT (R = 2), respectively. Moreover, Table 1 shows that choosing R = 2 instead of R = 1 increases the sample size of SSD by approximately 12% and the ASN of the TT and DTT by the same proportion under H

_{0}, H

_{1}and θ = θ

_{R}/2. On the other hand, as compared with SSD when R = 2 (SSD

_{adj}), there were decreases of approximately 46% and 29% under H

_{0}, 37% and 36% under H

_{1}, and 24% and 21% under θ = θ

_{R}/2 in the ASNs for the TT (R = 2) and DTT (R = 2), respectively.

ASN required to reach a conclusion under H_{0}/H_{1}, and when θ = θ_{R}/2 for the TT (R = 1), TT (R = 2), DTT (R = 1) and DTT (R = 2) and sample size for the SSD (R = 1) and SSD (R = 2), for different values of θ_{R}, β (α = 0.05) and n = 12.

θ | β | SSD R = 1 | SSD R = 2 | TT(R = 1) | TT(R = 2) | DTT(R = 1) | DTT(R = 2) | ||||
---|---|---|---|---|---|---|---|---|---|---|---|

ASN (H | ASN (θ | ASN (H | ASN (θ | ASN (H | ASN (θ | ASN (H | ASN (θ | ||||

0.4 | 0.05 | 325 | 366 | 168/185 | 239 | 188/208 | 269 | 220/184 | 247 | 248/208 | 273 |

0.1 | 263 | 296 | 135/164 | 191 | 152/184 | 215 | 179/165 | 199 | 201/185 | 224 | |

0.5 | 0.05 | 208 | 234 | 108/120 | 155 | 122/134 | 173 | 143/120 | 159 | 160/135 | 179 |

0.1 | 168 | 189 | 88/107 | 124 | 98/120 | 139 | 116/108 | 129 | 129/120 | 145 | |

0.6 | 0.05 | 144 | 163 | 77/85 | 109 | 86/95 | 122 | 100/86 | 112 | 113/95 | 125 |

0.1 | 117 | 131 | 62/76 | 87 | 69/85 | 98 | 82/76 | 90 | 92/85 | 102 | |

0.7 | 0.05 | 106 | 119 | 57/64 | 81 | 64/71 | 91 | 75/64 | 84 | 84/72 | 93 |

0.1 | 86 | 97 | 47/57 | 65 | 52/64 | 73 | 61/57 | 68 | 68/64 | 76 | |

0.8 | 0.05 | 81 | 91 | 45/50 | 63 | 50/56 | 70 | 59/50 | 65 | 63/56 | 71 |

0.1 | 66 | 74 | 37/45 | 51 | 41/50 | 57 | 46/44 | 51 | 53/50 | 59 | |

0.9 | 0.05 | 64 | 72 | 36/41 | 51 | 40/45 | 56 | 46/41 | 51 | 52/45 | 58 |

0.1 | 52 | 58 | 30/37 | 41 | 33/41 | 46 | 39/37 | 42 | 43/41 | 48 | |

1.0 | 0.05 | 52 | 58 | 30/34 | 42 | 34/38 | 47 | 39/35 | 43 | 44/38 | 48 |

0.1 | 42 | 47 | 25/31 | 34 | 28/34 | 38 | 29/29 | 32 | 36/34 | 39 |

_{adj}, TT (n = 12) and DTT (n = 12) as a function of R when θ

_{R}= 0.7 and β = 0.05. The ASN curves for the TT and DTT under ${\text{H}}_{\text{1}}^{\text{+}}$ always stayed beneath the sample size required by the SSD

_{adj}and were similar to one another. Indeed, as compared with SSD

_{adj}, there were decreases of approximately 39.5%, 39.5%, 40.4%, 41.5%, 42% and 42.7% in ASNs for TT and DTT when R was equal to 1, 2, 3, 4, 5 and 9, respectively. Also, for R ≤ 4, the ASN curves of the TT and DTT stayed beneath the sample size required by the SSD (N = 106). Indeed, as compared with SSD, for R ≤ 4, they were decreased by approximately 39.5%, 32%, 21% and 8.5% in ASNs for TT and DTT when R was equal to 1, 2, 3 and 4 respectively. On the other hand, for R ≥ 2 the ASN curve of SSD

_{adj}remained above the sample size required by SSD. Indeed, as compared with SSD, they were increased by approximately 12.3%, 33%, 56.6%, 80% and 178% in ASNs for SSD

_{adj}when R was equal to 2,3,4,5 and 9, respectively.

_{adj}where there is no limit on patient recruitment, the power curves remain constant.

## Discussion

Sequential methods and unequal randomization design are two different techniques in clinical trials, with their ethical and economic advantages. However, no previous study has combined unequal randomization with sequential analyses. In other words, the debates concerning unequal randomization were restricted to the SSD [11–16] and sequential analyses were only discussed in situations where the patients were equally randomized between the treatment groups [7, 8]. Sebille and Bellissant [8] showed that, of the one-sided sequential tests, the one-sided TT (R = 1) offers a substantial decrease in sample size compared with the one-sided SSD (R = 1); namely 40% under H_{0} and H_{1} and 25% under θ = θ_{R}/2. In addition, they showed that the two-sided TT (R = 1) offers a two-sided conclusion with much fewer patients than the double TT (R = 1) and two-sided SSD (R = 1), but at the expense of a high decrease in power under ${\text{H}}_{\text{1}}^{\text{-}}$[7]. On the other hand, according to Avins [13], Edwards [14] and Pocock [16], unbalanced randomization has ethical advantages since more patients are randomized to what is thought to be the superior therapy. Also, Torgerson and Campbell [11, 12] showed that when research costs differ between treatments and there is no constraint on total sample size, it is more cost-effective to randomize more patients to the less expensive treatment. However, Pocock [16] showed that, when there is a ceiling on total sample size, unequal randomization leads to a reduction in statistical power. Hence, we expected that, in the practical situation, in which the sample size of the SSD cannot be adjusted, the TT (R = 2) and DTT (R = 2), compared with unadjusted SSD, decrease the sample size while maintaining the power of the trial to its planned value. As an important result, our simulation study showed that even with the maximum ASN which occurs at θ = θ_{R}/2, the TT (R = 2) and DTT (R = 2) have smaller ASNs than the SSD (R = 1). Also before the start of the work, we could not estimate how much using R = 2 instead of R = 1 would increase the sample size in the TT, DTT and SSD. However, based on our findings, choosing R = 2 instead of R = 1 equally increases the sample size in the sequential methods and SSD up to 12%. Nevertheless, when the costs of the two treatment groups are very different, allocation of more patients to the cheaper treatment in the TT and DTT will compensate for this increase rate in the sample size. This decreases the total cost of the trial substantially.

However, it is necessary to present some characteristics of our study. Firstly, to present a fair comparison, we have only evaluated the statistical properties of the two-sided TT and DTT under H_{0}, ${\text{H}}_{\text{1}}^{\text{+}}$ and θ_{R}/2, because, under these hypotheses, their power functions are identical [7]. Secondly, we did not compare the one-sided TT with other two-sided tests simultaneously because it is quite controversial in the literature [18, 19].

## Conclusion

This study shows that if we allocate patients unequally in the SSD among the treatment groups and sample size ceiling cannot be increased to maintain the power of the trial due to economic restrictions, then an amalgamation of the sequential analysis and unequal randomization, compared with SSD, can be a compromise between statistical, ethical and economic requirements.

## Notes

## Declarations

### Acknowledgements

We are thankful to Keivan Shalileh for his comments and reading the final draft. We are also thankful to the referees for their invaluable comments. The financial support for this study was provided by Shiraz University of Medical Sciences, Shiraz, Iran.

## Authors’ Affiliations

## References

- Jennison C, Turnbull BW: Group Sequential Methods with Application to ClinicalTrials. 2000, London: Chapman and HallGoogle Scholar
- Whitehead J: The design and analysis of sequential clinical trials. 1997, Chichester: John WileyGoogle Scholar
- Wang SK, Tsiatis AA: Approximately optimal one-parameter boundaries for group sequential trials. Biometrics. 1987, 43: 193-199.View ArticlePubMedGoogle Scholar
- Kim K, DeMets DL: Design and analysis of group sequential tests based on the type I error spending rate function. Biometrika. 1987, 74: 149-154.View ArticleGoogle Scholar
- DeMets DL, Lan KKG: Interim Analysis: The alpha spending function approach. Stat Med. 1994, 13: 1341-1352.View ArticlePubMedGoogle Scholar
- Pampallona S, Tsiatis AA: Group sequential designs for one-sided and two-sided hypothesis testing with provision for early stopping in favor of the null hypothesis. J Stat Plann Inf. 1994, 42: 19-35. 10.1016/0378-3758(94)90187-2.View ArticleGoogle Scholar
- Sebille V, Bellissant E: Comparison of the two-sided single triangular test to the double triangular test. Control Clin Trials. 2001, 22: 503-514. 10.1016/S0197-2456(01)00154-4.View ArticlePubMedGoogle Scholar
- Sebille V, Bellissant E: Comparison of four sequential methods allowing for early stopping of comparative clinical trials. Clin Sci. 2000, 98: 569-578. 10.1042/CS19990336.View ArticlePubMedGoogle Scholar
- Whitehead J, Todd S: The double triangular test in practice. Pharmaceut Statist. 2004, 3: 39-50. 10.1002/pst.91.View ArticleGoogle Scholar
- Sebille V, Bellissant E: Sequential methods and group sequential designs for comparative clinical trials. Fundam Clin Pharmacol. 2003, 17: 505-516. 10.1046/j.1472-8206.2003.00192.x.View ArticlePubMedGoogle Scholar
- Torgerson DJ, Campbell MK: Use of unequal randomisation to aid the economic efficiency of clinical trials. BMJ. 2000, 321: 759-10.1136/bmj.321.7263.759.View ArticlePubMedPubMed CentralGoogle Scholar
- Torgerson DJ, Campbell MK: Unequal randomisation can improve economic efficiency of clinical trials. J Health Serv res Policy. 1997, 2: 81-85.PubMedGoogle Scholar
- Avins AL: Can unequal be more fair? Ethics, subject allocation, and randomised clinical trials. J Med Ethics. 1998, 24: 401-408.View ArticlePubMedPubMed CentralGoogle Scholar
- Edwards S, Braunholtz D: Can unequal be more fair? A response to Andrew Avins. J Med Ethics. 2000, 26: 179-182. 10.1136/jme.26.3.179.View ArticlePubMedPubMed CentralGoogle Scholar
- Sposto M, Krailo MD: Use of unequal allocation in survival trials. Stat Med. 1987, 6: 119-126.View ArticlePubMedGoogle Scholar
- Pocock SJ: Allocation of patients to treatment in clinical trials. Biometrics. 1979, 35: 183-197.View ArticlePubMedGoogle Scholar
- Brunier H, Whitehead J: PEST 3.0 Operating Manual. 1993, Reading UniversityGoogle Scholar
- Sebille V, Bellissant E: Letter to the Editor. Control Clin Trials. 2002, 23: 423-424. 10.1016/S0197-2456(02)00219-2.View ArticleGoogle Scholar
- Whitehead J: Letter to the Editor. Control Clin Trials. 2002, 23: 422-423. 10.1016/S0197-2456(02)00211-8.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/6/1/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.