Skip to main content

Evaluation of biases present in the cohort multiple randomised controlled trial design: a simulation study

An Erratum to this article was published on 22 March 2017



The cohort multiple randomised controlled trial (cmRCT) design provides an opportunity to incorporate the benefits of randomisation within clinical practice; thus reducing costs, integrating electronic healthcare records, and improving external validity. This study aims to address a key concern of the cmRCT design: refusal to treatment is only present in the intervention arm, and this may lead to bias and reduce statistical power.


We used simulation studies to assess the effect of this refusal, both random and related to event risk, on bias of the effect estimator and statistical power. A series of simulations were undertaken that represent a cmRCT trial with time-to-event endpoint. Intention-to-treat (ITT), per protocol (PP), and instrumental variable (IV) analysis methods, two stage predictor substitution and two stage residual inclusion, were compared for various refusal scenarios.


We found the IV methods provide a less biased estimator for the causal effect when refusal is present in the intervention arm, with the two stage residual inclusion method performing best with regards to minimum bias and sufficient power. We demonstrate that sample sizes should be adapted based on expected and actual refusal rates in order to be sufficiently powered for IV analysis.


We recommend running both an IV and ITT analyses in an individually randomised cmRCT as it is expected that the effect size of interest, or the effect we would observe in clinical practice, would lie somewhere between that estimated with ITT and IV analyses. The optimum (in terms of bias and power) instrumental variable method was the two stage residual inclusion method. We recommend using adaptive power calculations, updating them as refusal rates are collected in the trial recruitment phase in order to be sufficiently powered for IV analysis.

Peer Review reports


Clinical trials are facing increasing costs, complexities, and poor accrual, all threatening the viability of the current clinical trial model [1]. There is a vital need for trial designs that promote increased external validity and cost efficiency so we can continue to deliver medical benefits to patients [2]. To this end, there have been frequent calls for more pragmatic generalizable trials [3, 4]. A simplification of trial eligibility criteria and integration of routine electronic health records to monitor trial outcomes should play a key role in the development of pragmatic trials [1, 3]. The cohort multiple randomised controlled trial design [5] (cmRCT, also referred to as trial within cohorts design) is a simple and efficient pragmatic trial design that could address some of the issues discussed. In a cmRCT design we identify a large observational cohort, and at the outset consent is obtained from participants both for random allocation to interventions in the future, and to use their medical records data. For a given intervention, eligible patients are identified from the cohort and a random selection are offered the intervention. The remaining eligible patients are assigned to the control group and are not formally recruited into the trial, thus addressing some of the issues with trial accrual.

The cmRCT design offers several advantages. First, the use of a large observational cohort enables multiple trials over time and as patients all come from the same cohort comparability of these trials is improved [5]. Second, electronic healthcare records reduce recruitment and follow-up costs [6], making it possible to have a large control arm requiring minimal additional costs for follow-up. Third, both the intervention and standard of care are delivered at the point of care enabling us to assess the effectiveness of both interventions in real life conditions as opposed to under strict trial conditions.

A cmRCT design has several limitations [5, 7]. The use of routine data will require a mix of skill sets from both clinical trial and observational research, possibly increasing costs. Loss to follow-up may be a significant issue if linkage of routine data linkage is not possible. The comparator is standard of care, this may contain a variety of treatments with some self-selected by the patient so is somewhat an unknown. The feasibility of standardised outcome measures may also be challenging when using routine data. Blinding of participants in the intervention is not possible in the cmRCT design. If the cohort of eligible patients is too small in a cmRCT and refusal is high, it is not possible to keep recruiting until the required power is achieved as this will start oversampling from the control arm leading to a less powerful trial than the standard two arm. A notable limitation is that only patients randomly selected to receive the experimental intervention have the option to refuse the allocated treatment. This refusal may differ from the refusal we see in routine care due to the experimental nature of the study. Refusal can potentially lead to bias in the treatment effect estimate and loss of power. Though the cmRCT design is a recent proposal, its uptake has been fairly broad in application [812] and ongoing cmRCTs have scarcely considered the matter of refusal. Where mention of intended analysis methods were available, intention-to-treat was always used. Only one trial made explicit assumptions about the proportions of refusers in the intervention arm, assuming a 20% refusal rate and planned to use complier average causal effect (CACE) [13] analysis if refusal in the intervention arm was large [10], another noting CACE analysis as a secondary analysis method [9]. CACE analysis provides an unbiased treatment effect estimate for individuals who comply with the protocol, unlike intention-to-treat or per-protocol; however, there is generally a loss of power.

The effects of non-uptake of the experimental intervention in cmRCT are unclear. In this study, we undertook a simulation study to test the validity of the cmRCT design in the presence of treatment refusal. The objectives were to assess the bias present in the cmRCT design due to differential refusals and any loss in statistical power. We identify scenarios where cmRCT are viable designs for point-of-care trials and suggest alternate statistical methods, such as instrumental variable analysis, which can be used to reduce bias.

Randomised trials are often divided into two types, individually randomised – individuals randomized to trial arm - or cluster randomized trials – clusters (groups of individuals or communities) randomized to trial arm. In certain scenarios cluster trials can be more appropriate (for instance when we anticipate contamination between trial arms), they can also improve accrual and reduce costs. However, cluster trials require specific design, statistical analysis and interpretation and are not always beneficial. In cluster trials we generally anticipate an inflation of the variance due to correlation among individuals in the same cluster, therefore, they require larger sample sizes than individually randomized trials designed to answer the same research question (if an individually randomized trial is possible) and more complex analysis to account for the correlation of outcomes within a cluster. This paper focusses on individually randomised cmRCTs, representative of the randomization used in the original cmRCT design and existing cmRCT studies [5]. The case of a cluster cmRCT has been investigated by Pate et al. [14]; differing effects of the refusal on power were found between the cluster and individual trial scenarios. As sample sizes for trials using the cmRCT design will be limited to the size of the cohort, the benefits and requirements for cluster level randomization should be considered with caution.


Data were simulated to represent hypothetical trials using a cmRCT design with randomisation of individual patients. The cohort of patients was those receiving standard of care for a particular condition. The trial assessed the effect of an intervention compared with standard of care, with an outcome of time to cardiovascular disease (CVD) event and effect estimate the hazard ratio. The simulations replicated a cmRCT study which assessed the effect of an intervention compared with standard of care for patients at risk of cardiovascular disease (CVD) and eligible for lipid lowering drugs [14]. The outcome of interest was the time to CVD event. We define treatment effect, the parameter of interest, as the complier average causal effect (defined counterfactually) of the treatment (taken per protocol) on (hazard of) outcome. Refusal of the intervention and correlation between this refusal and baseline risk were varied to create different trial scenarios. It was assumed that the underlying risk for the outcome of interest varied between patients. We used bias (error in the estimation of the causal treatment effect), statistical power, and standard error of the mean treatment effect as measures of performance.

Analysis methods

We assessed the bias and variance of effect estimators associated with different analysis methods when treatment refusal was present in the intervention arm of a cmRCT study. The methods included: intention-to-treat (ITT), per protocol (PP) and two instrumental variable (IV) analysis techniques. Current trial guidance recommends ITT analysis [15]; however, this dilutes the treatment effect in the presence of refusal which may result in biased estimates of the causal effect of a treatment, according to our specific definition of treatment effect above [16]. PP analysis excludes people from the study who do not follow protocol, generally yielding biased results as adherence to protocols is often non-random [17]. IV analysis estimates the effect of exposure (X) on the outcome (Y), accounting for all unmeasured confounding. The IV technique is a two stage analysis; the first stage assesses how an unbiased instrument (randomisation, Z) predicts the exposure (treatment received, X); the second stage uses this information to understand how the exposure predicts the outcome (Y). The performance measures used to assess the four methods were the bias of the estimate of treatment effect, statistical power and standard error.

Figure 1 represents the causal relationships required for an IV analysis in a trial. The three main assumptions for the validity of an instrumental variable are: it is strongly associated with exposure, it has no effect on the outcome other than through the exposure, and it is independent of confounders of the exposure outcome relationship [18]. In an RCT randomisation (Z) meets these assumptions; however, the strength of association with exposure depends upon the non-compliance rates. The performance of two IV approaches were compared, the two stage predictor substitution (2SPS) method and the two stage residual inclusion (2SRI) method [19, 20]. The 2SPS method is a straightforward two stage regression approach. First, regress IV on exposure, Z on X, and let \( \widehat{\mathrm{X}}= E\left( X\Big| Z\right) \). In the second stage fit a model for the outcome, Y, including \( \widehat{\mathrm{X}} \) as a covariate. The coefficient of \( \widehat{\mathrm{X}} \) from the second stage model is the IV estimated treatment effect of interest. Similarly, in the first stage of the 2SRI method regress Z on X, and then calculate the residual term R = X − E(X|Z). In the second stage, fit a model for the outcome, Y, including both treatment received, X, and residuals, R, as covariates in the model. The coefficient of X in the second stage model is the IV estimated treatment effect.

Fig. 1
figure 1

Directed acrylic graph representing randomised trial with treatment refusal

Simulation overview

Simulations were undertaken using statistical software SAS version 9.4 (SAS Institute Inc, Cary, NC) and plots in R version 3.1.2 followed standard guidelines [21]. Simulations replicated a cmRCT study evaluating the effect of an intervention compared with standard of care for patients at risk of CVD [14]. The outcome was time until CVD event, which is of direct importance to the patient, as opposed to some surrogate endpoints [22]. Simulated patient characteristics included: baseline risk of a CVD event (for example, acute myocardial infarction/stroke (fatal or non-fatal) and death attributable to CVD), time to non-CVD mortality, and probability of refusing the intervention. If non-CVD mortality event occurred before the CVD they were treated as censored. Clinician probability of refusing to offer the intervention was also simulated. The data generating models for time to both CVD event and mortality were Weibull distributions, with distribution parameters chosen to give a correlation of 0.25 between CVD and mortality to represent informative censoring. The distribution parameters were chosen so the mean and standard deviation of the ten year CVD risk were representative of the population eligible for lipid lowering drugs, 21.1 and 8.7% respectively [14]. Refusal was only present in the intervention arm. We assume the treatment has no effect on the competing risk of death from other causes than CVD event.

We used Weibull baseline hazard functions for all individuals for time until CVD event, T c , and time until mortality, T m . Individual risks were created by incorporating individual random effects to the Weibull baseline hazard function. Given that the shape parameter of a Weibull distribution is fixed, then the proportional hazards assumption holds. The intervention effect used was an average reduction in ten year CVD risk by 25%. The trial follow-up used, T max, was three years as a plausible follow-up period in a survival trial. All simulation variables are presented in Table 1.

Table 1 Variables included in the simulation study, generating models, and notation

Different trial scenarios were generated through varying both the average refusal probability and the correlation of refusal with individual risk. The average patient refusal probability, p, and average clinician refusal probability, q, took values of 0.1, 0.2, or, 0.3. Different correlation scenarios were created by defining an upper limit (UL) and lower limit (LL) that all individuals’ refusal probability lies between and assigning refusal probabilities equally spaced between these values. Patients were ordered (ascending or descending) by their individual random effect ε i . Correlation between CVD risk and refusal probability took values defined as zero, low, medium, or high using upper and lower limits for individual refusal probabilities \( \left( LL, UL\right)\in \Big\{\left( p, p\right),\left(\frac{2 p}{3},\frac{4 p}{3}\right),\left(\frac{p}{3},\frac{5 p}{3}\right),\left(0,2 p\right) \). For each scenario we simulated 1000 realisations of the trial data.

Sample sizes were calculated under two different recruitment methods: recruitment without refusal assumed no refusal and recruitment with refusal used the expected refusal rate in the sample size calculation (this assumes refusal is non-informative for the sample size calculation – not for estimating effects). Both recruitment methods were used for all simulated trial scenarios. We calculated the required sample size using bootstrap methods [23], an unequal randomisation method of 4:1 standard of care:intervention, 0.80 statistical power, a type I error rate of 0.05 and a minimum treatment effect tested of an average reduction in ten year CVD risk by 25% (taken as the minimum clinically important effect). If a trial budget is fixed but there is no limitation to the total sample size then an increased statistical power can be obtained by sampling more patients from the control arm [24], hence the unequal randomisation ratio. Cox proportional hazard survival models were used to provide effect estimates under each analysis method.

Simulation steps

We simulated 1000 trials each with size N obs , i = 1, …, N obs for each different combination of parameters. For j = 1, …, 1000 the following procedure was undertaken to simulate hypothetical time to event trial data:

  1. 1)

    Individuals risks: Generate individual random effects, ε i s for each patient, i = 1, …, N obs .

  2. 2)

    Refusal probabilities: Order patients by their individual random effect, ε i . Generate refusals either randomly or related to underlying CVD risk by relating them with individual random effect order ε i . The probability of a patient refusing intervention if offered it is p i and the probability of clinician refusing to offer the intervention to the patient is q i .

  3. 3)

    Randomisation: Randomise to standard of care Z i =0, or intervention, Z i =1, on a 4:1 basis.

  4. 4)

    Treatment: Generate the treatment actually received, X i =0/1 for standard of care/intervention. All those in control arm (Z i  = 0) receive standard of care (X i  = 0) and those randomised to intervention arm (Z i  = 1) receive either intervention or standard of care dependent upon their refusal, X i  = min {Bernoulli(1 − p i ), Bernoulli(1 − q i )}.

  5. 5)

    Hazard rate: Apply intervention effect, \( \beta =-0.32 \), and individual random effects to the Weibull baseline hazard function to give \( h\left({t}_{i c}\right) = {h}_0\left({t}_c\right)\ {e}^{\left(\beta {X}_i+{\varepsilon}_i\right)},\mathrm{and}\kern0.5em h\ \left({t}_{i m}\right)={h}_0\left({t}_m\right)\ {e}^{\left({\varepsilon}_i\right)}. \)

  6. 6)

    Time to event: Simulate individual time to event T ic and T im from corresponding Weibull distributions.

  7. 7)

    Censoring: Generate C i the censoring indicator C i =0/1 if outcome is an observed event/censored. We observe the continuous outcome Y i  = min(T ic , T im , T max ) and the observed trial data is then {Y i , C i , Z i , X i }, a set of censored survival data, treatment allocations, and treatments received.

  8. 8)

    Analysis: Using ITT, PP, 2SPS, and 2SRI methods fit a Cox proportional hazards model to estimate the effect estimate \( {\widehat{\beta}}_j \) and the corresponding p-value, s j for each of the four different estimators.

The same set of simulated data sets were used to compare different methods for the same scenario; however, for each scenario a different set of data sets was generated [21]. This enables differences between methods to be detected. The mean effect estimate \( \overline{\beta}={\displaystyle \sum_j}{\widehat{\beta}}_j/1000 \), relative bias \( \left(\overline{\beta}-\beta \right)/\beta \), and statistical power Ij(p j  < 0.05)/1000 were calculated for each scenario where j = 1, …, 1000. The empirical standard error of the mean effect estimate, \( \overline{\beta}, \) over all simulations was calculated as \( S. E.\left(\overline{\beta}\right)= S. D.\left({\widehat{\beta}}_j\right)/1000 \).


Figure 2 shows the influences of refusal on the causal effect estimate of the treatment (taken per protocol) on hazard of outcome and power (using recruitment without refusal). This was done for a positive correlation between refusal probability and underlying risk for the outcome. PP increasingly overestimates the effect estimate as correlation strengthens, refusal probabilities of 0.1 correlated to risk led to bias of between 3.3 and 12.8%; the corresponding power increased. ITT estimates were biased; average refusal probabilities of 0.2 underestimated the treatment effect by between 21.5 and 29.0%. As the correlation between refusal and risk strengthened the ITT estimate underestimated the effect with increasing magnitude. Both IV methods were less biased than PP or ITT estimates in all scenarios in Fig. 2. An average refusal of 0.2 and high correlation resulted in bias of −11.2% with 2SPS and 8.1% with 2SRI, compared to −29.0% with PP and 32.3% with ITT. The 2SRI method provided the most accurate effect estimate in all scenarios when there was correlation between refusal and risk. The IV and ITT analyses resulted in a lower power than the expected 0.80, the power of 2SRI fluctuated between a 0.65 and 0.70. Statistical power of both 2SPS and ITT were equal and generally decreased as refusal and correlations increased.

Fig. 2
figure 2

Percentage bias (top) and power (bottom) against correlation patient risk and refusal probability using recruitment without refusal and positive correlation between refusal and risk: both plots are paneled by the refusal probabilities and present results for all four analysis methods. Black reference lines represent empirical power and zero bias

The bias of the effect estimate and statistical power for negative correlations between refusal and event risk are presented in Fig. 3 (using recruitment without refusal). PP analysis provides underestimated effect estimates, increasing as both refusal and correlation increased; the corresponding power decreased. An average refusal of 0.1 negatively correlated with risk yielded an underestimate of the effect estimate of between 4.0 and 13.4%, respective power between 0.78 and 0.68. The ITT analysis again underestimated the effect estimate, however, here with decreasing magnitude as correlations strengthened. Both IV methods followed similar patterns to one another, overestimation and power generally increasing as correlation strengthened. It is the correlation that most affects the IV analyses, with bias only reaching greater than 10% when refusal is 0.3 and correlation is high.

Fig. 3
figure 3

Percentage bias (top) and power (bottom) against correlation patient risk and refusal probability using recruitment without refusal and negative correlation between refusal and risk: both plots are paneled by the refusal probabilities and present results for all four analysis methods. Black reference lines represent empirical power and zero bias

Figure 4 presents the bias of the effect estimate of increasing refusal probabilities and positive correlations under recruitment with refusal. The biases are consistent with those under recruitment without refusal, with some slight variation between the two due to random variation in the sample. However, the main effect of recruitment with refusal compared to recruitment without refusal is, understandably, an increase in statistical power. The 2SRI analyses resulted in overestimation of the effect by between 0.06 and 13.5% and a power which fluctuated between 0.81 and 0.90. Figure 5 shows the effects on bias and power when increasing refusal probabilities and strengthening negative correlations under recruitment with refusal. Again, the biases follow consistent patterns across both recruitment methods. The ITT and IV analyses were above the desired statistical power for all scenarios. Of all approaches the 2SPS performed best in terms of minimum absolute bias, between −4.9 and 11.7%.

Fig. 4
figure 4

Percentage bias (top) and power (bottom) against correlation patient risk and refusal probability using recruitment with refusal and positive correlation between refusal and risk: both plots are paneled by the refusal probabilities and present results for all four analysis methods. The black reference lines represent empirical power and zero bias

Fig. 5
figure 5

Percentage bias (top) and power (bottom) against correlation patient risk and refusal probability using recruitment with refusal and negative correlation between refusal and risk: both plots are paneled by the refusal probabilities and present results for all four analysis methods. The black reference lines represent empirical power and zero bias

Scenarios with both clinician refusal and patient refusal greater than zero were simulated, with IV methods again providing the least biased effect estimates. The bootstrapped standard Fs were calculated to estimate precision of the effect estimate. For both IV analyses standard errors were greater than the ITT and PP analyses in all scenarios, with little difference between the standard errors of 2SPS and 2SRI. Considering refusal at only the patient level, as seen in Figs. 2, 3, 4, 5, recruitment without refusal produced standard errors of up to 0.11 under ITT analysis and 0.16 under 2SPS and 2SRI analyses. Recruitment with refusal produced consistent standard errors regardless of the correlation of around 0.08 under ITT analysis and 0.11 under 2SPS and 2SRI analyses.


ITT and PP analysis are equivalent and unbiased with perfect compliance and IV analysis not necessary. When refusal is present, the intervention arm comprises both individuals on intervention and individuals on standard of care. When this refusal is random and uncorrelated with event risk, ITT estimates were biased through dilution bias, bias increasing as refusal rates increased. The makeup of the event risks of the individuals in the intervention arm depends upon the direction and strength of the correlation between refusal and event risk thus inducing bias in the effect estimates. Therefore, the bias in effect estimate when using ITT analysis is also affected by this correlation between refusal and risk, small refusal rates resulted in large bias and reduction in power if correlation was high.

IV analyses provided more accurate effect estimates when refusal was correlated to event risk. Effect estimates with the smallest absolute bias of all methods were given by the 2SRI method when correlation between refusal and risk was positive and the 2SPS method when correlation between refusal and risk was negative. However, when 2SPS is less biased, it is only by a small amount, and when its more biased it is by a large amount. The 2SPS estimate was also highly effected by direction of correlation, whereas, 2SRI was relatively similar irrespective of correlation direction. The standard errors of IV estimates are larger than ITT or PP due to the two stage modelling procedure.

Sample size ignoring refusal did not provide the desired power for all methods and all scenarios, except the PP analysis and positive correlation. The large decrease or increase in statistical power under PP analyses as correlations strengthened was actually indicative of the large bias present in the effect estimate. Sample size acknowledging refusal maintained statistical power for all scenarios when using the 2SRI method and relatively small bias, only greater than 10% when correlation between risk and refusal was high.

The simulation results from this study have relevance to clinical research. Since the development of the cmRCT design uptake has been increasing in conjunction with the movement towards developing more pragmatic trial approaches in order to tackle some of the current challenges present in the clinical trial model [1]. As new cmRCT cohorts continue to emerge it is crucial that the design and analysis of these novel trials remains rigorous. Refusal of an intervention is pragmatic in the sense that that there will likely be some level of refusal to treatment in clinical practice. A certain proportion of the refusal present in a cmRCT trial will be present in real life, some individuals would refuse said treatment whether it was offered as part of a trial or not. However, refusal may be related to the fact that the intervention is offered as part of a trial. It is expected that the effect size of interest, or the effect we would observe in clinical practice, would be somewhere between that estimated with ITT and IV analyses. All present cmRCT planned ITT as the main analysis method; future research would benefit from using actual refusal rates seen in these trials and thus considering if the refusal rates used in these simulations are realistic.

It is expected that there will be patients who do not refuse the intervention, but are noncompliant to some degree. If the non-compliance present in a trial is also expected in clinical practice, we believe the intervention effect estimate can just ignore it, it is realistic of clinical practice. In a cmRCT, given the passive follow-up of the control arm, it would not make sense to use any compliance increasing tactics for the intervention arm, and its likely it would be hard to assess compliance using the routinely collected data, so lack of compliance should ideally be treated as part of treatment effect.

A key limitation of this study is that heterogeneous treatment effects were not considered and such results are not generalizable to studies with a heterogeneous treatment effect. Secondly, adding time dependent covariates to the risks may provide a more realistic scenario of how individual risks change over time. Further work could consider more complex trial scenarios that may occur in the cmRCT design. For example, when the new treatment or control treatment is harmful in terms of non-CVD mortality and how this competing risk affects the effect estimate and power of the trial mortality. The effect of multiple trials within the same cohort, which is an interesting design feature of the cmRCT, could also be investigated to assess how possible correlation of control arm affects hypothesis testing and error rates [25]. Due to the use of a cohort in a cmRCT for recruitment it would also be of interest to investigate more closely the effect of refusal on power. For example, considering a trial simulation with a fixed cohort of size n and different refusal rates, this could be used to provide an idea of scenarios when it may be more powerful to do a standard two arm trial than a cmRCT and thus preferable. As previously discussed if the cohort of eligible patients in a cmRCT is too small and the refusal high, there is the risk of oversampling the control arm leading to a less powerful trial than the standard two arm RCT.


We have presented simulation results that show both the biases and statistical power of a cmRCT design are altered through refusal, particularly non-random refusal. We recommend a cmRCT design to assess probable refusal rates at the outset as well as the effect estimate, and to incorporate these into power calculations. We recommend using adaptive power calculations, updating them as refusal rates are collected in the trial recruitment phase. The instrumental variable methods provide a less biased effect estimate when refusal is present in the intervention arm. On average the 2SRI method provides effect estimates with the smallest absolute bias of all methods and sufficient power when using recruitment with refusal. We recommend running both an ITT and 2SRI analyses as it is expected that the effect size of interest, or the effect we would observe in clinical practice, would lie somewhere between those estimated with ITT and IV analyses. We demonstrate that sample sizes should be adapted based on expected and actual refusal rates in order to be sufficiently powered for instrumental variable analysis.



Two stage predictor substitution


Two stage residual inclusion


Cohort multiple randomised control trial


Cardiovascular disease




Instrumental variable




Randomised controlled trial


Trials within cohorts


  1. Vickers AJ. Clinical trials in crisis: four simple methodologic fixes. Clin trials. 2014;11(6):615–21.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Institue of Medicine. The Learning Healthcare System: Workshop Summary (IOM Roundtable on Evidence-Based Medicine). Washington, DC: National Academies Press (US); 2007.

    Google Scholar 

  3. van Staa T-P, Klungel O, Smeeth L. Use of electronic healthcare records in large-scale simple randomized trials at the point of care for the documentation of value-based medicine. J Intern Med. 2014;275(6):562–9.

    Article  PubMed  Google Scholar 

  4. Treweek S, Zwarenstein M. Making trials matter: pragmatic and explanatory trials and the problem of applicability. Trials. 2009;10(1):37.

  5. Relton C, Torgerson D, O’Cathain A, Nicholl J. Rethinking pragmatic randomised controlled trials: introducing the ‘cohort multiple randomised controlled trial’ design. BMJ. 2010;340(7753):963–7.

    Google Scholar 

  6. Van Staa T-P, Goldacre B, Gulliford M, Cassell J, Pirmohamed M, Taweel A, Delaney B, Smeeth L. Pragmatic randomised trials using routine electronic health records: putting them to the test. BMJ. 2012;344:e55.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Clegg A, Relton C, Young J, Witham M. Improving recruitment of older people to clinical trials: use of the cohort multiple randomised controlled trial design. Age and Ageing. 2015;44(4):547–550.

  8. R. Uher, “Skills for Wellness: Cognitive-behavioural Skills Training for Psychotic-like Experiences, Basic Symptoms, Affective Lability and Anxiety in Youth.” [Online]. Available: Accessed 29 June 2015.

  9. Kwakkenbos L, Jewett LR, Baron M, Bartlett SJ, Furst D, Gottesman K, Khanna D, Malcarne VL, Mayes MD, Mouthon L, Poiraudeau S, Sauve M, Nielson WR, Poole JL, Assassi S, Boutron I, Ells C, van den Ende CH, Hudson M, Impens A, Körner A, Leite C, Costa Maia A, Mendelson C, Pope J, Steele RJ, Suarez-Almazor ME, Ahmed S, Coronado-Montoya S, Delisle VC, Gholizadeh S, Jang Y, Levis B, Milette K, Mills SD, Razykov I, Fox RS, Thombs BD. The Scleroderma Patient-centered Intervention Network (SPIN) Cohort: protocol for a cohort multiple randomised controlled trial (cmRCT) design to support trials of psychosocial and rehabilitation interventions in a rare disease contex. BMJ Open. 2013;3(8):1–11.

    Article  Google Scholar 

  10. Burbach JPM, Verkooijen HM, Intven M, Kleijnen J-PJ, Bosman ME, Raaymakers BW, van Grevenstein WM, Koopman M, Seravalli E, van Asselen B, Reerink O. RandomizEd controlled trial for pre-operAtive dose-escaLation BOOST in locally advanced rectal cancer (RECTAL BOOST study): study protocol for a randomized controlled trial. Trials. 2015;16(1):58.

    Article  PubMed  PubMed Central  Google Scholar 

  11. M. van Vulpen, “Randomized Trial Comparing Conventional Radiotherapy With Stereotactic Radiotherapy in Patients With Spinal Metastases - VERTICAL Study - Full Text View -” [Online]. Available: Accessed 13 July 2015.

  12. Mitchell N, Hewitt C, Adamson J, Parrott S, Torgerson D, Ekers D, Holmes J, Lester H, McMillan D, Richards D, Spilsbury K. A randomised evaluation of CollAborative care and active surveillance for Screen-Positive EldeRs with sub-threshold depression (CASPER): study protocol for a randomized controlled trial. Trials. 2011;12(1):225.

  13. Hewitt CE, Torgerson DJ, Miles JNV. Is there another way to take account of noncompliance in randomized controlled trials? CMAJ. 2006;175(4):347–8.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Wu J, Zhu S, Yao GL, Mohammed MA, Marshall T. Patient factors influencing the prescribing of lipid lowering drugs for primary prevention of cardiovascular disease in UK general practice: a national retrospective cohort study. PLoS One. 2013;8(7):1–10.

    CAS  Google Scholar 

  15. Cobos-Carbó A, Augustovski F. CONSORT 2010 Declaration: updated guideline for reporting parallel group randomised trials. Med Clin (Barc). 2011;137(5):213–5.

    Article  Google Scholar 

  16. Nagelkerke N, Fidler V, Bernsen R, Borgdor M. Estimating treatment effects in randomized clinical trials in the presence of non-compliance. Stat Med. 2000;19:1849–64.

    Article  CAS  PubMed  Google Scholar 

  17. Sussman JB, Hayward RA. An IV for the RCT: using instrumental variables to adjust for treatment contamination in randomised controlled trials. BMJ. 2010;340:c2073.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Ali MS, Uddin MJ, Groenwold RHH, Pestman WR, Belitser SV, Hoes AW, de Boer A, Roes KCB, Klunge OH. Quantitative falsification of instrumental variables assumption using balance measures. Epidemiology. 2014;25(5):770–2.

    Article  PubMed  Google Scholar 

  19. Tchetgen EJ, Walter S, Vansteelandt S, Martinussen T, Glymour M. Instrumental variable estimation in a survival context. Epidemiology. 2015;26(3):402–410.

  20. Terza JV, Basu A, Rathouz PJ. Two-stage residual inclusion estimation: addressing endogeneity in health econometric modeling. J Health Econ. 2008;27(3):531–43.

    Article  PubMed  Google Scholar 

  21. Burton A, Altman DG, Royston P, Holder RL. The design of simulation studies in medical statistics. Stat Med. 2006;35:4279–92.

    Article  Google Scholar 

  22. Weintraub WS, Lüscher TF, Pocock S. The perils of surrogate endpoints. Eur Heart J. 2015;36(33):2212–8.

    Article  PubMed  PubMed Central  Google Scholar 

  23. K. Kleinman and S. S. Huang, “Calculating power by bootstrap, with an application to cluster-randomized trials,” arXiv:1410.3515v1 [stat.AP], 2014.

  24. Torgerson DJ, Campbell MK. Use of unequal randomisation to aid the economic efficiency of clinical trials. BMJ. 2000;321(7263):759.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Howard DR, Brown JM, Todd S, Gregory WM. Recommendations on multiple testing adjustment in multi-arm trials with a shared control group. Stat Methods Med Res; 2016.

Download references


The research leading to these results was conducted as part of Work Package 2 of the GetReal consortium. For further information please refer to This paper only reflects the personal views of the stated authors.


The work leading to these results has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement number 115546, resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/20072013) and EFPIA companies' in-kind contribution. The study was also partly supported by the University of Manchester's Health eResearch Centre (HeRC) funded by the Medical Research Council (MRC) Grant MR/K006665/1.

Availability of data and material

All data used was simulated, simulations code is available from on request.

Authors’ contributions

The idea for the paper was originally developed by TvS. JC ran the simulations and drafted the manuscript. MS provided consultation when running the simulations. AP provided assistance and consultation in running the simulations. TvS, MS, and AP revised the paper critically. TvS, JC, AP and MS approved the final version.

Competing interest

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations



Corresponding author

Correspondence to Jane Candlish.

Additional information

An erratum to this article is available at

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Candlish, J., Pate, A., Sperrin, M. et al. Evaluation of biases present in the cohort multiple randomised controlled trial design: a simulation study. BMC Med Res Methodol 17, 17 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: