Characteristics and practices of school-based cluster randomised controlled trials for improving health outcomes in pupils in the United Kingdom: a methodological systematic review

Background Cluster randomised trials (CRTs) are increasingly used to evaluate non-pharmacological interventions for improving child health. Although methodological challenges of CRTs are well documented, the characteristics of school-based CRTs with pupil health outcomes have not been systematically described. Our objective was to describe methodological characteristics of these studies in the United Kingdom (UK). Methods MEDLINE was systematically searched from inception to 30th June 2020. Included studies used the CRT design in schools and measured primary outcomes on pupils. Study characteristics were described using descriptive statistics. Results Of 3138 articles identified, 64 were included. CRTs with pupil health outcomes have been increasingly used in the UK school setting since the earliest included paper was published in 1993; 37 (58%) studies were published after 2010. Of the 44 studies that reported information, 93% included state-funded schools. Thirty six (56%) were exclusively in primary schools and 24 (38%) exclusively in secondary schools. Schools were randomised in 56 studies, classrooms in 6 studies, and year groups in 2 studies. Eighty percent of studies used restricted randomisation to balance cluster-level characteristics between trial arms, but few provided justification for their choice of balancing factors. Interventions covered 11 different health areas; 53 (83%) included components that were necessarily administered to entire clusters. The median (interquartile range) number of clusters and pupils recruited was 31.5 (21 to 50) and 1308 (604 to 3201), respectively. In half the studies, at least one cluster dropped out. Only 26 (41%) studies reported the intra-cluster correlation coefficient (ICC) of the primary outcome from the analysis; this was often markedly different to the assumed ICC in the sample size calculation. The median (range) ICC for school clusters was 0.028 (0.0005 to 0.21). Conclusions The increasing pool of school-based CRTs examining pupil health outcomes provides methodological knowledge and highlights design challenges. Data from these studies should be used to identify the best school-level characteristics for balancing the randomisation. Better information on the ICC of pupil health outcomes is required to aid the planning of future CRTs. Improved reporting of the recruitment process will help to identify barriers to obtaining representative samples of schools.


Background
Cluster randomised trials (CRTs) are studies in which groups, or clusters, of individuals are allocated to trial arms rather than the individuals themselves [1]. The clusters may be geographic areas, health organisations or social units. CRTs are used when the intervention is delivered to the entire cluster or there is a chance of contamination between trial arms if individuals are randomised [2].
CRTs can be more complex to design and analyse than individually randomised controlled trials. The most documented methodological consideration for CRTs is that observations on participants from the same cluster are more likely to be similar to each other than those on participants from different clusters [2]. This similarity is quantified by the intra-cluster correlation coefficient (ICC), defined as the proportion of the total variability in the trial outcome that is between clusters as opposed to between individuals within clusters [3]. The statistical dependence between observations within clusters needs to be taken account of when calculating the sample size and analysing data in CRTs [1]. The use of standard methods may result in the sample size being too small to detect the intervention effect, and analysis results that exaggerate the evidence for a true intervention effect. Estimates of the ICC or coefficient of variation of clusters for the outcome from previous studies are required to calculate the design effect, the factor by which the number of individuals that would be required in an individually randomised trial needs to be inflated to account for within-cluster correlation in the sample size calculation. In addition, when calculating the sample size in CRTs, a degrees of freedom correction should be incorporated to take account of the uncertainty with which variability in the outcome across clusters is estimated in the analysis [4], and a further inflation of the sample size should be considered to allow for loss of efficiency that results from recruiting unequal numbers of participants from the clusters [5]. When estimating the intervention effect from the resulting trial data the main analytical approaches are to either apply standard statistical methods to summary statistics that represent the cluster response (cluster-level analyses) or use methods at the individual participant level that account for within-cluster correlation in the model or by weighting the analysis. Another important methodological consideration in CRTs is the potential for recruitment bias that might occur in studies where the participating individuals are recruited after the clusters are randomised. Finally, when using meta-analysis to pool findings from studies that use the CRT design, there is the need to consider how best to incorporate estimated effects from studies that did not allow for clustering in the analysis, and consider the extent to which differences in the types of clusters that were randomised are a source of heterogeneity. These considerations are detailed in several textbooks [1,2,[6][7][8].
CRTs are increasingly used to evaluate non-pharmacological interventions for improving child health outcomes [9][10][11]. Although the use of CRTs to evaluate the effectiveness of interventions for improving educational outcomes is long established [12,13], their use to evaluate health interventions in schools is more recent [10]. Schools provide a natural environment to recruit, deliver public health interventions to and measure outcomes on children, due to the amount of time they spend there [10]. Cluster randomisation is consistent with the natural clustering found within school settings (i.e., classrooms within year groups within schools). School-based CRTs share common challenges with other settings, but specific considerations may be more challenging when schools are randomised, for example, consent procedures [10,14].
In 2011, a methodological systematic review on the characteristics and quality of reporting of CRTs involving children reported a marked increase in such studies [9]; three quarters of the included studies randomised schools. To date, no systematic review has focussed specifically on the characteristics of school-based CRTs for improving pupil health outcomes. Such a review would help identify common methodological challenges, obtain estimates of parameters (e.g., the ICC) that are of use to researchers planning similar trials and inform the design of simulation studies that use synthetic data to evaluate the properties of statistical methods applied in the context of school-based CRTs with health outcomes.
The aim of this methodological systematic review is to describe the characteristics and practices of schoolbased CRTs for improving health outcomes in pupils in the United Kingdom (UK). characteristics for balancing the randomisation. Better information on the ICC of pupil health outcomes is required to aid the planning of future CRTs. Improved reporting of the recruitment process will help to identify barriers to obtaining representative samples of schools. Keywords: Child and adolescent health, Cluster randomised trials, Public health, Randomised trials, Research methods, Schools, Systematic review Methods This is a systematic review of school-based CRTs with pupil health outcomes that were conducted in the UK. The review was focussed on the UK to align with constraints on available resources and collect richer data on CRT methodology in a single education system.

Data sources and search methods
The systematic review was registered with PROSPERO (CRD42020201792) and the protocol has been published [15]. After extensive scoping of the subject area, a pragmatic decision was made to search MEDLINE (through Ovid) in order to make the review more time-efficient and align with available resources. MEDLINE was exclusively searched from inception to 30 th June 2020 for peer-reviewed articles of school-based CRTs. The search strategy (Table 1) was developed in consultation with information specialists, based on a sensitive MEDLINE search strategy for identifying CRTs [16]. Cluster designrelated terms 'cluster*' , 'group*' and 'communit*' were combined with the terms 'random' and 'trial' , along with the 'Schools' Medical Subject Heading (MeSH) term. The search was limited to English language.

Inclusion and exclusion criteria
The systematic review included school-based definitive CRTs of the effectiveness of an intervention versus a comparison group that evaluated health outcomes on pupils. The population of interest was children in fulltime education in the UK. Studies that took place outside the UK were excluded. The pragmatic decision was made to limit the population to educational settings within the UK as it made the review more focussed and applicable to a specific setting. Eligible studies included pupils in preschool, primary school and secondary school. The types of eligible clusters included schools themselves, year groups, classes, teachers or any other relevant schoolrelated unit. All school types were eligible, including special schools. Any health-related intervention(s) and control groups were considered. The primary outcome had to be related to pupils' health. Studies for which the primary outcome was not health-based (e.g., academic attainment) were excluded. All types of CRT design were eligible including parallel group, factorial, crossover and stepped wedge studies.
If more than one publication of the primary outcome result for an eligible CRT was identified, a key study (index) report was designated and used for data extraction. Papers that did not report the primary outcome were excluded along with pilot/feasibility studies, protocol/design articles, process evaluations, economic evaluations/cost-effectiveness studies, statistical analysis plans, commentaries and mediation/mechanism analyses.

Sifting and validation
Two reviewers (KP and OU) independently screened the titles and abstracts of all references (downloaded into Endnote [17]) for eligibility against the inclusion criteria. Any studies for which the reviewers were uncertain of for inclusion were taken to full text screening. Full-text articles were evaluated by the same reviewers based on the inclusion criteria using a pre-piloted coding method. Any discrepancies which could not be resolved through discussion were sent to a third reviewer (ZMX) for a decision.

Data extraction and analysis
For each eligible study, data were extracted using a prepiloted form in Microsoft Excel. Data were extracted by two reviewers (KP and OU), and any discrepancies that could not be resolved through discussion were sent to a third reviewer (ZMX) for a final decision. Missing information that was not available in the index papers was sought from corresponding protocol papers and other "sibling" publications.
The items of information extracted are listed as follows: Publication details: year of publication and journal name.
Setting characteristics: country/region, school level and type of school. Intervention: health area and intervention type. Primary outcome: name, health area, reporter of outcome and method of data collection. Study design and analysis methods: unit of randomisation (i.e., type of cluster), justification for using the cluster trial design, method used to sample schools, method used to balance the randomisation, length and number of follow-ups, design of follow-up (cohort versus repeated cross-sectional design) and method used to account for clustering in the analysis. Sample size calculation: target sample size (i.e., number of clusters and pupils) and assumptions underlying the sample size calculation (e.g., assumed ICC, percentage loss to follow-up). Ethics and consent procedures: activities covered by the consent agreements and use of "opt-out" consent.
Other study characteristics of methodological interest: number of clusters and pupils that were recruited and lost to follow-up, estimate of the ICC of the primary outcome.
Study characteristics were described using medians, interquartile ranges (IQRs) and ranges for continuous variables, and numbers and percentages for categorical variables, using Stata software [18]. Formal quality assessment was not performed as it was not an objective of this review to estimate intervention effects in the included studies. Some information relevant to the quality of CRTs was, however, extracted and summarised as part of the review.

Search results
After deduplication, 3103 articles were identified through MEDLINE, 159 were full-text screened and 64 were included in the review . Of 95 excluded studies, 88 did not meet the inclusion criteria, and 7 studies met inclusion criteria but were subsequently excluded because they were sibling reports of an index paper. The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram is in Fig. 1.

Study characteristics
The included papers were published in 36 different journals, including: British Medical Journal (n = 9 papers); BMC Public Health (n = 4); International Journal of Behavioural Nutrition and Physical Activity (n = 4); Archives of Disease in Childhood (n = 3); BMJ Open (n = 3); Journal of Epidemiology and Community Health (n = 3); Public Health Nutrition (n = 3); and The Lancet (n = 3). The CRT design has been increasingly used in the UK school setting to evaluate health interventions for pupils since the first paper was published in 1993 ( Fig. 2). Twenty three papers were published between 2001 and 2010, compared to 37 between January 2011 and June 2020. Table 2 summarises the characteristics of included studies.

Setting
Almost three quarters of the studies were conducted exclusively in England (n = 47; 73%); most studies (50 of the 52 studies that provided the data) took place in one or two geographic regions (e.g., West Midlands). Just over half the studies (56%) were based exclusively in primary schools (age 5-11 years), and 38% were exclusively in secondary schools (age 11-16 years). Of the 44 studies that reported information on the types [83] of schools recruited, 93% included state-funded schools.

Intervention type
Eighteen (28%) studies evaluated interventions that targeted nutrition, 15 (23%) physical activity, 15 (23%) socioemotional function and its influences, 7 (11%) dental health, 5 (8%) smoking and 5 (8%) injury, amongst others. Physical health interventions are increasingly prominent (13 published since 2011 in contrast to just 2 prior to then). Of the 15 studies targeting socioemotional function and its influences, 13 were published since 2011, highlighting increasing use of the CRT design in this area. Of the 7 CRTs related to dental health, the most recent one was published in 2011. The vast majority of interventions were in primary prevention (94%).

Primary outcome
Health areas assessed by the primary outcomes are summarised in Table 2. In 53% of the studies pupils reported the primary outcome, with researchers reporting primary outcomes in 20%, teachers in 8%, and parents in 8%. In 28% of the studies the primary outcome reporter was blind to allocation status (some authors specifically commented on the challenges of blinding trial arm status [33,36,56,60]), and 22% measured the outcome using an objective method.

Study design and analysis methods
Explicit justification for use of the CRT design was only provided in 17 (27%) studies; the most common reason was to avoid contamination (13 studies altogether). Most studies (n = 56; 88%) randomised school clusters, while classes and year groups were allocated in 6 (9%) and 2 (3%) studies, respectively. Two authors said that in order to maintain power, classes were randomised instead of schools and that this may have led to contamination between the intervention and control arms [22,28]. Nearly all studies used a parallel group design (n = 61; 95%); the remaining 3 used a factorial design [21,37,39]. Of the 46 studies with sufficient information to establish the approach used to sample schools, 33 initially invited all potentially eligible schools to participate, 5 used random sampling, 4 used purposive sampling, 3 used convenience sampling, and 1 used a mixed random/ convenience sampling approach.
Eighty percent of studies reported using a restricted allocation method to balance cluster-level characteristics between the trial arms. Most commonly a measure of socio-economic status (SES) was balanced on (48%), with a third of studies (21/64) specifically balancing  Table 3. Few studies gave justification for their choice of balancing factors.
One of the challenges of CRTs is to avoid recruitment bias that might occur if participants are recruited after the clusters are randomised [88,89]. One third (33%) of studies avoided this by recruiting pupils before the clusters were randomised; furthermore, 25% collected baseline data before randomisation. This information, however, was unclear in many studies (41% and 33%, respectively). Generally, insufficient information was provided on whether recruitment bias was avoided in studies where pupils were recruited after randomisation of clusters. A notable exception was one study [57] where recruitment bias was avoided because allocation was not revealed to the schools until after recruitment and baseline assessment.
Nearly all studies used the cohort design as their method of follow-up (n = 62, 97%), where the same pupils provided data at each study wave. One study used a repeated cross-sectional design where different pupils provided data at each wave [46], and one used an a priori mixed design incorporating elements of the cohort and repeated cross-sectional designs, with only a subset of participating pupils providing data at each wave [49].
Seventy two percent of studies analysed their data using individual-level methods that allow for clustering, 16% used cluster-level analysis methods, and 12% did not allow for clustering in their analysis.

Sample size calculation
Seventy eight percent of studies accounted for clustering in their sample size calculation and 72% reported the ICC or coefficient of variation [90] that was assumed for the outcome. None of the studies made a degrees of freedom correction to the sample size calculation. Only two studies [57,63] allowed for unequal cluster sizes in their sample size calculation, and only one of these [57] specified the anticipated variation in the number of pupils across clusters. The median (range) assumed ICC for school clusters was 0.05 (0.005 to 0.175) based on the 37 studies that provided these data. Of the 3 studies that specified the coefficient of variation of the outcome, 2 assumed it to be 0.2 [42,60] and 1 assumed it to be 0.25 [19]. The median (range) assumed design effect was 2.21 (1.22 to 8.11). The median targeted sample size was 30 and 964 clusters and pupils, respectively. Most studies (94%) did not state whether their sample size calculation allowed for loss to follow-up of clusters.

Ethics and consent procedures
From whom was consent/assent sought for pupil participation? 64 Parents and pupils, n (%) 40 (63) Parents only, n (%) 15 (23) Pupils only, n (%) 2 (3) Not stated / Neither parent nor pupil, n (%) 7 (11) Opt-out consent/assent procedure used for either parent/guardian or pupils 64 Yes, n (%) 29 (45) Not stated / No, n (%) 35 (55) Some studies included more than one school type. This is the number of studies that included specific types of school. State schools receive funding through their local authority or directly from the government. The most common ones are local authority, foundation and voluntary aided school which are all funded by the local authority. Academies are run by government and not-for-profit trusts, and are independent of local authority. Grammar schools are run by local authorities but intake is based on assessment of the pupils' academic ability. Special schools cater for pupils with special educational needs. Faith schools follow the national curriculum but can decide what they teach in religious studies. Independent schools follow the national curriculum but charge fees for attending pupils d Some interventions targeted more than one health area e Includes mental health, behaviour, ADHD, wellbeing, quality of life, bullying, social and emotional learning, and self-esteem f Intervention type was summarised based on the typology described by Eldridge and colleagues [1]. 'Individual-cluster' interventions include components that are directed at individual participants (e.g. pupils) on whom outcomes are measured. 'Professional-cluster' interventions include components for training professionals in the cluster (e.g. teachers in schools) to deliver the intervention. 'External-cluster' interventions involve additional staff outside the cluster to deliver the intervention (e.g. researchers, trained facilitators). 'Cluster-cluster' interventions include components that necessarily have to be administered to entire clusters (e.g., school policy). 'Multifaceted' interventions include components across more than one of the 'individual-cluster' , 'professional-cluster' , 'external-cluster' and 'cluster-cluster' categories g Includes mental health, behaviour, hyperactivity/inattention (ADHD), wellbeing, quality of life, bullying, social and emotional learning, and self-esteem (body image) h Summary excludes the two CRTs that did not use the cohort design i Summary excludes the two CRTs that did not use the cohort design j Summary excludes the two CRTs that did not use the cohort design k Summary excludes the two CRTs that did not use the cohort design Parker et al. BMC Med Res Methodol (2021) 21:152 Other study characteristics of methodological interest A median (IQR) of 31.5 (21 to 50) clusters, 29 (15 to 50) schools and 1308 (604 to 3201) pupils were recruited. The CRT studies that used a cohort design and reported both targeted and achieved recruitment figures at the cluster (n = 45) and pupil (n = 43) levels achieved those recruitment targets in 89% and 77% of studies, respectively. Some authors noted challenges with recruitment at the cluster [45,47,50] and pupil [24,55] levels. Based on the 33 studies that provided data, the median (IQR) percentage of pupils categorised as "White" was 76.8% (51.5% to 86.2%). Thirty out of 62 (48%) studies that provided information reported that at least one cluster was lost to follow-up. Missing data resulting from entire school drop-out was highlighted as a problem in some reports (e.g., [42,48,54] Table 4. The median (range) ICC for school clusters was 0.028 (0.0005 to 0.21). For many studies that reported both values there was a marked difference between the observed school-level ICC in the study data and the corresponding assumed value of the ICC in the sample size calculation (Fig. 3). The median (range) of the differences between the observed ICC and the assumed ICC was -0.006 (-0.117 to 0.16) indicating that: on average, the observed ICC was slightly smaller than the assumed ICC; at one extreme, the observed ICC in one study was 0.117 smaller than the assumed value [25]; and at the other extreme, the observed ICC in one study was 0.16 larger than the assumed value [68]. The intra-class correlation coefficient of agreement between the observed and assumed ICCs was 0.24.

Other study characteristics of methodological interest
Seven studies [24,26,44,59,68,71,74] that reported ICCs had a binary primary outcome, but none of these stated whether the ICC was calculated on the proportions scale or the logistic scale [3]. It is possible that five of these studies [24,26,68,71,74]   Yes, n (%) 24 (38) mixed effects ("multi-level") models [91] to analyse the data reported the ICC on the logistic scale, which could potentially account for some of the differences between the observed and assumed ICCs. Further scrutiny of the data, however, revealed marked differences for only two of the aforementioned studies: 0.21 for the observed ICC versus 0.05 for the assumed ICC in Mulvaney and colleagues [68], and 0.028 versus 0.1, respectively, in Obsuth and colleagues [71].

Discussion
The number of UK school-based CRTs evaluating the effects of interventions on pupil health outcomes has increased in recent years, reflecting growing recognition of the role that schools can play in improving the health of children [10,[92][93][94][95]. The findings of this systematic review indicate a number of methodological considerations that are worthy of reflection.

Interpretation
Seventy two percent of the studies reported the level of clustering assumed in their sample size calculation, a little more than the 62% observed in a 2015 review of the reporting of sample size calculations in CRTs [96]. Our review found that the observed ICC in the study data often differed markedly from the ICC assumed in the sample size calculation. This will be partly due to sampling variation and adjustment for prognostic factors in Table 4 Reported intra-cluster correlation coefficients for primary outcomes (N = 26) a The estimated intra-cluster correlation coefficient in James (2004)  the analysis, but it may also reflect the lack of availability of good estimates of the ICC at the time of sample size calculation. Knowledge of the ICC for pupil health outcomes in the school setting is less well established than for patient health outcomes in the primary care setting where general practices are allocated as clusters [1,97]. It has been reported that general practice-level ICCs for health outcomes are generally less than 0.05 [98]; in our review, only 13 of 23 studies that randomised school clusters and reported observed ICCs had values that were less than 0.05. School-based ICC estimates are widely available for educational outcomes [99], but these are markedly higher than those reported in this review for pupil health outcomes; this is to be expected given that the primary role of the school is to provide education. The importance of reporting ICCs from study data for planning future similar CRTs has long been established [100] and the 2012 CONSORT extension to CRTs includes a specific reporting item for this [101]. Only two-fifths (41%) of studies in this review, however, reported the ICC for the primary outcome; this figure rises to 48% (16/33) for studies published after 2012. Improved reporting of the ICC in the increasing number of CRTs in the schoolbased setting, and further papers written specifically to report ICCs [102,103], will provide valuable knowledge. This review focussed on CRTs in the UK setting; a useful area to investigate is the extent to which school-based ICC estimates for health outcomes from other countries (e.g., [102,104]) are similar to those in the UK. Representativeness of school and pupil characteristics in school-based trials is important for external validity and inclusiveness. For most studies in this review, schools were recruited from only one or two geographic regions/ counties. A median 23% of participating pupils were in a minority ethnic group, lower than the national percentages reported by the UK Department for Education (33.5% of primary school pupils and 31.3% of secondary school pupils) [105]. The study reports generally provided little information on specific aspects of the recruitment process, such as why some schools declined to participate and details of their characteristics. Many of the studies evaluated interventions that involved classroom lessons and necessitated teachers being trained to deliver the intervention. Additionally, the teachers reported pupil outcomes in some studies [32,34,60,73,82]. Insufficient school resources to deliver the intervention and the wider trial may be a barrier to participation and result in lack of representation of certain types of schools.
Eighty percent of the studies used some form of restricted allocation to balance the randomisation on cluster-level characteristics, which is higher than previous methodological reviews of CRTs [106][107][108][109]. The percentage of pupils in the school that are eligible for free school meals was often used as a balancing factor, perhaps partly because this information is readily available from the UK Department for Education [110]. School characteristics that are predictive of the study outcomes, account for within-cluster correlation or influence effectiveness of the intervention are candidates on which to balance the randomisation [1,111]; previous schoolbased CRTs could be used to identify such factors.

Strengths
This systematic review used a defined search strategy tailored to identify school-based CRTs. The strategy was developed following an iterative process and allowed us to achieve the right balance of sensitivity and specificity relevant to our available resources. Identifying reports of CRTs is a challenge given that many articles do not used the term 'cluster' in their title or abstract. Therefore, a search strategy was used which included terms such as 'group' and 'community' to improve sensitivity. The 'School' MeSH term was also used to identify publications that randomised any type of school-related unit. The piloting of our screening procedure and data extraction were conducted by two independent reviewers, improving accuracy. The review identified school-based CRTs with interventions spanning a variety of different health conditions/areas.

Limitations
A potential limitation of the review is that the search was limited to one database. MEDLINE was used because the focus of the review was on describing the characteristics of trials that evaluate the impact of health interventions on pupil's health outcomes, but it is possible that we have not identified eligible publications that are not indexed in MEDLINE. Translating our search in the EMBASE, DARE, PsycINFO and ERIC databases for potential includes published in the last 3 years, however, revealed only one additional eligible school-based CRT.
Given resource constraints, we focussed the review on the UK, making the decision to collect rich data on CRT methodology in a single education system. As a result, the findings are readily applicable to a specific context. Despite being focussed on the UK, the findings of this review will be of global interest. Other high income countries, such as Australia, have a similar school system to the UK, and many of our findings may be applicable in those settings. Furthermore, some of the methodological challenges in the design of CRTs will be similar across different settings.

Future directions
The results provide a summary of the methodological characteristics of school-based CRTs with pupil health outcomes in the UK. To our knowledge, there has been no systematic review of the characteristics of schoolbased CRTs for evaluating interventions for improving education outcomes, despite the fact that the use of the CRT design is more established in that area. A comparison of methodology between health-based CRTs and education-based CRTs in the school setting would be valuable to both areas. The results in our review indicate that better information on the ICC is needed to design school-based CRTs with health outcomes. Cataloguing of ICCs from previous studies will help researchers choose better values for the assumed ICC when calculating sample size.

Conclusions
CRTs are increasingly used in the school setting for evaluating interventions for improving children's health and wellbeing. The emerging pool of published trials in the UK provides investigators and methodologists with relevant experiential knowledge for the design of future similar studies. This review of schoolbased CRTs has highlighted the need for more information on the ICCs to calculate the required sample size. Better reporting of the recruitment process in CRTs will help to identify common barriers to obtaining representative samples of schools and pupils. Finally, previous school-based CRTs may provide a useful source of data to identify the school-level characteristics that are strong predictors of pupil health outcomes and, therefore, potentially good factors on which to balance the randomisation.