 Research article
 Open Access
 Open Peer Review
 Published:
Comparison of nuisance parameters in pediatric versus adult randomized trials: a metaepidemiologic empirical evaluation
BMC Medical Research Methodology volume 18, Article number: 7 (2018)
Abstract
Background
We wished to compare the nuisance parameters of pediatric vs. adult randomizedtrials (RCTs) and determine if the latter can be used in sample size computations of the former.
Methods
In this metaepidemiologic empirical evaluation we examined metaanalyses from the Cochrane Database of SystematicReviews, with at least one pediatricRCT and at least one adultRCT. Within each metaanalysis of binary efficacyoutcomes, we calculated the pooledcontrolgroup eventrate (CER) across separately all pediatric and adulttrials, using randomeffect models and subsequently calculated the controlgroup eventrate riskratio (CERRR) of the pooledpediatricCERs vs. adultCERs. Within each metaanalysis with continuous outcomes we calculated the pooledcontrolgroup effect standard deviation (CESD) across separately all pediatric and adulttrials and subsequently calculated the CESDratio of the pooledpediatricCESDs vs. adultCESDs. We then calculated across all metaanalyses the pooledCERRRs and pooledCESDratios (primary endpoints) and the pooledmagnitude of effectsizes of CERRRs and CESDratios using REMs. A ratio < 1 indicates that pediatric trials have smaller nuisance parameters than adult trials.
Results
We analyzed 208 metaanalyses (135 for binaryoutcomes, 73 for continuousoutcomes). For binary outcomes, pediatricRCTs had on average 10% smaller CERs than adultRCTs (summaryCERR: 0.90; 95% CI: 0.83, 0.98). For mortality outcomes the summaryCERR was 0.48 (95% CIs: 0.31, 0.74). For continuous outcomes, pediatricRCTs had on average 26% smaller CESDs than adultRCTs (summaryCESDratio: 0.74).
Conclusions
Clinically relevant differences in nuisance parameters between pediatric and adult trials were detected. These differences have implications for design of future studies. Extrapolation of nuisance parameters for samplesizes calculations from adulttrials to pediatrictrials should be cautiously done.
Background
For sample size calculations for randomized controlled trials (RCTs), some parameters, like the treatment difference (effect size) between the experimental and control intervention have to be estimated. The investigators also have to estimate parameters that are not of direct interest, but are required for the computations. These parameters are often termed nuisance parameters, the most common of which are the controlgroup event rate (CER) for binary outcomes and the standard deviation (SD) for continuous outcomes.
Determining an appropriate sample size for an RCT has always been a challenge due to these nuisance parameters being unknown and needing to be estimated [1, 2]. Usually, they can only be estimated from previous studies on the same topic, but there is always a problem when the new study is the first trial on the topic. Consequences of erroneously estimated nuisance parameters are the underestimation or overestimation of the required sample size, which can then lead to underpowered studies that may fail to reach a definitive conclusion [3] in the former case, or to unnecessary higher study cost and longer recruitment periods in the latter case [4].
Investigators sometimes extrapolate evidence on nuisance parameters from randomized trials in adults for sample size calculations in pediatric trials [1, 5, 6]. However, differences in clinical effects between adults and children do exist as has been previously shown by systematic empirical evaluations of the comparative effectiveness and comparative safety of medical interventions between adults and children [7,8,9,10]. A systematic empirical evaluation of nuisance parameters in pediatric RCTs, as compared to adult RCTs has not been previously performed.
We performed a metaepidemiologic empirical evaluation to investigate whether nuisance parameters differ between pediatric and adult trials on the same topics, for the same compared interventions and for the same clinical outcomes. We also studied whether there are differences in the nuisance parameters according to the types of outcomes: binary versus continuous efficacy outcomes and mortality versus nonmortality outcomes.
Methods
Selection of metaanalyses
We addressed the above questions by examining 106 systematic reviews from the Cochrane Database of Systematic Reviews (CDSR) (Issue 1, 2007), that have been previously analyzed [9] in an empirical evaluation of the comparative effectiveness of medical interventions in children versus adults. These systematic reviews included 135 metaanalyses on diverse medical interventions with binary efficacy outcomes with at least one adult RCT and at least one pediatric RCT per metaanalysis. In this prior analysis, the following types of metaanalyses were excluded: those involving surgical, psychological, behavioral, social interventions, or evaluations of medical devices; those focusing exclusively on harms, without any primary efficacy outcome; metaanalyses with only continuous outcomes; metaanalyses for which it was not possible to discriminate between an experimental and control intervention; those without any quantitative data synthesis and those that did not cover both age groups and did not have any complementary systematic review focusing on the other age group. When a systematic review addressed different types of eligible comparisons of experimental versus control interventions, each comparison was considered for eligibility separately [9].
We further screened these 106 systematic reviews to identify additional eligible metaanalyses with continuous outcomes that had included at least one adult RCT and at least one pediatric RCT per metaanalysis. Furthermore, we screened 79 reviews previously excluded in the ContopoulosIoannidis et al. study since they did not contain binary efficacy primary outcomes. Four authors (BV, IT, MJW, and SW) extracted all continuousoutcome metaanalyses up to a maximum of five per review—if more than five continuousoutcome metaanalyses were reported we chose the five metaanalyses that had the maximum number of participants. We excluded metaanalyses that used a standardized mean difference as their method of pooling, since these standard deviations would be expected to be different (i.e. different scales for different trials). A total of 73 continuousoutcome metaanalyses (from both sources) were included in our continuousoutcome analysis (Fig. 1).
Data extraction
From each eligible metaanalysis with binary efficacy outcomes we extracted the following data from the included RCTs: a) compared interventions (experimental vs. control); b) event rate (events/total) in the control group; c) control group sample size and d) age group of study participants (adults versus children). From each eligible metaanalysis with continuous outcomes we extracted the following study level data: a) standard deviation of effectsize in the control group; b) control group sample size; c) mean of effect size in the control group and d) age group of study participants. For the identification of the experimental and control intervention when two active interventions were compared we used the interpretation of the authors of the Cochrane review. For the study age group categorization, we used the classification reported in the Cochrane review. If this was not described, we used age group classification rules previously applied by ContopoulosIoannidis et al. In brief, a study was characterized as “adult” if all included patients were >12 years and patients >20 years were also included; “pediatric” if all patients were <20 years and patients <12 years were also included.
Data synthesis
Primary endpoints
Binary efficacy outcomes
Pooling of CER risk ratios (summaryCERRRs): First, we calculated the CER by dividing the event rate in the control group by the total sample size in the control group for each individual study. Second, for each of these CERs we calculated its standard error using the normal scores method [11]. Third, separately for all pediatric and adult RCTs, we used a random effects model [12] within each metaanalysis to estimate a pooledpediatricCER and pooledadultCER respectively, its standard error and the heterogeneity statistic Isquared [13]. Fourth, for each metaanalysis, we computed the logarithm of the CERRR of the pediatric CER to the adult CER with its associated standard error. Finally, we calculated the summaryCERRR between pediatric and adult trials and their 95% confidence intervals across all metaanalyses by synthesizing the pooled logarithms of the CERRRs within each metaanalysis again using the random effects model [14]. The logarithms were converted back to CERRRs for presentation purposes. A CERRR < 1 indicated that pediatric trials had smaller CERs than adult trials.
Exploratory analyses
Pooling of magnitude of absolute CERRRs: Here our interest was in the magnitude of difference in estimated nuisance parameters, not the direction. The pooling of the CERs as described above takes into account also directional differences in CERRRs between pediatric and adult studies—thus, a minimal estimated difference in that analysis would not reduce concerns about the comparability of nuisance parameters in any individual situation. Therefore, it was important to assess also the magnitude of the differences by performing a separate metaanalysis of the “absolute CERRRs”. The methodology was the same as above, except that before the final step of pooling the CERRRs, we replaced any RR smaller than 1 with its reciprocal. This is mathematically equivalent to replacing the logarithm of the RR in the final metametaanalysis with its absolute value. This pooled estimate tells us how large (on average) the CERRRs were, regardless of which group (pediatric or adult) had the larger CER. For example, CERRRs of 1.25 and 0.80 will be transformed to absolute CERRRs of 1.25 and 1.25 respectively.
Subgroup analysis
The binaryoutcome metaanalyses were subgrouped by mortality and nonmortality outcomes. We computed a separate summaryCERRR for each of these groups.
Continuous efficacy outcomes
Pooling of control groupeffectSDratios (CESDratios): First, we extracted the SD of the estimate in the control group of each pediatric and adult RCT. Second, we calculated the weighted average of the SDs for all pediatric (pooledpediatricCESD) and adult RCTs (pooledadultCESD), respectively, within each metaanalysis, by weighting with the square root of each study’s sample size. Third, we computed the ratio [CESDratio] of the pediatric vs. adult controlgroup effectSDs within each metaanalysis by dividing the weightedaveragepediatric SDs (pooledpediatricCESD) by the weightedaverageadult SDs (pooledadultCESD). Finally, we calculated the summaryCESDratio between pediatric and adult trials across all metaanalyses as the weighted average of the logarithms of these ratios. The summary log SD ratio was exponentiated to get the summary CESDratio. A CESDratio < 1 indicates that pediatric trials had smaller SDs than adult trials.
Descriptive analyses
For graphical comparison of the CESDs of adult vs. pediatric RCTs within each metaanalysis we divided the CESDs of each individual RCT by the maximum SD in that metaanalysis in order to get a standardized CESD for each adult and pediatric RCT that would allow comparisons of adult and pediatric CESDs within each metaanalysis and across metaanalyses.
Software
Summary statistics and weighted averages were computed in SAS 9.3 (SAS institute Inc., Cary NC). Metaanalyses were performed using the metan module in Stata 11.2 (StataCorp, College Station TX). Graphs were produced using Review Manager, SPlus 8.2 (Tibco Software Inc.), and Microsoft Excel.
Results
We examined 208 metaanalyses, 135 with binary primary efficacy outcome data and 73 with continuous outcome data, from a total of 185 systematic reviews. All metaanalyses comprised a total of 2110 RCTs; 1515 adult RCTs (1126 with binary and 389 with continuous outcomes) and 595 pediatric RCTs (355 with binary and 240 with continuous outcomes). Each study could have contributed data to more than one metaanalysis within each systematic review (e.g. for different metaanalyses with different compared interventions or outcomes).
Binary outcome analyses
Summary pooledCERRRs
The summarypooledCERRR of pediatric CERs versus adult CERs across all 135 metaanalyses showed that pediatric RCTs had on average a 10% smaller CERs (summaryCERRR: 0.90; 95% CI: 0.83, 0.98). The individual CERRRs within each metaanalysis are shown in Fig. 2 and their distribution thereof in Fig. 3. Overall, 13.3% of the examined metaanalyses had pediatricCERs that were at least fivefold smaller than the adultCERs; the opposite (adult RCTs with at least fivefold smaller CERs than pediatric RCTs) occurred in only 2.2% of cases. Moreover, in 60.7% of the metaanalyses, the pediatricCERs were smaller than the adultCERs.
I^{2} values were calculated for 172 of the 270 metaanalyses (135 each for pediatric and adult trials respectively); for the remaining metaanalyses, I^{2} was not defined as they included only single studies. The distribution of these I^{2} values within each metaanalysis is shown in Table 1. Since these metaanalyses do not mix pediatric and adult data, these results show high heterogeneity among studies even when the pediatric/adult factor was taken out of the analysis.
Subgroup analysis: Mortality and nonmortality outcomes
There were 21 metaanalyses with mortality outcomes and 114 metaanalyses with nonmortality outcomes. Among the 21 metaanalyses with mortality outcomes the summarypooled CERRRs of pediatricCERs vs. adultCERs was 0.48 (95% CI: 0.31, 0.74). In 18 of these 21 metaanalyses the mortality CER in adult trials was larger than that in pediatric trials.
Among the 114 metaanalyses with nonmortality outcomes the summaryCERRR of pediatric vs. adult trials was 1.00 (95% CI: 0.92, 1.09).
Pooling using magnitude (absolute CERRR)
When we repeated the primary analysis using the “absoluteCERRR” (Fig. 4) instead of the CERRR the summaryabsoluteCERRR was 1.46 (95% CI; 1.37, 1.56). This indicates that CERs in pediatric trials are on average either 1.5 times larger or 1.5 times smaller than in adult trials.
Continuous outcome analyses
SummarypooledCESDratios
The weighted average (weighted on the log scale by the square root of the sample size) (summaryCESDratio) of the CESDratios of pediatricCESDs vs adultCESDs for the 73 metaanalyses with continuous efficacy outcomes was 0.74. This indicates that on average the pediatric RCTs CESDs were 26% smaller than their adult counterparts. The distribution of the CESDratios is shown in Fig. 5. In 27.1% of the metaanalyses the CESDratios between pediatric and adult RCTs differed by at least 2folds in either direction. Furthermore, ignoring direction of difference, and looking only at the sizes, the weighted magnitude of CESDratio was 1.76, which means that, on average, the CESD in adult and pediatric RCTs differed by a factor of 1.76.
The distribution of the standardized SDs of the effect sizes in the control groups of all adult and pediatric studies within each metaanalysis is shown in Fig. 6. The SDs of the effect sizes in the control groups of pediatric vs. adult trials varied greatly both within metaanalyses and between metaanalyses.
Discussion
The controlgroup event rate nuisance parameter in pediatric trials was on average 10% smaller than that in adult trials. In our secondary analyses, when we considered the magnitude of the controlgroup effect size rather than the direction of effect, pediatric trials had an average controlgroup event rate that was 1.5 times higher or 1.5 times smaller than that in adult RCTs. The reason for considering also the magnitude of effects, ignoring their direction thereof, is that an important issue to address is whether nuisance parameters in pediatric studies are likely to differ from those estimated from adult studies. We could have large differences in nuisance parameters (some overestimating, some underestimating) that average out to no difference. By analyzing magnitudes, we are presupposing a difference and trying to estimate how large that difference might be.
In over 60% of metaanalyses the controlgroup event rates in pediatric RCTs were smaller than those in adult trials and in 36% of the metaanalyses, relative differences in controlgroup event rates of at least 2folds, in either direction, were identified. Specifically, for mortality outcomes, the controlgroup mortality rate in pediatric trials was on average 50% lower than that in adult trials.
Large variation was also seen between pediatric and adult trials when continuous efficacy outcomes were considered. The pediatric controlgroup SD was on average 26% smaller than that of adult trials and in 27% of the metaanalyses the relative difference in SDs between pediatric and adult trials was at least 2fold in either direction. Moreover, when the magnitude of the controlgroup SD was considered, pediatric trial SDs were at least either 1.8 times larger or 1.8 times smaller than adult trial SDs.
Large differences were seen among many studies with regards to nuisance parameters. To demonstrate how erroneous estimation of nuisance parameters can affect sample size computation substantially, we will take two examples from the included metaanalyses in this study. In the review Antibiotics for the common cold and acute purulent rhinitis, for the primary metaanalysis of the persisting symptoms outcome, we had an estimated CER in the adult population of 0.48. If one wishes to conduct a pediatric trial on the same topic, with a type I error probability of 0.05, 80% power, and an assumption that a 30% reduction in number of patients with persisting symptoms would be required to demonstrate a clinically relevant antibiotic effect, the required sample size for the pediatric study would be 182 patients per arm. Under the assumption of a CER of 0.048, as was actually seen in the pediatric trials, the required sample size would be over 16 times larger at 2962. To give an example using a continuous outcome, we take the review Early emergency department treatment of acute asthma with systemic corticosteroids. For the outcome of final PEFR, we had observed a mean SD in the adult studies of 32 L/min. Suppose we plan to conduct a pediatric trial on the same topic, using a type I error probability of 0.05, a power of 80%, and a minimal clinical important difference threshold of 15 L/min. Using the adult estimated SD of 32 L/min, we could compute that we would require 643 patients in each of two groups. If we assume an SD estimate of 4.3 L/min, as was observed in the pediatric trials, we would only require a sample of 12 patients in each group, a sample size that is less than 2% of the originally computed sample size. It is clear from these examples that the erroneous estimations of these nuisance parameters can have important implications in the sample size computations, which can lead to either inappropriately powered studies that would not be able to answer the clinical question, or, on the other hand, to unnecessary waste of valuable clinical and financial resources. A third unwanted consequence might be that a proposed trial is not conducted because the erroneous estimate for the sample size is too large to be feasible.
We did observe a trend in both binary and continuous outcome data for pediatric RCTs to have smaller values of nuisance parameters (both CERs and CESDs) than their adult counterparts. Thus, when one does use these parameters from adult studies as surrogate for pediatric studies, the nuisance parameter is more likely to be overestimated than underestimated. This relationship has been well documented and graphed^{2}. In the case of continuous data, an overestimation of the SD will always result in an overestimation of the sample size. The situation for binary data is more nuanced, as the sample size will depend upon the ratio of the CER and the treatmentgroup event rate (the closer the ratio is to 1, the larger the required sample size), so an underestimation of the CER could lead to either an under or overestimated sample size. For example, if a pediatric population had an actual CER of 0.3 with a treatmentgroup event rate of 0.2, then underestimating the CER (say as 0.25) would result in a larger than required sample. However; if the treatmentgroup event rate was 0.4, then this underestimate of the CER would result in a sample that was too small.
The discrepancies in the nuisance parameters between pediatric and adult trials were more prominent with mortality outcomes. In 86% of metaanalyses with mortality outcomes, the mortality CER in adult trials was larger than that in pediatric trials. On average, the controlgroup mortality rates in adult trials were two times larger than in pediatric trials. Mortality seems to be an outcome where extrapolation of adult controlgroup event rates for the estimation of pediatric trial sample sizes may give inaccurate results.
We should acknowledge some study limitations. Traditional metaanalysis of standard deviations was not feasible in the analysis of continuous outcomes since the systematic reviews did not provide us enough information to ascertain variances around these nuisance parameters. Metaanalyses were done on a variety of outcomes, and thus the standard deviations were all reported in different units, and therefore not comparable across metaanalyses without standardization. In both the analyses of binary and continuous outcomes we observed considerable heterogeneity in nuisance parameters, not only between metaanalyses but also within them. We assumed that studies included within the same metaanalysis of a Cochrane review would have populations sufficiently similar to use them to impute nuisance parameters. However, extremely high betweenstudy heterogeneity (I^{2} > 80%) was seen in more than half of the metaanalyses, which implies that even within studies of the same agegroup (i.e. adult or pediatric) we cannot expect nuisance parameters to routinely be similar. This suggests that not only should we be wary of extrapolating nuisance parameters for pediatric studies from adult studies, but we should be almost equally wary of extrapolating them from other pediatric studies.
With these limitations in mind and given the results we have seen here, it would be interesting to do a further and more refined analysis as to which factors may lead to better concordance between the nuisance parameters of pediatric and adult studies. This would be a difficult endeavor, however, since these factors would likely be specific to a subject area, and not necessarily generalizable. Analysis would then have to be limited to those areas where there are enough studies to do it properly.
Conclusion
This study provides evidence to raise awareness among investigators planning to design trials in children, when available data on nuisance parameters are mostly from adult studies that significant differences between pediatric and adult trials do exist. Extrapolation from adult trials of nuisance parameters to guide sample size calculations for pediatric trials should be cautiously done. Inappropriate extrapolations of nuisance parameters from adult trials to pediatric trials can lead to erroneous sample size calculations. Significant over or underestimation of the required pediatric sample sizes can occur particularly when the outcome is mortality. When there is doubt about the similarity between the population from which the estimates are derived and the prospective study population, either a blinded sample size review during the early phase of the new trial (internal pilot study [15]), a more flexible (sequential) design and analysis [1, 2], or use of a standardized effect size [16] should be considered to maximize trial efficiency.
Abbreviations
 CDSR:

Cochrane database of systematic reviews
 CER:

Controlgroup event rate
 CERRR:

Controlgroup event rate risk ratio
 CESD:

Controlgroup effect standard deviation
 RCT:

Randomized control trial
 SD:

Standard deviation
References
 1.
van der Tweel I, Askie L, Vandermeer B, Ellenberg S, Fernandes RM, Saloojee H, et al. Standards for research in (Star) child health; standard 4: determining adequate sample size. Pediatrics. 2012;129:S138–45.
 2.
Nikolakopoulos S, Roes K, van der Lee JH, van der Tweel I. Sample size calculations in pediatric clinical trials conducted in an ICU: a systematic review. Trials. 2014;15:274.
 3.
van der Lee JH, Tanck MW, Wesseling J, Offringa M. Pitfalls in the design and analysis of paediatric clinical trials: a case of a “failed” multicentre study, and potential solutions. Acta Paeditr. 2009;98:385–91.
 4.
Koletsi D, Fleming PS, Seehra J, Bagos PG, Pandis N. Are Sample Sizes Clear and Justified in RCTs Published in Dental Journals? PLoS ONE. 2014. https://doi.org/10.1371/journal.pone.0085949.
 5.
Goodman SN, Sladky JTA. Bayesian approach to randomized controlled trials in children using information from adults: the case of GuillainBarre syndrome. Clinical Trials. 2005;2:305–10.
 6.
Schoenfeld DA, Zheng H, Finkelstein DM. Bayesian design using adult data to augment pediatric trials. Clinical Trials. 2009;6:297–304.
 7.
Klassen TP, Hartling L, Craig JC, Offringa M. Children are not just small adults: the urgent need for highquality trial evidence in children. PLoS Med. 2008. https://doi.org/10.1371/journal.pmed.0050172.
 8.
Caldwell PHY, Murphy SB, Butow PN, Craig JC. Clincial trials in children. Lancet. 2004;364:803–11.
 9.
ContopoulosIoannidis DG, Baltogianni MS, Ioannidis JPA. Comparative effectiveness of medical interventions in adults versus children. J Pediatr. 2010;157:322–30.
 10.
Lathyris D, Panagiotou OA, Baltogianni M, Ioannidis JPA, ContopoulosIoannidis D. Safety of medical interventions in children versus adults. Pediatrics. 2014;133:e666–73.
 11.
Agresti A, Coull BA. Approximate is better than “exact” for interval estimation of binomial proportions. Am Stat. 1998;52:119–26.
 12.
DerSimoninan R, Laird N. Metaanalysis in clinical trials. Controlled Clin Trials. 1986;7:177–88.
 13.
Higgins J, Thompson SG. Quantifying heterogeneity in a metaanalysis. Stat Med. 2002;21:1539–58.
 14.
Zhang Z. Metaepidemiological study: a step by step approach by using R. J Evid Based Med. 2016. https://doi.org/10.1111/jebm.12191.
 15.
Wittes J, Brittain E. The role of internal pilot studies in increasing the efficiency of clinical trials. Stat Med. 1990;9(1–2):65–72.
 16.
Cohen J. A power primer. Psychol Bull. 1992;112(1):155–9.
Acknowledgements
Not applicable.
Funding
The research leading to these results has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no. 261060 (Global Research in Paediatrics—GriP network of excellence). Marijke Jansenvan der Weide, Stephanie Weinreich, and Paola Baiardi received funding from the EU FP7. Additional funding was received by the Alberta Research Centre for Health Evidence (ARCHE) and StaR Child Health (www.starchildhealth.org). These funding bodies had no role in the design, data collection, analysis, and interpretation of data for the study.
Availability of data and materials
The datasets analysed during this study are available from the corresponding author on reasonable request.
Author information
Affiliations
Contributions
The individual contributions of each of the authors was as follows: Study Conception and Design: BV, IT, DB, RF, LA, HS, PB, SE, JL. Acquisition of Data: BV, IT, MJW, SW, DCI. Analysis and Interpretation of Data: BV, IT, DCI, RF, SE, JL. Drafting of Manuscript: BV. Critical Revision: BV, IT, MJW, SW, DCI, DB, RF, LA, HS, PB, SE, JL. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Johanna H. van der Lee.
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Nuisance parameters
 Extrapolation
 Sample size computations
 Pediatric trials
 Adult trials