The ratio of means method as an alternative to mean differences for analyzing continuous outcome variables in metaanalysis: A simulation study
 Jan O Friedrich^{1, 2, 3}Email author,
 Neill KJ Adhikari^{2, 4} and
 Joseph Beyene^{5, 6}
DOI: 10.1186/14712288832
© Friedrich et al; licensee BioMed Central Ltd. 2008
Received: 07 March 2008
Accepted: 21 May 2008
Published: 21 May 2008
Abstract
Background
Metaanalysis of continuous outcomes traditionally uses mean difference (MD) or standardized mean difference (SMD; mean difference in pooled standard deviation (SD) units). We recently used an alternative ratio of mean values (RoM) method, calculating RoM for each study and estimating its variance by the delta method. SMD and RoM allow pooling of outcomes expressed in different units and comparisons of effect sizes across interventions, but RoM interpretation does not require knowledge of the pooled SD, a quantity generally unknown to clinicians.
Objectives and methods
To evaluate performance characteristics of MD, SMD and RoM using simulated data sets and representative parameters.
Results
MD was relatively biasfree. SMD exhibited bias (~5%) towards no effect in scenarios with few patients per trial (n = 10). RoM was biasfree except for some scenarios with broad distributions (SD 70% of mean value) and mediumtolarge effect sizes (0.5–0.8 pooled SD units), for which bias ranged from 4 to 2% (negative sign denotes bias towards no effect). Coverage was as expected for all effect measures in all scenarios with minimal bias. RoM scenarios with bias towards no effect exceeding 1.5% demonstrated lower coverage of the 95% confidence interval than MD (89–92% vs. 92–94%). Statistical power was similar. Compared to MD, simulated heterogeneity estimates for SMD and RoM were lower in scenarios with bias because of decreased weighting of extreme values. Otherwise, heterogeneity was similar among methods.
Conclusion
Simulation suggests that RoM exhibits comparable performance characteristics to MD and SMD. Favourable statistical properties and potentially simplified clinical interpretation justify the ratio of means method as an option for pooling continuous outcomes.
Background
Metaanalysis is a method of statistically combining results of similar studies [1]. For binary outcome variables both difference and ratio methods are commonly used. For each study, the risk difference is the difference in proportions of patients experiencing the outcome of interest between the experimental and control groups, the risk ratio is the ratio of these proportions, and the odds ratio is the ratio of the odds. Metaanalytic techniques are used to combine each study's effect measure to generate a pooled effect measure. Standard metaanalytic procedures for each of these effect measures also estimate heterogeneity, which is the variability in treatment effects of individual trials beyond that expected by chance. Each effect measure (risk difference, risk ratio, odds ratio) has advantages and disadvantages in terms of consistency, mathematical properties, and ease of interpretation, implying that none is universally optimal [2].
In contrast, for continuous outcome variables, only difference methods are commonly used for group comparison studies [3]. If the outcome of interest is measured in identical units across trials, then the effect measure for each trial is the difference in means, and the pooled effect measure is the mean difference (MD), which more accurately should be described as the weighted mean of mean differences. If the outcome of interest is measured in different units, then each trial's effect measure is the difference in mean values divided by the pooled standard deviation of the two groups, and the pooled effect measure is the standardized mean difference (SMD), which more accurately should be described as the weighted mean of standardized mean differences. Normalizing the differences using the standard deviation allows pooling of such results, in addition to allowing comparison of effect sizes across unrelated interventions. By convention [4], SMD's of 0.2, 0.5, and 0.8 are considered "small", "medium", and "large" effect sizes, respectively. When trials in metaanalyses are weighted by the inverse of the variance of the effect measure (the weighting scheme generally used for MD and SMD), the pooled SMD has the unfavorable statistical property of negative bias (i.e. towards the null value) [5, 6]. Alternative methods of estimating the variance of individual trial SMDs used in the inverse variance method have been proposed to minimize this bias [5, 6].
Renal Physiological Parameters from LowDose Dopamine MetaAnalysis 1 Day After Starting Therapy [7].
Effect Measure  

Parameter  Number of Trials  Number of Patients  MD  SMD  RoM  
Urine Output  33  1654  Estimate    0.49  1.24 
95% CI    0.29 to 0.69  1.14 to 1.35  
pvalue    <0.001  <0.001  
I ^{2}    71%  77%  
Serum Creatinine  32  1807  Estimate  3.51  0.28  0.96 
95% CI  6.71 to 0.23  0.51 to 0.06  0.93 to 0.99  
pvalue  0.04  0.01  0.01  
I ^{2}  73%  79%  73%  
Creatinine Clearance  22  1077  Estimate    0.10  1.06 
95% CI    0.02 to 0.22  1.01 to 1.11  
pvalue    0.10  0.02  
I ^{2}    0%  0% 
Given the similarity of these results, the objective of this current study was to test the hypothesis that MD, SMD, and RoM methods exhibit comparable performance characteristics in terms of bias, coverage and statistical power, using simulated data sets with a range of parameters commonly encountered in metaanalyses.
Methods
The RoM Effect Measure
For mean difference metaanalysis, one calculates a difference in mean values between the experimental and control groups for each study. (A review of the inversevariance weighted fixed and random effects models and calculation of the point estimates and variances for MD and SMD using standard methods [including a correction factor for small samples for SMD], can be found in the Appendix). Instead of calculating a difference in mean values between the experimental and control groups, one can calculate a ratio of mean values. The following uses the natural logarithm scale to carry out such calculations, similar to statistical procedures for binary effect measures (risk ratio and odds ratio), due to its desirable statistical properties [14].
Log transformation of the ratio of mean values, a nonnormally distributed function, allows this approximation of the 95% confidence interval of this approximately normally distributed transformed function. This approach is similar to that applied to other ratio methods such as OR and RR, used for binary group comparison studies.
As the ratio of means method is unitless, this method can be used irrespective of the units used in trial outcome measures. Using the delta method limited to first order terms results in a straightforward formula to estimate the variance of the ratio. Second order terms would be raised to the fourth power and are not included as they would not increase the variance by much. For example, even choosing simulation parameters that maximized the contribution of these second order terms (ratio of the standard deviation to the mean equal to 0.7, and n = 10 patients per trial arm [see below]), would increase the variance estimate by less than 2.5%.
Design of the Simulation Study
Parameter Values Used in the Simulated Data Sets
Varied Parameter  Assigned Values 

Standard Deviation (percentage of control mean value)  10%, 40%, 70% 
Number of Trials  5, 10, 30 
Number of Experimental and Control Patients Per Trial Arm  10, 100 
Effect Size (in standard deviation units)  0.2, 0.5, 0.8 
Heterogeneity of Mean Values (in standard deviation units)  0, 0.5 
For each simulated scenario, k simulated study means and standard deviations were calculated from a collection of n individual values randomly sampled from a normal distribution. This was done independently for the control and experimental groups. For the control group the normal distribution from which values were randomly sampled had a mean value set to 100, resulting in a standard deviation of 10, 40, or 70. For the experimental group the normal distribution from which values were randomly sampled had a mean value of [100 + (effect size) × (standard deviation)] and the same standard deviation as the control group. Using the simulated study mean values and standard deviations, metaanalysis was carried out using MD, SMD, and RoM, with inverse variance weighting and a random effects model as described in the Appendix. With the parameters described above, the expected MD = (effect size) × (standard deviation), the expected SMD = effect size, and the expected RoM = 1 + [(effect size) × (standard deviation)/(mean value in control group[= 100])], where effect size varies as 0.2, 0.5 and 0.8, and the standard deviation varies as 10, 40, and 70.
Heterogeneity for each scenario was introduced by setting τ = 0.5 standard deviation units. This was achieved by introducing an additional studyspecific standard deviation equal to 0.5/√2 standard deviation units to both the experimental and the control groups, since the studyspecific standard deviation of the difference between experimental and control groups is given by √[(0.5/√2)^{2} + (0.5/√2)^{2}] = 0.5 standard deviation units. In other words, studyspecific variance was added to experimental and control group means but the baseline difference and ratio in mean values was held constant. Since a given degree of result heterogeneity may be reflected differently in the difference methods (MD and SMD) compared to the ratio method (RoM), heterogeneity was added at the level of the individual mean values rather than the level of the treatment effects to ensure that the degree of heterogeneity added was comparable between the three methods. Heterogeneity of each metaanalysis scenario is presented using I ^{2}. Since I ^{2} = τ ^{2}/(τ ^{2} + s^{2}), where s^{2} is the variance of the effect measure, as described in the Appendix, the expected value for I ^{2} for τ = 0.5 can be calculated to be 56% when n = 10 patients per trial arm, corresponding to the introduction of a moderate (i.e. I ^{2} = 50–75%) degree of heterogeneity, and 93% when n = 100 patients per trial arm, corresponding to a high (i.e. I ^{2} > 75%) degree of heterogeneity [11, 12].
The baseline scenarios assumed equal numbers of participants in both the experimental and control arms and were constructed by randomly selecting data points from normally distributed data. Separate sensitivity analyses were also carried out to determine 1) the effect of unequal numbers of participants (chosen to have a 2:1 and 1:2 experimental:control arm ratio but keeping the total number of participants constant (i.e. 14:6 instead of 10:10 and 134:66 instead of 100:100)) and 2) the effect of selecting the data points from an underlying skewed distribution. The skewed distribution was empirically constructed by mixing a combination of 3 normal distributions with identical standard deviations (0.24) centered at 0.84, 1.42 and 1.92 and weighted 77%, 17%, and 6% respectively in the overall mixed skewed distribution. This created a graphical distribution appearing markedly skewed on visual inspection with an overall mean of unity and overall standard deviation similar to that of the middle normally distributed data scenario (i.e. 40% of the control mean value), but skewness (third standardized moment about the mean [15]) of 0.88.
For each scenario, data points were generated and analyzed 10,000 times and performance characteristics of each effect measure were assessed. These consisted of bias (expressed as a percentage of the true parameter value, directed away or towards the null value [zero for MD and SMD, and one for RoM]), coverage (of the 95% confidence interval of the simulated result, i.e. the percentage of time that the true parameter value falls within the 95% confidence interval of the simulated result), statistical power (the percentage of time that the 95% confidence interval of the simulated result yields a significant treatment effect, by excluding zero for MD and SMD or one for RoM), and heterogeneity (expressed as I ^{2}). Simulations were programmed and carried out using SAS (version 8.2, Cary, NC).
Results
Simulation Results (Normal Distribution, Equal Experimental and Control Groups, Standard Deviation 40% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  n (exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  10/10  5  0  4  0  0  5  1  95  97  95  90  92  90  15  11  14  15  13  14  59  48  55 
10  0  5  0  1  5  0  95  97  95  92  93  92  27  22  26  19  18  19  60  48  56  
30  0  5  0  1  6  0  94  96  95  94  94  94  66  61  64  39  38  38  60  48  56  
MD = 8  100/100  5  0  0  0  0  0  1  96  97  96  88  88  87  82  82  82  21  21  21  93  92  92 
RoM = 1.08  10  0  0  0  0  0  0  96  96  96  92  92  91  99  99  99  27  27  28  93  92  92  
30  0  0  0  0  0  0  96  96  96  94  94  94  100  100  100  56  58  59  93  92  92  
SMD = 0.5  10/10  5  0  4  0  0  5  0  95  97  95  90  92  90  64  57  62  43  40  42  59  48  56 
10  0  5  0  0  5  0  95  97  95  92  93  92  91  89  90  66  64  65  60  48  56  
30  0  5  0  0  6  0  94  96  94  94  93  93  100  100  100  98  97  97  60  47  57  
MD = 20  100/100  5  0  0  0  0  0  1  96  97  96  88  88  88  100  100  100  61  61  61  93  92  92 
RoM = 1.2  10  0  0  0  0  0  0  96  96  96  92  92  91  100  100  100  85  86  86  93  92  92  
30  0  0  0  0  1  0  96  96  96  94  94  93  100  100  100  100  100  100  93  92  92  
SMD = 0.8  10/10  5  0  4  0  0  5  0  95  97  95  90  92  90  95  94  95  75  72  73  59  47  56 
10  0  5  0  0  5  0  95  96  95  92  92  92  100  100  100  95  95  95  60  47  57  
30  0  5  1  0  6  0  94  94  94  94  92  93  100  100  100  100  100  100  60  46  57  
MD = 32  100/100  5  0  0  0  0  0  1  96  96  96  88  88  87  100  100  100  91  91  91  93  92  92 
RoM = 1.32  10  0  0  0  0  0  1  96  96  96  92  92  91  100  100  100  100  100  100  93  91  92  
30  0  0  0  0  1  0  96  96  96  94  94  93  100  100  100  100  100  100  93  91  92 
Simulation Results (Normal Distribution, Equal Experimental and Control Groups, Standard Deviation 10% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  n (exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  10/10  5  0  4  0  0  5  0  95  97  95  90  92  90  15  11  15  15  13  15  59  48  59 
10  0  5  0  1  5  0  95  97  95  92  93  92  27  22  27  19  18  19  60  48  60  
30  0  5  0  1  6  0  94  96  94  94  94  94  66  61  66  39  38  39  60  48  60  
MD = 2  100/100  5  0  0  0  0  0  0  96  97  96  88  88  88  82  82  82  21  21  21  93  92  93 
RoM = 1.02  10  0  0  0  0  0  0  96  96  96  92  92  92  99  99  99  27  27  27  93  92  93  
30  0  0  0  0  0  0  96  96  96  94  94  94  100  100  100  56  58  57  93  92  93  
SMD = 0.5  10/10  5  0  4  0  0  5  0  95  97  95  90  92  90  64  57  63  43  40  43  59  48  59 
10  0  5  0  0  5  0  95  97  95  92  93  92  91  89  91  66  64  66  60  48  60  
30  0  5  0  0  6  0  94  96  94  94  93  94  100  100  100  98  97  98  60  47  60  
MD = 5  100/100  5  0  0  0  0  0  0  96  97  96  88  88  88  100  100  100  61  61  61  93  92  93 
RoM = 1.05  10  0  0  0  0  0  0  96  96  96  92  92  91  100  100  100  85  86  85  93  92  93  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  100  100  100  93  92  93  
SMD = 0.8  10/10  5  0  4  0  0  5  0  95  97  95  90  92  90  95  94  95  75  72  74  59  47  59 
10  0  5  0  0  5  0  95  96  95  92  92  92  100  100  100  95  95  95  60  47  60  
30  0  5  0  0  6  0  94  94  94  94  92  94  100  100  100  100  100  100  60  46  60  
MD = 8  100/100  5  0  0  0  0  0  0  96  96  96  88  88  88  100  100  100  91  91  91  93  92  93 
RoM = 1.08  10  0  0  0  0  0  0  96  96  96  92  92  92  100  100  100  100  100  100  93  91  93  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  100  100  100  93  91  93 
Simulation Results (Normal Distribution, Equal Experimental and Control Groups, Standard Deviation 70% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  n (exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  10/10  5  0  4  0  1  5  0  95  97  96  90  92  91  15  11  12  15  13  12  59  48  45 
10  0  5  1  0  5  1  95  97  96  92  93  93  27  22  23  19  18  16  60  48  45  
30  0  5  1  0  6  1  94  96  95  94  94  93  66  61  60  38  38  34  60  47  46  
MD = 14  100/100  5  0  0  0  0  1  2  96  97  96  88  88  87  82  82  82  21  21  22  93  92  91 
RoM = 1.14  10  0  0  0  0  0  1  96  96  96  92  92  90  99  99  99  27  27  31  93  92  91  
30  0  0  0  0  0  1  96  96  96  94  94  92  100  100  100  56  58  62  93  92  91  
SMD = 0.5  10/10  5  0  4  1  0  5  1  95  97  95  90  92  91  64  57  58  43  40  38  59  48  47 
10  0  5  2  0  5  2  95  97  95  92  93  92  91  89  88  66  64  61  60  47  47  
30  0  5  2  0  6  3  94  96  92  94  93  91  100  100  100  98  97  97  60  47  48  
MD = 35  100/100  5  0  0  0  0  0  2  96  97  96  88  88  87  100  100  100  61  61  63  93  92  91 
RoM = 1.35  10  0  0  0  0  0  1  96  96  96  92  92  90  100  100  100  85  86  88  93  92  91  
30  0  0  0  0  1  1  96  96  95  94  94  92  100  100  100  100  100  100  93  92  91  
SMD = 0.8  10/10  5  0  4  1  0  5  2  95  97  95  90  92  90  95  94  93  74  72  71  59  47  48 
10  0  5  2  0  5  3  95  96  94  92  92  91  100  100  100  95  95  94  60  46  48  
30  0  5  3  0  6  4  94  94  90  94  92  89  100  100  100  100  100  100  60  46  49  
MD = 56  100/100  5  0  0  0  0  0  2  96  96  96  88  88  87  100  100  100  91  91  92  93  92  91 
RoM= 1.56  10  0  0  0  0  0  2  96  96  96  92  92  90  100  100  100  100  100  100  93  91  91  
30  0  0  0  0  1  1  96  96  95  94  94  92  100  100  100  100  100  100  93  91  91 
Simulation Results (Skewed Distribution, Equal Experimental and Control Groups, Standard Deviation 40% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  N (exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  10/10  5  0  2  0  1  5  1  95  97  95  89  92  90  16  12  16  15  13  15  60  50  60 
10  0  3  0  0  4  0  94  97  94  92  93  92  29  22  28  19  17  19  61  49  61  
30  0  4  0  0  5  0  94  96  94  94  94  94  68  62  67  38  37  36  61  49  61  
MD = 8  100/100  5  0  0  0  1  0  1  96  96  96  88  88  88  82  81  82  21  21  21  93  92  92 
RoM = 1.08  10  0  1  0  0  0  0  96  96  96  92  92  91  99  99  99  27  28  28  93  92  92  
30  0  0  0  0  0  0  96  96  96  94  93  93  100  100  100  56  58  59  93  92  92  
SMD = 0.5  10/10  5  0  2  1  1  4  1  95  97  95  89  92  90  65  59  64  43  40  42  60  49  60 
10  0  3  0  0  4  0  94  97  94  92  93  92  92  90  91  65  63  63  61  49  61  
30  0  4  0  0  5  0  94  96  94  94  94  94  100  100  100  98  98  98  61  48  61  
MD = 20  100/100  5  0  0  0  0  0  1  96  96  96  88  88  88  100  100  100  61  61  61  93  92  92 
RoM = 1.2  10  0  0  0  0  0  0  96  96  96  92  91  91  100  100  100  85  86  86  93  92  92  
30  0  0  0  0  0  0  96  96  96  94  93  93  100  100  100  100  100  100  93  92  92  
SMD = 0.8  10/10  5  0  2  1  0  3  1  95  97  95  89  92  90  95  94  95  74  72  73  60  48  60 
10  0  3  1  0  4  0  94  96  94  92  92  92  100  100  100  95  95  95  61  48  61  
30  0  4  0  0  5  0  94  95  94  94  93  94  100  100  100  100  100  100  61  48  62  
MD = 32  100/100  5  0  0  0  0  0  1  96  96  96  88  88  88  100  100  100  91  91  91  93  92  92 
RoM = 1.32  10  0  0  0  0  0  1  96  96  96  92  92  91  100  100  100  100  100  100  93  92  92  
30  0  0  0  0  0  0  96  96  96  94  93  93  100  100  100  100  100  100  93  91  92 
Bias
The MD method exhibits minimal bias (less than 0.5%) in almost all of scenarios. In contrast, there is one principal source of bias for the SMD method and two for the RoM method.
SMD Bias Towards No Effect with Smaller Trials
SMD is biased towards zero or no effect, with the bias more prominent when the number of patients per study is small. Table 3 shows this bias to be ()4 to 6% in the baseline scenario with 10 patients per trial, regardless of the number of trials. The bias decreases in the 100patient per trial scenarios. The lower weighting of extreme values (i.e. values far from no effect [zero]) in the SMD method also results in decreased heterogeneity (I ^{2}), in the scenarios with 10 patients per trial where the bias is largest (discussed in the heterogeneity section below). Sampling variance alone results in bias toward zero, but this bias is even larger when heterogeneity is present since this results in a further increase in dispersion (or the effective variance) of the results. These findings are consistent with theoretical considerations (see Appendix).
RoM Bias
In contrast, the RoM bias depends on the relative effects of two competing sources of bias. The first is a negative bias towards unity or no effect due to properties of the variance of ln(RoM) and is most pronounced when the number of patients per trial is small. The second is a bias away from unity or no effect occurring when heterogeneity is present, due to properties of RoM. Although bias from both sources is absent or less than 0.5% in all scenarios with 100 patients per trial and no heterogeneity, one or both sources of bias can be significant in other scenarios. These are described in more detail below.
RoM Bias Towards No Effect with Smaller Trials
Simulation Results (Normal Distribution, 2:1 Experimental to Control Group Sizes, Standard Deviation 40% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  n (exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  14/6  5  1  5  0  0  5  0  94  97  94  90  92  90  15  10  11  14  12  12  59  44  54 
10  1  4  1  2  4  1  94  97  93  92  93  91  25  18  18  18  16  14  60  44  55  
30  0  5  1  1  6  1  94  97  91  93  94  93  59  53  44  36  35  27  60  44  55  
MD = 8  134/66  5  0  0  0  0  1  0  96  97  97  88  88  88  78  77  76  21  21  21  92  91  91 
RoM = 1.08  10  0  0  0  0  1  0  96  96  96  92  91  91  98  98  97  27  27  27  92  91  91  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  56  57  57  92  91  91  
SMD = 0.5  14/6  5  0  4  1  0  5  0  94  97  94  90  92  90  57  50  50  41  37  36  59  44  54 
10  0  5  1  1  5  1  94  97  93  92  93  91  86  83  79  62  61  56  60  44  55  
30  0  5  1  0  6  1  94  96  91  93  93  92  100  100  100  96  97  94  60  43  56  
MD = 20  134/66  5  0  0  0  0  1  1  96  96  97  88  88  88  100  100  100  60  61  60  92  91  91 
RoM = 1.2  10  0  0  0  0  1  0  96  96  96  92  92  91  100  100  100  85  86  85  92  91  91  
30  0  0  0  0  1  0  96  96  95  94  94  94  100  100  100  100  100  100  92  91  91  
SMD = 0.8  14/6  5  0  4  1  0  5  0  94  97  93  90  92  90  91  89  87  71  69  66  59  43  54 
10  0  5  1  0  5  1  94  96  92  92  93  91  100  100  99  93  93  91  60  43  55  
30  0  5  2  0  6  1  94  94  90  93  92  92  100  100  100  100  100  100  60  42  56  
MD = 32  134/66  5  0  0  0  0  1  1  96  96  96  88  88  88  100  100  100  91  91  91  92  91  91 
RoM = 1.32  10  0  0  0  0  1  0  96  96  96  92  92  91  100  100  100  100  100  100  92  91  91  
30  0  0  0  0  1  0  96  96  95  94  94  93  100  100  100  100  100  100  92  91  91 
Simulation Results (Normal Distribution, 1:2 Experimental to Control Group Sizes, Standard Deviation 40% of Control Mean Value).
% Bias  % Coverage  % Statistical Power  I ^{2}(%)  

τ = 0s  τ = 0.5s  τ = 0s  τ = 0.5s  τ = 0s  τ 0.5s  τ 0.5s  
Δ  n(exp/contr)  k  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM  MD  SMD  RoM 
SMD = 0.2  6/14  5  1  3  1  1  3  2  94  97  94  90  93  90  15  10  17  14  12  16  59  44  55 
MD = 8  10  0  4  1  2  4  1  94  97  94  92  94  92  25  18  30  18  16  22  60  44  56  
RoM = 1.08  30  0  5  1  1  5  1  93  97  93  94  94  93  59  53  69  36  35  44  60  44  56  
66/134  5  0  0  0  0  1  1  96  96  96  88  88  87  77  77  79  20  20  21  92  91  92  
10  0  1  0  0  1  0  96  96  96  92  92  91  98  98  98  26  27  28  92  91  92  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  56  57  59  92  91  92  
SMD = 0.5  6/14  5  0  3  1  1  4  1  94  97  95  90  92  90  57  50  61  41  37  43  59  44  56 
MD = 20  10  0  4  1  1  5  1  94  97  94  92  93  92  86  83  89  62  60  66  60  44  57  
RoM = 1.2  30  0  5  1  0  6  1  93  96  94  94  94  94  100  100  100  96  97  98  60  43  57  
66/134  5  0  0  0  0  1  1  96  96  96  88  88  88  100  100  100  61  61  61  92  91  92  
10  0  0  0  0  1  0  96  96  96  92  92  91  100  100  100  84  85  86  92  91  92  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  100  100  100  92  91  92  
SMD = 0.8  6/14  5  0  4  0  0  4  1  94  97  95  90  92  90  91  89  93  72  69  74  59  43  56 
MD = 32  10  0  5  0  0  5  1  94  96  95  92  93  92  100  100  100  93  93  94  60  43  57  
RoM = 1.32  30  0  5  0  0  6  0  93  95  94  94  92  94  100  100  100  100  100  100  60  42  57  
66/134  5  0  0  0  0  1  1  96  96  96  88  88  88  100  100  100  90  91  91  92  91  92  
10  0  0  0  0  1  1  96  96  96  92  92  92  100  100  100  100  100  100  92  91  92  
30  0  0  0  0  1  0  96  96  96  94  94  94  100  100  100  100  100  100  92  91  92 
RoM Bias Away from No Effect Due to Heterogeneity
The second RoM bias is a bias away from unity (or no effect) that occurs only in the scenarios with heterogeneity and is due to the effects of heterogeneity on the RoM. It is most apparent in the scenarios with heterogeneity with higher standard deviations (70% of the mean control value as shown in Table 5) when the number of patients per trial is 100, since in the 100patient per trial scenarios the simultaneously present bias towards unity (or no effect) discussed above is less than 0.5%. (In the scenarios with heterogeneity and 10 patients per trial shown in Table 5, the overall bias is due to the combined effect of both the bias towards unity discussed in the previous section and the bias away from unity discussed in this section.) This bias away from unity ranges up to 1–2% in the scenarios with 100 patients per trial and occurs for the following reason. Heterogeneity is introduced in the simulations by shifting individual trial experimental and control mean values upwards or downwards. As in the scenarios presented, for RoM>1, in trials where both the experimental and control means are shifted upwards, the RoM value is decreased, and in trials when the means are both shifted downwards, the RoM value is increased. For upward and downward shifts of equal magnitude, the increase in the RoM for downward shifts is greater than the decrease in the RoM for upward shifts. This results in a pooled RoM value that is greater in the presence of heterogeneity and results in the bias away from unity. Table 5 demonstrates that this bias is higher for increasing effect sizes (i.e. larger ratios). Comparing the results where the standard deviation is higher (70% of the mean control value, Table 5) to those where the standard deviation is lower (e.g. 40% of the mean control value, Table 3) demonstrates that this bias increases with increasing standard deviations due to an increased range of individual trial RoM estimates.
Coverage
The proportion of the scenarios for which the 95% confidence interval contains the true effect size is relatively similar among the three methods for most scenarios. The coverage is close to 95%, as expected, for the scenarios with no heterogeneity, but decreases when heterogeneity is introduced. The lowest coverage of 87–88% is equally low with all three methods and occurs when heterogeneity is present with 5 trials and 100 patients per trial arm. This low coverage occurs because with the degree of heterogeneity in these scenarios (I ^{2} = 92–93%) the mean values can be widely variable. With only 5 trials, the pooled value of these mean values can be far from the true value and due to the large number of patients per trial the confidence intervals for the individual trials are relatively narrow resulting in missed coverage of the true value. Increasing the number of patients to 1000 patients per trial arm still results in coverage rates between 87–88% (results not shown), because the degree of missed coverage is dominated by the degree of heterogeneity and the increase in I ^{2} from 92–93% in the scenarios with 100 patients per trial arm to I ^{2} = 99% for the scenarios with 1000 patients per trial arm is relatively small.
Statistical Power to Detect a Significant Treatment Effect
As expected, statistical power (the proportion of scenarios yielding a significant treatment effect) increases with increasing effect size, number of patients, and number of trials, and decreases with more heterogeneity. Power also decreases with imbalanced patient allocation between groups (Tables 7 and 8) because confidence intervals are wider compared to balanced allocation scenarios. Statistical power is similar among the three methods in most scenarios. In scenarios where SMD or RoM are biased towards no effect, the power decreases compared to MD, and in the scenarios where RoM is biased away from unity, power increases compared to MD. Overall, the effect of these biases is small so that the proportion of scenarios yielding significant treatment effects is within 5 percentage points for almost all scenarios.
Heterogeneity
For the scenarios with heterogeneity, I ^{2} is around 55–60% and greater than 90% for scenarios with n = 10 and n = 100 patients per trial, respectively, close to the expected values. In scenarios where SMD and RoM are biased, I ^{2} is lower compared to MD, which is relatively free of bias (for example, scenarios with 10 patients per trial in Tables 3, 4, 5 [SMD] or Table 5 [RoM]). This occurs because bias decreases the weighting of values greatly deviating from no effect (zero for SMD and one for RoM), decreasing I ^{2}. In the scenarios exhibiting less bias, I ^{2} among all methods is similar (for example, scenarios with 100 patients per trial in Tables 3, 4, 5).
Discussion
This study examines the use of a new effect measure for metaanalysis of continuous outcomes that we call the ratio of means (RoM). In this method, the ratio of the mean value in the experimental group to that of the control group is calculated. The natural logarithmtransformed delta method approximated to first order terms provides a straightforward equation estimating the variance of the RoM for each study. Using this formulation, we performed simulations to compare the performance of RoM to traditionally used difference of means methods, MD and SMD.
Each method performed well within the simulated parameters with low bias and high coverage, even in scenarios with moderate or high heterogeneity. The methods had similar statistical power to detect significant treatment effects. SMD exhibited some bias towards zero or no effect, especially with smaller studies, as previously described [5, 6], whereas MD was relatively biasfree. RoM gave acceptable results, with bias usually less than 2–3%.
As discussed earlier, SMD and RoM, unlike MD, allow pooling of studies expressed in different units and allow comparisons regarding relative effect sizes across different interventions. However, interpreting the results of a metaanalysis that uses SMD to determine the expected treatment effect in a specific patient population requires knowledge of the pooled standard deviation. This information is frequently unknown to clinicians. In contrast, interpretation of the results of a metaanalysis that uses RoM does not require knowledge of the pooled standard deviation and may permit clinicians to more readily estimate treatment effects for their patients. Moreover, RoM provides a result similar in form to a risk ratio, a binary effect measure preferred by clinicians [16]. Thus, overall RoM may be easier for clinicians to interpret.
One limitation of RoM is that the mean values of the intervention and control groups must both be positive or negative, since the logarithm of a negative ratio is undefined. All simulations assumed positive mean values in both groups. This limitation may be less important for biological variables since these generally have positive values. Another related limitation inherent to ratio methods occurs for a normally distributed control variable with a very broad distribution (i.e. a significant proportion of expected negative values) or for a control variable with only positive values but a distribution heavily skewed towards zero. In both such distributions, a high proportion of the control mean values will be very small. These small values in the denominator of the RoM can result in a high proportion of exceedingly large ratios. This could generate results biased to higher values.
In addition to statistical properties, the choice between a difference or a ratio method for a specific situation should be determined by the biological effect of the treatment as either additive or relative for different control group values. Unfortunately, this information is frequently not known in advance. For binary outcomes, empirical comparisons between difference methods (risk difference) and ratio methods (risk ratio and odds ratio) using published metaanalyses have shown that the risk difference exhibits less consistency compared to ratio methods, resulting in increased heterogeneity [17, 18]. This suggests that for binary outcomes, relative differences are more preserved than absolute differences as baseline risk varies. It is unclear whether this is also the case for continuous outcomes, but such an empirical comparison between difference and ratio methods can be performed using our description of RoM.
Conclusion
The results of our metaanalytic simulation studies suggest that the RoM method compares favorably to MD and SMD in terms of bias, coverage, and statistical power. Similar to binary outcome analysis for which both ratio and difference methods are available, this straightforward method provides researchers the option of using a ratio method in addition to difference methods for analyzing continuous outcomes.
Appendix
This appendix briefly reviews the inversevariance weighted fixed and random effects models and the determination of the point estimate and variance for the continuous outcome measures, MD and SMD. The derivation of the point estimate and variance for RoM is described in the main text.
The inversevariance weighted fixed and random effects models
where Θ_{IV(FE)} is the inversevariance weighted fixed effects pooled effect estimate for k total studies, Θ_{i} is the effect measure estimate for study i, and weighting w_{i} = 1/variance(Θ_{i}).
where w_{i}* = 1/(w_{i} ^{1} + τ ^{2}). One estimate of τ ^{2} uses the Q statistic:
Q = Σ_{i = 1, k} w_{i} × (Θ_{i}  Θ_{IV(FE)})^{2}
When there is no betweentrial heterogeneity (τ ^{2} = 0), the Qstatistic has the expected value of k1, and the ratio Q/(k1) [12] has an expected value of unity. Under these circumstances the random effects model is equivalent to the fixed effects model. In situations with heterogeneity (τ ^{2} > 0), Q/(k1)> 1, and the proportion of variation in studylevel estimates of treatment effect due to betweenstudy heterogeneity can be expressed using the I ^{2} measure expressed as a percentage. I ^{2} can be expressed in terms of Q and k1, where I ^{2} = [Q/(k1)  1]/[Q/(k1)] which simplifies to (Q(k1))/Q [12]. I ^{2} can also be expressed as τ ^{2}/(τ ^{2} + s^{2}), where s^{2} is the variance of the effect measure, and s^{2} = Σ_{i = 1, k} w_{i} (k1)/[Σ_{i = 1, k} w_{i})^{2}  Σ_{i = 1, k} w_{i} ^{2}] [12]. When the variance (and thus weighting) of each trial is identical, as is the case with all the simulated scenarios in this study, then the variance of the effect measure, or s^{2}, reduces to the variance of a single trial.
Thus, to carry out a random effects metaanalysis requires calculating the effect measure and its variance for each study to be combined. First the fixed effects pooled effect measure is calculated, which is then used to estimate Q and τ ^{2}, and finally τ ^{2} is used to estimate the random effects pooled effect measure and its variance.
The Mean Difference Effect Measure
Using the measured values, the mean difference effect measure for each study (MD_{i}) is estimated as:
MD_{i} = mean _{ exp } mean _{ contr }
with estimated variance,
Var (MD_{i}) = Var (mean _{ exp }) + Var (mean _{ contr }) = (sd _{ exp }/√n _{ exp })^{2} + (sd _{ contr }/√n _{ contr })^{2}
where the subscripts "exp" and "contr" refer to the experimental and control groups, respectively, mean to the mean value, sd to the standard deviation, and n to the number of patients in each group. The individual effect measures and their variances are combined as described previously. All studies need to be reported in identical units for mean difference to be used as the effect measure.
The Standardized Mean Difference Effect Measure
The individual effect measures and their variances are combined as described previously. As SMD_{i} assumes more extreme positive or negative values deviating from zero, Var(SMD_{i}) increases, resulting in a smaller weighting for such trials. This means that in general SMD is biased towards zero or no effect [5, 6]. This bias towards zero is independent of the number of studies in the metaanalysis, and decreases for larger N. Using a random effects model instead of a fixed effects model can reduce this bias because the betweenstudy variance, estimated by τ ^{2}, tends to equalize the study weights. However, this advantage is offset by a lower Q (used to estimate τ ^{2}), which depends on the inverse of the variance for each study and therefore is also biased towards lower values. Alternate weighting methods have been proposed to address this bias [5, 6].
Abbreviations
 CI:

confidence interval
 FE:

fixed effects
 IV:

inverse variance
 i:

counter ranging from 1 to the number of trials in each metaanalysis (k)
 I ^{2} :

I ^{2} heterogeneity measure
 k:

number of trials in each metaanalysis
 MD:

mean difference
 mean_{contr} :

mean value in the control group
 mean_{exp} :

mean value in the experimental group
 n:

number of experimental or number of control patients per trial
 n_{contr} :

number of control patients per trial
 n_{exp} :

number of experimental patients per trial
 N:

total number of patients per trial (N = n_{contr} + n_{exp})
 Q:

Cochran's Q statistic for heterogeneity
 RE:

random effects
 s:

standard deviation
 s^{2} :

sampling variance of the effect measure
 SMD:

standardized mean difference
 SD:

standard deviation
 sd_{contr} :

standard deviation in the control group
 sd_{exp} :

standard deviation in the experimental group
 sd_{pool} :

pooled standard deviation of the control and experimental groups
 t ^{2} :

variance due to heterogeneity
 Var:

variance
 w_{i} :

weighting of study i
 w_{i} *:

weighting of study i incorporating the variance due to heterogeneity
 T_{i} :

MD SMD, or RoM effect measure estimate for study i
Declarations
Acknowledgements
The study received no specific funding. JF is supported by a Clinician Scientist Award from the Canadian Institutes of Health Research (CIHR), and JB by CIHR Grant No. 84392. CIHR had no involvement in the conduct of this study.
Authors’ Affiliations
References
 Eggar M, Davey Smith G, Altman DG, editors: Systematic Reviews in Health Care: MetaAnalysis in Context. 2001, London: BMJ BooksGoogle Scholar
 Deeks JJ, Altman DG: Effect measures for metaanalysis of trials with binary outcomes. Systematic Reviews in Health Care: MetaAnalysis in Context. Edited by: Eggar M, Davey Smith G, Altman DG. 2001, London: BMJ Books, 313335.View ArticleGoogle Scholar
 Deeks JJ, Altman DG, Bradburn MJ: Statistical methods for examining heterogeneity and combining results from several studies in metaanalysis. Systematic Reviews in Health Care: MetaAnalysis in Context. Edited by: Eggar M, Davey Smith G, Altman DG. 2001, London: BMJ Books, 285312.View ArticleGoogle Scholar
 Cohen J: Statistical Power Analysis for the Behavioral Sciences. 1988, Hillside, New Jersey: Lawrence Erlbaum Associates, 247. SecondGoogle Scholar
 Hedges LV, Olkin I: Statistical Methods for MetaAnalysis. 1985, Orlando, Florida: Academic PressGoogle Scholar
 van den Noortgate W, Onghena P: Estimating the mean effect size in metaanalysis: bias, precision, and mean squared error of different weighting methods. Behavior Research Methods, Instruments, & Computers. 2003, 35: 504511.View ArticleGoogle Scholar
 Friedrich JO, Adhikari N, Herridge MS, Beyene J: Metaanalysis: lowdose dopamine increases urine output but does not prevent renal dysfunction or death. Ann Intern Med. 2005, 142 (7): 510524.View ArticlePubMedGoogle Scholar
 Adhikari NKJ, Burns KEA, Friedrich JO, Granton JT, Cook DJ, Meade MO: Nitric oxide improves oxygenation but not mortality in acute lung injury: metaanalysis. BMJ. 2007, 334: 779View ArticlePubMedPubMed CentralGoogle Scholar
 Sud S, Sud M, Friedrich JO, Adhikari NKJ: Effect of mechanical ventilation in the prone position on clinical outcomes in patients with acute hypoxemic respiratory failure: a systematic review and metaanalysis. CMAJ. 2008, 178: 11531161.View ArticlePubMedPubMed CentralGoogle Scholar
 Armitage P, Colton T, editors: Encyclopedia of Biostatistics. 1998, Chichester, United Kingdom: John Wiley & Sons, 37313737.Google Scholar
 Higgins JPT, Thompson SG, Deeks JJ, Altman DG: Measuring inconsistency in metaanalysis. BMJ. 2003, 327: 557560.View ArticlePubMedPubMed CentralGoogle Scholar
 Higgins JPT, Thompson SG: Quantifying heterogeneity in a metaanalysis. Statistics in Medicine. 2002, 21: 15391558.View ArticlePubMedGoogle Scholar
 DerSimonian R, Laird N: Metaanalysis in clinical trials. Controlled Clinical Trials. 1986, 7: 177188.View ArticlePubMedGoogle Scholar
 Fleiss JL: The statistical basis of metaanalysis. Statistical Methods in Medical Research. 1993, 2: 121145.View ArticlePubMedGoogle Scholar
 Ghahramani S: Fundamentals of Probability. 2000, Upper Saddle River, United States: PrenticeHall, 4162Google Scholar
 Schwartz LM, Woloshin S, Welch HG: Misunderstandings about the effects of race and sex on physician's referrals for cardiac catheterization. NEJM. 1999, 341: 279283.View ArticlePubMedGoogle Scholar
 Engels EA, Schmid CH, Terrin N, Olkin I, Lau J: Heterogeneity and statistical significance in metaanalysis: an empirical study of 125 metaanalyses. Statistics in Medicine. 2000, 19: 17071728.View ArticlePubMedGoogle Scholar
 Deeks JJ: Issues in the selection of a summary statistic for metaanalysis of clinical trials with binary outcomes. Statistics in Medicine. 2002, 21: 15751600.View ArticlePubMedGoogle Scholar
 The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/8/32/prepub
Prepublication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.