Skip to main content

Table 1 Summary of methods identified for replacing missing variance/SD/SE

From: Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review

Method Category Description Statistics required Assumptions Software implementation
Abrams et al. (2005) 3 Bayesian meta-analysis estimates within-patient correlation between baseline and follow-up; enables imputation of mean change from baseline and its SD when only baseline and follow-up means and SDs reported • baseline mean/SD
• follow-up mean/SD
• change from baseline mean/SD in some included studies
SD at baseline same as SD at follow-up; within subject correlation comes from same distribution for all studies and treatment arms; careful choice of prior distribution for variance parameters Example WinBUGS [24] code provided in paper
Hozo et al. (2005) 1 Missing variance estimated for example as: • minimum
• median
• maximum
• sample size
Data normally distributed Excel spreadsheet provided by Wan et al. (2014)
   \( Var\approx \frac{\left(n+1\right)}{48n{\left(n-1\right)}^2}+\left(\left({n}^2+3\right){\left(a-2m+b\right)}^2+4{n}^2{\left(b-a\right)}^2\right) \)   
Sung et al. (2006) 3 Imputation of missing variances within Bayesian meta-analysis assuming distributed as
True variance * χ2 (n-1)/(n-1) where true variance distributed as log-normal
• variances reported for other included studies Assume missing variances come from same lognormal distribution as reported variances Implemented in WinBUGS; code supplied in online supplement to article
Walter and Yao (2007) 1 Improved version of “range” method which calculates SD = (b-a)/4 • sample size
• range or min/max
Approximate normality Lookup table in paper could readily be implemented in standard software; RevMan [8] could accommodate in update
Ma et al. (2008) 2 Impute weighted average of variances observed in other studies; or calculate a range of pooled estimates for efficacy based on the smallest and largest variances observed • sample size
• variances of other studies in meta-analysis
Unobserved and observed variances come from the same underlying distribution Could readily be implemented in any statistical software
Nixon et al. (2009) 3 Impute missing change from baseline SD values in Bayesian random effects meta-regression • baseline SD
• follow-up SD
Log transform of baseline SD, follow-up SD and change from baseline SD follow trivariate normal distribution. Where follow-up SD is based on complete cases, imputation assumes non-informative drop-out Applied in WinBUGS
Dakin et al. (2010) 3 Bayesian hierarchical modelling estimating SD values in context of network meta-analysis. SD assumed to follow gamma distribution; parameters estimated from studies reporting SDs • observed SDs Observed and missing SD values come from the same gamma distribution WinBUGS code provided in publication
MacNeil et al. (2010) 3 Impute missing SDs in hierarchical Bayesian meta-analysis based on posterior predictive distribution • observed SDs Observed, missing SDs arise from same gamma distribution Implemented in PyMC Markov chain Monte Carlo (MCMC) toolkit [31] of Python [32]; code given in online supplement
Stevens (2011), Stevens et al. (2012) 3 Bayesian network meta-analysis that enables imputation of missing SDs via posterior predictive distribution (variances assumed to follow gamma distribution) • observed variances Variances follow gamma distribution; log(SD) given weak uniform prior distribution WinBUGS code provided
Boucher (2012) 3 Emax model of SDs; implemented using either maximum likelihood or hierarchical Bayesian model • observed SDs over time in longitudinal study longitudinal modelling of SDs using Emax mixed effects model; differences by treatment group permitted in SDs; weak uniform prior for SD used in Bayesian approach SAS (SAS Institute Inc., Cary, NC) PROC NLMIXED and WinBUGS code provided for maximum likelihood and Bayesian approaches respectively
Wan et al. (2014)* 1 \( SD\approx \frac{q3-q1}{2{\Phi}^{-1}\left(\frac{0.75n-0.125}{n+0.25}\right)} \) • lower quartile
• upper quartile
• sample size
Data normally distributed Excel spreadsheet provided by Wan et al. (2014)
Bland (2015) 1 Missing variance estimated as: • minimum
• lower quartile
• median
• upper quartile
• maximum
• sample mean
Data normally distributed Excel spreadsheet provided by Wan et al. (2014)
   \( Var\approx \frac{1}{n-1}\left(\frac{\left[2\left(\mathrm{n}+3\right)\left(\mathrm{q}12+\mathrm{m}2+\mathrm{q}32\right)+2\left(\mathrm{n}-5\right)\left(\mathrm{a}.\mathrm{q}1+\mathrm{m}.\mathrm{q}1+\mathrm{m}.\mathrm{q}3+\mathrm{q}3.\mathrm{b}\right)+\left(\mathrm{n}+11\right)\left(\mathrm{a}2+\mathrm{b}2\right)\right]}{16}-\mathrm{n}{\overline{x}}^2\right) \)
Kwon and Reis (2015) 1 Approximate Bayesian computation to estimate SD • available summary statistics Underlying distribution of data R code provided
Chowdhry et al. (2016) 2 Meta-regression assuming sample variances follow gamma distribution • observed variances from other studies in meta-analysis
• study covariates
Variances missing at random (MAR) and follow a gamma distribution Can be fitted in SAS PROC NLMIXED
  1. a minimum value, q1 lower quartile, m median, q3 upper quartile, b maximum, n sample size, \( \overline{x} \), sample mean
  2. *Also provide formulae for scenarios where only a, b and n are available; or where a, b, q1, q3, and n are available; see Results section for details
  3. Key to category numbers:
  4. 1 Methods to derive the variance/SD/SE algebraically from parametric test statistics, p-values, etc
  5. 2 Summary statistic level imputation of variance/SD/SE, for example substituting SD data from other studies, using coefficient of variation, non-parametric summaries, or correlation data
  6. 3 Meta-analysis level strategies, for example multiple imputation or bootstrapping
  7. 4 Methods to meta-analyse effects on continuous outcomes without using individual study variance/SD/SE
  8. 5 Methods to impute effect size, from which variance/SD/SE could be derived