Skip to main content

Table 1 Summary of methods identified for replacing missing variance/SD/SE

From: Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review

Method

Category

Description

Statistics required

Assumptions

Software implementation

Abrams et al. (2005)

3

Bayesian meta-analysis estimates within-patient correlation between baseline and follow-up; enables imputation of mean change from baseline and its SD when only baseline and follow-up means and SDs reported

• baseline mean/SD

• follow-up mean/SD

• change from baseline mean/SD in some included studies

SD at baseline same as SD at follow-up; within subject correlation comes from same distribution for all studies and treatment arms; careful choice of prior distribution for variance parameters

Example WinBUGS [24] code provided in paper

Hozo et al. (2005)

1

Missing variance estimated for example as:

• minimum

• median

• maximum

• sample size

Data normally distributed

Excel spreadsheet provided by Wan et al. (2014)

  

\( Var\approx \frac{\left(n+1\right)}{48n{\left(n-1\right)}^2}+\left(\left({n}^2+3\right){\left(a-2m+b\right)}^2+4{n}^2{\left(b-a\right)}^2\right) \)

  

Sung et al. (2006)

3

Imputation of missing variances within Bayesian meta-analysis assuming distributed as

True variance * χ2 (n-1)/(n-1) where true variance distributed as log-normal

• variances reported for other included studies

Assume missing variances come from same lognormal distribution as reported variances

Implemented in WinBUGS; code supplied in online supplement to article

Walter and Yao (2007)

1

Improved version of “range” method which calculates SD = (b-a)/4

• sample size

• range or min/max

Approximate normality

Lookup table in paper could readily be implemented in standard software; RevMan [8] could accommodate in update

Ma et al. (2008)

2

Impute weighted average of variances observed in other studies; or calculate a range of pooled estimates for efficacy based on the smallest and largest variances observed

• sample size

• variances of other studies in meta-analysis

Unobserved and observed variances come from the same underlying distribution

Could readily be implemented in any statistical software

Nixon et al. (2009)

3

Impute missing change from baseline SD values in Bayesian random effects meta-regression

• baseline SD

• follow-up SD

Log transform of baseline SD, follow-up SD and change from baseline SD follow trivariate normal distribution. Where follow-up SD is based on complete cases, imputation assumes non-informative drop-out

Applied in WinBUGS

Dakin et al. (2010)

3

Bayesian hierarchical modelling estimating SD values in context of network meta-analysis. SD assumed to follow gamma distribution; parameters estimated from studies reporting SDs

• observed SDs

Observed and missing SD values come from the same gamma distribution

WinBUGS code provided in publication

MacNeil et al. (2010)

3

Impute missing SDs in hierarchical Bayesian meta-analysis based on posterior predictive distribution

• observed SDs

Observed, missing SDs arise from same gamma distribution

Implemented in PyMC Markov chain Monte Carlo (MCMC) toolkit [31] of Python [32]; code given in online supplement

Stevens (2011), Stevens et al. (2012)

3

Bayesian network meta-analysis that enables imputation of missing SDs via posterior predictive distribution (variances assumed to follow gamma distribution)

• observed variances

Variances follow gamma distribution; log(SD) given weak uniform prior distribution

WinBUGS code provided

Boucher (2012)

3

Emax model of SDs; implemented using either maximum likelihood or hierarchical Bayesian model

• observed SDs over time in longitudinal study

longitudinal modelling of SDs using Emax mixed effects model; differences by treatment group permitted in SDs; weak uniform prior for SD used in Bayesian approach

SAS (SAS Institute Inc., Cary, NC) PROC NLMIXED and WinBUGS code provided for maximum likelihood and Bayesian approaches respectively

Wan et al. (2014)*

1

\( SD\approx \frac{q3-q1}{2{\Phi}^{-1}\left(\frac{0.75n-0.125}{n+0.25}\right)} \)

• lower quartile

• upper quartile

• sample size

Data normally distributed

Excel spreadsheet provided by Wan et al. (2014)

Bland (2015)

1

Missing variance estimated as:

• minimum

• lower quartile

• median

• upper quartile

• maximum

• sample mean

Data normally distributed

Excel spreadsheet provided by Wan et al. (2014)

  

\( Var\approx \frac{1}{n-1}\left(\frac{\left[2\left(\mathrm{n}+3\right)\left(\mathrm{q}12+\mathrm{m}2+\mathrm{q}32\right)+2\left(\mathrm{n}-5\right)\left(\mathrm{a}.\mathrm{q}1+\mathrm{m}.\mathrm{q}1+\mathrm{m}.\mathrm{q}3+\mathrm{q}3.\mathrm{b}\right)+\left(\mathrm{n}+11\right)\left(\mathrm{a}2+\mathrm{b}2\right)\right]}{16}-\mathrm{n}{\overline{x}}^2\right) \)

Kwon and Reis (2015)

1

Approximate Bayesian computation to estimate SD

• available summary statistics

Underlying distribution of data

R code provided

Chowdhry et al. (2016)

2

Meta-regression assuming sample variances follow gamma distribution

• observed variances from other studies in meta-analysis

• study covariates

Variances missing at random (MAR) and follow a gamma distribution

Can be fitted in SAS PROC NLMIXED

  1. a minimum value, q1 lower quartile, m median, q3 upper quartile, b maximum, n sample size, \( \overline{x} \), sample mean
  2. *Also provide formulae for scenarios where only a, b and n are available; or where a, b, q1, q3, and n are available; see Results section for details
  3. Key to category numbers:
  4. 1 Methods to derive the variance/SD/SE algebraically from parametric test statistics, p-values, etc
  5. 2 Summary statistic level imputation of variance/SD/SE, for example substituting SD data from other studies, using coefficient of variation, non-parametric summaries, or correlation data
  6. 3 Meta-analysis level strategies, for example multiple imputation or bootstrapping
  7. 4 Methods to meta-analyse effects on continuous outcomes without using individual study variance/SD/SE
  8. 5 Methods to impute effect size, from which variance/SD/SE could be derived