Skip to main content

Table 4 Reporting of MI procedure in articles using multiple imputation

From: The rise of multiple imputation: a review of the reporting and implementation of the method in medical research

  Type of studies  
Characteristics reported Trials Observational studies All studies
  N (%)* N (%)* N (%)*
  (n = 73) (n = 30) (n = 103)
Imputation details    
Any imputation details provideda 60 (82) 27 (90) 87 (85)
Imputation method stated 29 (40) 9 (30) 38 (37)
    MI using chained equations (MICE) 14 6 20
    MI using multivariate normal model (MVNI)b 7 1 8
    MI using predictive mean matching (PMM) 1 0 1
    MI using regression-based imputationc 4 1 5
    MI using MICE & PMMd 1 1 2
    MI using propensity score 1 0 1
    MI using propensity score or regression modellinge 1 0 1
General procedure/command specified 5 (7) 2 (7) 7 (7)
    Proc MI 4 1 5
    MI command 0 1 1
    Model-based MIf 1 0 1
    Imputation method inferred 11 (15) 10 (33) 21 (20)
    MICE (SAS- IVEware) 1 2 3
    MICE (Stata- pre V11) 1 2 3
    MICE (Multiple packageg) 1 0 1
    MVNI (SAS- pre V9.3-imputed more than 1 variable) 5 1 6
    MVNI (R-Amelia II) 0 2 2
    MVNI (S-plus) 2 0 2
    Regression-based imputation (SAS pre V9.3-imputed 1 categorical variable) 1 3 4
Non-normal variables transformed prior to imputation 6 (8) 6 (20) 12 (12)
    Log transformationh 4 4 8
    Logit transformation 0 1 1
    General comment about applying normalising transformation 2 1 3
Provided details on the variables included in the imputation model 26 (36) 13 (43) 39 (38)
    Included auxiliary variable(s) 6 4 10
    Included interaction term(s) 2 2 4
    Included auxiliary variable and interaction 3 2 5
    No information provided on auxiliary variables and interaction terms 15 5 20
Number of imputations 28 (38) 19 (63) 47 (46)
    ≤5 8 3 11
    10 6 3 9
    11-50 8 6 14
    100 4 6 10
    >100 2 1 3
Carried out diagnostic checks of the imputation modeli 0 (0) 2 (7) 2 (2)
Assessed differences between results obtained from CC/LOCF and MI in the text/tablej 45 (62) 17 (57) 62 (60)
Software details    
Imputation software statedk,l 51 (70) 25 (83) 76 (74)
    SAS 23 10 33
    Stata 18 9 27
    R 6 6 12
    Other packages (SOLAS, S-plus, SPSS) 4 0 4
Analysis status of MI    
MI used in the primary analysis 26 (36) 12 (40) 38 (37)
MI used as a secondary analysis 47 (64) 19 (63) 66l (64)
    Methods used for primary analysis if MI applied as a secondary analysis    
      Complete case analysis (CC)m,n 43 19 62
      Last observation carried forward (LOCF) 4 0 4
Sensitivity analysis following MI 3 (4) 0 (0) 3 (3)
    Pattern-mixture model approach 1 0 1
    Selection model approach 0 0 0
    Performed but the method not statedo 2 0 2
  1. *Unless otherwise stated.
  2. Abbreviations: MI- multiple imputation, MICE- multiple imputation by chained equations, MVNI- multivariate normal imputation, PMM- predictive mean matching, MCMC- Markov chain Monte Carlo, CC- complete case, LOCF- last observation carried forward.
  3. aAny information provided by the authors with regard to the imputation process. Note: a general procedure/command stated by the authors, and the imputation methods that were inferred by the reviewers are not included in this category.
  4. bIn five articles [35,61,68,90] MI via MCMC algorithm was used for imputing missing data.
  5. cIn three articles [40,47,84], logistic regression method and in two articles [39,113], linear regression method were stated as a imputation method for handling missing data.
  6. dTwo articles [61,93] imputed one or two variables with missing data under PMM (because of non-normality), and imputed other incomplete variables under MICE.
  7. eOne article [91] stated that MI was used on the basis of either propensity scoring or regression modelling for imputation of missing data in the primary and secondary outcome measures.
  8. fOne article [51] stated that model-based MI was used to account for missing data in the clinical outcome.
  9. gIn one article [77] multiple packages were used for the analyses, i.e. SPSS version 15.0 and Stata version 10.1. The default imputation method in either of these packages (given the specified versions) was chained equations.
  10. hOne article [93] used both the square root and log transformations for non-normally distributed variables.
  11. iBoth articles [82,130] compared the observed and imputed data.
  12. jThe MI estimates were not provided in 6 articles [34,37,81,85,87,120], instead a comparison of the results between the different approaches for dealing with the missing data was commented on in the text (e.g. the analysis of complete cases and the imputed data provided the same results).
  13. kFor eight articles [59,77,81,88,94,96,115,127] it was not possible to extract this information because multiple packages for the statistical analyses were mentioned with no explicit statement regarding which package was used for imputation.
  14. lThose articles that did not provide the name of the imputation software (R, Stata, SAS, etc.), but instead gave the name of the procedure/application used for imputing missing data (e.g. Amelia II, IVEware) were also included here.
  15. mOne article [99] used MI as well as CC for primary analysis to impute the missing confounder values (with no imputation of missing data in the exposure and outcome), and used MI again as a sensitivity analysis to impute missing data in all confounders and the outcome (but not the exposure), as well as a CC.
  16. nTwo articles [40,100] used LOCF for the secondary analysis as well as MI; one of them described the MI as a sensitivity analysis.
  17. oA general statement was made about performing a sensitivity analysis but the results of the details were not provided.