Skip to main content

Table 1 Summary of imputation approaches for handling missing data in longitudinal studies available in standard software

From: A comparison of multiple imputation methods for missing data in longitudinal studies

MI approaches

Method

Details

Software

Joint modelling (JM)

(Assumes a joint multivariate distribution between all the variables in the imputation model)

JM-MVN

• Repeated measurements of time-dependent variables are imputed as distinct variables.

• Assumes a joint multivariate normal distribution for all incomplete variables.

• Binary variables are imputed as continuous variables.

• Categorical variables can be imputed as a continuous variable or as a series of dummy variables.

SAS (7), SPSS (42), Stata (8), Mplus (43) and R (9)

JM-MLMM

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• All incomplete variables are imputed using a joint multivariate LMM.

• Binary variables are imputed as continuous variables.

• Categorical variables can be imputed as a continuous variable or as a series of dummy variables.

• A constant residual error variance is assumed for all individuals.

Mplus,

R package pan [42].

JM-MLMM-LN

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• All incomplete variables are imputed using a joint multivariate LMM.

• Binary and categorical incomplete variables are imputed using latent normal variables.

• Can be fitted assuming either a constant or a subject-specific residual error variance.

Realcom-impute [43], R package jomo [16].

Fully conditional specification (FCS)

(Imputes using a univariate conditional model for each variable with missing data)

FCS-Standard

• Repeated measurements of time-dependent variables are imputed as distinct variables.

• Imputes variables using conditional univariate regression models for each incomplete variable, conditional on the time-dependent variables at all waves.

SAS, SPSS, Stata, Mplus and R

FCS - Twofold

• Repeated measurements of time-dependent variables are imputed as distinct variables.

• Imputes variables using univariate regression model for each incomplete variable, conditional on a subset of all time-dependent variables in the data based on a window period.

• Imputation carried out in a two-step iterative process.

Stata

FCS-MTW

• Repeated measurements of time-dependent variables are imputed as distinct variables.

• Imputes variables using univariate regression models for each incomplete variable, conditional on a subset of all time-dependent variables in the data based on a window period.

• Imputation carried out in a single step iterative process.

Stata

FCS-LMM

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Assumes a conditional LMM for each incomplete variable.

• Binary variables are imputed as continuous variables.

• Categorical variables can be imputed as a continuous variable or as a series of dummy variables.

• A constant residual error variance is assumed for all individuals.

R package mice (mice.impute.2 l.pan) [44].

FCS-LMM-het

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Assumes a conditional LMM for each incomplete variable.

• Binary and categorical variables are imputed as continuous variables.

• The model assumes a subject-specific residual error variance.

R package mice (mice.impute.2 l.norm) [44].

FCS-GLMM

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Assumes a conditional GLMM for incomplete binary and categorical variables.

• A constant residual error variance is assumed for all individuals

R package micemd [33]

FCS-MLMM-LN

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Only a single variable is considered to be missing in a given iteration and is imputed using a joint LMM similar to JM-MLMM-LN using imputed values for the other incomplete variables. This process is repeated for all incomplete variables in turn.

• Binary and categorical incomplete variables are imputed using a latent normal variable.

• Can be fitted using either a constant or a subject-specific residual error variance.

Mplus, R package micemd

FCS- LMM-LN

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Assumes a conditional LMM for incomplete variables.

• Binary and categorical incomplete variables are imputed using a latent normal variable

• Can be fitted using either a constant or a subject-specific residual error variance.

Blimp [45]

FCS-LMM-PMM

• Repeated measurements of time-dependent variables are imputed using hierarchical models.

• Imputes incomplete values using a draw from a pool of observed values who have the closest predicted mean to that of the incomplete case.

R package miceadds [46]

  1. The following abbreviations are used to denote different MI methods, e.g., MVN: multivariate normal imputation; MLMM: Multivariate linear mixed-effects model; MLMM-LN: Multivariate linear mixed-effects model with latent normal variables; LMM: Linear mixed-effects model; PMM-Predicted mean matching; GLMM-Generalised linear mixed-effects model; MTW – Moving Time Window