 Research article
 Open Access
 Open Peer Review
 Published:
Multiple imputation methods for handling missing values in a longitudinal categorical variable with restrictions on transitions over time: a simulation study
BMC Medical Research Methodologyvolume 19, Article number: 14 (2019)
Abstract
Background
Longitudinal categorical variables are sometimes restricted in terms of how individuals transition between categories over time. For example, with a timedependent measure of smoking categorised as neversmoker, exsmoker, and currentsmoker, currentsmokers or exsmokers cannot transition to a neversmoker at a subsequent wave. These longitudinal variables often contain missing values, however, there is little guidance on whether these restrictions need to be accommodated when using multiple imputation methods. Multiply imputing such missing values, ignoring the restrictions, could lead to implausible transitions.
Methods
We designed a simulation study based on the Longitudinal Study of Australian Children, where the target analysis was the association between (incomplete) maternal smoking and childhood obesity. We set varying proportions of data on maternal smoking to missing completely at random or missing at random. We compared the performance of fully conditional specification with multinomial and ordinal logistic imputation, and predictive mean matching, twofold fully conditional specification, indicator based imputation under multivariate normal imputation with projected distancebased rounding, and continuous imputation under multivariate normal imputation with calibration, where each of these multiple imputation methods were applied, accounting for the restrictions using a semideterministic imputation procedure.
Results
Overall, we observed reduced bias when applying multiple imputation methods with restrictions, and fully conditional specification with predictive mean matching performed the best. Applying fully conditional specification and twofold fully conditional specification for imputing nominal variables based on multinomial logistic regression had severe convergence issues. Both imputation methods under multivariate normal imputation produced biased estimates when restrictions were not accommodated, however, we observed substantial reductions in bias when restrictions were applied with continuous imputation under multivariate normal imputation with calibration.
Conclusion
In a similar longitudinal setting we recommend the use of fully conditional specification with predictive mean matching, with restrictions applied during the imputation stage.
Background
The problem of missing data is prominent in longitudinal studies as these studies involve gathering information from respondents at multiple waves over a long period of time [1]. One approach for handling such missing data is multiple imputation (MI), which has become a frequently used method for handling missing data in observational epidemiological studies [2]. MI is a two stage process [3]. In the first stage, the incomplete dataset is replicated multiple times, with the missing values replaced by values drawn from an appropriate imputation model. In the second stage, the analysis of interest is performed on each of the imputed datasets and resulting parameter estimates are combined using Rubin’s rules [3]. Multivariate normal imputation (MVNI), and fully conditional specification (FCS), are widely available MI methods that have been used in longitudinal studies [4, 5] to impute missing values.
MVNI imputes missing values by fitting a joint imputation model for all the variables with missing data, assuming that these variables follow a multivariate normal distribution [6]. FCS uses univariate regression models fitted to each variable with missing data depending on the type of variable with missing data [7, 8]. When handling missing values in longitudinal data, standard implementations of MVNI and FCS can be applied by treating repeated measurements of the same variable at different time points as distinct variables, sometimes referred to as the “Just Another Variable” approach [9]. For example, measurements of quality of life at different time points are treated as separate variables. This needs to be done for all the longitudinal variables. This approach does not explicitly model the longitudinal structure of the data, although it does allow for the correlations between the repeated measurements. The twofold FCS algorithm is a recently proposed version of FCS that takes into consideration the longitudinal structure of the data by imputing missing values in a variable at a certain time point, using information only from the specific time point and immediately adjacent time points [9, 10]. Twofold FCS may help to reduce convergence issues encountered with FCS in longitudinal studies with large numbers of waves and incomplete variables [9].
In many epidemiological studies, variables are collected that involve several restrictions. One example is that of restrictedtransition variables. These are categorical variables where the set of possible future states depends on its current and previous states. For example, with a timedependent measure of smoking categorised as never, ex, and currentsmoker, current or exsmokers cannot transition to a neversmoker at a subsequent wave. Oral contraceptive use measured repeatedly as a neveruser, exuser or currentuser is another example of a timedependent variable which is restricted such that an ex or currentuser cannot transition into a neveruser at a subsequent wave. However, neverusers may start using oral contraceptives at any time.
Guidance on how MI methods should be applied for handling missing data in such variables is limited in the statistical literature. For incomplete smoking data (non, ex and currentsmoker), Welch et al. [9] focused on a simulation scenario where nonsmokers at baseline did not transition into other smoking categories, and used deterministic imputation for the nonsmoking category in this simulation study. Specifically, all respondents observed as nonsmokers at any of the time points, were imputed as nonsmokers for missing time points. Missing values for the remaining respondents were imputed stochastically, as either a currentsmoker or exsmoker [9]. Although this semideterministic approach is appealing, it may not always be appropriate as in realworld situations some nonsmokers may start smoking. Similarly, in the contraceptive use example, neverusers may start using oral contraceptives over time. Another simulation study by Kalaycioglu et al. [5] explored a number of scenarios for handling missing values in longitudinal data, including a categorical treatment variable, which had transition restrictions. However, little information was available on how missing values were handled in this variable.
While the primary goal of MI is to obtain valid inferences, and not to replace the actual missing values per se [11], it is important to assess the impact of implausible imputation values on the parameter estimates of interest [6, 7, 12]. Therefore, the aim of this paper was to evaluate the performance of possible MI approaches (namely MVNI, FCS, and twofold FCS algorithm) for handling missing values in a longitudinal categorical variable with restrictions on transitions over time. We report the findings of a case study from the Longitudinal Study of Australian Children (LSAC), and a simulation study based on the LSAC [13] where approximately 65% of data on maternal smoking were set to missing completely at random (MCAR) or missing at random (MAR). In this study, maternal smoking was a timedependent categorical exposure variable with restrictions, measured repeatedly over six time points.
Methods
Motivating example: Longitudinal study of Australian children (LSAC)
The Longitudinal Study of Australian Children (LSAC) is a prospective study of 10,000 children, involving two cohorts, the infant cohort (B) and the child cohort (K). Data collected at six time points, from 2004 to 2014 [13] was available for this study. LSAC obtained written informed consent from the caregiver on behalf of each of the study children, as the children were minors at the time of data collection and was approved by the Australian Institute of Family Studies Ethics Committee.
Epidemiological analysis of interest
Childhood obesity is a growing epidemic in most developed countries, and a common problem among Australian children [14]. Many severe health diseases are attributable to childhood obesity [15]. Importantly, exposure to maternal smoking has been found to be an important risk factor of childhood obesity [16,17,18,19]. The motivating example for our simulation study was to quantify the relationship between exposure to maternal smoking and body mass index (BMI).
Target analysis model
The analysis of interest was the association between maternal smoking measured at one wave and BMI for age zscores (BMIz) measured at the subsequent wave, estimated using a linear mixedeffects model with a random intercept and adjusted for child’s current age, birthweight, and sex, breastfeeding, maternal age at child birth, maternal education, and family socioeconomic status (see Eq. 1 and Table 1 for description of the variables, and Fig. 1a for the causal diagram).
where i = 1,..., N, and N = 1000 for waves j = 1,…,6; ∈_{ij} is identically and independently distributed as \( {\in}_{ij}\sim N\ \left(0,{\sigma}_{\in}^2\right) \); a = 0 (neversmoker – reference category), 1 (exsmoker) and 2 (currentsmoker); β_{0} is the population parameter for mean BMIz when other covariates are set to zero and b_{0i} is the random intercept for individual i, assumed to be normally distributed with mean zero, and constant variance; β_{1} − β_{8} are the population parameters for the mean change in BMIz associated with the covariates.
Simulation of complete data
The simulation study was based on six waves of the LSAC infant cohort, which had a participation of 5107 children at wave 1 (see Additional file 1: Table S1). Data were generated as specified below based on the casual diagram in Fig. 1a. This process was repeated to generate 1000 complete datasets. A detailed description of the simulation procedure is provided in the Additional file 1.
After simulating the timeindependent variables, the timedependent exposure (m_smoking) and outcome (BMIz) were simulated as follows:

Maternal smoking at wave 0 (i.e. during pregnancy) (m_smoking_{i,0}) was generated from a multinomial logistic regression model:

Maternal smoking at waves j = 1,…,6 (m_smoking_{i,j}) was generated in two stages.

Stage 1: Maternal smoking was generated for respondents who were neversmokers at the previous wave using the multinomial logistic regression model:

Stage 2: Maternal smoking for the remaining respondents (current or exsmoker) was generated using the logistic regression model:

BMI for age zscores (BMIz_{i,j}) were generated for waves j = 2,…,6 using the linear mixedeffects model in Eq. 1 so that the chosen values for β_{1,a} (a = 1, 2) of Eq. 1 are the true values for the parameters of interest.
We considered β_{1,1} = 0.10 and β_{1,2} = 0.15. In general, parameter values used in the simulation process were chosen to mimic the LSAC data (see Additional file 1: Table S2).
Generation of missing data
For each of the 1000 simulated datasets, and at each wave, maternal smoking values were randomly assigned to missing such that for some individuals, measurements in all subsequent waves were also missing (i.e. dropout) while for others future values of maternal smoking could be missing or observed (i.e. intermittent missingness). The proportions of missingness per waves were as in the LSAC (see Additional file: Fig. S1). Missingness was generated under an MCAR mechanism, or either of two MAR mechanisms, representing weak or strong associations between the probability of missingness and predictors of missingness (see Table 2).
Specifically, under each MAR mechanism, it was assumed the probability of missingness in maternal smoking followed a logistic regression model dependent on BMIz and the auxiliary variable maternal depression (Fig. 1b). The dseparation criterion [20] was used to show that missingness is independent of unobserved data conditional on maternal depression at wave j (m_depression_{j}) and BMIz measured at the subsequent wave (BMIz_{j + 1}) that is, the MAR assumption holds given these variables (see Additional file 1). The models used to generate missing values in maternal smoking were:
Model A: missing for all subsequent waves
Model A introduces monotone missingness, such that, if the measurement at wave j is specified as missing using model A, then the individual will have measurements missing for all subsequent waves j + 1, … , 5.
Model B: intermittent missingness between waves j1 and j
(6)
where R_{i, j} is an indicator variable of missingness, and maternal smoking was assigned to missing for respondent i at wave j if R_{i, j} = 1.
Model B was only applied to the respondents who were not specified as missing using model A. The strong MAR scenario was obtained by doubling the log of the odds ratios used in the weak MAR scenario (see Table 2 for parameter values).
For each mechanism (MCAR or MAR), the overall missingness proportion for maternal smoking was set at 45 % and 65%, representing realistic and extreme scenarios respectively [21], resulting in 6 simulation scenarios.
Methods to handle missing data
For comparison with MI methods, we first performed a complete case analysis (CCA), excluding all respondents with missing values for maternal smoking at any of the 5 waves, and an available case analysis (ACA), including available data at each wave in the analyses [22]. These approaches are commonly used due to simplicity [2, 22,23,24]. CCA and ACA are expected to produce biased estimates under the MAR scenarios explored in this study. Both CCA and ACA condition on the missingness indicator R_{j} (see Fig. 1b). This missingness indicator is a collider as it lies in the pathway ‘m_depression_{j}→R_{j}← BMIz_{j + 1}’, opening a backdoor path between the exposure and outcome of interest that is not blocked in the analysis model given that maternal depression is an auxiliary variable not included in the target analysis. Therefore, in principle we expect biased estimates under CCA and ACA [25], although this bias may be small.
We then assessed three MI methods, MVNI, FCS, and twofold FCS, to multiply impute missing values in maternal smoking at waves 1 to 5. Given that the missingness mechanism generated satisfies the MAR assumption given m_depression_{j} and BMIz_{j + 1}, as explained previously, we expect in principle that appropriate MI methods incorporating the target analysis variables as well as the auxiliary maternal depression variable to produce unbiased estimates under the missing data scenarios considered. Specifically, we considered two versions of each of these MI methods; the standard version, and the restrictionadapted version that accounts for restrictions in transitions over time.
Standard version
In the standard implementation of MVNI and FCS, repeated measurements of maternal smoking were included as distinct variables in the imputation model (i.e. one variable for each time point). This ‘singlelevel’ imputation was used to impute missing data at all the time points. The correlation between the repeated measures is captured in this approach [4, 5], However, treating repeated measurements of the same variable as distinct variables fails to account for the temporal ordering of the data which may affect imputation [9].
With MVNI, due to the assumption of multivariate normality, the imputed values for maternal smoking could take noninteger values. Therefore, we used two methods for imputation; maternal smoking imputed as indicators using MVNI, followed by projected distancebased rounding (indicatorPDBR) [26], and maternal smoking imputed as a continuous variable using MVNI, followed by calibration (continuouscalibration) [27, 28], to recategorise imputed values into the original categories (see Additional file 1, Figure S2 and S3).
Within the FCS framework we considered three univariate imputation methods: multinomial logistic regression, ordinal logistic regression (treating the smoking variable as continuous based on the numerical codes 0, 1, 2), and predictive mean matching (PMM) (using a linear prediction model to obtain predicted values and k = 5 and 10 for randomly drawing from k^{th} nearest observed values to the predicted value) [29].
With the twofold FCS algorithm, missing values in maternal smoking were imputed using information from only specific and immediately adjacent time points, and assuming a multinomial logistic imputation model (ordinal logistic regression is not available in current implementation of twofold FCS) [30].
We used a linear mixedeffects model with a random intercept as our analysis model. Even though we used a multilevel analysis model, missing data were imputed using singlelevel fixedeffect imputation methods. These singlelevel fixedeffect MI methods allow an unstructured correlation structure between the repeated measurements. This indicates that no unnecessary assumptions are made about the correlations, which makes the singlelevel fixedeffect MI methods more general than a multilevel MI method. Furthermore, all imputation models included all variables in the analysis model as predictors, as well as the timedependent auxiliary variable maternal depression [31]. Hence the MI methods considered are approximately compatible with the analysis model. Even though singlelevel fixedeffect MI may lead to increased precision, the statistical literature has highlighted limitations of this method: it can inflate the sampling variance, lead to low coverage probabilities, and may be computationally demanding. These issues are discussed by Enders et al. [32].
Restrictionadapted version
We used a semideterministic approach, where missing values in maternal smoking at waves 1 to 5 were imputed according to a threestage process, as follows:

Stage 1: If a respondent was observed as a neversmoker at a specific wave, any missing values in all previous waves were deterministically assigned to be a neversmoker (Fig. 2a).

Stage 2: If a respondent was observed as a current or exsmoker at a specific wave, any missing values in all subsequent waves were imputed stochastically as current or exsmokers (i.e. as a binary variable) (Fig. 2b).

Stage 3: For the remaining scenarios (Fig. 2c), the missing values were imputed stochastically as never, current or exsmokers.
In stage 3, it is inevitable that a small proportion of imputed values will violate the restrictions. However, we accepted these implausible values as it would be difficult to further introduce restrictions within the already existing restrictions.
Performance measures for evaluating different methods
We estimated the target analysis parameters (β_{1, a} (a = 1,2) of Eq. 1) by fitting the linear mixedeffects model in Eq. 1.
We compared the performances of CCA, ACA, and the different MI methods (standard and restrictionadapted versions) using the absolute bias (difference between true value and average of MI estimates calculated from 1000 simulations); empirical standard error (square root of variance of 1000 estimates); and coverage of 95% confidence interval (proportion of simulated datasets in which the true parameter value was contained in the estimated 95% confidence interval). The relative bias (bias relative to true parameter value), the modelbased standard error (average of standard errors of 1000 estimates) and mean square error (MSE), which is a combined measure of bias and efficiency [33], were also reported. The Monte Carlo errors for the MI estimates were used to assess the variation in estimated parameters across the simulations [34].
Case study analysis
In addition to the simulation study, we also provide an empirical comparison of the methods considered, using the data from the LSAC infant cohort. We used wavespecific measures of whether the mother currently smoked or not to derive the neversmoker, exsmoker and currentsmoker at waves 1 through 6 (see Additional file 1).
Stata 13 statistical software [35] was used for all analyses.
Results
Results from simulation study
The standard and twofold FCS methods with multinomial logistic regression imputation models failed to converge in all 1000 simulations for each of the 6 simulation scenarios. Standard FCS with ordinal logistic regression imputation showed extremely high nonconvergence rates (up to 95%). The results for standard and twofold FCS methods with multinomial logistic regression imputation, and FCS with ordinal regression are no longer considered in the following description of the results.
As expected we observed minimal bias under CCA and ACA when data were MCAR, with the relative bias not exceeding 3% (Figs. 3a and 4a). In both MCAR scenarios, the MI methods (FCS with PMM, indicatorPDBR and continuouscalibration) produced more biased estimates than CCA and ACA (a minimum relative bias of 0.05% produced by CCA and for the MI methods a maximum relative bias of 19.01% produced by continuouscalibration without restrictions). However, when data were MAR, the CCA resulted in more bias than most MI approaches, particularly in the strong MAR scenario (Figs. 3c and 4c). ACA still produced low bias (relative bias less than 10%) (Additional file 1: Tables S4S7) and performed better than all of the MI methods in nearly all scenarios. FCS with PMM performed better than the other MI methods in terms of bias, in most MAR scenarios, under the standard implementation of MI, and we observed further reductions in bias under the restrictionadapted version, with the relative bias remaining under 10% for all missingness scenarios. Both imputation approaches under MVNI resulted in a high level of bias under the standard version. Convergence issues in up to 0.3% of the simulations across the 6 missingness mechanisms when no restrictions were applied were observed with indicatorPDBR. Little difference was observed in bias for indicatorPDBR with restrictions compared to the standard version; however, the nonconvergence was lowered to a maximum of 0.1% across the 6 scenarios. We observed substantial reductions in bias for continuouscalibration (a reduction of relative bias of up to 26%, Fig. 4c) under the restrictionadapted version compared to the standard implementation.
For all MI methods with no issues of convergence, we found substantial gains in precision compared to CCA. However, for ACA we observed slightly larger empirical standard errors compared to these MI approaches. Across these MI methods, there was minimal difference in precision irrespective of the imputation approach and whether it was applied with or without restrictions. The gain in precision for MI compared to CCA and ACA was also reflected in the MSE, in which the MI methods produced a substantially lower MSE compared to CCA, and a slightly lower MSE compared to ACA. FCS with PMM performed better in terms of MSE than the other imputation approaches in most missingness scenarios when no restrictions were applied, however, we did not observe much difference in MSE when restrictions were applied.
The coverage was within 93.6 and 96.4% for the nominal level of 95% (expected range for coverage based on 1000 simulations) for most scenarios. However, a slight overcoverage was reported by both continuouscalibration and FCS with PMM for parameter estimates corresponding to exsmokers relative to neversmokers, under both standard and restrictionadapted versions.
Results from case study
Similar to the simulation study, the multinomial and ordinal logistic imputation models fitted under the FCS methods (both with and without restrictions) did not converge. Additionally, indicatorPDBR with restrictions, which showed some convergence issues in the simulation study, did not converge with the real data.
As shown in Fig. 5, the CCA produced slightly large estimates for the mean differences and wider confidence intervals compared to the ACA and the MI methods that converged. The ACA gave smaller standard errors and narrower confidence intervals than all MI methods. Continuouscalibration and FCS with PMM were the only MI methods with restrictions that converged. We observed minimal differences in the estimates and confidence intervals when these methods were used with restrictions compared to without restrictions (Additional file 1: Table S3).
Discussion
We compared the performance of MI methods, MVNI, FCS, and twofold FCS, applied with and without restrictions, in addition to CCA and ACA, for handling missing data in a categorical variable with restrictions over time. We considered 6 different scenarios of missing data in maternal smoking, a longitudinal categorical exposure with three levels; neversmoker, exsmoker and currentsmoker, where an ex or currentsmoker at a specific wave is restricted from transitioning into a neversmoker.
Consistent with previously published studies [9, 21, 36,37,38], CCA and ACA produced negligible bias under MCAR. CCA excluded all individuals with missing data in at least one wave from the analysis. Missing data in maternal smoking were generated such that missingness was dependent on the outcome, BMI for age zscores, after conditioning on the variables of the target analysis model. Therefore, as expected CCA produced biased estimates when data were MAR, with larger bias in the stronger MAR scenario. In contrast, in nearly all missingness scenarios investigated, ACA produced less biased estimates than MI without restrictions. This may be due to ACA accounting for most of the missingness mechanism due to the correlation between the repeated measurements. The imputation of implausible transitions under standard MI without restrictions is a possible reason for why this method produced more biased estimates than ACA. Furthermore, standard MVNI and FCS methods do not account for the temporal ordering of the repeated measurements as they treat repeated measurements of the same variable as distinct variables [9], which may explain the underperformance. However, simulation studies by Kalaycioglu et al. [5] and De Silva et al. [4] have shown that both MVNI and FCS may not be to susceptible to this issue as they have both been shown to have very good performance when including as much information as possible (i.e. all the repeated measurements) in the imputation model, as implemented in our study. Conversely, Kalaycioglu et al. [5] reported more biased estimates using ACA compared with MI without restrictions in the presence of multiple longitudinal variables with missing data, many of which were not restricted. In terms of precision, we observed substantial and slight gains with MI in both standard and restrictionadapted versions compared to CCA and ACA respectively, consistent with previous studies [4, 5]. This was presumably because we used maternal depression (a fully observed timedependent variable) in the imputation models, which was a strong predictor of missingness [4, 21, 31, 39, 40].
The standard FCS approach imputing smoking using multinomial or ordinal logistic regression imputation failed to converge in 95–100% of the simulated datasets. Our findings agree with the results of simulation studies by Welch et al. [9] and Kalaycioglu et al. [5], which reported convergence issues in FCS, albeit of smaller proportions. Welch et al. [9], assumed that nonsmokers at baseline remained nonsmokers throughout, and only current and exsmokers transitioned between the two categories, thus converting the imputation of maternal smoking into a binary imputation. Despite this, approximately 25% of the simulated datasets did not converge with standard FCS [9]. Of note, application of the twofold FCS in our simulation study, which reduced the number of categorical predictor variables in each univariate imputation model [30] where imputation of smoking was performed using multinomial logistic regression, still did not overcome the convergence issues. We observed similar convergence issues as seen in the simulation study with the real data.
Multinomial logistic regression faces difficulties of convergence when the imputation model includes a large number of categorical variables with rare categories and/or high collinearity. In our study, under FCS, six categorical smoking variables (one for each time point) were included in the multinomial logistic imputation model, and only a small number of exsmokers were present in the simulated data mimicking the real cohort. Even though under the twofold FCS algorithm only four categorical smoking variables (current and immediately adjacent time points, and smoking during pregnancy) were included in the multinomial logistic imputation model, all of these variables had a rare category leading to convergence issues.
FCS with PMM imputation produced the least biased estimates when compared to other MI methods irrespective of whether restrictions were applied. It also produced the smallest MSE across the 6 missing mechanisms, gaining precision over ACA, which performed best in terms of bias. While all other MI methods either failed to converge for all simulated datasets or resulted in large bias, PMM performed well both with and without restrictions. PMM replaces missing values with observed values [29, 41], therefore, even without restrictions, the proportion of implausible transitions imputed was low. PMM also avoids the problems arising from rounding methods related to MVNI. Slight issues of overcoverage were observed under PMM. Rodwell et al. [42] also reported issues with coverage when using PMM for imputing limited range variables, due to the matching algorithm used in Stata for PMM imputation. PMM uses three different types (0, 1 and 2) of matching to calculate a predictive distance between an observed value and a value obtained from a linear predictor, and identifies k observations which minimise this predictive distance. The `mi impute pmm’ command in Stata uses type 2 matching. PMM can also be implemented in R using the `mice’ package which uses type 1 matching. Type 2 matching differs from type 1 matching in that it does not adequately account for the uncertainty around the parameter of the imputation model when computing the predictive distance. A simulation study by Morris et al. [41] reported undercoverage for PMM under both type 1 and type 2 matching, with type 2 matching leading to slightly worse coverage probabilities for this reason. Therefore, the coverage probabilities may have been better when implementing PMM using the `mice’ package in R compared to the `mi impute pmm’ command in Stata.
Simulation studies by Kalaycioglu et al. [5] and De Silva et al. [4] have shown that MVNI can have very good performance when used to impute missing longitudinal data. However, the underlying assumption of multivariate normality is not plausible in our study as maternal smoking is a categorical variable. While MVNI can result in valid inferences despite the departure from multivariate normality [6, 43], adoption of a suitable rounding method to deal with noninteger imputed smoking values is required for the analysis of interest. There are number of rounding techniques available for categorical variables at a single time point [44, 45], rounding methods in the context of longitudinal data are yet to be explored [32]. We observed high biases with both MVNI approaches under different scenarios, especially without restrictions. Presumably because, indicatorPDBR uses an indicator based approach for imputation followed by projected distancebased rounding, which does not aim to preserve the marginal proportion in each category, and continuouscalibration imputes maternal smoking as a continuous variable, followed by calibration for rounding, which distorts the association between the exposure and outcome, even though it aims to preserve the marginal proportion in each category [44, 45]. Continuouscalibration resulted in substantial reductions in bias when restrictions were applied, and there were slight gains in MSE from continuouscalibration compared to indicatorPDBR, which agrees with the findings of Galati et al. [45]. It should, however, be noted that continuouscalibration was originally proposed for ordinal variables [44], while maternal smoking is technically a nominal variable. IndicatorPDBR also faced some convergence issues, presumably because it uses an indicatorbased approach for imputation [44].
The threestage restriction procedure employed in our study is an extension of the semideterministic approach used by Welch et al. [9], where they simplified the imputation to ex and currentsmokers as discussed previously. We observed moderate to substantial reductions in bias for PMM and continuouscalibration, and fewer convergence issues for indicatorPDBR, when restrictions were applied. However, when restrictions were applied, we observed that the empirical standard errors either slightly increased or remained the same compared with the standard implementation of MI. The MSE was greatly influenced by the empirical standard error due to its relatively large magnitude compared with absolute bias, therefore, even in scenarios which showed substantial improvements in bias, little or no change in empirical standard errors resulted in no changes in MSE, when restrictions were applied.
There is currently limited guidance on the imputation of missing values in timedependent categorical variables even without restrictions. With standard FCS often facing convergence issues in the presence of categorical variables with rare categories, and unsatisfactory rounding methods for MVNI, this area warrants further research. Enders et al. [32] suggested using a joint imputation procedure with latent variable formulation for categorical variables, available in the MLwiN software [46]. The ‘jomo’ package in R is designed for multilevel joint modelling MI [47], but to date has not been widely adopted. Our study was limited to currently available methods in the Stata statistical software and multilevel MI methods such as ‘jomo’ are currently not available in Stata. Additionally, further research is required to examine how to implement restrictions within these multilevel imputation methods, and this was beyond the scope of this study.
Our simulation study was designed based on the LSAC infant cohort to assess the performance of MI methods in a realistic setting [4, 21, 36]. We also provide a case study for an empirical illustration of what we observed in the simulation study. This simulation study was designed based on a single cohort, and the performance of the methods may vary with changes in various factors including, magnitude and structure of the correlations between the repeated measurements, and magnitudes of the parameters used in the simulation models [21]. Therefore, caution is required when generalising these results.
Conclusion
The findings from this study, which was based on a longitudinal cohort study, indicate that among the MI methods available in Stata (which are all singlelevel fixedeffect models), FCS with PMM, applied with restrictions, performs best in terms of bias and precision, when handling up to 65% missing values in a timedependent categorical exposure variable with restrictions on transitioning over time. In a similar longitudinal setting, we would recommend the use of PMM within the FCS framework with a suitable procedure to implement restrictions within the imputations.
Abbreviations
 ACA:

available case analysis
 BMI:

body mass index
 BMIz:

BMI for age zscores
 CCA:

complete case analysis
 continuouscalibration:

imputation as a continuous variable using multivariate normal imputation with calibration
 FCS:

fully conditional specification
 indicatorPDBR:

indicator based imputation using multivariate normal imputation with projected distancebased rounding
 LSAC:

Longitudinal Study of Australian Children
 MAR:

missing at random
 MCAR:

missing completely at random
 MI:

multiple imputation
 MSE:

Mean square error
 MVNI:

multivariate normal imputation
 PMM:

predictive mean matching
 twofold FCS:

twofold fully conditional specification
References
 1.
Karahalios A, Baglietto L, Carlin JB, English DR, Simpson JA. A review of the reporting and handling of missing data in cohort studies with repeated assessment of exposure measures. BMC Med Res Methodol. 2012;12:96–106.
 2.
Hayati Rezvan P, Lee KJ, Simpson JA. The rise of multiple imputation: a review of the reporting and implementation of the method in medical research. BMC Med Res Methodol. 2015;15:30–43.
 3.
Rubin DB. Multiple imputation for nonresponse in surveys. New York: Wiley; 1987.
 4.
De Silva AP, MorenoBetancur M, De Livera AM, Lee KJ, Simpson JA. A comparison of multiple imputation methods for handling missing values in longitudinal data in the presence of a timevarying covariate with a nonlinear association with time: a simulation study. BMC Med Res Methodol. 2017;17(1):114–24.
 5.
Kalaycioglu O, Copas A, King M, Omar RZ. A comparison of multipleimputation methods for handling missing data in repeated measurements observational studies. Royal Statistical Society. 2015;179(3):683–706.
 6.
Schafer JL. Analysis of incomplete multivariate data. United States of America: CRC Press; 1997.
 7.
Raghunathan TE, Lepkowsi JM, Van Hoewyk J, Solenberger P. A multivariate technique for multiply imputing missing values using a sequence of regression models. Sampling Methodology. 2001;27(1):85–95.
 8.
Van Buuren S, Brand JPL, GroothuisOudshoorn CGM, Rubin DB. Fully conditional specification in multivariate imputation. J Stat Comput Simul. 2006;76(12):1049–64.
 9.
Welch C, Petersen I, Bartlett JW, White IR, Marston L, Morris RW, Nazareth I, Walters K, Carpenter J. Evaluation of twofold fully conditional specification multiple imputation for longitudinal electronic health record data. Stat Med. 2014;33(21):3725–37.
 10.
Nevalainen J, Kenward MG, Virtanen SM. Missing values in longitudinal dietary data: a multiple imputation approach based on a fully conditional specification. Stat Med. 2009;28(29):3657–69.
 11.
Rubin DB. Multiple imputation after 18+ years. Journal of American Statistical Association. 1996;91(434):473–89.
 12.
Hussain S, Mohammed MA, Haque MS, Holder R, Macleod J, Hobbs R. A simple method to ensure plausible multiple imputation for continuous multivariate data. Communications in Statistics  Simulation and Computation. 2010;39(9):1779–84.
 13.
Australian Institute of Family Studies. The Longitudinal Study of Australian Children: An Australian Government Initiative, Data User Guide. 2013.
 14.
Halfon N, Larson K, Lu M, Tullis E, Russ S. Lifecourse health development: past, present and future. Matern Child Health J. 2014;18(2):344–65.
 15.
De Stavola BL, Nitsch D, dos Santos Silva I, McCormack V, Hardy R, Mann V, Cole TJ, Morton S, Leon DA. Statistical issues in life course epidemiology. Am J Epidemiol. 2006;163(1):84–96.
 16.
Al Mamun A, Lawlor DA, Alati R, O'Callaghan MJ, Williams GM, Najman JM. Does maternal smoking during pregnancy have a direct effect on future offspring obesity? Evidence from a prospective birth cohort study. Am J Epidemiol. 2006;164(4):317–25.
 17.
Moller SE, Ajslev TA, Andersen CS, Dalgard C, Sorensen TI. Risk of childhood overweight after exposure to tobacco smoking in prenatal and early postnatal life. PLoS One. 2014;9(10):e109184.
 18.
Robinson O, Martinez D, Aurrekoetxea JJ, Estarlich M, Somoano AF, Iniguez C, SantaMarina L, Tardon A, Torrent M, Sunyer J, et al. The association between passive and active tobacco smoke exposure and child weight status among Spanish children. Obesity. 2016;24(8):1767–77.
 19.
Thurber KA, Dobbins T, Kirk M, Dance P, Banwell C. Early life predictors of increased body mass index among indigenous Australian children. PLoS One. 2015;10(6):e0130039.
 20.
Thoemmes F, Mohan K. Graphical representation of missing data problems. Struct Equ Model Multidiscip J. 2015;22(4):631–42.
 21.
Karahalios A, Baglietto L, Lee KJ, English DR, Carlin JB, Simpson JA. The impact of missing data on analyses of a timedependent exposure in a longitudinal cohort: a simulation study. Emerging Themes in Epidemiology. 2013;10(1):6–17.
 22.
Pigott TD. A review of methods for missing data. Educ Res Eval. 2001;7(4):353–83.
 23.
Bartlett JW, Seaman SR, White IR, Carpenter JR. For the Alzheimer's disease neuroimaging I: multiple imputation of covariates by fully conditional specification: accommodating the substantive model. Stat Methods Med Res. 2014;24(4):462–87.
 24.
Romaniuk H, Patton GC, Carlin JB. Multiple imputation in a longitudinal cohort study: a case study of sensitivity to imputation methods. Am J Epidemiol. 2014;180(9):920–32.
 25.
Williamson EJ, Aitken Z, Lawrie J, Dharmage SC, Burgess JA, Forbes AB. Introduction to causal diagrams for confounder selection. Respirology. 2014;19(3):303–11.
 26.
Allison PD. Missing data. Thousand Oaks, Calif. London: SAGE Publications; 2002.
 27.
Yucel RM, He Y, Zaslavsky AM. Gaussianbased routines to impute categorical variables in health samplings. Stat Med. 2011;30(29):3447–60.
 28.
Yucel RM, He Y, Zaslavsky AM. Using calibration to improve rounding in imputation. Am Stat. 2008;62(2):125–9.
 29.
Little RJA. Missingdata adjustments in large samplings. J Bus Econ Stat. 1988;6(3):287–96.
 30.
Welch C, Bartlett J, Peterson I. Application of multiple imputation using the twofold fully conditional specification algorithm in longitudinal clinical data. Stata J. 2014;14(2):418–31.
 31.
White IR, Royston P, Wood AM. Multiple imputation using chained equations: issues and guidance for practice. Stat Med. 2011;30(4):377–99.
 32.
Enders CK, Mistler SA, Keller BT. Multilevel multiple imputation: a review and evaluation of joint modeling and chained equations imputation. Psychol Methods. 2016;21(2):222–40.
 33.
Burton A, Altman DG, Royston P, Holder RL. The design of simulation studies in medical statistics. Stat Med. 2006;25(24):4279–92.
 34.
White IR. Simsum: analyses of simulation studies including Monte Carlo error. Stata J. 2010;10(3):369–85.
 35.
StataCorp. Stata statistical software, release 13. College Station: StataCorp LP; 2013.
 36.
Marshall A, Altman DG, Royston P, Holder RL. Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study. BMC Med Res Methodol. 2010;10:7–22.
 37.
Demissie S, LaValley MP, Horton NJ, Glynn RJ, Cupples LA. Bias due to missing exposure data using completecase analysis in the proportional hazards regression model. Stat Med. 2003;22(4):545–57.
 38.
Knol MJ, Janssen KJ, Donders AR, Egberts AC, Heerdink ER, Grobbee DE, Moons KG, Geerlings MI. Unpredictable bias when using the missing indicator method or complete case analysis for missing confounder values: an empirical example. J Clin Epidemiol. 2010;63(7):728–36.
 39.
Graham JW, Collins LM. Using modern missing data methods with auxiliary variables to mitigate the effects of attrition on statistical power. In: Missing Data. Edn. New York: Springer; 2012. p. 253–75.
 40.
Collins LM, Schafer JL, Kam CM. A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol Methods. 2001;6(4):330–51.
 41.
Morris TP, White IR, Royston P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med Res Methodol. 2014;14:75–98.
 42.
Rodwell L, Lee KJ, Romaniuk H, Carlin JB. Comparison of methods for imputing limitedrange variables: a simulation study. BMC Med Res Methodol. 2014;14:57–75.
 43.
Lee KJ, Carlin JB. Multiple imputation for missing data: fully conditional specification versus multivariate Normal imputation. Am J Epidemiol. 2010;171(5):624–632.44.
 44.
Lee KJ, Galati JC, Simpson JA, Carlin JB. Comparison of methods for imputing ordinal data using multivariate normal imputation: a case study of nonlinear effects in a large cohort study. Stat Med. 2012;31(30):4164–74.
 45.
Galati JC, Seaton KA, Lee KJ, Simpson JA, Carlin JB. Rounding nonbinary categorical variables following multivariate normal imputation: evaluation of simple methods and implications for practice. J Stat Comput Simul. 2012;84(4):798–811.
 46.
Goldstein H, Carpenter J, Kenward MG, Levin KA. Multilevel models with multivariate mixed response types. Statistical Modelling: An International Journal. 2009;9(3):173–97.
 47.
Quartagno M, Carpenter J. Package 'jomo'. R statistical software package. 2016.
Acknowledgements
Not Applicable.
Funding
This work was supported by funding from the National Health and Medical Research Council: a Centre of Research Excellence grant, ID 1035261, awarded to the Victorian Centre of Biostatistics (ViCBiostat); and a Senior Research Fellowship ID 1104975 (JAS) and Career Development Fellowship ID 1053609 (KJL). APDS is funded by a Victorian International Research Scholarship and a Melbourne International Fee Remission Scholarship.
Availability of data and materials
All data generated and analysed during the current study are available from the corresponding author on reasonable request.
Author information
Affiliations
Contributions
Authors APDS and JAS designed the study with critical review from AMDL, MMB and KJL. APDS performed the simulation study and statistical analyses under the supervision of JAS and AMDL. APDS drafted the paper with input from JAS, MMB, AMDL and KJL. All authors were responsible for critical revision of the manuscript and have approved the final version to be published.
Corresponding author
Correspondence to Anurika Priyanjali De Silva.
Ethics declarations
Ethics approval and consent to participate
For the simulation study, data were completely simulated, which did not require approval from the ethics committee or consent from participants. The case study example used in this study was based on the infant cohort of LSAC which has been provided ethical clearance by the Australian Institute of Family Studies Ethics Committee. The authors are approved users of the LSAC data and were granted access to the data through The University of Melbourne’s Organisational Deed of License. Written informed consent was obtained from the caregiver on behalf of each of the study children, as the children were minors at the time of data collection. The signed consent forms are retained by the field agency (Australian Bureau of Statistics).
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file
Additional file 1:
Comprehensive details and findings of simulation study including Stata code. (DOCX 162 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Fully conditional specification
 Longitudinal categorical data
 Missing data
 Multiple imputation
 Multivariate normal imputation
 Restricted transitions