Does pattern mixture modelling reduce bias due to informative attrition compared to fitting a mixed effects model to the available cases or data imputed using multiple imputation?: a simulation study

Background Informative attrition occurs when the reason participants drop out from a study is associated with the study outcome. Analysing data with informative attrition can bias longitudinal study inferences. Approaches exist to reduce bias when analysing longitudinal data with monotone missingness (once participants drop out they do not return). However, findings may differ when using these approaches to analyse longitudinal data with non-monotone missingness. Methods Different approaches to reduce bias due to informative attrition in non-monotone longitudinal data were compared. To achieve this aim, we simulated data from a Whitehall II cohort epidemiological study, which used the slope coefficients from a linear mixed effects model to investigate the association between smoking status at baseline and subsequent decline in cognition scores. Participants with lower cognitive scores were thought to be more likely to drop out. By using a simulation study, a range of scenarios using distributions of variables which exist in real data were compared. Informative attrition that would introduce a known bias to the simulated data was specified and the estimates from a mixed effects model with random intercept and slopes when fitted to: available cases; data imputed using multiple imputation (MI); imputed data adjusted using pattern mixture modelling (PMM) were compared. The two-fold fully conditional specification MI approach, previously validated for non-monotone longitudinal data under ignorable missing data assumption, was used. However, MI may not reduce bias because informative attrition is non-ignorable missing. Therefore, PMM was applied to reduce the bias, usually unknown, by adjusting the values imputed with MI by a fixed value equal to the introduced bias. Results With highly correlated repeated outcome measures, the slope coefficients from a mixed effects model were found to have least bias when fitted to available cases. However, for moderately correlated outcome measurements, the slope coefficients from fitting a mixed effects model to data adjusted using PMM were least biased but still underestimated the true coefficients. Conclusions PMM may potentially reduce bias in studies analysing longitudinal data with suspected informative attrition and moderately correlated repeated outcome measurements. Including additional auxiliary variables in the imputation model may also reduce any remaining bias. Electronic supplementary material The online version of this article (10.1186/s12874-018-0548-0) contains supplementary material, which is available to authorized users.


Background
Informative attrition is a potential source of bias in longitudinal data analysis, which occurs when participants drop out of a study and the reason for drop out is associated with the study outcome [1]. Analysing longitudinal data ignoring informative attrition may bias findings due to selection bias. For example, if lower cognitive functioning, the outcome, is associated with drop out, participants with lower cognitive function are more likely to be missing. Informative attrition can only be assumed to exist from our knowledge of the data; but this association cannot be determined since the required data are missing.
A few different approaches can reduce bias due to missing data in longitudinal studies. Firstly, fitting a mixed effects model (with random intercept and slope) to the available cases (AC) will exclude participants with missing outcome or exposure values at all data collection phases but will analyse participants who have values missing only at some phases by allowing for the missing data using the within and between participant correlations. However, the mixed effects model may not reduce all the bias due to informative attrition if not enough information exists in the data.
The second approach for handling missing data is multiple imputation (MI) [2]. MI repeatedly selects random values from the missing data distribution, given the observed data, defined using an imputation model. The repeated draws generate many imputed datasets and the mixed effects model is fitted to each dataset separately and these results combined using Rubin's rules [3]. An approach, validated to impute missing values in non-monotone, longitudinal data is the two-fold fully conditional specification (FCS) algorithm, which imputes missing values at each phase sequentially, conditional on the observed information at adjacent phases [4,5]. One benefit of MI compared to the AC analysis is that MI can use additional information, as auxiliary variables, in the imputation model to reduce bias. However, to achieve unbiased results from imputed data, the data needs to have a plausible ignorable missingness mechanism, that is, the probability of the data being missing is not associated with the missing values, conditional on the observed data [2]. If informative attrition is present, the missingness mechanism is non-ignorable and MI alone cannot completely reduce the bias.
A third approach is pattern mixture modelling (PMM) [3], which can be used as a sensitivity analysis if the missingness mechanism is thought to be non-ignorable. The procedure first assumes an ignorable missingness mechanism and uses MI to impute missing values, generating multiple datasets. Next, to use the PMM approach, adjust the imputed values by a fixed value.
Larger adjustment values suggest greater ignorable missing assumption violation. Finally, the mixed effects model is fitted using these adjusted values in each dataset and the results combined using Rubin's rules. Other sensitivity analyses exist, for example selection modelling, which specifies a selection distribution for those who drop out [3] or inverse probability weighting [6] which can correct for the bias due to missing data. However, we will focus on PMM since we can begin by assuming an ignorable missingness mechanism and then incorporate non-ignorable assumptions into the model.
Many clinical trial studies with longitudinal data recommend using PMM as a sensitivity analysis [7][8][9][10]. In general, clinical trial data have a monotone missingness pattern (non-response at a given phase will be missing at all later phases), which simplifies MI and, therefore, PMM. However, in observational longitudinal studies, data are often missing due to non-response as well as attrition, giving the data a non-monotone missingness pattern. In addition, the missingness mechanism for participants with repeated non-response status, who have not officially withdrawn from the study, may be more similar to participants with attrition status compared to participants who alternate between response and non-response. In this context, using MI to impute missing values may be more complex compared to clinical trials.
For this analysis, a simulation study was designed to evaluate these different approaches by comparing PMM to an available case mixed effects model and multiple imputation. Fully observed datasets, with known distributions and associations are generated, and a mixed effects model fitted to these datasets to obtain 'true' coefficients and standard errors (SE). Then, informative attrition is defined in the dataset using a non-ignorable missingness mechanism of our choice.
By replacing selected values with missing values, each approach can be used to account for bias due to informative attrition by analysing the data and comparing the coefficients and SE to 'true' estimates to assess bias and precision [11]. For this simulation study, we used distributions and associations in the Whitehall II study [12]. Data were first collected for over 10,000 civil servants in 1985 and data collection phases were repeated every 2-3 years. Participants completed a health and lifestyle questionnaire and, at alternate phases, attended a screening clinic. Over 30 years, analyses of the Whitehall II study have resulted in many publications. One investigated the association between smoking status at baseline (Phase 5) and 10-year cognitive decline using a mixed effects model with random intercept and slope [13]. This analysis was used as the basis for our simulation study to investigate whether informative attrition of participants with reduced cognitive function, who may have been unable to continue participation in the study, could give rise to bias in the estimates of association.
The aim of our study was to compare bias and precision of fitting the mixed effects model to the AC with an analysis which imputes data using the two-fold FCS algorithm and uses PMM to reduce bias due to informative attrition. With informative attrition, we expect least biased results when we apply methods which assume non-ignorable missingness such as PMM. However, by using a simulation study, we can assess whether the results are as expected and also quantify the difference in bias for PMM compared to approaches which assume ignorable missingness. In addition, simulation allows the effects of different percentages of informative attrition and different types of covariates with missing data to be assessed.
For our study, 1,000 fully observed datasets were simulated, each with 10,000 participants, having the same distributions and associations as observed in the Whitehall II study. From the missing data distributions observed in the Whitehall II study, an ignorable missingness mechanism for participants without attrition status and a non-ignorable missingness mechanism for participants with attrition status were first created. As it is not known how many with non-response status have a non-ignorable missingness mechanism, the analysis was repeated, generating a non-ignorable missingness mechanism for all participants with attrition or non-response status. We used a sensitivity analysis to investigate how results change if missing values are imputed for the time-independent covariate education at Phase 5 instead of time-dependent covariate smoking status.

Study design
Longitudinal records exist for i = 1,...,N independent participants with Y i,t the outcome values for participant i at phase t (a time period when data collection occurs). It is assumed that explanatory variable X exists with values at t = 1,...,T (typically equally spaced) phases. Let X i,t denote the value of variable X for individual i at phase t.
The substantive model (model of interest) is a linear mixed effects model adjusted for the explanatory variables' main effect and their interaction with data collection phase, together with random intercept β 0i and slope β 1i : The two-fold fully conditional specification algorithm Historically, to impute missing data for more than one variable, random draws were selected from a multivariate normal conditional distribution for the variables with missing values, conditional on the observed data, to obtain a complete dataset of observed and imputed values [14]. The approach generated multiple datasets, each analysed separately and the results combined using Rubin's rules [2]. In many cases, the multivariate normal model is difficult to define, for example if the rows or columns are ordered (such as with longitudinal data), or are not multivariate normally distributed, for example with different variable types (such as categorical). A more flexible approach, fully conditional specification (FCS) [15], selects random draws from separate conditional, univariate imputation models for each variable with missing data, repeatedly cycling through each variable in turn. Compared to fitting a multivariate normal model, FCS is computationally convenient and, despite a lack of theoretical justification, simulation studies found using FCS to impute missing values achieves similar results compared to using a multivariate model [16,17]. For longitudinal data with t = 1,...,T phases and j = 1,...,J variables measured at each phase, FCS imputes missing values for the J variables at each phase t. FCS repeats these imputations at each phase and the J T imputations constitute one iteration. However, the imputed data may lose the correlation structure between phases and biased estimates may be observed from analysing data imputed using FCS by not conditioning on measurements at other phases. The J T variables could be imputed simultaneously but, with many highly correlated repeated measurements, this may cause convergence problems due to collinearity, particularly for categorical variables [18].
Collinearity issues can be avoided and the correlation structure maintained in the longitudinal data by imputing using the two-fold FCS algorithm [4], which imputes missing values at phase t using FCS conditional on values at phase t and adjacent phases; a within-time iteration. The two-fold FCS algorithm repeats b W within-time iterations at each phase, generally in time order, and completes one among-time iteration when all phases are imputed. This is repeated for b A among-time iterations. Once the specified within-time and among-time iterations are complete, the first imputed dataset consists of current imputed and observed values. This is repeated M times to create M imputed datasets, which are each analysed separately and the results combined using Rubin's rules [2]. By conditioning on only adjacent phases, the two-fold FCS algorithm is more efficient compared to approaches which do not use information at other phases and can impute missing values in large datasets with many participants, phases and variables [5].
Missing outcome (Y) and covariate (X) values were imputed using the two-fold FCS algorithm, each imputed dataset analysed using substantive model (Eq. 1) and the results combined using Rubin's rules. The imputation model included all variables specified in Eq. 1, including the outcome, but no additional auxiliary variables were included to simplify the interpretation of the results. No interactions with time were specified since the two-fold FCS algorithm includes interactions with time by imputing each time point separately. The two-fold FCS algorithm was used, with 5 within-time iterations and 20 among-time iterations, to impute missing values at Phases 3, 5, 7, 9 and 11. Smoking status was conditioned on, at baseline only (no other phases), when smoking status was not missing, in order to avoid convergence issues due to collinearity. Twenty imputed datasets were generated, the substantive model fitted to each dataset, and the results combined using Rubin's rules.

Pattern mixture modelling
Analysing data imputed using the two-fold FCS algorithm can achieve unbiased findings if an ignorable missingness mechanism can be assumed for the data. However, in longitudinal observational studies, attrition is often associated with the missing outcome values Y (informative) and non-ignorable missingness [19]. If an ignorable missingness mechanism cannot be assumed, fitting a mixed effects model (Eq. 1) to the AC or data imputed using the two-fold FCS algorithm may still produce biased results. An approach that reduces the bias due to informative attrition is required.
In this situation, the outcome distributions may differ, depending on the missing data patterns. For example, there may be different patterns of missing observations, each potentially with a different joint distribution of partially observed and fully observed data with the overall density being the average of these patterns [3]. For each pattern, the joint distribution of the partially and fully observed variables is specified, which implies, within each pattern, a conditional distribution exists for the partially observed data given the fully observed data. To apply PMM, an ignorable missingness is assumed initially and missing values imputed using the two-fold FCS algorithm. These imputed values are then changed to reflect explicit assumptions about the difference between the observed and conditional distribution when the variables are unobserved [3].
In our data we have two missing data patterns, observed outcomes and missing outcomes. The distribution of the observed outcome pattern is given by Eq. 1. For the missing outcome pattern, we define an attrition indicator R i,t for participant i who leaves the study at phase t and add this to the observed outcome pattern: where k is the assumed mean difference between the imputed outcome distribution and the unknown true distribution which cannot be estimated from the observed data. If k = 0, the missingness mechanism is ignorable, otherwise for k ≠ 0 the mechanism is non-ignorable. Larger k suggests a greater violation of the ignorability assumption.
The PMM steps are; first, use the two-fold FCS algorithm to impute the missing data and generate M imputed datasets. For each imputed dataset, change the already imputed outcome values Y i,t , missing due to attrition R i,t , by k. Finally, fit the substantive model (Eq. 1) to the imputed dataset with updated outcome values.

Data generation and simulation process
A simulation study was designed using the Whitehall II study data. Exposure-outcome relationships were simulated using an existing epidemiological investigation of the association between smoking status at baseline (Phase 5) and 10-year cognitive decline using cognitive function measured at Phases 5, 7 and 9, each 5 years apart [13]. In the original study, Sabia, et al., stratified by sex and derived a 4 category smoking status variable. To simplify the analysis for the simulation study, only male participants and 3 smoking status categories (current smokers, ex-smokers and never smokers) were used. The distribution and associations of the variables and missing data were replicated.
For each of the 1,000 simulations, the following steps were used and are described in detail later in this section: if not missing due to attrition -change observations to missing using an ignorable missingness mechanism at each phase. if missing due to attrition -change observations to missing using a non-ignorable missingness mechanism at each phase.
4. Fit substantive model to AC, record parameter coefficients and SE. 5. Impute missing data using the two-fold FCS algorithm and fit the model of interest to each imputed dataset, combine the results using Rubin's rules [2] and record the imputation-based parameter coefficients and SE. 6. Apply PMM to the imputed datasets from step 5, adjust imputed values by a fixed value, re-analyse and record the imputation-based parameter coefficients and SE.

Data generation mechanism
We generated data at Phases 3, 5, 7, 9 and 11 because smoking status and cognitive function were recorded at these clinic phases. The substantive model was fitted to data collected at Phases 5, 7 and 9 but we also generate data at Phases 3 and 11 to inform the imputation of missing values at the phases in between. We generated the following time-independent categorical variables at baseline: age in years (5 categories 1. Short term verbal memory -20 one-or two-syllable words presented at 2 sec intervals that the participants had 2 min to recall in writing. 2. Vocabulary -Mill Hill Vocabulary Test [20] in its multiple-choice format consisting of a list of 33 stimulus words ordered by increasing difficulty and 6 response choices. 3. Reasoning -Alice Heim 4-I test, total verbal and mathematical reasoning tasks completed in 10 min (out of 65) [21]. 4. Phonemic fluency -total words beginning with 'S' recalled verbally in 1 min [22]. 5. Semantic fluency -total animals recalled verbally in 1 min [22].
A global cognitive score using all 5 cognitive function tests was created to minimize problems due to measurement error [23,24]. The scores on each test for the entire cohort were standardised to z scores (mean [SD] = 0 [1]) using the mean and standard deviation at Phase 5 (baseline). To calculate the global cognitive function, the z scores were averaged to create a global cognitive score and standardised again using the mean and standard deviation at Phase 5.
We compared results for two time-dependent outcome measures with different size correlations among repeated measurements; standardised memory score (correlations 0.45) and standardised global score (correlations 0.97). Each outcome was generated using two different mixed effects models with random intercept and slope fitted to data collected at Phases 5, 7 and 9, conditional on variable measurements at baseline (Phase 5): smoking status, age, occupational position and education. The models also included an interaction between each variable and time.
The data generation details are described in the Additional file 1: Appendix.

Parameters used for data generation
To derive the model parameters used for each data generation step, the mixed effects models were fitted to data from the cohort of Whitehall II study participants to obtain coefficients, considered to be the 'true' estimates in the simulation study. Any phases with missing smoking status were replaced with 'never smoker' if participants only had 'never smoker' smoking status recorded and, otherwise, were imputed as either 'current smokers' or 'ex-smokers'. Welch, et al., found using this approach reduced the missing data, ensured consistent smoking status recording and simplified MI using the two-fold FCS algorithm [5]. Any male participants who died or withdrew from the study before Phase 5 or those with missing cognitive function score or smoking status at Phases 5, 7 or 9 were excluded.

Missingness mechanism
Two different missingness mechanisms were investigated to compare results from imputing missing values for time-independent and time-dependent covariates. For the first missingness mechanism, a fixed percentage of the cognitive function measures (outcome) and smoking status (exposure) at each phase were changed to missing. For the second missingness mechanism, a percentage of the cognitive function measures (outcome) at each phase and education at baseline (covariate) were changed to missing. For these variables, the percentage of values changed to missing was similar to the percentage missing observed in the Whitehall II study.
One of the following participation statuses was generated for each participant at each phase: Response -participated at a given phase, but may have missing values for some variables (item nonresponse). Non-response -does not participate at a given phase so all variables have missing values (unit non-response). Death -before phase, confirmed by death certificate. Attrition -informed Whitehall II study they no longer wish to participate before the phase.
At Phases 3 and 5, only response or non-response status levels were generated, since all participants who died or dropped out before phase 5 were excluded. All four participation statuses were assigned at Phases 7, 9 and 11. For the first missingness mechanism (missing cognitive function and smoking status), a probability, p i , of non-response at Phase t= 3 was generated to be ignorable conditional on age, occupational grade and education, by choosing values for β 0 , β 1 , β 2 and β 3 so the proportion with non-response status was the same as in the Whitehall II study data: From exploring the associations between participation statuses at adjacent phases, we found participants with non-response status at the phase before were more likely to non-respond at the next phase, and participants with response status at the phase before were more likely to respond at the next phase. Therefore, the probability of non-response at Phase 5 was generated separately for response and non-response status at Phase 3, using Eq. 3.
At Phase t=7, separately for response and non-response status at Phase 5, a probability of each participation status (s= response, non-response, death or attrition) was generated to be ignorable, conditional on age, occupational grade and education, again choosing values for β 0ts , β 1ts , β 2ts and β 3ts so the proportion with each participation status was the same as Whitehall II study data: Any participants with died or attrition status at Phase 7 were assigned these statuses at later phases because, by definition, they do not return to the study (a monotone missingness pattern). This approach using Eq. 4 was repeated at Phases 9 and 11. Some participants status alternates between response and non-response; a non-monotone missingness pattern. Some missing values were also assigned to participants with response status at each phase with an ignorable missingness mechanism conditional on age (item non-response).
Next, the mean cognitive function score for participants with attrition status at Phases 7, 9 and 11 was examined. Currently, attrition status was generated with an ignorable missingness mechanism. To create a non-ignorable missingness mechanism, the probability of attrition p i,j was generated by conditioning on the cognitive function values at the same phase y i,j : Values for λ m0 and λ m1 were chosen so that the mean cognitive function scores were 0.5 less than the mean scores for an ignorable missingness mechanism, but ensured the proportion of participants with attrition status remained similar to the proportion observed in the Whitehall II study. Using this approach, k, from Eq. 2 was assigned the value -0.5. For the first missingness mechanism (missing cognitive function and missing smoking status), we changed cognitive function and smoking status values to missing for participants assigned attrition status at Phases 7, 9 and 11.
For the second missingness mechanism (missing cognitive function and missing education) the same method described above was used, except, to ensure an ignorable missingness mechanism, smoking status at baseline, instead of education, was conditioned on in Eqs. 3 and 4.
As a sensitivity analysis, the effect of increasing the percentage of participants with non-ignorable missingness mechanism was investigated by changing non-response and attrition status to non-ignorable missing at Phases 7,9 and 11. In summary, PMM in eight different settings was investigated, defined by the following criteria: c. Groups assigned missing values using non-ignorable missingness mechanism.

Statistics used in the evaluation
Letθ m denote the parameter estimate for each simulation m = 1,...,M. From Rubin's conditions for proper imputation [2],θ m is normally distributed with mean θ and variance σ 2 . For θ, the true parameter value used in the data generation mechanism, the following statistics were calculated: where the average imputed mean across simulations is given by: Smaller variance suggests greater precision (more accurate result).

Mean square error (MSE)
Smaller MSE suggest less bias. We calculate a ratio of each MSE and the AC analysis MSE for comparison. that include the true value, θ. δ m is the degrees of freedom calculated using Rubin's rules. A 95% level of confidence was used, so 95% of the confidence intervals were expected to contain θ.
To aid understanding of the results, we also assessed the correlations between variables in the simulated data, the data with missing values, the data imputed using the two-fold FCS algorithm and the imputed data adjusted using PMM. We performed the analysis using Stata 14 (StatCorp LP, Texas, USA) (www.stata.com) and the two-fold FCS algorithm using the Stata command twofold [26]. Table 1 shows the characteristics of the participants in the simulated dataset at Phases 5, 7, and 9. The greatest proportions of participants came from the two younger age categories, highest employment grades and education categories. Due to the study design, 49.4% of participants were never smokers at all phases, while the percentage of smokers decreased between Phase 5 (7.2%) and Phase 9 (5.0%). The global and memory cognitive function scores (standardised using mean and SD from Phase 5) decreased between Phase 5 and Phase 9 by 0.42SD and by 0.26SD respectively. The standardised SD for both cognitive function scores was 1 in the Whitehall II study cohort used to generate the simulated data, but less than 1 in the simulated data. Most responded at each phase but attrition (informed Whitehall II they no longer wished to participate) increased from 4.3% at Phase 7 to 6.0% at Phase 9. It was assumed that those participants (approximately 5%) with missing data due to attrition were informative. Approximately 17% were missing due to non-response or death and it was assumed that these were non-informative. In total, approximately 22% of participants had missing values. The analysis was repeated assuming missing due to attrition or non-response were informative, but the results were not reported here.

Results
The bias and precision of the intercept coefficients were similar across the three estimation methods and we have, therefore, restricted our description to the slope coefficients and SE from the mixed effects models. High correlations among repeated global cognitive measures (≈ 0.97) and repeated smoking status measures in the simulated data (≈ 0.95) were observed ( Table 2).
The correlations between repeated global cognitive measures and smoking status measures ranged from -0.0467 to -0.0978 and correlations between other variables had similar low correlations ( Table 2). The mixed effects substantive model was fitted to each full simulated dataset and the slope coefficients and SE averaged to estimate global cognitive function change over time. The slope coefficients from the full simulated data analysis closely replicated the slope coefficients observed in the Whitehall II study, and were precise due to high correlations among repeated global cognitive function measures (Table 3). Standardised cognitive function (SD), mean (SD) After global cognitive function and smoking status were replaced with missing values, the slope coefficients were slightly underestimated when the mixed effects model was fitted to the AC, had greater underestimation when analysing data imputed using the two-fold FCS algorithm and showed overestimation when analysing imputed data adjusted using PMM (Table 3). The slope coefficients from the AC were less precise compared to full simulated data, but more precise compared to fitting the mixed model to data imputed using the two-fold FCS algorithm or Table 2 Correlations among variables in full simulated global cognitive function data and differences compared to correlations among variables in available case analyses data, data imputed using multiple imputation and after applying pattern mixture modelling imputed data adjusted using PMM (Table 3). Figure 1 shows the bias, MSE and coverage of the different methods and confirms that fitting the mixed effects model to the AC achieved the least biased results. Again, the coefficients from fitting the mixed effects model to the AC when global cognitive function and education are missing were similar, but less precise, compared to the coefficients from the full simulated data analysis (Table 3). All slope coefficients from fitting the mixed effects model to data imputed using the two-fold FCS algorithm were similar but more precise compared to the AC with education missing, but the slope coefficients were less precise than when smoking status was missing for both variables with and without missing data. With global cognitive function and education missing, less bias was again observed in the slope coefficients from fitting the mixed effects model to the AC compared to data imputed using the two-fold FCS algorithm or imputed data adjusted using PMM (Fig. 1). The AC coefficients and precision were similar in both analyses with smoking status or education missing. However, % bias (Fig. 1a) and MSE (Fig. 1b) were smaller and coverage was closer to 95% (Fig. 1c) for the slope coefficients from fitting the mixed effects model to the AC compared to analysing data imputed using the two-fold FCS algorithm or imputed data adjusted using PMM (Fig. 1).
The correlations among repeated memory cognitive function were approximately 0.45, less than half of those observed for global cognitive function (Table 4). Correlations among repeated smoking status measurements had similar high correlations to the global cognitive function data (Table 2). However, correlations between all other covariates were low (Table 4). With missing memory cognitive function and smoking status, the mixed effects model fitted to the AC, gave larger underestimated slope coefficients and were less precise (Table 5) compared to those with  global cognitive function, due to lower correlations among the repeated memory measures. The slope coefficients from fitting the mixed effects model to data imputed using the two-fold FCS algorithm were generally more underestimated, but also more precise, compared to AC (Table 5) due to higher correlations in the imputed data (Table 4). However, the slope coefficients from fitting the mixed effects model to imputed data adjusted using PMM were similar to the full data analysis coefficients compared to fitting mixed effects model to the AC or data imputed using the two-fold FCS algorithm (Table 5). Figure 2a confirms that slope coefficients from analysing imputed data adjusted using PMM had the least bias, smallest MSE (Fig. 2b) and coverages closest to 95% (Fig. 2c). Using the two-fold FCS algorithm to impute the missing values increased precision, but some bias still existed in the coefficients, which was reduced, but not completely removed, using PMM (Fig. 2).
When memory cognitive function and education were missing, more precise slope coefficients but greater underestimation was observed when fitting the mixed effects model to data imputed using the two-fold FCS algorithm compared to missing memory cognitive function and smoking status since we did not condition on repeated education measurements at other phases to reduce bias. Due to the larger bias in the coefficients from analysing data imputed using the two-fold FCS algorithm, adjusting the imputed data using PMM did not reduce this to less than the AC analysis slope coefficients, which had least bias (Fig. 2a), smallest MSE (Fig. 2b) and coverages closest to 95% (Fig. 2c).

Discussion
This study described a PMM approach to account for bias due to informative attrition in a longitudinal, cohort study with non-monotone missing data. We found adjusting imputed data using PMM unnecessary when using a mixed effects substantive model and data with highly correlated repeated outcome measurements. The mixed effects model fitted to AC gave least bias slope coefficient because enough information was available in the repeated measurements.
The mixed effects model slope coefficients from the AC had more bias and less precision with lower to moderate correlations among the repeated outcome measurements, because not enough information is available in the observed data to adequately use the between and within participant correlations to adjust for the missing values. The two-fold FCS algorithm included additional information, not available in the AC, to reduce bias and precision. From fitting the mixed effects model to data imputed using the two-fold FCS algorithm, the time-dependent smoking status explanatory variable slope coefficient increased precision because the two-fold FCS algorithm conditions on highly correlated smoking status measurements at other phases. However, using the two-fold FCS algorithm to impute missing values for the time-independent explanatory variable education did not increase precision as much for variables with and without missing data possibly because no highly correlated repeated measurements exist to condition on. We may have observed greater bias reduction since education had a higher correlation with the outcome compared to smoking status [27]. Some bias in the slope coefficients from analysing data imputed using the two-fold FCS algorithm was still observed, but this reduced after adjusting the imputed data using PMM. However, the slope coefficients from fitting the mixed effects model to imputed data adjusted using PMM still underestimated the full data analysis slope coefficients. Although it is unlikely that such high correlations, as seen among repeated global cognitive function, will be a b c Fig. 1 Slope coefficient bias for smoking status and education categories with global cognitive function outcome when imputing the outcome together with either smoking status or education. a -% bias, b -mean square error ratio, ccoverage. a -% bias closest to zero indicates least bias approach. b -Ratios relative to available case analysis MSE. Ratio less than one indicates less bias compared to the available case analysis. c -Coverage close to 95% indicate proper control of the type I error rate for testing a null hypothesis of no effect observed in existing epidemiological datasets, it is more likely that correlations like those seen for repeated memory cognitive function will be observed. However, the results from analysing data with high and moderate correlations among repeated outcome measurements can be compared. The intercept coefficients were not reported in the results since fitting the mixed effect model to the AC and Table 4 Correlations among variables in full simulated memory cognitive score data and differences compared to correlations among variables in available case analyses data, data imputed using multiple imputation and after applying pattern mixture modelling    Fig. 2 Slope coefficient bias for smoking status and education categories with memory cognitive function outcome when imputing the outcome together with either smoking status or education. a -% bias, b -mean square error ratio, ccoverage. a -% bias closest to zero indicates least bias approach. b -Ratios relative to available case analysis MSE. Ratio less than one indicates less bias compared to the available case analysis. c -Coverage close to 95% indicate proper control of the type I error rate for testing a null hypothesis of no effect data imputed using the two-fold FCS algorithm showed similar bias and precision. For some analyses, for example with the global cognitive function outcome, slightly more precise and less bias intercept coefficients were observed when analysing data imputed using the two-fold FCS algorithm compared to AC analysis, but the AC analysis achieved the least bias slope estimates. However, the difference in bias between analysing the AC and imputed data was small, and it may be preferable to analyse the AC in practice. For memory cognitive function outcome, the least bias intercept and slope coefficients were observed from fitting mixed effects model to imputed data adjusted using PMM. Some participants chose to stop participation in the Whitehall II study, but did not formally withdraw, so may contribute to the bias due to informative attrition. Initially, it was assumed that all participants who formally withdrew were due to informative attrition. The analyses were repeated, overestimating the bias by assuming all participants with attrition and non-response status contributed to the bias due to informative attrition, which increased the percentage with non-ignorable missingness from 5 to 20%. The coefficients had larger bias compared to 5%, but the general findings were the same (results not shown).
Historically, the literature recommended using MI to impute missing values and then delete imputed outcome values before analysis. If both outcome and explanatory variables have missing values, imputing both outcome and explanatory variables will provide some information for the substantive model, by improving prediction of missing explanatory variables with observed outcome [14], but cases with imputed outcome contain no information about the regression of the outcome on explanatory variables [28]. However, more recent research does not recommended deletion since analysing data imputed using an imputation model with auxiliary variables associated with missing outcome found biased coefficients when observations with imputed outcomes were removed from the analysis [29]. We therefore chose to use MI to impute all the missing values and analyse the imputed data without any deletion.
For this paper, auxiliary variables were not included in the imputation model to simplify the analysis and interpretation. Slope coefficients from fitting mixed effects model to imputed data adjusted using PMM for moderately correlated time-dependent outcomes were underestimated. To increase precision, auxiliary variables highly correlated with the outcome values could be included, and this also reduces bias if the auxiliary variables are also correlated with the probability the variable is missing [30]. A monotone observational study investigated MI and a joint model of the cross-sectional and longitudinal data [31]. Under non-ignorable missingness, both methods resulted in biased estimates. However, including auxiliary variables correlated with the variables with missing values reduced the bias. Wang, et al., recommend future work to evaluate the effectiveness of auxiliary variables to impute missing values in non-monotone missingness data [31].
A prospective cohort study investigated the association between diabetes diagnosis and cognitive decline using a mixed effects model (which implicitly assumes the same distribution for those who drop out and for those who stay in the study) and used generalised estimating equations to avoid the implicit imputation [32]. In the study, they imputed missing scores due to drop out for both alive and deceased and investigated the effect of including auxiliary variables associated with cognitive function and the probability of drop out and death. Rawlings, et al., found similar results for mixed effects model and generalised estimating equations [32]. However, when auxiliary variables associated with drop out were included in the imputation model, MI effectively reduced the bias in the estimates. Some clinical trial studies with monotone missing data have also investigated incorporating reason for drop out. Standard PMM assumes non-random drop out and Lotz, et al., found that, when additionally stratifying standard PMM by random and non-random reasons for drop-out, the results had less bias [33]. For the purposes of this simulation study, a simple missingness mechanism was chosen where the outcome was associated with drop out. However, in reality, it is likely to be more complex and associated with other variables such as smoking status. Mein, et al., investigated risk factors associated with drop out in the Whitehall II study, which could be incorporated in the analysis [34].
Biering, et al., investigated a similar MI approach to the two-fold FCS algorithm when imputing missing values in a non-monotone missing longitudinal observational data; the Mental Component Summary from Short Form 12-item survey [35]. This study imputed missing data due to non-response and attrition at each time conditional on measurements at adjacent time and death, but did not impute measurements missing due to death. Biering, et al., changed imputed outcome values by a fixed amount, similar to PMM, which also effectively accounted for a non-ignorable missingness mechanism.
The main strength of this study was using a large, complex cohort in a real-life epidemiological setting to describe PMM in non-monotone missing cohort data; the findings are likely to be generalisable to other longitudinal studies. Using the two-fold FCS algorithm is another strength because it is the appropriate approach for imputing non-monotone longitudinal data, particularly for time-dependent variables, which imputes missing values for each phase sequentially conditional on observed information at adjacent phases [4]. This approach avoids possible convergence problems due to collinearity by restricting imputation to a short time window. Also, the results from a simulation study found increased precision when analysing time-dependent explanatory variables imputed using the two-fold FCS algorithm compared to standard approaches, such as AC [5]. The two-fold FCS algorithm can reduce bias and increase precision by conditioning on correlated repeated measures of the time-dependent outcome variable at adjacent phases. Therefore, using PMM to adjust data imputed using the two-fold FCS algorithm may be most suitable for longitudinal studies with many measurement phases, participants and variables. A previous study that used the two-fold FCS to impute missing outcome and explanatory variables found similar results to other MI approaches, but these analyses were restricted to 3 data collection waves [36].
A limitation of using the two-fold FCS algorithm is that it overestimates the random slope (results not shown) because it does not correctly consider multilevel structure by conditioning on the random intercept and slope in the imputation [37]. However, methods described in this paper are suitable for fixed effects. Demirtas and Shchafer investigated using MI to average marginal estimates from each pattern [8]. The authors observed under coverage in the results because of uncertainty due to model misspecification was not taken into account. However, they repeated the imputations using a three-level linear mixed effects imputation model which included a random level due to each pattern, accommodating model uncertainty in the imputation process [38].
A potential limitation is that k, the mean difference between the imputed outcome values and the unknown missing outcome values, was assumed to be constant, and this may be unrealistic [3]. For a more general specification in future studies, a distribution for k could be specified and a sensitivity analysis performed to investigate the effects of changing the variance of k as well its mean. Also, PMM may not reduce bias for outcome-dependent, non-ignorable missingness if large residual errors exist since the probability of missingness depends on residual errors as well as true outcome values [39]. For instance, participants with high observed outcome scores who are more likely to drop out may also have high measurement error values and, therefore, the mean of measurement errors within each missing pattern may no longer be zero. This may be an issue with the Whitehall II cognitive function data. Participants know of the tests in advance, since the tests repeat at each phase, so participants can prepare [40] and a higher than expected cognitive functioning in participants has been observed. However, in the data simulated for this paper, the residual error for each missing pattern was examined and the means were close to zero.
The National Research Council panel for handling missing data in clinical trials, USA, recom-mended undertaking more research to understand appropriate methods to impute missing values in non-monotone data [41], so this study adds to the evidence base.

Conclusions
Our findings suggest that with moderate correlations in the repeated outcome measurements and a linear mixed effects substantive model, using PMM reduces bias and increases precision but may still underestimate the true slope coefficient. With high correlations between repeated outcome measurements, the linear mixed effects model fitted to the available cases can adequately recover information. We recommend a few considerations for further analysis when using PMM, which may reduce bias and increase precision. First, select appropriate auxiliary variables for the imputation model with highly correlated repeated measurements or correlated with the outcome. Also, incorporate the reason for drop out in the imputation model.