Using a real dataset to provide a suitable structure for simulating the datasets, as in this study, simplifies the data generation procedures, avoids arbitrary choices and can aid the generalisability of the results. The simulated data were not an exact replica of the original, but provided sufficiently strong similarities to the original data to not warrant using more complicated semi-continuous distributions for PGR level (X_{3}), and ER level (X_{4}). Approximating the patterns of missingness seen in an incomplete dataset provided a realistic framework for simulating the missing data. The levels of missingness imposed reflected those seen in a review of prognostic modelling studies [1]. The effects of using MI when fitting prognostic models were unknown. Therefore this simulation study restricted the modelling process to including all covariates in the prognostic model and assuming linear relationships for all continuous covariates, both in the generation of the survival times and in the fitting of the prognostic model. Extensions of this research could include investigating the possible approaches for performing variable selection and fitting fractional polynomials after MI (e.g. [34]).

In this simulation study, performing a CC analysis with any multivariate missingness produced less biased regression coefficient estimates with better coverage rates than using SI or MI. However, this benefit was at the expense of larger standard errors and hence loss of efficiency due to the reduction in the sample being analysed [8]. This loss in efficiency affected the significance of the covariates in the prognostic model of the marginal prognostic covariates, making them appear non-significant with 25% or more missingness when in fact they were prognostically important. It is advisable to use a CC analysis only when fitting a Cox proportional hazards model with a reasonably small amount of missing data. Previous evidence [2, 3] suggested that imposing univariate MAR missingness associated with outcome would result in biased regression coefficient estimates when using CC analysis. Demissie et al [3] found large biases when the MAR mechanisms were associated with longer survival times or event status and the covariates had large hazard ratios for survival, but not with a hazard ratio of 1 or when the missingness was associated with shorter survival times. Relatively unbiased results were found in our simulations with multivariate MAR missingness where the missingness of X_{5}, the binary covariate with 20% of the total missingness, was associated with shorter survival times and the hazard ratio for X_{5} in the prognostic model was 0.7. Biased regression estimates may be more evident with more missing covariate data that is highly dependent on the outcome, especially longer survival times and event status, and with large hazard ratios.

With multivariate missing covariate data, using SI is not recommended with more than 10% missingness, due to its underestimation of the variability and corresponding detrimental effect on the coverage rates.

The results from performing simulations based on the German breast cancer dataset showed some bias, especially with over 25% missingness, for all mechanisms and MI approaches. The average SEs for all MI approaches and mechanisms were similar. They fell between the inflated estimate from the CC analysis and the underestimated SE from SI, as expected from previous research [8, 35]. The coverage was unaffected and remained around the nominal 95% level for all mechanisms and covariates, except for the skewed covariates. Tang et al [36] also found that the coverage may be poorer after MI for highly skewed data. Better coverage rates were seen using MICE-PMM without transformations or the '*aregImpute*' function than with the other MI approaches and also when a MAR mechanism was imposed rather than a MNAR mechanism.

Researchers have suggested that MI approaches are fairly robust to departures from normality due to the separation of the imputation and analysis phases [12, 13, 37]. Any deficiencies in the assumptions and implementation of the imputation model will only affect the incomplete component of the dataset and not the whole sample [38]. Having skewed continuous data and an outcome of survival time as in this simulation study may have affected the performance of the methods under investigation, especially those which assumed an underlying normal distribution for the continuous covariates, e.g. NORM, MIX and MICE. This study highlighted the problems that can exist when the imputation and analysis models differ and the model assumptions may not be fully satisfied. The bias seen in this simulation study even when the MAR mechanism assumption was correct may be an artefact of the transformations used in the imputation process [39]. Not only are the incomplete covariates transformed for imputation and then back-transformed prior to analysis, but the survival times are also transformed in the imputation model and then fitted using an alternative model. Imputing without transformation can reduce the bias in the mean estimate but distort other aspects of the distributional shape [39]. Log transformations were used for the continuous covariates in the data generation process. However, as the simulated data were then truncated to resemble the real data, applying the same transformations in the imputation process failed to satisfy normality. No other simple power transformations sufficiently improved normality or provided more plausible imputations.

The inclusion in the imputation model of survival time after log transformation and event status may not be the optimal choice to account for the censoring process and thus may have also introduced bias into the results. Using the Nelson-Aalen estimate of the cumulative hazard of the survival time may be more appropriate in future [40]. If the hazard rates for the survival and censoring times differ then it may be sensible to consider these times separately in the imputation model.

The MI procedures using MICE-PMM or the *'aregImpute'* function, which rely on the distributional assumptions only to match complete and incomplete responders, performed better for all missingness mechanisms than the other MI approaches. This confirmed the results from Faris et al [15] that with incomplete skewed data, MICE-PMM would be preferred to other MI Markov Chain Monte Carlo type approaches. However, caution is needed when using the *'aregImpute'* function, especially when the missingness is highly related to survival, as although the estimates for the incomplete covariates may exhibit little bias, the estimates for other prognostically important covariates may display more bias than seen with other MI approaches. Both MICE-PMM and the *'aregImpute'* function identify suitable matches from the observed data and therefore additional caution is required with small samples and with covariates with rare events as there may be a limited number of available cases to be selected as imputed values. With skewed data, values of a few cases have a lot of leverage that may distort the imputations and influence the results. Therefore it is essential to examine the distributions of the covariates requiring imputation to determine whether transformations are likely to provide reasonable estimates for the data to be analysed. With MICE-PMM, transforming the continuous covariates produced worse estimates than simply using the covariate values on their original skewed scale. Our findings suggest that if suitable transformations do not improve normality it is better to use MICE-PMM without transformations. With a fully observed normally distributed outcome and more normally distributed incomplete covariates and hence compatible imputation and analysis models, MICE-PMM may not remain the best MI approach.

In this simulation study, truncating imputed values for the continuous covariates to within the plausible range produced less bias than allowing implausible values. Schafer [31] suggested rounding values for the incomplete binary covariate to the observed values. In these simulations, where only 20% of the total missingness was imposed on the binary covariate, that approach did not produce any more bias than using the correct distributional assumption, e.g. fitting logistic regression models. Biases may be more apparent when the binary covariate has an uneven split or greater missingness [39].

From this simulation study, with incomplete skewed data, MI using MICE-PMM without transformations produced precise unproblematic estimates [12] within the allowable 10% accuracy with up to 25% missingness, but would not be recommended with 50% or more missingness for any missing data mechanism. Furthermore, with a MNAR mechanism, MI performed poorly with 25% or more overall missingness. Including variables in the imputation model that help to explain the missingness or are highly associated with the incomplete covariates themselves, can reduce the effect of an MNAR missing data mechanism [8]. With less enriched imputation models, and datasets where there is little correlation between variables, the results from the MNAR may be even more extreme than seen here. Further research is required to assess whether alternative MI procedures or fully Bayesian approaches that can model the skewness of the covariate distribution and the missing data mechanism may be more appropriate when there is more than 25% missingness.

The true performance of the various missing data methods is likely to vary in relation to the underlying distribution of the covariates, the correlations between these variables as well as with different missing data mechanisms and associations between the outcome and the covariates with missing data. Therefore, the generalisability of the results from this simulation study, however rigorous, is limited due to reflecting the data from a single real prognostic study and imposing a restricted number of missing data mechanisms. Confirmatory investigations are required to examine the extent to which these findings are consistent across alternative populations, distributions and clinical contexts.