 Research article
 Open Access
 Published:
An application of a patternmixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations
BMC Medical Research Methodology volume 19, Article number: 10 (2019)
Abstract
Background
The benefit of a given treatment can be evaluated via a randomized clinical trial design. However, protocol deviations may severely compromise treatment effect since such deviations often lead to missing values. The assumption that methods of analysis can account for the missing data cannot be justified and hence methods of analysis based on plausible assumptions should be used. An alternative analysis to the simple imputation methods requires unverifiable assumptions about the missing data. Therefore sensitivity analysis should be performed to investigate the robustness of statistical inferences to alternative assumptions about the missing data.
Aims
In this paper, we investigate the effect of tuberculosis pericarditis treatment (prednisolone) on CD4 count changes over time and draw inferences in the presence of missing data. The data come from a multicentre clinical trial (the IMPI trial).
Methods
We investigate the effect of prednisolone on CD4 count changes by adjusting for baseline and timedependent covariates in the fitted model. To draw inferences in the presence of missing data, we investigate sensitivity of statistical inferences to missing data assumptions using the patternmixture model with multiple imputation (PMMI) approach. We also performed simulation experiment to evaluate the performance of the imputation approaches.
Results
Our results showed that the prednisolone treatment has no significant effect on CD4 count changes over time and that the prednisolone treatment does not interact with time and antiretroviral therapy (ART). Also, patients’ CD4 count levels significantly increase over the study period and patients on ART treatment have higher CD4 count levels compared with those not on ART. The results also showed that older patients had lower CD4 count levels compared with younger patients, and parameter estimates under the MAR assumption are robust to NMAR assumptions.
Conclusions
Since the parameter estimates under the MAR analysis are robust to NMAR analyses, the process that generated the missing data in the CD4 count measurements is missing at random (MAR). The implication is that valid inferences can be obtained using either the likelihoodbased methods or multiple imputation approaches.
Background
The benefit of trial medication may be evaluated through a randomized clinical trials design. Randomized clinical trials with longitudinal followup are central to the evaluation of treatments. However, statistical inferences from the resulting analysis is almost always complicated because subjects might deviate from the protocol [1]. The study protocol sets out the objective and procedure of conducting the trial.
Given the trial setting and the specific question, such deviations may include poor compliance with, or withdrawal from the intervention; unblinding, either of intervention or evaluation; and loss to followup, so that no further information on the patient is available [1]. These deviations complicate the analysis because (to address both primary and secondary questions), there is the need to make assumptions about the unobserved data [1, 2]. These assumptions are often not verifiable. There is now an increase in awareness that such assumptions have the potential to introduce implicit ambiguity into the inferences that can be drawn [1, 3, 4]. In addition, inappropriate assumptions about the unobserved data may lead to biased estimates of the treatment effect. The extent to which such inappropriate assumptions are practical issues will depend both on the precise question, and on how the extent and nature of deviations from the protocol affect this question [1]. Most often, regulators and analysts will require some level of confidence that inferences are robust to plausible departures from the primary assumptions that govern the main analysis. This gives an indication that such inferences require sensitivity analyses [1].
It is known that missing data may severely compromise statistical inferences from clinical trials. However, missing data has received little attention in the clinicaltrials research [5] and existing regulatory guidelines [6] on design, conduct, and analysis of clinical trials have limited advice on how to handle missing data. The national research council (NRC) report [3] outlined recommendations for handling missing data in clinical trials.
There is now increasing attention to the importance of conducting sensitivity analysis in the biomedical research. For instance, Section 7 of the new EMA guideline on missing data in confirmatory clinical trials [7] is devoted to this issue. It states “The sensitivity analyses should show how different assumptions influence the results obtained.” In addition, recommendation 15 of the NRC report [3] recommended that “sensitivity analyses should be part of the primary reporting of findings from clinical trials. The sensitivity to the assumptions about the missing data mechanism should be a mandatory component of reporting.”
Sensitivity to missing data can be conducted based on three modeling frameworks [8]. In the selection modeling (SeM) framework, the joint distribution of the measurement and the dropout processes is factored as the marginal distribution of the measurement process and the conditional distribution of the dropout process, given the measurements [8, 9]. The patternmixture model (PMM) is a reverse factorization of the SeM defined as the marginal distribution of the dropout process and the conditional distribution of the measurement process given the dropout process [2, 8]. For the sharedparameter model (SPM), a set of latent variables (random effects) is assumed to be shared between the measurement and the dropout processes [8, 10]. It is conventionally assumed that conditional on this set of random effects, no further dependency exists between the measurement and the dropout process, although this can be generalized [11]. Yuan and Little [12] proposed mixedeffect hybrid models (MEHMs) framework, where the joint distribution of the measurement process and dropout process is factorized into the marginal distribution of random effects, the dropout process conditional on random effects, and the outcome process conditional on dropout patterns and random effects. Carpenter and colleagues [1] proposed the patternmixture model with multiple imputation (PMMI) approach in order to conduct sensitivity analysis. Ratitch and colleagues [13] considered sensitivity analysis approaches based on the patternmixture model, and Mallinckrodt and colleagues [14] considered selection model based approaches [9] and the PMMI approach [1] to conduct sensitivity analyses. Permutt and colleagues [15] examined previous ideas of sensitivity analysis with a view to explaining how the NRC panel’s recommendations are different and possibly better suited to coping with present problems of missing data in the regulatory setting. They also discussed, in more detail than the NRC report, the relevance of sensitivity analysis to decisionmaking, both for researchers and for regulators. In this paper, we applied the PMMI approach of Carpenter and colleagues to investigate the effect of prednisolone treatment on CD4 count changes over time and to investigate sensitivity of inferences to missing data assumptions [8, 16, 17] using the incomplete CD4 count data from the IMPI trial [18, 19]. Carpenter and colleagues [1] applied the PMMI approach to longitudinal data but used models which assumed independent observations, i.e., they fitted models to values at the last visit, whereas in the models in this paper we considered CD4 count measurements at all visits.
In “Description of the IMPI trial data” section, we give a brief description of the IMPI trial data. In “Estimands for primary and sensitivity analyses” section, we define the estimands and their associated deviations as well as some key assumptions relevant for the PMMI approach. In “Standard patternmixture model and the patternmixture model with multiple imputation” section, we briefly review the standard patternmixture model and then discuss the patternmixture model with multiple imputation (PMMI) [1]. This is followed by a brief discussion of the assumptions (sensitivity analysis) that allow us to obtain missing postdeviation data under the PMMI approach in “Constructing joint distributions of predeviation and postdeviation outcome data” section. We then applied the PMMI approach to the incomplete CD4 count data from the IMPI trial in “Application of the PMMI approach to the IMPI trial CD4 count data” section. In “Simulation study” section, we perform simulation studies to evaluate the performance of the PMMI approach. Finally, we give a discussion of the results and concluding remarks in “Discussion and conclusion” section.
Description of the IMPI trial data
In this paper, we used data from the IMPI trial [18, 19]. The IMPI trial was a multicentre international randomized doubledblind placebocontrolled 2 × 2 factorial study. The IMPI trial tested prednisolone and Mycobacterium indicus pranii (M. indicus pranii) immunotherapy treatments in TB pericarditis patients in Africa. TB pericarditis leads to high mortality especially in countries with limited resource and with concomitant epidemics of human immunodeficiency virus (HIV) infection [18, 19]. Tuberculous pericarditis is associated with high morbidity and mortality even if antituberculosis treatment is taken as directed [19]. A reduction in the strength of the inflammatory response in TB pericarditis may improve patients conditions by reducing cardiac tamponade and pericardial constriction. However, whether the use of adjunctive immunomodulation with corticosteroids and M. indicus pranii can safely reduce mortality and morbidity is uncertain [19]. To investigate whether adjunctive immunomodulation with corticosteroids and M. indicus pranii can safely reduce mortality and morbidity, Mayosi and colleagues set up the IMPI trial [18, 19].
In total, 1400 patients with definite probable tuberculosis pericardial effusion, from 9 African countries in 19 centers were enrolled in the fouryear trial. Eligible patients were randomly assigned to receive oral pill prednisolone for 6 weeks and M. indicus pranii or placebo for 3 months. Patients were followed up at weeks 2, 4, and 6 and months 3 and 6 during the intervention period and 6monthly thereafter for up to 4 years [18].
The main aim of the IMPI trial was to assess the effectiveness and safety of oral pill prednisolone and M.w injection in reducing the time to first occurrence of the primary composite outcome of death, pericardial constriction, or cardiac tamponade requiring pericardial drainage in with TB pericardial effusion [19]. In this paper, we assessed the effect of trial medication (prednisolone) on CD4 count changes over time. A large proportion of the TBP patients were also coinfected with HIV (42%). Hence the interest in investigating the effect of prednisolone among HIV positive (denoted as HIV+) patients. We restricted our analysis to HIV positive (denoted as HIV+) patients only who have at least two CD4 count values observed. In the IMPI trial, patients who were confirmed HIV+ at the time of randomization or confirmed to be HIV+ during the trial, were given a standard of care (ART) and their CD4 count were measured at some visits. Mayosi and colleagues [19] results showed that the oral pill prednisolone and M. indicus pranii do not interact and hence, treatments arms were analyzed separately with their corresponding placebo arms. Also, their results showed that prednisolone reduces the risk of constriction whereas M. indicus pranii was not effective. We considered analysis of the CD4 count measurements under the prednisolone treatment and its corresponding placebo arm only. The analysis of CD4 count data is restricted to the mandated periods for CD4 count measurements; baseline, week 2, months 1, 3 and 6. However, most South Africa centres continued to measure CD4 count at months 24, 36 and 48 scheduled visit time. These data were excluded in this analysis. A majority of patients had unobserved CD4 count with 72%, 84%, 93% as missingness proportions for the months 24, 36 and 48, respectively.
In this paper, we applied the PMMI approach to nonmonotone and monotone missing data patterns. For nonmonotone missing data pattern, patients can be missing at any scheduled visit and then be observed at the subsequent visit. For monotone missing data pattern, if the i^{th} patient is missing at schedule visit j, then this same patient will be missing at the next scheduled visit jC1.
Nonmonotone data
Out of 587 HIV+ patients, 294 patients are in the placebo arm and 293 patients are sin the prednisolone arm. Some of the patients have missing values within the selected scheduled visits. The left panel of Fig. 1 shows profiles plots of the observed \(\sqrt {\text {CD4}}\) count measurements for each patient. Some of the patients CD4 count values are missing at either months 0.5, 1, or 3, after the baseline measurements are taken whereas some patients completed the study with their values observed from baseline up to month 6. Because there are too many patients in the left panel of the Fig. 1, the figure is not that informative. We have provided observed profiles plots of 29 (5%) patients in Fig. 2 to make this panel more informative. It can be observed from these plots that some patients completed the study (observed from baseline 0 to the month 6) while others have missing values (incomplete cases). The right panel of the Fig. 1 shows the profiles plots of the mean \(\sqrt {\text {CD4}}\) count measurements by treatment arms. The mean profile plots showed a slight reduction of CD4 count level among patients in the prednisolone arm compared with those in the placebo arm. The Fig. 3 displays standard error bars around the mean graph. These plots show an overlap between confidence intervals which suggests comparable ART benefit to patients in the placebo and the prednisolone arms. There are 25 missingness patterns, presented in Table 1. A missingness pattern represents time points for which a group of patients values are missing or observed at all time points. The Table 1 shows the mean \(\sqrt {\text {CD4}}\) count for each of the missingness patterns at each visit by treatment arm. The proportion of patients with missing values, in the prednisolone arm (84%), is approximately the same to that of the patients in the placebo arm (85%). Table 1 presents summaries of the \(\sqrt {\text {CD4}}\) count data by treatment groups. The distribution of the pattern of missingness between the two treatment groups does not differ (chisquared test statistic D 29.97, p D 0.1858).
Monotone data
The monotone CD4 count data consisted of 137 HIV positive patients. 64 were in the placebo arm and 73 in the prednisolone arm.
The left panel of Fig. 4 shows profiles plots of the observed \(\sqrt {\text {CD4}}\) count measurements for each HIV positive patient. Some of these patients dropped out at months 0.5, 1, and 3, after the baseline measurements are taken, whereas some patients completed the study with their values observed from baseline up to month 6. The right panel of the Fig. 4 shows profiles plots of the mean \(\sqrt {\text {CD4}}\) count measurements by treatment arms, where it can be observed that there is a slight reduction of CD4 count level for patients in the prednisolone arm compared with those in the placebo arm. The Fig. 5 displays standard error bars around the mean graph. These plots show an overlap between confidence intervals which suggests comparable ART benefit to patients in the placebo and the prednisolone arms.
Table 2 gives the number and proportion of patients remaining at each visit by treatment arm. There is higher completion rate 44 (69%) in the placebo arm compared with 46 (63%) completion rate in the prednisolone treatment arm. There are four deviation patterns. A deviation pattern represents the time point for which a group of patients dropped out of the study. The deviation patterns 4, 3, 2 and 1 represent completers (those patients who completed the study without missing values), those who dropped out at months 3, 1 and 0.5 respectively. Table 3 shows the mean \(\sqrt {\text {CD4}}\) count for each of the deviation patterns at each visit by treatment arm. The proportion of patients deviating (who do not complete the study) in the prednisolone arm (37%) is higher than the portion deviating in the placebo arm (31%). The distributions of the patterns of missingness between the two treatment groups do not differ (chisquared test statistic D 5.15, pD0.161).
Figure 6 shows the profile plots of the mean \(\sqrt {\text {CD4}}\) count of the four deviation patterns for patients in the placebo and prednisolone groups. This figure gives an indication that the \(\sqrt {\text {CD4}}\) count increases over time. Figure 6 agrees with those mean profiles in Figs. 1 and 4. That is, there is slight increase in the \(\sqrt {\text {CD4}}\) count among patients in the placebo arm compare with those in prednisolone arm.
Estimands for primary and sensitivity analyses
Since the focus of this paper is to draw statistical inferences in the presence of missing data, this section discusses the de jure and de facto estimands [1]. This discussion is necessary because our primary analysis model is based on the de jure estimand, and the sensitivity analysis models are based on the de facto estimand [1]. The primary analysis (as specified in the statistical analysis plan) addresses the main objective of the study, whereas the sensitivity analysis considers models the make alternative assumptions (trial protocol) that, in one way or the other, may influence statistical inferences under the primary analysis model. We discuss the de jure estimand in “De jure estimand hypothesis” section and then the de facto estimand in “De facto estimand hypothesis” section. We will also discuss deviations associated with each estimand in “Deviations associated with estimands” section.
De jure estimand hypothesis
The de jure estimand estimates the effect of treatment on patients assuming that patients adhered to the study protocol without deviating from the trial protocol [1, 14]. The de jure estimand hypothesis is analogous to the MAR mechanism. This hypothesis assumes that the conditional distribution of observations later in the followup, given observations earlier in the followup, is independent of whether deviation occurs. In this case, patients are expected to obtain the full benefit of the treatment and the question of interest is whether the treatment works under the best case scenario. In this study, the de jure primary analysis is based on the multiple imputation under missing at random (MAR) [8, 17, 20]. The primary analysis method to choose varies from trial to trial. The guidelines on how to decide on an appropriate primary analysis for a given trial can be found in the NRC panel report [3] and many others [14, 15].
De facto estimand hypothesis
The de facto estimand concerns what would be the effect of treatment seen in practice if treatment were allocated to the target population of eligible patients as defined by the trial inclusion criteria. In addressing this question, we may ask, what would have been the effect of treatment seen at the end of the study if those who deviated moved to the equivalent of the active treatment arm (prednisolone treatment in this study). However, this may underestimate the benefit of active treatment in trials where more benefit is expected from the active treatment. This is because estimand equates treatment benefit of those failing on placebo arm to those opting for active treatment. In this instance, the fairer comparison might be to move those who deviate from the prednisolone arm onto the placebo arm. In the case of the IMPI trial, since all patients in both prednisolone and placebo arms were given ART, we expect no significant difference in their response to ART treatment unless there is interaction between prednisolone and ART treatment.
We discuss four de facto options for obtaining postdeviation data in “Constructing joint distributions of predeviation and postdeviation outcome data” section. These options make assumptions about the missing postdeviation data. These assumptions are alternative plausible assumptions, which depart from the MAR assumption under the primary analysis. In this way, it is assumed that the data are not missing at random (NMAR) and we assess the robustness of inferences under the MAR primary analysis to the alternative assumptions under the de facto options (sensitivity analyses).
Deviations associated with estimands
It is important to define clearly deviations associated with each estimand in the study protocol. This is because clarity of deviations associated with each estimand is vital for primary analysis and framing relevant sensitivity analysis [1]. The exact definition of a deviation will depend on the trial setting and may also vary between separate analyses [1]. In the IMPI trial, the following situations can be regarded as deviations associated with the de jure estimand: unblinding of treatment arms and unobserved CD4 count measurements and deviations associated with the de facto estimand are unblinding such as treatment allocation, loss to followup such that no further treatment is taken and influence if trial prednisolone treatment on ART.
Given the estimands and their associated deviations, it is assumed that each patient has longitudinal followup data until either the patient deviates or reaches the final visit, and that the nature or reason of each deviation is known. This approach further assumes that for each deviation or group of similar deviations occurring in a dataset due to similar reasons, an appropriate postdeviation distribution can be built taking into consideration (1) the patient’s predeviations, (2) predeviations and postdeviations data from other patients in the trial, (3) the nature of the deviation, (4) and the reason for the deviation [1].
Standard patternmixture model and the patternmixture model with multiple imputation
It this section, we give a brief review of the standard patternmixture model (PMM) and then discuss the patternmixture model with multiple imputation (PMMI) of Carpenter and colleagues. In “Link between the patternmixture model and the patternmixture model with multiple imputation” section, we give the link between these approaches.
Standard patternmixture model
We have mentioned in the “Background” section that the patternmixture modeling framework is a reverse factorization of the selection model [2, 8, 9, 21]. The selection model can be viewed as a multivariate model where one variable represents marginal density of the measurements process and the other variable represents the conditional density of the missingness process, given the outcomes. The PMM approach, on the other hand, is defined as a model for the product of the conditional distribution of the responses Y_{i} for patient i,iD1,2,…,N, given nonresponse patterns R_{i} and the model for nonresponse R_{i}. [10, 22, 23]; that is
where R_{i}D1 if response is observed and 0 otherwise, X_{i} is design matrix of covariates, θ and ψ represent parameter estimates in the measurement model and dropout model respectively.
The PMM has desirable properties especially where the data are NMAR (probability that a response will be missing depends on the R_{i} and Y_{i}). For instance, where it is not substantively reasonable to consider nonresponses as missing data, it may be desirable to limit the inferences to the subpopulation of patients whose responses are observed. Thus, it is more meaningful to consider the distribution of Y_{i} given R_{i}D1 (R_{i}D1 if subject is observed and 0 otherwise) rather than the marginal distribution of Y_{ij} [8]. Contrary to the selection model, \(\Pr \left (\mathbf {Y}^{m}_{i} \mid \mathbf {Y}^{o}_{i},\mathbf {X}_{i}, \mathbf {R}_{i} \right)\) is modeled directly from the patternmixture model, where \(\mathbf {Y}^{o}_{i}\) is a vector of observed responses and \(\mathbf {Y}^{m}_{i}\) is a vector of the missing responses.
One important feature of the patternmixture model (1) is that it fits a different response model for each pattern of missingness such that the observed data is a mixture of patterns weighted by their respective probabilities of missing patterns. That is, the first component in the PMM (1), Pr(Y_{i}∣X_{i},R_{i},θ) fits a response model for each pattern of missingness and Pr(R_{i}∣X_{i},ψ) represents dropout probability for each pattern. It follows that if there are U number of missingness patterns in a data set, then the marginal distribution of Y_{i} is a mixture of \(\Pr \left (\mathbf {Y}_{i} \mid \mathbf {X}_{i}, \boldsymbol {\theta }\right) = \sum \limits _{u = 1}^{U} \Pr \left (\mathbf {Y}_{i} \mid \mathbf {R}_{i} = \mathbf {R}^{u}_{i}, \mathbf {X}_{i}, \boldsymbol {\theta }^{u}\right) \pi _{u}\), where π_{u}DPr(R_{i}Du∣X_{i},ψ) and R_{i} counts the number of U patterns, θ^{u} represents the parameters of marginal density Pr(Y_{i}) in the u^{th} pattern. It can be observed that in the patternmixture model, parameters {θ^{1},…,θ^{U}} can have different dimensions. A logistic model is often assumed for dropout probabilities and a linear mixed effect model (LMM) [24] for the measurement process.
The linear mixed effects model (LMM) [24] is assumed for the measurement process and is given by
where b_{i} is an qdimensional vector of random effects, Z_{i} and X_{i} are N×q and N×q dimensional matrices of known covariates, β is a pdimensional vector containing the fixed effects, ε_{i} is an Ndimensional vector of residual components, G_{i}⊳ρ⊲ and R_{i}⊳σ⊲ are q×q and n_{i}×n_{i} covariance matrices respectively and σ and ρ are c×1 and s×1 (with s≤n_{i}⊳n_{i}C1⊲/2) vectors of unknown variance parameters corresponding to ε_{i} and b_{i} respectively.
The patternmixture model (1) is well understood using the second MAR assumption. The second MAR assumption states that observations that would have been recorded for a patient in the future, given that the observed history of such patient has the same statistical behavior. This feature of the patternmixture model makes it possible for multiple imputation to provide a practical approach to estimation and inferences. In addition, this feature provides a framework for the formulation of the patternmixture model with multiple imputation [1].
Patternmixture model with multiple imputation methodology
In this section, we describe the patternmixture model with multiple imputation (PMMI) methodology [1]. Consider a randomized clinical trial with two treatment arms and predictors of continuous response Y_{i} (Y_{ij}⊲ for each patient. Let the Y_{ij} be the measurements of the i^{th} patient at the j^{th} occasion in each treatment arm T_{i}, where jD0 represents baseline measurements in each treatment arm and jDn_{i} denotes the last observation time prior to a deviation for the i^{th} patient. It is then assumed that all patients were observed at baseline. Let (1) \(\mathbf {Y}^{o}_{i} = \left ({Y}_{i0}, \ldots, {Y}_{{in}_{i}} \right)'\) denotes a vector of the i^{th} patient’s observed responses at each scheduled visit from jD0,…,n_{i}, (2) \(\mathbf {Y}^{m}_{i} = \left ({Y}_{{in}_{i}+1}, \ldots, {Y}_{in} \right)'\) denote a vector of the i^{th} patient’s missing postdeviation responses at scheduled visits time from jDn_{i}C1,…,n, where n is the last schedule visit, (3) \(\mathbf {Y}^{m} = \left (\mathbf {Y}^{m'}_{1}, \ldots, \mathbf {Y}^{m'}_{N} \right)'\) denotes a column vector of the i^{th} patient’s missing postdeviation responses profile, and (4) \(\mathbf {Y}^{o} = \left (\mathbf {Y}^{o'}_{1}, \ldots, \mathbf {Y}^{o'}_{N} \right)'\) denotes a column vector of the i^{th} patient’s observed responses profile. It follows that the distribution of each patient’s postdeviation responses \(\mathbf {Y}^{m}_{i}\), given each patient’s predeviation responses \(\mathbf {Y}^{o}_{i}\) and the deviation time n_{i}, is defined by
where T_{i} denotes binary treatment arm (for patient in either the prednisolone or placebo treated arm). The parameter vector θ has to be estimated before we can impute missing postdeviation data by drawing from conditional distribution (3).
Link between the patternmixture model and the patternmixture model with multiple imputation
If postdeviation data are assumed to be MAR (that is, the probability that the responses are missing depends on the observed data), the distribution (3) is independent of the deviation time n_{i}. Hence the distribution (3) can be written as
Under such assumption, the direct maximum likelihood estimation [8, 25] or the multiple imputation under MAR can be used to obtain valid inference [8, 17, 26]. However, if data are NMAR, the distribution (3) depends on the deviation the time n_{i} in a manner that could be different for each patient. This feature of the distribution (3) is analogous to the standard patternmixture model (1), where response model is fitted for each pattern of missingness such that the observed data is a mixture of patterns weighted by their respective probabilities of missingness.
It follows that for each patient or group of patients, a specific form of the conditional distribution (3) is defined to reflect a specific assumption appropriate to their treatment arm T_{i}, deviation time n_{i} and other relevant information or covariates. Given this information, multiple imputation is used for imputing missing postdeviation data from Eq. 3 to create complete data sets. Thereafter, estimation and inference is then performed by fitting a standard method of analysis (which is a methods of analysis that yields valid inferences without missing data) to the complete data sets [16, 17]. Thus, for inferences about θ in the presence of deviations, multiple imputation is used to create K “completed” data sets.
To obtain postdeviation data from the distribution (3), Carpenter and colleagues [1] suggested the following.
Step A: Assume a multivariate normal for the observed data Y^{o}.
Step B: Draw samples of the parameter estimates of β and R_{i} from the Bayesian posterior distribution defined as Pr(β′,α′∣Y^{o}), where β is a vector of the means and α′D(σ′,ρ′)′ is a parameter vector of the variance components in the measurement model. The Markov chain Monte Carlo (MCMC) method is used to draw samples of β and α from this posterior.
Step C: Update the Markov chain sufficiently after each draw in order to avoid correlation between draws in each of the parameter estimates β and α.
Step D: After each draw of β and α for each patient who deviates before the end of the trial, β and α are used to build the joint distribution of such patient’s predeviation and postdeviation data. We discuss different options for building this joint distribution in “Constructing joint distributions of predeviation and postdeviation outcome data” section.
Step E: The joint model in Step D is then used to build the conditional distribution of each patient’s missing postdeviation data, given the predeviation data (3). The missing postdeviation data in the conditional distribution (3) are obtained using the parameter estimates β and α obtained from Step D.
Step F: Repeat Steps BEK times to create K “complete” data sets. Thereafter, any method of analysis that yields valid inferences in the absence of missing data can then be applied to the complete data sets.
Carpenter and colleagues [1] considered the treatment benefit at the last schedule visit where they fitted a linear regression model that assumed that observations are independent. This paper considers the treatment benefit over time and hence the linear mixed effect model [24] is assumed for the measurement process. This model is then fitted to each of the K imputed data sets. This analysis produced K statistics for the parameter estimates β and α. Estimates from each of the K completed data set were then combined to produce single estimates with their associated standard errors using the Rubin’s rule [17].
Constructing joint distributions of predeviation and postdeviation outcome data
In this section, we discuss the four de facto options for obtaining the missing postdeviation data [1]. These options make alternative and plausible assumptions about the missing data such that the de facto (NMAR sensitivity analysis) assumptions depart from the de jure (MAR primary) assumption about the missing data. These assumptions assess whether inferences under such MAR primary analysis assumption are sensitive to the alternative plausible assumptions under NMAR sensitivity analysis. In this way, we will be able to assess whether the process that generated the missing CD4 count data is MAR or NMAR mechanism. This distinction is necessary because the type of missing data mechanism has implications for both the analysis and interpretations of the results [27]. We also discuss how to choose reference arm (“Choosing the reference arm” section) and the implications of the de facto options under the IMPI trial in “De facto options under the IMPI trial” section.
Carpenter and colleagues proposed the following options for constructing the joint distribution of each patient’s pre and postdeviation outcome data where each option represents a possible de jure or de facto assumption concerning postdeviation data. These assumptions differ in the ways in which unavailable information for deviated patient are borrowed, or estimated, from other groups of patients in the same trial [1]. Here two treatment arms, placebo and active (prednisolone in our study), are considered and one of these arms is chosen as a reference arm such that unavailable information for deviated patient can be “borrowed” from such reference arm. The reference arm could be either the placebo or the active arm depending the hypothesis to address. In this study, we in turn used each arm as reference arm just to explore how treatment effect is affected under such considerations. Here, we refer to the arm not chosen as reference as the other arm.
A: Jump to reference (J2R): Under this assumption, after a patient stops taking treatment from the randomized arm, such patient’s mean response distribution is now considered to be the same us of the “reference” group of patients. Typically, such a patient will take treatment from the control or placebo arm. However, such a patient may not necessarily take treatment from the placebo arm (but assumes to take treatment from the randomized arm after dropout) since the choice of the reference arm may depends on trial setting. In a trial where more benefit is expected in the active arm, such a change may be seen as extreme, and choosing the reference group to be the placebo group may be viewed as a worstcase scenario in terms of reducing any treatment benefit, since withdrawn patients on active will lose the effect of their period on treatment. In this study, the postdeviation data in the reference arm are imputed under randomizedarm MAR.
B: Copy difference in reference (CDR): Under this de facto option, after the patient deviates, it is assumed that the patient’s postdeviation mean increments copy those from the reference arm. For instance, if the placebo arm is chosen as the reference arm, the patient’s mean profile after deviation tracks that of the mean profile in the placebo arm, but starting from the benefit already obtained from the active arm.
C: Last mean carried forward (LMCF): Under the LMCF, it is assumed that after deviation, the patient’s postdeviation means equal that of the marginal mean of the randomized treatment arm.
D: Copy reference (CR): The “copy reference” de fact option assumes that a patient’s whole distribution, both predeviation and postdeviation data, is the same as reference arm.
Whereas the above assumptions for constructing post deviation data have been proven to be practical and permit relevant, accessible assumptions for framing primary and sensitivity analyses, the PMMI approach depends on the relevance of the assumptions about missing postdeviation data in relation to the context of the trial at hand [1]. In this study, we apply the PMMI approach in the context of the IMPI trial setting (see “Choosing the reference arm” and “De facto options under the IMPI trial” sections).
Choosing the reference arm
For the “jump to reference”, “copy reference” and “copy increment in reference” de facto options, we discuss the implications for the choice of the reference arm. In the IMPI trial, it could be either the placebo or the prednisolone arm. This is because we expect similar statistical behavior for patients in either arm. Suppose that one wishes to address the de facto question corresponding to the assumption that after postdeviation (CD4 count measurements are unobserved), (1) patients on the placebo arm obtain a treatment equivalent to the active (prednisolone) arm, and (2) the prednisolonetreated patients continue on treatment and adhered to the study protocol, so that their postdeviation data can be imputed assuming randomizedarm MAR. In such a case, we specify the prednisolone arm as a reference. In the IMPI trial, HIV+ patients in either placebo or prednisolone arm were given ART and thus patients with their CD4 count unobserved are expected to have equivalent treatment benefit compared with those patients with their CD4 count observed unless prednisolone treatment influences ART treatment. Since we hypothesized that patients’ response to ART treatment in both the placebo and the prednisolone arms are comparable, we also present results where the placebo arm is used as a “reference”. Thus dropouts in the prednisolone arms obtain treatment equivalent to the placebo arm so that their postdeviation data (unobserved CD4 count measurement) can be imputed under randomizedarm MAR. This latter assumption might be appropriate where no alternative treatment is generally available or where patients in both arms receive treatment but responses were unobserved (in the case of the IMPI trial IMPI trial).
De facto options under the IMPI trial
A simple interpretation of the PMMI approach is that within the same trial, the PMMI approach is used to “borrow” or estimate unavailable information from a group of patients for another group of patients who have their information missing. As we have stated earlier, in the IMPI trial setting, HIV+ patients in both the active treatment (prednisolone) arm and the placebo treatment arm were given ART, and hence we expect similar benefit of ART treatment unless prednisolone treatment interacts with the ART treatment. One research question to address in the IMPI trial is whether the prednisolone treatment interacts with the ART treatment. If they do interact, patients’ response to ART treatment from the active arm and the placebo arm will be different, otherwise they would be comparable. Also in the IMPI trial, missing CD4 count for patients were unobserved due to inadequate resources but not necessarily that the patient dropped out before the end period of the trial. In other words, CD4 count measurements were missing at some scheduled visits mostly due to administrative reason and missingness would have been generated by a random process. In fact, only 6% of the patients dropped out (genuine dropout) in the IMPI trial. This means that most of the patients do not dropout from the study but their CD4 count values could not be measured due to inadequate resources. Thus, patients who CD4 count are unobserved, are expected to have similar CD4 count levels to those who were observed. Out of a total number of 294 HIV positive patients in the placebo arm, approximately 78% were already on ART at the time of randomization and out of a total number of 293 HIV positive patients in the prednisolone arm, approximately 80% were already on ART at baseline.
For the de facto question, since we do not expect significant different in treatment effect between patients with their CD4 count observed and those with their CD4 count unobserved, the jump to reference and the copy reference options are the most plausible options for assessing sensitivity of inferences to MAR assumption.
The CD4 count data introduced in “Description of the IMPI trial data” section, are analyzed under de jure MAR and de facto NMAR assumptions. In the measurement model (2), we included an intercept, and assumed as fixed effects the following covariates: prednisolone (which takes the value of 1 for subjects randomized to prednisolone and 0 if the subject was randomized to placebo), time (months), age, whether on ART or not at each scheduled visit (1 if the subject received ART, and 0 if subject did not receive ART), and interactions between prednisolone and time, and prednisolone and ART. Age and time were continuous variables. Our fitted linear mixed model is defined as
where \(\sqrt {\text {CD4}_{ij}}\) is the square root of CD4 count for i^{th} patient at the j^{th} visit, for iD1,…,N and jD1,…,n_{i}, b_{i} represents the patientspecific random effect, and ε_{ij} is the residual error. It is assumed that b_{i} and ε_{ij} are independently distributed as \(\mathrm {b}_{i} \sim N\left (0,\sigma ^{2}_{b}\right)\) and \({\epsilon }_{ij} \sim N\left (0,\sigma ^{2}_{\epsilon }\right)\) respectively.
Application of the PMMI approach to the IMPI trial CD4 count data
In this section, we applied the PMMI approach to the incomplete CD4 count data. We implemented the PMMI approach using STATA mimix package developed by Cro of London School of Hygiene and Tropical Medicine (LHTM), UK. This package imputes missing continuous outcomes for a longitudinal trial with protocol deviations under distinct reference groups based assumptions for the unobserved data, following the procedure proposed by Carpenter and colleagues [1].
To address the de jure hypothesis, we performed multiple imputation for the unobserved CD4 count under MAR mechanism using the ice package in STATA [28]. We also impute postdeviation under LMCF, J2R, CDR and CR de facto options to obtain a complete data sets. The linear mixed effect model (5) was then fitted to each of the completed data sets and parameter estimates combined to produce parameter estimates with their corresponding standard errors using the Rubin’s rule [17, 28].
Monotone data
This section presents the PMMI analyses of the monotone CD4 count data. We consider the jump to reference option for illustration purpose and Fig. 7 shows profiles plots of the mean \(\sqrt {\text {CD4}}\) count measurements for the complete data sets, for each deviation pattern, by placebo arm (Treatment = 0) and prednisolone arm (Treatment = 1). The left panel of the Fig. 7 shows complete data profiles of the placebo reference arm with missing postdeviation values obtained under MAR whereas the right panel of the Fig. 7 shows complete data profiles of the prednisolone arm patients with missing postdeviation data “borrowed” from the placebo arm (left panel of the Fig. 7). We in turn used the prednisolone arm as a reference where the complete data profiles are shown in Fig. 8. The right panel of the Fig. 8 shows complete data profiles of the prednisolone reference arm with missing postdeviation values obtained under MAR whereas the left panel of the Fig. 8 shows complete data profiles of the placebo arm patients with missing postdeviation data obtained from the prednisolone arm (right panel of the Fig. 8). It can be observed that treatment seems to reduce CD4 count a little, and so imputed data for placebo under MAR are above those when the placebo patient jumps to the prednisolone arm. Hence, we investigate the significance of such reduction in the CD4 count level by using the parameter estimates associated with the prednisoloneART interaction (see Table 4). Similar plots for LMCF, CDR and CR can be found in Appendix A. After imputation of the missing postdeviation data under LMCF, J2R, CDR and CR, we fit a linear mixed effect model (5) to the completed data sets and combine the parameter estimates from each data set using the Rubin’s rule to produce parameter estimates with their associated standard errors for the final inferences. The parameter estimates from these analyses are shown in Table 4.
Table 4 shows inferences from the MAR primary analysis (MI), which addresses the de jure hypothesis, are robust to the difference assumptions under the NMAR sensitivity analyses under de facto estimand hypothesis (LMCF, J2R, CR, and CDR). This result thus serves as a justification that the mechanism that generated the missing data in the CD4 count measurements from the IMPI trial is missing at random (MAR) mechanism. The implication of this justification is that the direct maximum likelihood and multiplication methods under MAR can be used to provide valid inferences when assessing the effect of prednisolone and ART treatments on changes in CD4 count level among different treatment groups. The results show that there is no significant prednisolone effect. The effect of prednisoloneART is also not significant. This confirmed our hypothesis that prednisolone treatment does not influence ART treatment. However, there seem to be a slight reduction of CD4 count level in the prednisolone arm. Patients’ CD4 count levels increased significantly with time and patients who are permanently on ART have significantly higher CD4 count levels relative to those who are not ever on ART treatment. The prednisolonetime interaction results show a very slight increase in CD4 count level in the placebo arm compared with prednisolone arm over time. However, this increase is not significant. The nearzero estimates of the prednisolonetime interaction effect suggest that there is no difference in prednisolone effect in both arms over time. This means that the effect of treatments in both arms does not differ significantly over time. The results also show that older patients are more likely to have lower CD4 count, hence CD4 count significantly decrease with increasing age. These results agree with the mean \(\sqrt {\text {CD4}}\) count profiles plots in Figs. 1 and 4. This is because CD4 count in both the prednisolone and placebo arms increases at the same rate (no significant prednisolone effect and prednisolone does not influence ART treatment) and CD4 count increases with increasing time where this increase, in both arms, is the same over time (no prednisolonetime effect).
Combined monotone and nonmonotone data
This section presents the PMMI analyses of the combined monotone and nonmonotone data. Parameter estimates of these analyses are shown in Table 5. The results of these analyses agree with the results under the Table 4. These results also give an indication that the MAR primary analysis (MI), which addresses the de jure hypothesis, are robust to the difference assumptions by and the NMAR sensitivity analyses under de facto estimand hypothesis (LMCF, J2R, CR, and CDR). These analyses show that the mechanism that generated the missing data in the CD4 count measurements from the IMPI trial is missing at random (MAR) mechanism. This means that the direct maximum likelihood and multiplication methods under MAR can be used to provide valid inferences when assessing the effect of prednisolone and ART treatments on changes in CD4 count level among different treatment groups.
It can be observed from these analyses that there is no significant prednisolone effect and the effect of prednisoloneART is also not significant. This implies that prednisolone treatment does not influence ART treatment. We also found a reduction of CD4 count level in the prednisolone arm. However, this reduction is not significant. As expected, patients’ CD4 count levels significantly increase with increasing time and patients who are on ART at each schedule visit time have significantly higher CD4 count levels relative to those who are not on ART treatment at each schedule visit. The near zero estimates of the prednisolonetime interaction effect suggest that there is no difference in prednisolone effect in both arms over time. This means that the effect of treatments in both arms does not differ significantly over time. The results also show that older patients are more likely to have lower CD4 values, hence CD4 count significantly decrease with increasing age. These results agree with the mean \(\sqrt {\text {CD4}}\) count profiles plots in Figs. 1 and 4. This is because CD4 count in both the prednisolone and placebo arms increases at the same rate (no significant prednisolone effect and prednisolone does not influence ART treatment) and CD4 count increases with increasing time where this increase, in both arms, is the same over time (no prednisolonetime effect).
Simulation study
In this section we performed simulation experiments to evaluate the performance of the PMMI approach. We performed a simulation experiment to evaluate the performance of the de facto hypothesis against the usual MI method for imputation of missing data and likelihood based method (ML). These methods (MI and ML) are known to provide valid inference when missing values are missing at random (MAR) [8, 17].
The simulated datasets were generated using the R software. The R code for the simulation experiment is available from the first author upon request. The simulation experiment was performed according to the linear mixed effect model defined by
The initial values for β_{0},β_{1},β_{2}, and β_{3} are 13, 0.75, 0.11, 0.19, 0.20 respectively. The initial value for standard deviation σ of the random effect b_{i} is 4.57. In generating these data sets, we assumed that (1) the measurement at the first time point (j = 0) from the original data set is completely observed, (2) the data are MCAR or MAR mechanism, (3) the missing pattern is monotone, and (4) there are different dropout rates. We considered the following two steps for generating the data sets. We called these steps, Mstep and Dstep. We generated the longitudinal measurements under the Mstep and under the Dstep, we then generated data according to MAR and MCAR mechanisms.
Mstep: We generated fiverepeated measurements for each patient by a random number from a multivariate normal distribution. We used parameter estimates obtained from fitting a linear mixed effect model to the data. We repeated these processes 1000 times for 200 patients. Patients were randomly assigned to two treatment (treatment and placebo) arms in a ratio 1:1.
Dstep: We generated missing data according to MCAR and MAR mechanisms. Missing data were generated through a logistic regression model. However, generating MCAR and MAR missing mechanisms involves two different assumptions for the dropout mechanism. For MAR, missing data were generated by dropping observations according to a logistic regression model relating the probability of dropout at particular time point with changes from baseline to previous time point. For MCAR, missing data were randomly generated by dropping observations according to a logistic regression model. Specific values for the logistic regression were chosen in order to yield the desire dropout rates in a given missing data mechanism. Under each of the missing data mechanisms, we generate overall dropout rates at 5%, 20%, 30, and then 50%. Thereafter, we perform analyses using ML, MI, LMCF, J2R, CDR and CR approaches and then assess the performance of these methods in estimating treatment effect.
The results from the simulation study under MCAR and MAR mechanisms are shown in Appendix B. The MCAR results are shown in Table 6 and the MAR results are presented in Table 7. Under the MCAR mechanism, it can be observed that all the methods produced unbiased parameter estimates under the different missingness rates. The root mean square error (RMSE) estimates of prednisolone effect, produced by each methods under the different missingness rates, are often higher compared with the time and treatmenttime interaction effects. Most of the methods yielded unbiased estimates of treatment effect and this may imply that the process that generated the missing data is likely to be random. The simulation results under the MAR mechanism revealed that each of the methods yielded unbiased estimates for prednisolone effect under the missingness rates with less unbiased estimates for treatment effects when the missingness rates are 5%, 20% and 50%. All the methods yielded unbiased estimate of time effect under the different missingness rates. When missingness rate was assumed to be 50%, the LMCF and the CDR methods yielded less unbiased estimates of time effect. Each of the methods showed no bias for treatmenttime interaction slope when the missingness rates were assumed to be 5%, 10% and 30% and bias for treatmenttime interaction slope when missingness rate was assumed to be 50% and 20%. However, ML and MI, yielded unbiased estimates for treatmenttime interaction. These results suggested that the four de facto assumptions proposed by Carpenter and colleagues [1] are suitable for handling the missing data in the IMPI clinical trial and other trials with similar settings.
Discussion and conclusion
In this paper, we investigated the effect of TB pericarditis treatment (prednisolone) on CD4 count changes over time. We also conducted sensitivity analysis to investigate sensitivity of statistical inferences under MAR analysis (de jure option) to alternative plausible assumptions under NMAR (de facto option) using the PMMI approach [1]. These principles and methods quantify the robustness of inferences to departures from the primary analysis assumptions. We recognized that this case study cannot cover the broad range of types and designs of clinical trials. This is because the literature on sensitivity analysis is evolving. The primary objective of this paper is to assert the importance of conducting some form of sensitivity analysis and to illustrate principles in the IMPI trial setting.
The study results show that inferences under the de jure (MAR primary analysis) assumption are robust to the inferences under the de facto (NMAR sensitivity analysis) assumptions. This finding gives an indication that the mechanism that generated the missing values in the CD4 count measurements from the IMPI trial is likely to be missing at random (MAR). The implications are that (1) the observed data are random sample from the population patients with TB pericarditis and (2) either the direct maximum likelihood (ML) approach or the multiple imputation approach, under the assumption that the data are MAR, can be used to produce valid inferences.
The investigation of sensitivity of statistical inferences to missing data is important and use of such methods must be encouraged. This is because, such sensitivity analysis provides additional information to readers of a clinical report to be able to interpret the results. This means that clinical reports should describe the primary and the sensitivity analyses to nonstatisticians. This requires that assumptions about missing data are articulated in a transparent manner so that researchers and practicing clinicians can assess their validity under the study at hand [1]. Carpenter and colleagues [1] encourage the need for such sensitivity analysis stating that “assumptions need to be assessable, so that in the context of the trial at hand all stakeholders can understand whether they are plausible. Then, departures from these assumptions also need to be relevant in the context of the trial at hand, so that stakeholders can see if they require investigation.” When data are missing, it is possible that readers of a clinical report may doubt its conclusions unless the conclusions are supported with sensitivity analysis.
Our study results from both the combined monotone, and the nonmonotone and monotone showed that there is no significant prednisolone effect in all the analyses. The prednisolonetime interaction results show a very slight reduction in CD4 count level among the patients in the prednisolone arm compared with placebo arm over time. However, this reduction is not significant. As expected, there is a significant time effect indicating that CD4 count level increases with increasing time. Patients who are on ART treatment, at each scheduled visit, are likely to have significantly higher CD4 count levels compared with those who are not always on ART at each visit time. The results also show that older patients are more likely to have a lower CD4 count level. Also, there is no prednisoloneART interaction effect in all the analyses. However, the prednisolone effects under the combined monotone and nonmonotone analyses are negatives because the overall reduction in the CD4 count levels among patients in the prednisolone arm is more pronounced than that of the patients in the placebo arm (see Fig. 1). On the contrary, the treatment effects under the nonmonotone analyses are positives because the overall reduction in the CD4 count levels among patients in the prednisolone arm is less pronounced than that of the patients in the placebo arm (see Fig. 4).
The IMPI trial was a cardiology trial and HIVrelated data were collected. However, the HIV data were not collected as would have be in a HIV focused clinical trial, and hence there are missing CD4 count. Despite the fact that the IMPI trial is a cardiology trial, our analyses of the HIV data provide reasonable information regarding the effect of prednisolone on CD4 count changes over time.
In the IMPI trial prednisolone effect was not significant, and hence patients CD4 count levels in the treatments arms are comparable. If the prednisolone effect was significant, CD4 count levels for patients in the treatment arms would have been different.
The missingness of CD4 values might be informative, and hence later values of CD4 count might be missing because patients died. This would require joint modeling on the CD4 count and time to death.
Appendix A
This section presents the complete profile plots of CDR, CR, and LMCFde facto hypotheses.
Appendix B
This section presents simulation results under MCAR and MAR mechanisms with varying missingness rates 5%, 10%, 20%, 30%, and 50% in Tables 6 and 7 respectively.
Abbreviations
 ART:

Antiretroviral therapy
 CDR (P+):

Copy difference in reference active arm
 CDR (P):

Copy difference in reference placebo arm
 CR (P+):

Copy reference active arm
 CR (P):

Copy reference placebo arm
 J2R (P+):

Jump to reference active arm
 J2R (P):

Jump to reference placebo arm
 LMCF:

Last mean carried forward
 MAR:

Missing at random
 MCAR:

Missing completely at random
 MI:

Multiple imputation under MAR
 NMAR:

Not missing at random
 PMMI:

Patternmixture model with multiple imputation
References
 1
Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: A framework for relevant, accessible assumptions, and inference via multiple imputation. Biopharm Stat. 2013; 23(6):1352–71.
 2
Little R, Yau L. Intenttotreat analysis for longitudinal studies with dropouts. Biometrics. 1996; 52(4):1324–33.
 3
Council NR. The Prevention and Treatment of Missing Data in Clinical Trials, Panel on Handling Missing Data in Clinical Trials, Committee on National Statistics, Division of Behavioral and Social Sciences and Education.2010.
 4
Committee for medicinal products for human use (chmp) guideline on the choice of the noninferiority margin. Stat Med. 2006; 25(10):1628.
 5
Little RJ, D’agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, Frangakis C, Hogan JW, Molenberghs G, Murphy SA, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012; 367(14):1355–60.
 6
O’neill R, Temple R. The prevention and treatment of missing data in clinical trials: an fda perspective on the importance of dealing with it. Clin Pharmacol Ther. 2012; 91(3):550–4.
 7
European Medicines Agency. Committee for Medicinal Products for Human Use (CHMP). Guideline on Missing Data in Confirmatory Clinical Trials. EMA/CPMP/EWP/1776/99 Rev.1.http://www.ema.europa.edu/docs/en_GB/document_library/Scientific_guidel٪ine/2010/09/WC500096793.pdf. Published July 2, 2010.
 8
Rubin DB. Inference and missing data. Biometrika. 1976; 63(3):581–92.
 9
Diggle P, Kenward MG. Informative dropout in longitudinal data analysis. Appl Stat. 1994; 43(1):49–93.
 10
Wu MC, Carroll RJ. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics. 1988; 44(1):175–88.
 11
Creemers A, Hens N, Aerts M, Molenberghs G, Verbeke G, Kenward MG. Generalized sharedparameter models and missingness at random. Stat Model. 2011; 11(4):279–310.
 12
Yuan Y, Little RJ. Mixedeffect hybrid models for longitudinal data with nonignorable dropout. Biometrics. 2009; 65(2):478–86.
 13
Ratitch B, O’Kelly M, Tosiello R. Missing data in clinical trials: from clinical assumptions to statistical analysis using pattern mixture models. Pharm Stat. 2013; 12(6):337–47.
 14
Mallinckrodt C, Roger J, ChuangStein C, Molenberghs G, O’Kelly M, Ratitch B, Janssens M, Bunouf P. Recent developments in the prevention and treatment of missing data. Therapeutic Innovation ands Regulatory Science. 2013.
 15
Permutt T. Sensitivity analysis for missing data in regulatory submissions. Stat Med. 2016; 35(17):2876–9.
 16
Little RJ, Rubin DB. Statistical analysis with missing data. Wiley. 2014.
 17
Rubin DB. Multiple imputation after 18+ years. J Am Stat Assoc. 1996; 91(434):473–89.
 18
Mayosi BM, Ntsekhe M, Bosch J, Pogue J, Gumedze F, Badri M, Jung H, Pandie S, Smieja M, Thabane L, et al. Rationale and design of the investigation of the management of pericarditis (impi) trial: A 2 × 2 factorial randomized doubleblind multicenter trial of adjunctive prednisolone and mycobacterium w immunotherapy in tuberculous pericarditis. Heart J. 2012; 165:109–15.
 19
Mayosi BM, Ntsekhe M, Bosch J, Pandie S, Jung H, Gumedze F, Pogue J, Thabane L, Smieja M, Francis V, et al.Prednisolone and mycobacterium indicus pranii in tuberculous pericarditis. N Engl J Med. 2014.
 20
Rubin DB. The calculation of posterior distributions by data augmentation: Comment: A noniterative sampling/importance resampling alternative to the data augmentation algorithm for creating a few imputations when fractions of missing information are modest: The sir algorithm. J Am Stat Assoc. 1987; 82(398):543–6.
 21
Heckman JJ. The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Natl Bur Econ Res. 1976; 5(4):475–92.
 22
Wu MC, Bailey K. Analysing changes in the presence of informative right censoring caused by death and withdrawal. Stat Med. 1988; 7(12):337–46.
 23
Wu MC, Bailey KR. Estimation and comparison of changes in the presence of informative right censoring: conditional linear model. Biometrics. 1989; 45(3):939–55.
 24
Laird NM, Ware JH. Randomeffects models for longitudinal data. Biometrics. 1982; 38(4):963–74.
 25
Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B. 1977; 39(1):1–38.
 26
Rubin DB. Formalizing subjective notions about the effect of nonrespondents in sample surveys. J Am Stat Assoc. 1977; 72(359):538–43.
 27
Molenberghs G, Beunckens C, Sotto C, Kenward MG. Every missingness not at random model has a missingness at random counterpart with equal fit. J R Stat Soc Ser B. 2008; 70(2):371–88.
 28
Royston P. Multiple imputation of missing values: update of ice. Stata J. 2005; 5(4):527.
Acknowledgments
AI would like to thank South African Center for Epidemiological Modeling and Analysis (SACEMA) for funding the project. The authors would also like to thank The Academy of Medical Sciences and the National Research Foundation for partially funding this research. We thank the Mayosi Research Group, Department of Medicine, University of Cape Town for providing the data for the study. The authors would like to thank the following for their invaluable comments: Prof James Carpenter of London School Hygiene and Tropical Medicine, UK and Prof Jane Hutton, University of Warwick, Department of Statistics, UK. The authors would also like to thank Suzie Cro of London School of Hygiene and Tropical Medicine, UK, for making software available as well as offering valuable suggestions for software’s implementation.
Availability of data and materials
We do not have permission to distribute the data.
Author information
Affiliations
Contributions
AI carried out the literature review, statistical analyses, and wrote the manuscript. FG also contributed to the writing and the reviewing of the manuscript and also provided consultation regarding analysis and interpretation of findings. Both authors read and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The author declares that he has no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Iddrisu, AK., Gumedze, F. An application of a patternmixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations. BMC Med Res Methodol 19, 10 (2019). https://doi.org/10.1186/s128740180639y
Received:
Accepted:
Published:
Keywords
 Likelihoodbased methods
 Missing at random
 Multiple imputation
 Not missing at random
 Patternmixture model
 Protocol deviation
 Sensitivity analysis