 Research article
 Open Access
 Published:
Patternmixture model in network metaanalysis of binary missing outcome data: onestage or twostage approach?
BMC Medical Research Methodology volume 21, Article number: 12 (2021)
Abstract
Background
Trials with binary outcomes can be synthesised using withintrial exact likelihood or approximate normal likelihood in onestage or twostage approaches, respectively. The performance of the onestage and the twostage approaches has been documented extensively in the literature. However, little is known about how these approaches behave in the presence of missing outcome data (MOD), which are ubiquitous in clinical trials. In this work, we compare the onestage versus twostage approach via a patternmixture model in the network metaanalysis using Bayesian methods to handle MOD appropriately.
Methods
We used 29 published networks to empirically compare the two approaches concerning the relative treatment effects of several competing interventions and the betweentrial variance (τ^{2}), while considering the extent and level of balance of MOD in the included trials. We additionally conducted a simulation study to compare the competing approaches regarding the bias and width of the 95% credible interval of the (summary) log odds ratios (OR) and τ^{2} in the presence of moderate and large MOD.
Results
The empirical study did not reveal any systematic bias between the compared approaches regarding the log OR, but showed systematically larger uncertainty around the log OR under the onestage approach for networks with at least one small trial or low event risk and moderate MOD. For these networks, the simulation study revealed that the bias in log OR for comparisons with the reference intervention in the network was relatively higher in the twostage approach. Contrariwise, the bias in log OR for the remaining comparisons was relatively higher in the onestage approach. Overall, bias increased for large MOD. For these networks, the empirical results revealed slightly higher τ^{2} estimates under the onestage approach irrespective of the extent of MOD. The onestage approach also led to less precise log OR and τ^{2} when compared with the twostage approach for large MOD.
Conclusions
Due to considerable bias in the log ORs overall, especially for large MOD, none of the competing approaches was superior. Until a more competent model is developed, the researchers may prefer the onestage approach to handle MOD, while acknowledging its limitations.
Background
To address aggregate binary missing participant outcome data (MOD) in pairwise and network metaanalysis, the researchers usually resort to simple datahandling approaches, such as exclusion or imputation. Both approaches are popular due to their simplicity [1,2,3], yet notorious for the implausibility of their assumptions. A more appropriate approach, both statistically and conceptually, is to model MOD simultaneously with the observed outcomes. This approach naturally accounts for the uncertainty due to MOD, and it may also safeguard against biased results by adjusting the withintrial results (treatment effect and standard error) for MOD [4]. Since modelling of MOD does not require any data manipulation before analysis, it overrides both exclusion and imputation.
The patternmixture model is the most commonly described model in the methodological literature for pairwise and network metaanalysis to address binary MOD [4,5,6,7]. It consists of two parts: a model for the outcome conditional on being missing or observed and a model for the probability of MOD [8]. The patternmixture model incorporates an informative missingness parameter, which in the case of binary data, is known as the informative missingness odds ratio (IMOR) parameter and quantifies departures from the missing at random (MAR) assumption [4, 6, 7, 9]. The IMOR parameter is defined as the ratio between the odds of an event among MOD and the odds of an event among participants completing the trial. The IMOR parameter is naturally unknown, and we can only make clinically plausible assumptions for its value. Under the Bayesian framework, IMOR is commonly assigned a normal prior distribution in the logarithmic scale with mean and variance indicating our on average prior belief and uncertainty about the missingness mechanism, respectively [4, 6].
The patternmixture model can be applied under both the exact and approximate likelihood methods. The former (more frequently – but not exclusively – applied using Bayesian methods) commonly assumes withintrial binomial distribution, and thus, uses logistic regression to estimate the withintrial log ORs and their corresponding standard errors in a single step (hereafter the onestage patternmixture (PM) approach). Under this approach, the log IMOR is assigned a normal prior distribution with various options regarding its structure (e.g. identical, exchangeable, or independent across trials, trialspecific, interventionspecific) rendering this approach very appealing and flexible [4, 6]. Under approximate likelihood methods (hereafter the twostage PM approach), initially, log ORs, and standard errors are calculated in each trial after adjusting for a scenario about the missingness process (e.g. MAR as a starting point) – expressed via the mean and variance of log IMORs. Then, the adjusted log ORs are pooled using inversevariance weighting [7, 10].
Albeit being more straightforward to apply, the twostage PM approach has several shortcomings inherent to the withintrial normal approximation assumption. By fixing the withintrial results to the assumed mean and variance of log IMOR, the twostage PM approach does not allow the observed data to contribute to the estimation of log IMOR to gain further insights on the missingness process in the collected trials [4]. Furthermore, the adjusted withintrial treatment effects and variances – the latter assumed known, although estimated, under the normal distribution – comprise the dataset for the second stage of the twostage PM approach. In the presence of zero cells, continuity correction is thus required – a suboptimal approach that has been criticised for leading to biased results [11, 12]. In a typical systematic review where large and many studies are not prevalent, it is hard to justify the withintrial normal approximation [13, 14]. Consequently, the application of the twostage PM approach may implicate the accuracy of summary results (especially, when the included trials are small, or the outcome is sparse [15]), and hence, compromise the conclusions delivered to the endusers of systematic reviews.
The advantages of the exact likelihood (onestage approach) over the approximate normal likelihood (twostage approach) for the synthesis of trials are welldocumented in the literature for pairwise metaanalysis [16,17,18] and recently for network metaanalysis (NMA) [19]. However, little is known of how much the presence of MOD can challenge the behaviour of these two approaches. To our knowledge, there are only two simulation studies on the performance of the patternmixture approach in evidence synthesis of binary outcome data [5, 20]. However, they have considered only scenarios that allow for the approximate normality assumption. In this work, we investigate the implications of applying the onestage and twostage PM approaches on the relative treatment effects and the betweentrial variance in NMA. In Section “Methods”, we introduce the onestage and twostage PM approaches for binary MOD in Bayesian randomeffects NMA, and we briefly describe the empirical study. The results of the empirical study appear in the homonymous section followed by the Section “Simulation study” where we describe the setup and the various scenarios of our simulation study. In Section “Results of the simulation study”, we present the results of the simulation study. Discussion of the findings from the empirical and simulation studies can be found in Section “Discussion”, and brief conclusions and recommendations are followedup in Section “Conclusions.
Methods
Consider a network of N trials that compare different sets of interventions regarding a binary outcome, where a_{i} represents the number of interventions (from now on called arms) investigated in trial i (i = 1, 2, …, N). In arm k = 1, 2, …, a_{i} of trial i, \( {r}_{ik}^o \) represents the number of participants who experienced the outcome conditional on the completers (i.e. participants who completed the trial), m_{ik} represents the number of missing participants, and n_{ik} represents the total number of randomised participants.
Onestage patternmixture model
By convention, \( {r}_{ik}^o \) and m_{ik} are assumed to follow the corresponding binomial distributions:
where \( {p}_{ik}^o \) is the probability of an event conditional on completers (i.e. n_{ik} − m_{ik}), and q_{ik} is the probability of MOD in arm k of trial i [4, 6].
Under the patternmixture model, the randomised participants are distinguished to those completing and those dropping out of the trial early. Within each subgroup, the participants are further distinguished to those experiencing and those not experiencing the event. Then, the underlying probability of an event, p_{ik}, can be written as a function of these subgroups using conditional probabilities [4]:
with \( {p}_{ik}^m \) being the probability of an event conditional on those dropping out of arm k in trial i. Then, the IMOR parameter is defined as follows [7]:
with
The relationship between \( {p}_{ik}^m \) and \( {p}_{ik}^o \) is explained by the formula of the IMOR parameter:

if \( {p}_{ik}^m={p}_{ik}^o \), then δ_{ik} = 1 (and φ_{ik} = 0) which suggests the MAR assumption;

if \( {p}_{ik}^m>{p}_{ik}^o \), then δ_{ik} > 1 (and φ_{ik} > 0) which suggests a deviation from the MAR assumption and indicates that the odds of an event given the missing participants are more likely than the odds of an event given the completers, and

if \( {p}_{ik}^m<{p}_{ik}^o \), then δ_{ik} < 1 (and φ_{ik} < 0) which also suggests a deviation from the MAR assumption and indicates that the odds of an event given the missing participants are less likely than the odds of an event given the completers.
In the present work, we considered independent φ_{ik} to agree with the structure of φ_{ik} in the twostage PM approach (Section “Twostage patternmixture model”):
where we assume on average MAR in each arm of every trial. Since the true missingness mechanism is not known, we consider the MAR assumption to be a reasonably plausible assumption following the recommendations of the relevant literature [4, 7, 9].
Randomeffects network metaanalysis model
Then, the logit function with randomeffects is applied:
with u_{i} = logit(p_{i1}) being the log odds of baseline arm and θ_{ik} being the log OR of an event between arm k (k ≠ 1) and baseline arm in trial i. Index t_{ik} indicates the intervention studied in arm k of trial i, that is, t_{ik} ∈ {A, B, …}. Typically, τ^{2} is assumed common for all observed comparisons; this corresponds to a correlation equal to 0.5 between any two θ_{ik} s (with k ≠ 1) in a multiarm trial [21].
Under the consistency assumption (i.e. an agreement between direct and more than one indirect source of evidence [22]), we can obtain all possible pairwise comparisons as linear combinations of the summary log ORs of the basic parameters (i.e. comparisons with the reference intervention in the network [23]):
where A is the reference intervention in the network with a set of interventions T = {A, B, C, …} and j ≠ l ∈ T ∖ {Α} are the nonreference interventions of the network. Using the basic parameters, we can also obtain several measures of hierarchy to order the interventions from the best to worst [24]. However, intervention hierarchy is out of the scope of the present study.
Twostage patternmixture model
In the first stage, we adjust the withintrial log ORs using the patternmixture model (eq. (1)). In line with the onestage PM model, we considered that log IMORs are on average MAR in each arm of every trial (i.e. ω_{ik} = 0), where p_{i, k} corresponds to \( {r}_{ik}^o/\left({n}_{ik}{m}_{ik}\right) \). Then, the log OR of an event between arm k (k ≠ 1) and baseline arm in trial i is estimated as:
where the term ‘zero cells’ refers to observing no events in either arm of a trial. Under the patternmixture model, the withintrial variance of log OR, v_{ik}, is partitioned to the variance due to sampling error and to the variance arising from φ_{ik}. In the present work, tο approximate the variance due to sampling error, we applied Taylor series (eq. (13) in White et al. [7]), and for the variance due to φ_{ik} we used the eq. (16) in White et al. [7] assuming zero correlation between φ_{ik} s of the compared arms in each trial and \( {\sigma}_{ik}^2 \) equal to 1. By convention, v_{ik} is treated as known based on the central limit theorem that trials are sufficiently large so that y_{ik} approximates the normal distribution with variance equal to v_{ik}.
Randomeffects network metaanalysis model
In the second stage, following the contrastbased parameterisation described by Dias et al. [25] (Example 7(a) in the Appendix, there), in a multiarm trial, withintrial log ORs are sampled from the following multivariate normal distribution:
with \( {\boldsymbol{y}}_i={\left({y}_{i2},{y}_{i3},\dots, {y}_{i{a}_i}\right)}^{\prime } \) and \( {\boldsymbol{\theta}}_i={\left({\theta}_{i2},{\theta}_{i3},\dots, {\theta}_{i{a}_i}\right)}^{\prime } \) referring to all pairwise comparisons with the baseline arm of trial i and
being the variancecovariance matrix of trial i with cov(y_{ij}, y_{il}) = 1/(n_{i1}p_{i1}(1 − p_{i1})), j ≠ l ∈ {2, 3, …, a_{i} } which is the variance of log odds of the baseline arm (obtained using the Delta method). Then, the vector θ_{i} of correlated randomeffects in trial i is assumed to follow either a multivariate normal distribution (eq. (10) in Dias et al. [25]) or conditional univariate normal distributions on θ_{ik} with k > 2 given all other arms from 2 to a_{i} − 1 (eq. (11) in Dias et al. [25]). Using the consistency equations (Section “Onestage patternmixture model”), we can obtain the summary log ORs for all possible comparisons in the network.
In summary, the onestage PM approach uses the information extracted from each arm of every trial as input data (i.e. \( {r}_{ik}^o \), m_{ik}, and n_{ik}) and incorporates the patternmixture model (eq. (1)) into the hierarchical model of NMA. Contrariwise, the twostage PM approach uses the estimated withintrial results as input data (i.e. y_{ik} and v_{ik}) to perform NMA. These results have been derived by applying the patternmixture model (eq. (1)) under a specific assumption about φ_{ik} (‘on average MAR’ assumption, here) to obtain the p_{i, k}.
Factors that may affect withintrial normality approximation
We used the database of 29 networks from several healthrelated fields considered in previous work [6]. Detailed information on the MOD per network can be found elsewhere [6]. For this study, we considered a sample size of fewer than 50 participants to represent small trials, and event risk below 5% to be low. We characterised a network as ‘susceptible’ to withintrial normality approximation (hereinafter called ‘susceptible’ network) when there was at least one trial with a sample size less than 50 participants and/ or at least one trialarm with observed event risk less than 5%. Otherwise, the network was characterised as ‘nonsusceptible’ to withintrial normality approximation (hereinafter called ‘nonsusceptible’ network). We acknowledge that these two categorisations of the networks may not be universally accepted.
Extent and balance of MOD per trial and network
We used the ‘fiveandtwenty rule’ by Sacket et al. [26], which classifies MOD in a trial as resulting in little, intermediate and serious attrition bias, alongside our definition of unbalanced MOD [6] to indicate the trial and networks as having:

low MOD (i.e. a trial with a percentage of MOD less than 5; a network with a median percentage of MOD less than 5);

moderate and balanced MOD: moderate MOD (i.e. a trial with a percentage of MOD between 5 and 20; a network with a median percentage of MOD between 5 and 20) which are balanced in the compared arms (i.e. a trial with a difference in the percentage of MOD in the compared arms up to 6.5; a network with a median difference in the percentage of MOD in the compared arms up to 6.5);

moderate and unbalanced MOD: moderate MOD which are unbalanced in the compared arms (i.e. a trial with a difference in the percentage of MOD in the compared arms above 6.5; a network with a median difference in the percentage of MOD in the compared arms above 6.5);

large and balanced MOD: large MOD (i.e. a trial with a percentage of MOD over 20; a network with a median percentage of MOD over 20) which are balanced in the compared arms, and,

large and unbalanced MOD: large MOD which are unbalanced in the compared arms.
The ‘percentage of MOD for a trial’ is defined as the ratio of the number of MOD in all compared arms of the trial to the total number of randomised participants in that trial. The ‘median percentage for a network’ refers to the median of the percentage of MOD across all trials of the network.
Overall, 37% of the 539 trials in our dataset had low MOD, followed by 30% with moderate and balanced MOD, 18% with moderate and unbalanced MOD, 9% with large and unbalanced MOD, and 6% with large and balanced MOD. Almost half of the networks were classified as having moderate and balanced MOD (Table 1), followed by low MOD (41%). Overall, three networks were found to be more problematic in terms of MOD: two networks of moderate and unbalanced MOD, and one network of large and unbalanced MOD. None of the networks was classified as having large and balanced MOD.
Characteristics of the analysed networks
Eleven out of 29 networks (38%) were categorised as ‘susceptible’ and 18 as ‘nonsusceptible’ (Table 1; Supplementary Table 1, Additional file 1). The former group included considerably more trials (median: 21, range: 11–104) and therefore, more trials per comparison (median: 2, range: 1–13) than the latter group (median: 9, range: 4–15 for trials; median: 1, range: 1–10 for trials per comparison) (Table 1). Of the 11 ‘susceptible’ networks, the majority (72%) had trials with moderate and balanced MOD, whereas the majority (55%) of ‘nonsusceptible’ networks had trials with low MOD (Table 1). There were three networks with the most severe cases of MOD overall: one ‘susceptible’ network with moderate and unbalanced MOD, one ‘nonsusceptible’ network with moderate and unbalanced MOD, and one ‘nonsusceptible’ network with large and unbalanced MOD. The sample size of the trials was moderate overall (median: 204 and 364 in ‘susceptible’ and ‘nonsusceptible’ networks, respectively; Table 1); however, nine of the ‘susceptible’ networks included at least one trial with less than 50 participants (Supplementary Table 1, Additional file 1). Median event risk indicated frequent events in both network categories (median: 0.58 and 0.66 in ‘susceptible’ and ‘nonsusceptible’ networks, respectively; Table 1). Four of the ‘susceptible’ networks included at least one trial with an event risk of less than 5% (Supplementary Table 1, Additional file 1). Nine ‘susceptible’ networks had at least one trial with zero events or nonevents (median number of zero cells: 1, range: 1–4; Table 1; Supplementary Table 1, Additional file 1).
Model implementation and presentation of results
Both approaches were implemented in JAGS via the Rpackage R2jags [27] (statistical software R, version 3.6.1 [28]). Technical details on the specification of the models (i.e. prior distributions, convergence inspection, number of chains and iterations) can be found in Additional file 1. We created scatterplots to illustrate the agreement between results from the onestage versus twostage PM approaches for the following three model parameters: a) posterior mean of withintrial log ORs, b) posterior mean of NMA log ORs for comparisons with the reference intervention in each network, and c) the posterior median of τ^{2}. We compared the approaches also in terms of the posterior standard deviation of the parameters mentioned above. An agreement was inferred when the points were aligned with the diagonal line. To quantify the agreement, we used the concordance correlation coefficient (CCC) [29] via the Rpackage epiR [30]. The Rpackage ggplot2 was used to draw the scatterplots [31]. The dataset and the code to perform the empirical study are available online at https://github.com/LoukiaSpin/OnestagevstwostagePMmodels.git.
Results of the empirical study
The first panel of Fig. 1 shows the posterior mean and standard deviation of the withintrial log ORs across the 11 ‘susceptible’ networks (404 points, Fig. 1 a)) and 18 ‘nonsusceptible’ networks (172 points, Fig. 1 b)) for a different amount of MOD. The second panel of Fig. 1 presents the posterior mean and standard deviation of the log ORs for the basic parameters of each ‘susceptible’ (104 points, Fig. 1 a)) and ‘nonsusceptible’ network (80 points, Fig. 1 b)), and the third panel illustrates the posterior median and standard deviation of τ^{2} in the ‘susceptible’ networks (11 points, Fig. 1 a)) and ‘nonsusceptible’ networks (18 points, Fig. 1 b)) for a different amount of MOD. Results on the posterior mean of residual deviance are illustrated in Additional file 1 (Table S2) to investigate whether each model fits the data satisfactorily for each network.
Posterior mean or median
For the ‘susceptible’ networks, onestage and twostage PM approaches overall agreed concerning the posterior mean of withintrial log ORs (CCC: 0.99) and the posterior mean of NMA log ORs (CCC: 0.99) across the different scenarios of MOD (Fig. 1 a, first and second panel). An agreement could also be inferred for the posterior median of τ^{2} (CCC: 0.90), except for four networks with moderate and balanced MOD whose τ^{2} estimates were found to be higher under the onestage PM approach (Fig. 1 a, third panel). In more detail, from the left to the right of the plot, the posterior median of τ^{2} under the twostage PM approach was 0.14, 0.26, 0.37, and 0.71 versus 0.20, 0.40, 0.66, and 0.93 under the onestage PM approach, respectively. These τ^{2} estimates corresponded to moderate statistical heterogeneity (network 14; the posterior median of τ^{2} was lower than the third quartile of the corresponding predictive distribution for τ^{2}) and large statistical heterogeneity (networks 11, 22, and 27; the posterior median of τ^{2} was larger than the third quartile of the corresponding predictive distributions for τ^{2}). Note that the remaining ‘susceptible’ networks had low statistical heterogeneity as the posterior median of τ^{2} was lower than the median of the corresponding predictive distributions for τ^{2}. Therefore, in ‘susceptible’ networks with large statistical heterogeneity, the compared approaches did not agree in the estimation of τ^{2} as the estimated τ^{2} tended to be larger under the onestage PM approach when compared with the twostage PM approach.
Contrariwise, in ‘nonsusceptible’ networks, the compared approaches perfectly agreed concerning the posterior mean of withintrial log ORs and the posterior mean of NMA log ORs across the different scenarios of MOD (Fig. 1 b, first and second panel). The agreement was almost perfect for the posterior median of τ^{2} (CCC: 0.97) apart from one network with moderate and balanced MOD that showed a slightly larger posterior median of τ^{2} under the onestage PM approach (Fig. 1 b, third panel). Specifically, the τ^{2} estimates were 0.19 and 0.27 under the twostage PM approach and onestage PM approach, respectively – both estimates indicated moderate statistical heterogeneity for being lower than the third quartile of the selected predictive distribution for τ^{2}.
Posterior standard deviation
In ‘susceptible’ networks, the posterior standard deviation of NMA log ORs was systematically larger under the onestage PM approach, especially in networks with moderate and balanced MOD (Fig. 1 a, second panel). This was expected as the onestage PM approach accounted for the uncertainty in the estimation of all parameters in the patternmixture model (eq. (1)). Therefore, the uncertainty increased when the available information was limited; namely, the included trials were small with low events and substantial MOD. Contrariwise, in ‘nonsusceptible’ networks, the agreement was almost perfect for the posterior standard deviation of NMA log ORs (Fig. 1 B, second panel).
Regarding the posterior standard deviation of τ^{2}, the agreement was higher in ‘nonsusceptible’ networks overall (CCC: 0.96, 95% confidence interval (CI): 0.91 to 0.99) as compared with the ‘susceptible’ networks (CCC: 0.91, 95% CI: 0.82 to 0.95). In the ‘susceptible’ networks, the onestage PM approach resulted in a larger posterior standard deviation of τ^{2} for the four networks mentioned above (Section “Posterior mean or median”) (Fig. 1 a and b, third panel). Therefore, in ‘susceptible’ networks with large statistical heterogeneity, the onestage PM approach tended to estimate τ^{2} also with larger uncertainty as compared to the twostage PM approach.
Simulation study
We simulated 1000 triangle networks of twoarm trials and three interventions: new intervention, old intervention, and placebo. Our main interest was the comparison of the former two interventions; however, for completeness, we also presented the results on the basic parameters (i.e. comparisons with placebo). The ultimate goal of the simulation study was to compare the performance of the two PM approaches under a setting where the normality approximation is compromised (i.e. small trials with low events) while considering the recommended ‘on average MAR’ assumption as a primary analysis to model informative MOD. In practice, it is more plausible for MOD to be informative; however, note that we cannot know the exact missingness mechanism.
Simulation setup
The simulation setup was in line with a previous study on MOD in NMA [5]. Briefly, we assumed a larger beneficial underlying log OR for ‘new intervention versus placebo’ as compared to ‘old intervention versus placebo’ and we used the consistency equation to obtain the underlying log OR for ‘new versus old intervention’ (Table 2). To generate the number of events in each arm of every trial, we considered the datagenerating model of Hartung and Knapp for a randomeffects pairwise metaanalysis [34]. For a brief description of the datagenerating model, the reader can refer to Additional file 1. To obtain the event risks among the completers in each arm of every trial, we used the linkage function of Turner et al. [4] (eq. (7), there) that is a function of the IMOR parameter, the underlying event risks and the probability of MOD in each arm of every trial.
Simulation scenarios
In the present work, we considered only a ‘typical loop’ with one trial comparing ‘new versus old intervention’, three trials comparing a ‘new intervention with placebo’, and four trials comparing an ‘old intervention with placebo’ [32] (Table 2). The simulation scenarios were constructed such that to explore the impact of four key factors: the sample size of the trials, frequency of events, the extent of MOD, and degree of τ^{2}. With respect to sample size, we considered a trial as having a small sample size if n < 50, and moderate sample size if n > 100, equally distributed in the compared arms (Table 2). For event risk at the control arm, a maximum of 15% was considered to be low, and at least 27% was considered to be frequent (Table 2). Initially, we considered a maximum of 5% as low event risk at the control arm. However, this scenario resulted in generating networks with zero events in both arms for the majority of trials, particularly for the scenario of fewer than 50 participants, and thus, creating serious convergence issues in both approaches. We focused on scenarios of unbalanced MOD with more MOD in the control arm and cases of moderate and large MOD (Table 2). A previous study revealed that moderate and large MOD (which were unbalanced in the compared arms) affected the performance of the onestage PM approach in terms of the posterior standard deviation of log OR and τ^{2} [5]. We considered informative missingness process in all interventions: IMOR equal to 2 for the new and old interventions (i.e. the odds of an event given MOD are twice the odds of an event given the completers) and IMOR equal to 1/2 for placebo. We considered τ^{2} equal to 0.02 and 0.07 to reflect small and substantial true statistical heterogeneity, respectively. These values correspond to the median of the predictive lognormal distribution for allcause mortality (95% prior interval: 0.001–0.26) and generic health setting (95% prior interval: 0.002–2.67), respectively [33]. Table 2 illustrates the scenarios considered in the simulation.
Model implementation and presentation of results
For each of the 16 scenarios (4 factors of two categories), we performed a Bayesian randomeffects NMA with consistency equations using the onestage and twostage PM approaches to analyse the generated networks. All analyses were performed under the ‘on average MAR’ assumption as the recommended primary analysis [4, 7, 9, 35, 36]. In line with the empirical study, we considered a noninformative normal prior distribution with zero mean and variance equal to 10,000 on all location parameters for both PM approaches. We assigned a predictive prior distribution on τ^{2} that refers to the improvement of symptoms for pharmacological versus placebo comparison (median: 0.11, 95% prior interval: 0.01–2.13) and aligns with the beneficial outcome considered in the simulation study [33]. We preferred this prior distribution to a weaklyinformative prior distribution, such as halfnormal prior distribution on τ with variance one (median: 0.67, 95% prior interval: 0.03–2.24), as the latter compromised the estimation of the parameters for the scenario of low events (Supplementary Table 3–6, Additional file 1).
For each scenario, we calculated the bias for (NMA) log OR as the difference between the posterior mean of log OR and the underlying log OR. The bias for τ^{2} was calculated as the difference between the posterior median of τ^{2} and the underlying τ^{2}. The posterior width of 95% credible interval (CrI) for a parameter (log OR or τ^{2}) was calculated as the difference between the 97.5 and 2.5% percentile of the simulated parameter. The bias of the posterior mean and the width of the 95% CrIs of log OR for every comparison are illustrated using dot plots. The posterior mean and posterior standard deviation of log ORs are presented in tables in the Additional file 1 (Supplementary Table 7–9). The posterior median and posterior standard deviation of τ^{2} alongside the bias and the width of the 95% CrI are presented in Table 3. Regarding the bias and the width of the 95% CrIs of log OR, we presented only the results for small τ^{2} as the behaviour of the compared approaches was similar under small and substantial true τ^{2}.
To demonstrate that there is an association between the withintrial log OR and its standard error, when normality approximation cannot be defended (i.e. small trials with low events), we used the simulated triangles to estimate the covariance between the withintrial log OR and its standard error at the first stage of the twostage PM approach. We created a scatterplot for each scenario where we plotted the estimated withintrial standard error of log OR against the withintrial log OR, and we used different colours to illustrate the magnitude of covariance. We presented the results for ‘new versus old intervention’ in the main text and the results for the comparisons with placebo as Supplementary Figs. 1–2 (Additional file 1).
For each simulation, we used three parallel chains with different initial values; thinning equal to 10; 80,000 updates; and a burnin of 20,000 Markov chain Monte Carlo samples. Simulations and analyses were performed in R [28]. The dot plots and scatterplots were created using the Rpackage ggplot2 [31]. The code and necessary material to generate and analyse the triangles are available online at https://github.com/LoukiaSpin/OnestagevstwostagePMmodels.git.
Results of the simulation study
Bias and width of 95% credible interval of log ORs
For the case of a low event and small trial size, we encountered smallscale convergence issues for the log OR of all comparisons under the onestage PM approach alone (1 to 4% of simulations whose results were discarded). In both approaches, the absolute bias of the posterior mean of log OR for ‘new versus old intervention’ was smaller under all scenarios as compared to the bias of posterior mean of log OR for both basic parameters (Fig. 2). The onestage PM approach overestimated the posterior mean of log OR for ‘new versus old intervention’ in the presence of small trials with low event frequency, and notably, for large MOD (Fig. 2). On the contrary, the bias in the twostage PM approach was very low for those scenarios (bias equal to 0.03). In the remaining scenarios, the bias of the posterior mean of log OR for ‘new versus old intervention’ was similar in both approaches.
Interestingly, the posterior mean of the log OR for both basic parameters was substantially underestimated in both approaches in the presence of large MOD (Fig. 2). For a low event and small trial size, both basic parameters had a smaller bias under the onestage PM approach. The exception was the case of large MOD, where the log OR for ‘old intervention versus placebo’ was slightly more biased under the onestage approach (Fig. 2; Supplementary Table 8–9, Additional file 1). In the remaining scenarios, the bias of the posterior mean of log OR for the basic parameters was similar in both models (Fig. 2; Supplementary Table 8–9, Additional file 1).
The relatively high negative bias in the basic parameters under both approaches may be attributed to the residual bias after considering the MAR assumption to analyse informative MOD, which were assumed to be moderate or large in all included trials. To investigate whether the extent of MOD may indeed explain this extent of bias, we reran the simulation study also considering low attrition bias (%MOD < 5) in all included trials. Under this bestcase situation, the bias in log OR of the basic parameters was reduced in both approaches. Specifically, the bias ranged from − 0.1 (moderate trial size with frequent events and substantial τ^{2}) to 0.07 (small trials with a low event and small τ^{2}) under the onestage PM approach, and from − 0.19 (small trials with a low event and substantial τ^{2}) to − 0.05 (frequent events and small τ^{2}) under the twostage PM approach (Supplementary Fig. 3, Additional file 1). Therefore, increasing the amount of MOD increased the bias in both basic parameters, particularly under the twostage PM approach. Note that in each network of our database, the percentage of MOD (%MOD) was ranging from very low levels (indicating low attrition bias; %MOD < 5) to moderate or large levels (indicating serious attrition bias; %MOD > 20) across the trials. Thus, we consider our simulation study to reflect a rather worstcase situation; in a ‘typical’ network, the bias in the log OR of the basic parameters would be at lower levels.
The 95% CrIs of log OR were wider in the onestage PM approach for all comparisons – especially, for small trials with low events, and large MOD (range: 5.79–6.85 under the onestage PM approach; range: 3.53–4.45 under the twostage PM approach) (Fig. 3). Under these scenarios, the available information was limited, and therefore, both approaches estimated log OR with greater uncertainty as compared to scenarios with more information (e.g. moderate trial size and/ or frequent events). However, since the onestage PM approach inherently treats all parameters of the patternmixture model (eq. (1)) as random variables, the uncertainty around the estimation of log OR was larger under this approach in ‘susceptible’ networks with considerable MOD. Contrariwise, the twostage PM approach estimated the withintrial log ORs and their standard error at the first stage (via the patternmixture model). Therefore, the twostage PM approach ‘disregarded’ the uncertainty in the estimation of the withintrial log ORs at the second stage leading to spuriously more precise summary log ORs even in the presence of large MOD.
Ad hoc analysis: association between the withintrial log OR and its standard error
Figure 4 illustrates a panel of scatterplots on the withintrial standard error of log OR for ‘new versus old intervention’ (y) against the withintrial log OR for that comparison (x) for each simulation scenario. For positive values of x, the covariance between x and y was positive, and therefore, trials with larger positive x corresponded to larger y and received smaller weight, whereas trials with smaller positive x corresponded to smaller y and received larger weight. On the contrary, for negative values of x, the covariance between x and y was negative, and therefore, bias was upwards for the pooled log OR. This pattern was observed for trials with small size and/ or low event frequency, regardless of τ^{2}, and became more evident for large MOD (Fig. 4). The conclusions were the same for the comparison of new and old intervention versus placebo (Supplementary Fig. 1–2, Additional file 1).
Bias and width of 95% credible interval of common τ^{2}
Both approaches achieved convergence in all scenarios regarding τ^{2}. Under small true τ^{2}, both approaches estimated a similarly low posterior median of τ^{2} for moderate trial size and frequent events that approached the truth regardless of the MOD scenario (Table 3). In the remaining scenarios, both approaches overestimated τ^{2} similarly for moderate and large MOD, though the bias was slightly larger under the onestage PM approach (from 0.05 to 0.11) as compared to the twostage PM approach (from 0.05 to 0.08), especially, for small trials with a low event. The overestimation may be attributed to having a small true τ^{2} which has the same likelihood as the first quartile of the prior predictive distribution that we assigned on τ^{2} in both approaches (equal to 0.02), and thus, τ^{2} was overestimated.
The conclusions were similar for substantial true τ^{2} (Table 3). As expected, the posterior median of τ^{2} was slightly larger in both approaches in most scenarios compared to the posterior median of τ^{2} under small true τ^{2}. However, bias was lower under substantial true τ^{2} in all scenarios as compared to small true τ^{2}. A plausible explanation may be that the substantial true τ^{2} was closer to the median of the prior predictive distribution for τ^{2} in both approaches (equal to 0.11), and hence, the magnitude of overestimation was relatively smaller under substantial true τ^{2} than under small true τ^{2}.
Overall, the onestage PM approach led to wider 95% CrIs for τ^{2} as compared to the twostage PM approach, especially for small trials with low events (Table 3). As expected, in both approaches, 95% CrIs for τ^{2} were wider under substantial τ^{2} (range: 0.57–3.29 in the onestage PM approach; 0.56–1.21 in the twostage PM approach) when compared with small τ^{2} (range: 0.31–3.06 in the onestage PM approach; 0.31–1.19 in the twostage PM approach) as well as under large MOD (range: 0.40–3.29 in the onestage PM approach 0.42–1.21 in the twostage PM approach) as compared to moderate MOD (range: 0.31–2.58 in the onestage PM approach; 0.31–1.15 in the twostage PM approach). In the case of moderate trial size with frequent events, both approaches led to a very similar width of 95% CrI for τ^{2} regardless of the MOD scenario.
Discussion
We compared the onestage approach with the twostage approach in the presence of MOD via the patternmixture model using Bayesian randomeffects NMA. We performed an empirical and simulation study to investigate the behaviour of NMA log OR and τ^{2} under moderate or large MOD and designfactors that implicate the withintrial approximate normality assumption in the twostage approach (i.e. sample size and event frequency).
The empirical study revealed that in the case of ‘susceptible’ networks with moderate MOD, the posterior standard deviation of NMA log OR was systematically larger under the onestage PM approach. The simulation study indicated that this behaviour was more evident in the presence of small trials with low events and exacerbated for large MOD. This is a situation where the available information is limited, and therefore, the uncertainty around the estimated NMA log OR increases. Our results are in line with Stijnen et al. [17] – albeit the authors applied binomialnormal and hypergeometricnormal models in the absence of MOD.
Furthermore, the empirical study did not indicate any systematic differences in the posterior mean of withintrial log ORs and NMA log ORs (for the basic parameters) between the compared approaches across the different amounts of MOD. Nevertheless, the simulation study revealed that in networks of small trials with low events and large MOD, the onestage PM approach resulted in a relatively higher positive bias of NMA log OR for ‘new versus old intervention’ (functional parameter) as compared to the twostage PM approach. This behaviour may be an artefact of the consistency equation, which is implicit also for the bias of NMA log OR for ‘new versus old intervention’ (see, Additional file 1). Presenting only the simulation results for the functional parameters of interest may be misleading if there is substantial bias in at least one of the basic parameters. This is because the bias on the functional parameters may be cancelled out to a great extent through the consistency equation, especially, in the twostage PM approach.
In the presence of large statistical heterogeneity, the empirical study revealed that the onestage PM approach tended to provide a larger estimate of τ^{2} as compared to the twostage PM approach for ‘susceptible’ networks. This observation also concurred with moderate and balanced MOD. A previous study [5] and the present simulation study did not indicate any implications of the amount of MOD on the estimation of τ^{2}; though, large MOD led to greater uncertainty in the estimation of τ^{2} – we observed this behaviour in our simulation study – more notably for the onestage PM approach [5]. However, our simulation study did not reveal the same large discrepancy in the compared approaches concerning the estimation of τ^{2} in networks of small trials with a low event and substantial true τ^{2}. A plausible explanation may be that the substantial true τ^{2} was much lower than the estimated τ^{2} in the empirical study (minimum equal to 0.20) to be able to capture a larger discrepancy in the compared approaches. Both true values for τ^{2} referred to a ‘typical’ metaanalysis with small or substantial statistical heterogeneity. Therefore, we consider the four networks with a large estimation of τ^{2} under the onestage PM approach to representing a rather extreme situation.
The parameter τ^{2} is a nuisance parameter in the randomeffects model, and it has no intuitive clinical interpretation as opposed to log OR. Nevertheless, τ^{2} is an important parameter for the evaluation of the certainty of the evidence in the context of inconsistency using the GRADE framework [37, 38]. The magnitude of τ^{2} affects our decision to downgrade (and by how many levels) or not the evidence for inconsistency: the larger the τ^{2} the more likely to downgrade the evidence for the investigated outcome. Therefore, how much accurately τ^{2} is estimated in a model is of critical importance. Our simulation study revealed that both approaches had a similar behaviour overall, except for networks of small trials with a low event where the onestage PM approach led to a slightly larger bias in the estimation of τ^{2}. Nonetheless, this should not be viewed as a reason to prefer the twostage PM approach over the onestage PM approach, because, in such networks, the twostage PM approach cannot be reliable for relying on the normality approximation.
The ignorance of the inherent correlation between the withintrial log OR and withintrial standard error in conjunction with considerable MOD also raises concerns for the credibility of the results from the twostage PM approach, particularly, in networks of small trials with low event frequency [15, 39]. As already illustrated in Fig. 4, there is a positive association between the withintrial log OR and its standard error when the withintrial log ORs are positive. However, there is a negative association when the withintrial log ORs are negative, and this pattern was obvious in networks of small trials with a low event. Stijnen et al. [17] noted that a positive or negative association between withintrial log OR and its standard error would result in a downward or upward bias in log OR, respectively. In our study, this implication was obvious only for the basic parameters. As we already mentioned, implying consistency in the bias for the ‘new versus old intervention’ led to smaller (yet positive) bias when compared with the bias for the basic parameters.
The flexibility of the onestage PM approach comes at a high computational cost as it appeared 10fold more computationally exhaustive compared with the twostage PM approach. Not surprisingly, convergence issues occurred for the estimation of the NMA log OR in the networks of small trials and low event frequency only under the onestage PM approach. The use of continuity correction seems to aid the convergence of the twostage PM approach for the NMA log OR in this particular scenario. Nevertheless, both approaches share a common limitation: the assumption of normally distributed randomeffects which, if deemed inappropriate (e.g. there are outlying trials in the synthesised dataset [15]), may compromise the validity of the results [17]. Using a simulation study to compare seven models for randomeffects metaanalysis in the frequentist framework, Jackson et al. [18] demonstrated that both the binomialnormal (onestage approach) and normalnormal (twostage approach) models performed poorly overall. The authors suggested alternative model parameterisations (models 4, 6 and 7, there) especially when the event is low or there is considerable statistical heterogeneity according to visual inspection of the forest plot. Extending these models to incorporate the patternmixture model in the Bayesian framework may be proper alternatives to the current onestage and twostage PM approaches.
We did not perform sensitivity analysis to different assumptions about the missingness mechanisms in the compared interventions, as we were interested in investigating the performance of the competing models in the presence of MOD, rather than inferring on the relative effectiveness of the compared interventions in the studied networks. We have investigated the performance of the competing models assuming MAR, which is the recommended starting point according to the relevant published literature [4, 7, 9, 35, 36]. The competing models will have the same behaviour regarding their performance under the same assumption of informative missingness. To raise awareness for good practice in the analysis of MOD, we advise the researchers to systematically apply a sensitivity analysis to a series of gradually stringent yet clinically plausible scenarios for the missingness mechanisms in the compared interventions to investigate whether inferences deviate from the MAR assumption, the recommended primary analysis.
Furthermore, in the empirical study, we have not performed a sensitivity analysis to different prior distributions for the betweentrial variance, as we are not interested in the relative effectiveness of the competing interventions in the analysed networks but in the performance of the competing models. Our simulation study revealed that the competing models maintained their performance under the weaklyinformative prior (Tables S3S5 in Additional file 1). However, the posterior standard deviation increased in both models, when compared with the results under the predictive prior (Tables S7S9 in Additional file 1), particularly for scenarios that compromised the approximate normality assumption, as expected. In the Bayesian analysis, it is a good practice to consider different plausible prior distributions for the betweentrial variance that align with the type and frequency of the investigated outcome as well as the interventioncomparison type. Then, the researchers can investigate the sensitivity of conclusions from the primary analysis to different prior distributions for the betweentrial variance. The recently updated NICE Guide to the Methods of Technology Appraisal provides recommendations for selecting the appropriate prior distribution for the betweentrial variance in NMA [40].
Conclusions
The twostage PM approach is straightforward to implement for having easier parameterisation, no convergence issues, and shorter convergence time as compared to the onestage PM approach. Nevertheless, the wellknown statistical shortcomings of this approach that relate to its approximate normal likelihood assumption and the inability to learn about the missingness mechanisms (since the missingness parameter is fixed rather than estimated) render this approach less appealing overall for the analysis of MOD. The onestage PM approach tackles these limitations, and thus, it may be considered as a more appropriate approach. However, the simulation study failed to demonstrate the onestage PM approach as superior due to considerable bias in the NMA log ORs, especially for large MOD, which can be slightly lower or similar to the corresponding bias under the competing approach. Until a more competent model is developed, we advise the researchers to apply the onestage PM approach to handle MOD, especially in situations that make the approximate normality assumption difficult to defend, provided that the limitations of this approach (as demystified in the present empirical and simulation study) are fully acknowledged in the discussion of the NMA results.
Availability of data and materials
The datasets generated and analysed during the current study are available online at https://github.com/LoukiaSpin/OnestagevstwostagePMmodels.git.
Abbreviations
 CCC:

Concordance correlation coefficient
 CI:

Confidence interval
 CrI:

Credible interval
 IMOR:

Informative missingness odds ratio
 MAR:

Missing at random
 MOD:

Missing outcome data
 NMA:

Network metaanalysis
 OR:

Odds ratio
References
 1.
Akl EA, CarrascoLabra A, BrignardelloPetersen R, Neumann I, Johnston BC, Sun X, et al. Reporting, handling and assessing the risk of bias associated with missing participant data in systematic reviews: a methodological survey. BMJ Open. 2015;5(9):e009368.
 2.
Kahale LA, Diab B, BrignardelloPetersen R, Agarwal A, Mustafa RA, Kwong J, et al. Systematic reviews do not adequately report or address missing outcome data in their analyses: a methodological survey. J Clin Epidemiol. 2018;99:14–23.
 3.
Spineli LM, Pandis N, Salanti G. Reporting and handling missing outcome data in mental health: a systematic review of Cochrane systematic reviews and metaanalyses. Res Synth Methods. 2015;6(2):175–87.
 4.
Turner NL, Dias S, Ades AE, Welton NJ. A Bayesian framework to account for uncertainty due to missing binary outcome data in pairwise metaanalysis. Stat Med. 2015;34(12):2062–80.
 5.
Spineli LM, Kalyvas C, Pateras K. Participants’ outcomes gone missing within a network of interventions: Bayesian modeling strategies. Stat Med. 2019;38(20):3861–79.
 6.
Spineli LM. An empirical comparison of Bayesian modelling strategies for missing binary outcome data in network metaanalysis. BMC Med Res Methodol. 2019;19(1):86.
 7.
White IR, Higgins JPT, Wood AM. Allowing for uncertainty due to missing data in metaanalysispart 1: twostage methods. Stat Med. 2008;27(5):711–27.
 8.
Little RJA. Patternmixture models for multivariate incomplete data. J Am Stat Assoc. 1993;88(421):125–34.
 9.
Higgins JP, White IR, Wood AM. Imputation methods for missing outcome data in metaanalysis of clinical trials. Clin Trials. 2008;5(3):225–39.
 10.
Chaimani A, Mavridis D, Higgins JPT, Salanti G, White IR. Allowing for informative missingness in aggregate data metaanalysis with continuous or binary outcomes: extensions to metamiss. Stata J. 2018;18(3):716–40.
 11.
Bradburn MJ, Deeks JJ, Berlin JA, Localio AR. Much ado about nothing: a comparison of the performance of metaanalytical methods with rare events. Stat Med. 2007;26(1):53–77.
 12.
Sweeting MJ, Sutton AJ, Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in metaanalysis of sparse data. Stat Med. 2004;23(9):1351–75.
 13.
Davey J, Turner RM, Clarke MJ, Higgins JPT. Characteristics of metaanalyses and their component studies in the Cochrane database of systematic reviews: a crosssectional, descriptive analysis. BMC Med Res Methodol. 2011;11:160.
 14.
Nikolakopoulou A, Chaimani A, Veroniki AA, Vasiliadis HS, Schmid CH, Salanti G. Characteristics of networks of interventions: a description of a database of 186 published networks. PLoS One. 2014;9(1):e86754.
 15.
Jackson D, White IR. When should metaanalysis avoid making hidden normality assumptions? Biom J. 2018;60(6):1040–58.
 16.
Hamza TH, van Houwelingen HC, Stijnen T. The binomial distribution of metaanalysis was preferred to model withinstudy variability. J Clin Epidemiol. 2008;61(1):41–51.
 17.
Stijnen T, Hamza TH, Özdemir P. Random effects metaanalysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data. Stat Med. 2010;29(29):3046–67.
 18.
Jackson D, Law M, Stijnen T, Viechtbauer W, White IR. A comparison of seven randomeffects models for metaanalyses that estimate the summary odds ratio. Stat Med. 2018;37(7):1059–85.
 19.
Seide SE, Jensen K, Kieser M. A comparison of Bayesian and frequentist methods in randomeffects network metaanalysis of binary data. Res Synth Methods. 2020;11(3):363–78.
 20.
Spineli LM, Kalyvas C. Comparison of exclusion, imputation and modelling of missing binary outcome data in frequentist network metaanalysis. BMC Med Res Methodol. 2020;20(1):48.
 21.
Higgins JP, Whitehead A. Borrowing strength from external trials in a metaanalysis. Stat Med. 1996;15(24):2733–49.
 22.
Salanti G. Indirect and mixedtreatment comparison, network, or multipletreatments metaanalysis: many names, many benefits, many concerns for the next generation evidence synthesis tool. Res Synth Methods. 2012;3(2):80–97.
 23.
Lu G, Ades AE. Assessing evidence inconsistency in mixed treatment comparisons. J Am Stat Assoc. 2006;101(474):447–59.
 24.
Salanti G, Ades AE, Ioannidis JPA. Graphical methods and numerical summaries for presenting results from multipletreatment metaanalysis: an overview and tutorial. J Clin Epidemiol. 2011;64(2):163–71.
 25.
Dias S, Sutton AJ, Ades AE, Welton NJ. Evidence synthesis for decision making 2: a generalized linear modeling framework for pairwise and network metaanalysis of randomized controlled trials. Med Decis Mak. 2013;33(5):607–17.
 26.
Sackett DL, Richardson WS, Rosenberg WM, Haynes RB. Evidencebased medicine: how to practice and teach EBM. New York: Churchill Livingstone; 1997.
 27.
Su Y, Yajima M. R2jags: Using R to Run ‘JAGS’. R package version 0.5–7. 2015. https://cran.rproject.org/package=R2jags.
 28.
R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for statistical Computing; 2019. https://www.rproject.org.
 29.
Lin LI. A concordance correlation coefficient to evaluate reproducibility. Biometrics. 1989;45(1):255–68.
 30.
Stevenson M. epiR: Tools for the Analysis of Epidemiological Data. R package version 1.0–15. 2020. https://cran.rproject.org/package=epiR.
 31.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: SpringerVerlag; 2009.
 32.
Veroniki AA, Mavridis D, Higgins JPT, Salanti G. Characteristics of a loop of evidence that affect detection and estimation of inconsistency: a simulation study. BMC Med Res Methodol. 2014;14:106.
 33.
Turner RM, Jackson D, Wei Y, Thompson SG, Higgins JPT. Predictive distributions for betweenstudy heterogeneity and simple methods for their application in Bayesian metaanalysis. Stat Med. 2015;34(6):984–98.
 34.
Hartung J, Knapp G. A refined method for the metaanalysis of controlled clinical trials with binary outcome. Stat Med. 2001;20(24):3875–89.
 35.
White IR, Welton NJ, Wood AM, Ades AE, Higgins JPT. Allowing for uncertainty due to missing data in metaanalysispart 2: hierarchical models. Stat Med. 2008;27(5):728–45.
 36.
White IR, Carpenter J, Horton NJ. Including all individuals is not enough: lessons for intentiontotreat analysis. Clin Trials. 2012;9(4):396–407.
 37.
Zhang Y, Akl EA, Schünemann HJ. Using systematic reviews in guideline development: the GRADE approach. Res Synth Methods. 2018. https://doi.org/10.1002/jrsm.1313.
 38.
BrignardelloPetersen R, Bonner A, Alexander PE, Siemieniuk RA, Furukawa TA, Rochwerg B, et al. Advances in the GRADE approach to rate the certainty in estimates from a network metaanalysis. J Clin Epidemiol. 2018;93:36–44.
 39.
Chang BH, Hoaglin DC. Metaanalysis of odds ratios: current good practices. Med Care. 2017;55(4):328–35.
 40.
Welton NJ, Phillippo DM, Owen R, Jones HJ, Dias S, Bujkiewicz S, Ades AE, Abrams KR. DSU Report. CHTE2020 Sources and Synthesis of Evidence; Update to Evidence Synthesis Methods. March 2020.
Acknowledgements
Chrysostomos Kalyvas is employed by Merck Sharp & Dohme. Katerina Papadimitropoulou is a PhD candidate at the Department of Clinical Epidemiology of Leiden University Medical Center and an employee of Danone Nutricia Research in Utrecht, The Netherlands. The authors alone are responsible for the views expressed in this article, and they should not be construed with the views, decisions, or policies of the institutions to which they are affiliated.
Funding
This work was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft under grant number SP 1664/1–1) to LMS. The funder was not involved in the study design; in the collection, analysis and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. Open Access funding enabled and organized by Projekt DEAL.
Author information
Affiliations
Contributions
LMS conceived the idea of the study. All authors designed the study. LMS analysed the empirical data. LMS and CK performed the simulations. CK and KP checked the code for correctness. LMS drafted the article. All authors revised the article critically for important intellectual content and approved the final version of the article.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Table S1
. Comparison size and factors that affect withintrial normal approximation. Table S2. The posterior mean of residual deviance of each model per network. Table S3. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (new versus old) under halfnormal prior distribution on τ. Table S4. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (new versus placebo) under halfnormal prior distribution on τ. Table S5. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (old versus placebo) under halfnormal prior distribution on τ. Table S6. Posterior median (95% CrI) and bias (width of 95% CrI) for common τ^{2} under halfnormal prior distribution on τ. Table S7. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (new versus old intervention) under empirical prior distribution on τ. Table S8. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (new vs placebo) under empirical prior distribution on τ. Table S9. Posterior mean (95% CrI) and bias (width of 95% CrI) for log OR (old versus placebo) under empirical prior distribution on τ. Fig. S1. A panel of scatterplots on the withintrial standard error of log OR for ‘new intervention versus placebo’ (axis y) against the withintrial log OR for that comparison (axis x) for each simulation scenario. The colour key indicates the magnitude of covariance between the withintrial standard error of log OR and withintrial log OR for that comparison. MOD, missing outcome data; OR, odds ratio. Fig. S2. A panel of scatterplots on the withintrial standard error of log OR for ‘old intervention versus placebo’ (axis y) against the withintrial log OR for that comparison (axis x) for each simulation scenario. The colour key indicates the magnitude of covariance between the withintrial standard error of log OR and withintrial log OR for that comparison. MOD, missing outcome data; OR, odds ratio. Fig. S3. Dot plots on the bias of posterior mean of NMA log OR for all pairwise comparisons under onestage and twostage PM approaches while accounting for low missing outcome data, the size of trials (small, moderate), the event frequency (low, frequent) and the extent of τ^{2} (small, substantial).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Spineli, L.M., Papadimitropoulou, K. & Kalyvas, C. Patternmixture model in network metaanalysis of binary missing outcome data: onestage or twostage approach?. BMC Med Res Methodol 21, 12 (2021). https://doi.org/10.1186/s12874020012056
Received:
Accepted:
Published:
Keywords
 Network metaanalysis
 Missing outcome data
 Patternmixture model
 Bayesian methods
 Onestage approach
 Twostage approach
 Simulation study