 Research article
 Open Access
 Published:
Multivariate network metaanalysis incorporating class effects
BMC Medical Research Methodology volume 20, Article number: 184 (2020)
Abstract
Background
Network metaanalysis synthesises data from a number of clinical trials in order to assess the comparative efficacy of multiple healthcare interventions in similar patient populations. In situations where clinical trial data are heterogeneously reported i.e. data are missing for one or more outcomes of interest, synthesising such data can lead to disconnected networks of evidence, increased uncertainty, and potentially biased estimates which can have severe implications for decisionmaking. To overcome this issue, strength can be borrowed between outcomes of interest in multivariate network metaanalyses. Furthermore, in situations where there are relatively few trials informing each treatment comparison, there is a potential issue with the sparsity of data in the treatment networks, which can lead to substantial parameter uncertainty. A multivariate network metaanalysis approach can be further extended to borrow strength between interventions of the same class using hierarchical models.
Methods
We extend the trivariate network metaanalysis model to incorporate the exchangeability between treatment effects belonging to the same class of intervention to increase precision in treatment effect estimates. We further incorporate a missing data framework to estimate uncertainty in trials that did not report measures of variability in order to maximise the use of all available information for healthcare decisionmaking. The methods are applied to a motivating dataset in overactive bladder syndrome. The outcomes of interest were mean change from baseline in incontinence, voiding and urgency episodes. All models were fitted using Bayesian Markov Chain Monte Carlo (MCMC) methods in WinBUGS.
Results
All models (univariate, multivariate, and multivariate models incorporating class effects) produced similar point estimates for all treatment effects. Incorporating class effects in multivariate models often increased precision in treatment effect estimates.
Conclusions
Multivariate network metaanalysis incorporating class effects allowed for the comparison of all interventions across all outcome measures to ameliorate the potential impact of outcome reporting bias, and further borrowed strength between interventions belonging to the same class of treatment to increase the precision in treatment effect estimates for healthcare policy and decisionmaking.
Background
In a health technology assessment (HTA) setting, it is imperative that decisions regarding healthcare policy are formulated with consideration to all outcomes associated with the decision question, using all relevant evidence, and appropriately taking uncertainty in to account. In many healthcare conditions, the effects of new interventions are reported on multiple outcomes and these effects are correlated. Taking in to account the dependence between correlated treatment effects, as well as heterogeneous reporting of treatment effects, may have a substantial impact on the ability to appropriately estimate costeffectiveness for new interventions [1], especially in terms of capturing uncertainty.
Network metaanalysis (NMA) is widely used in an evidence synthesis setting due to the attractive nature of utilizing all relevant information from both direct evidence (obtained from headtohead trials) and indirect evidence (obtained from trials that share a common treatment comparator) [2–4]. Multivariate network metaanalysis (MVNMA) builds on this framework and has the ability to simultaneously model treatment effects for multiple outcomes, as well as estimate the correlation between them. Consequently, for the studies that did not report certain outcomes of interest, it is possible to obtain a predictive value for missing data by modelling the treatment effects on multiple outcomes jointly, taking into account the correlation between them. This methodology can not only increase the information for decision making, but it can also limit the potential for outcome reporting bias in the original trials [5, 6], and consequently increase precision and confidence in the treatment effect estimates [7].
There are two types of correlations that need to be incorporated in the estimation of treatment effect estimates for multiple outcomes  one at the betweenstudy level, and another at the withinstudy level. Betweenstudy correlations occur due to variability betweenstudies. Betweenstudy variability occurs in situations with which studies have a different distribution of potential treatment effect modifiers, such as differences in patient characteristics, trial designs, and baseline patient severity. Withinstudy correlations occur between treatment effects on multiple outcomes within a trial, and are a consequence of differences in patientlevel characteristics. These correlations are indicative of the association between treatment effects measured on multiple outcomes on the same patients within a study. Withinstudy correlations are often difficult to estimate and they are seldom reported in clinical trials. Thus, estimation is often required using individual participant data [8], for example by bootstrapping [9], or elicited from expert opinion [10].
The ability to simultaneously model treatment effects on multiple outcomes in a multivariate analysis is an appealing feature in many HTA settings, as interest frequently lies in multiple and correlated outcome measures [11]. In the last 20 years, evidence synthesis methods have witnessed a rapid increase in methodological developments and applications of multivariate analyses to assess interventions with two or more outcomes of interest [9, 11–19]. It is often desirable to account for the correlation between treatment effects on multiple outcomes in a metaanalysis framework as this has the ability to borrow strength between all reported outcomes and studies in order to inform treatment effect estimates [9]. Such an approach is commonly referred to as multivariate metaanalysis. In more recent developments, multivariate metaanalyses have been extended to incorporate multiple treatment comparisons or network metaanalyses [7, 10, 20]. Ades et al. [20] simultaneously modelled mutually exclusive, competing risk outcomes, using a multinomial likelihood whereby the withinstudy correlations were accounted for but the betweenstudy correlations were assumed to be zero. Efthimiou et al. [10] proposed a MVNMA model that accounts for both the between and withinstudy correlations of binary outcomes. Specifically, Efthimiou et al. [10] incorporated the withinstudy correlations at the studyspecific treatment contrast level, which can be problematic when incorporating multiarm trials. Achana et al. [7] used a more natural modelling approach for armlevel data whereby the withinstudy correlations were incorporated at the treatmentarm level. Using this approach, Achana et al. [7] considered the treatment arms to be independent as a consequence of randomisation, which greatly eases computation of the likelihood for multiarm trials. Furthermore, Achana et al. [7] developed this methodology to borrow information across outcomes in order to predict an estimate for missing data. This methodology allows disconnected interventions to be incorporated in to the analyses if they belong to a connected network for one or more additional outcomes; thereby, allowing all interventions to be evaluated across all outcome measures.
In situations with which there are a large number of interventions in the treatment networks and relatively few trials informing each treatment comparison, there is a potential issue with sparsity of data, which can lead to substantial parameter uncertainty. Collapsing the intervention arms into their respective treatment classes, also known as “lumping" interventions, increases the evidence base and precision in the effect estimates, but with such a classbased approach, the direct interpretation of individual intervention effects (especially those of dose or formulation effects) are lost, which can hinder decisionmaking. To overcome this issue, exchangeability between the same interventions, but with different formulations and/or treatment regimes can be incorporated in a threelevel hierarchical NMA to borrow information within the classes of interventions, strengthening inferences, and potentially reducing the uncertainty around the individual intervention effects [21].
Furthermore, in situations where clinical trial data are heterogeneously reported and/or missing, synthesising such data can be problematic. Metaanalysis methods require specification of both a measure of effect and a measure of variability in the original trial reports. In the absence of such data, trials are often excluded from the analysis [22]. In a decision making context, missing data can have a detrimental impact on the overall decision, as many of the interventions of interest have to be excluded from the analyses. For this reason, when metaanalysing data, it is good practice to estimate missing standard errors for all models, based on additional measures of uncertainty [23].
This paper extends the methodology of Achana et al. [7] to incorporate the exchangeability between interventions of the same class [21], and further accounts for correlations between baseline and followup for missing measures of variability. This approach makes use of all available data by borrowing strength within classes of interventions as well as across outcomes. Borrowing information in this way has the potential to increase the precision in treatment effect estimates, as well as allowing for the comparison of all interventions across all outcome measures, to aid decision making. The remainder of this article is set out as follows: the first section describes the motivating example in overactive bladder (OAB). This is followed by a description of the multivariate network metaanalysis methods, model developments, and model estimation, before applying these methods to the motivating dataset and presenting the results. The article concludes with a discussion and final remarks.
Motivating dataset
The cardinal symptoms of OAB syndrome are urinary incontinence, increased daytime frequency (or voiding), and urgency. A gold standard tool for assessing the quantitative measure of OAB symptoms are patientreported bladder diaries whereby individuals with OAB record the frequency and severity of OAB symptoms on a daily basis [24]. These patientreported bladder diaries are often used as the primary outcome in trials assessing new interventions for OAB. However, many of the key symptoms are often underreported. For example, in the current literature urgency is defined as “the cardinal symptom" of OAB [25], but there are far fewer studies evaluating interventions for urgency (n=62) compared to incontinence (n=117) and voiding episodes (n=124). This poses several limitations for decision makers; most notably, it is difficult to estimate both clinical and costeffectiveness of interventions across the entire symptom syndrome.
In this motivating dataset, interest lies in evaluating the change from baseline for several symptomatic responses. Whilst symptoms at baseline and followup are commonly reported, change from baseline is often unreported; and thus, whilst it is possible to calculate the mean change from baseline, the variance is often missing. Table 1 illustrates the different scenarios for trial reporting in the included studies. In this example, there were no trials that reported followup data alone.
In total, 143 studies reported one of the three cardinal symptoms of OAB. Of these, only 51 (36%) reported treatment effects for all 3 outcomes. A total of 117, 124, and 62 trials evaluated 101, 108, and 58 different interventions for incontinence, voiding, and urgency episodes, respectively. Figures 1, 2 and 3 illustrate the networks of evidence for incontinence, voiding and urgency episodes, with corresponding treatment codes given in Additional file 1. Nearly one third of all studies did not report measures of variability in the mean treatment effects. In total, 44 (38%), 42 (34%), and 18 (29%) studies solely reported mean effects and gave no measure of uncertainty or variability for incontinence, voiding, and urgency episodes, respectively.
For analyses incorporating class effects, interventions were grouped according to expert clinical opinion (DGT). Figure 4 demonstrates the treatment classification of each of the individual interventions, where the central node represents the classes of treatments and the linked arms represent each of the individual interventions within those classes.
Methods
In this section, we begin by recalling the MVNMA model described by Achana et al. [7] (Model 1) and develop this framework further to incorporate the exchangeability between interventions of the same class of interventions (Model 2). We then describe a missing data framework to incorporate studies with missing measures of uncertainty or variability.
Model 1: multivariate network metaanalysis
Following the random effects MVNMA described by Achana et al. [7], let Y_{ij}=(y_{ij1},y_{ij2},y_{ij3}), be the observed vector of effects for intervention j of the i^{th} study (i=1,2,...,n_{s}) for each of the outcomes of interest (1 = urinary incontinence, 2 = voiding frequency, 3 = urgency episodes). Let Y_{ij} follow a multivariate normal (MVN) distribution such that:
where θ_{ij} is a vector of true treatment effects, S_{ij} is the treatmentspecific withinstudy covariance matrix assumed known, μ_{i} is a vector of true baseline effects in study i with baseline treatment b, and δ_{i,bk} is a vector of treatmentspecific effects in the k^{th} arm relative to the baseline treatment in arm 1 of study i. The elements of S_{ij} are expressed as:
where se_{ij(l)} denotes the observed standard errors of intervention j for outcome, l=1,2,3, and \(\rho _{w_{(qr)}}\) denotes the withinstudy correlations, for q=1,2 and r=2,3. For 2arm trials, the treatment effect differences of δ_{i,bk} are assumed to be normally distributed and correlated:
where
where \(d_{t_{ib}t_{ik}}(l)\) represents the pooled effect of the treatment in arm k relative to the treatment in arm b in study i for each outcome, l. Σ is the betweenstudy covariance matrix under an assumption of homogeneous betweenstudy variances [2] and correlations across treatment contrasts [7]. This notation can easily be extended to account for multiarm trials [26], which is further described in Additional file 2.
The relative effect of the studyspecific reference treatment in arm 1 (the control arm) relative to itself for outcome l, δ_{i,b1(l)}, is set to 0 and as such the set of conditional univariate distributions begin with the relative effect of the intervention in arm 2 relative to the control arm, δ_{i,b2(l)}. The studyspecific treatment comparisons, δ_{i,b(k−1)(l)}, are expressed in terms of the basic parameters of the pooled treatment effects for the intervention in arm k, \(d_{1,t_{ik}(l)}\) and the basic parameters of the pooled treatment effects for the intervention in the reference treatment arm, \(d_{1,t_{ib}(l)}\). For example, for trials that compare interventions A and B for outcome l, the pooled treatment effect, \(d_{t_{ib}t_{ik}(l)}= d_{AB(l)}\), is given by:
For d_{AB} where A>1 and B>1, the relative effects are expressed in terms of the basic parameters described in Eq. (5). The intervention effect of the reference treatment for the entire treatment network, j=1(l), usually a placebo or control intervention, is set to 0 for every outcome l, such that d_{11(l)}=0.
In order to predict treatment effect estimates for missing data for trials that did not report all outcomes of interest, following Achana et al. [7], we assumed that the pooled effects of intervention j=2,...,n_{t} relative to a reference treatment, d_{1j(l)}, for outcome, l, can be expressed as a sum of treatmentspecific effects, α_{j}, and outcomespecific effect, γ_{l}, such that:
The parameter ζ indicates the deviation of treatment effect profiles across outcomes. If ζ was close to zero, this would indicate a high degree of similarity between outcomes. In situations where ζ was particularly large, this would indicate a substantial deviation between treatment effect profiles across outcomes.
Noninformative prior distributions were specified for μ_{il}, α_{j}∼Normal(0,10^{3}), σ_{l}∼Uniform(0,2), and ζ∼Uniform(0,2), for l=1,2,3. The spherical parameterisation technique based on Cholesky decomposition [27] was used to express the betweenstudies variancecovariance matrix, Σ, and discussed in more detail in Additional file 3.
Model 2: multivariate network metaanalysis incorporating class effects
To utilise the additional similarity between the same interventions with different regimes, Model 1 was developed further to incorporate class effects. For study i evaluating intervention j belonging to class m, \(\boldsymbol {Y_{ij_{m}}} = (y_{ij_{m}1}\textrm {,} y_{ij_{m}2}\textrm {,} y_{ij_{m}3})\), denotes the observed vector of effects for each of the outcomes of interest, and expressed in the following way:
where μ_{i} and δ_{i,bk} have the same interpretation as in Model 1. The parameter \(\boldsymbol {\theta _{ij_{m}}}\), represents a vector of true treatment effects, and \(\boldsymbol {S_{ij_{m}}}\) represents the withinstudy covariance matrix, for intervention j belonging to a broader class of interventions m, for study i, such that:
As in Model 1, the elements of δ_{i,bk} for 2arm trials are assumed to be drawn from a multivariate normal distribution as described in Eq. 3 (and generalised to multiarm trials as described in Additional file 2). Here δ_{i,bk} is expressed in terms of the pooled effect, \(d_{t_{ib}t_{ik}}(l)\), of the treatment in arm k relative to the treatment in arm b in study i for each outcome, l. For example, when we incorporate class effects, the pooled treatment effect of treatment A belonging to class c, relative to the pooled treatment effect of treatment B belonging to class g, \(d_{A_{c}B_{g}(l)}\), is given by:
The intervention effect of the reference treatment for the entire treatment network, j=1_{1}(l), usually a placebo or control intervention, is set to 0 for every outcome l, such that \(d_{1_{1}1_{1}(l)}=0\). The basic parameters for relative treatment effects, \(d_{1_{1}j_{m}(l)}\), of intervention j within class m, relative to the reference treatment, were assumed to follow a normal distribution with mean equal to the treatmentspecific effect, \(\alpha _{j_{m}}\), plus the outcomespecific effect, γ_{l}, and variance, ζ^{2}:
where γ_{l} and ζ^{2} have the same interpretation as Model 1.
In order to incorporate the exchangeability between treatmentspecific effects, \(\alpha _{j_{m}}\), within class m, \(\alpha _{j_{m}}\) was assumed to follow a normal distribution with mean equal to the pooled effect estimate for the m^{th} class of interventions, β_{m}, with class specific betweenintervention variance, \(\nu _{m}^{2}\), [21] such that:
Noninformative prior distributions were specified for μ_{il}, σ_{l}∼Uniform(0,2) for l=1,2,3, ζ∼Uniform(0,2), β_{m}∼Normal(0,10^{3}) and ν_{m}∼Uniform(0,2). The spherical parameterisation technique based on Cholesky decomposition [27] was used to construct a prior distribution for the betweenstudy variancecovariance matrix, Σ, and discussed in more detail in Additional file 3.
In all multivariate analyses, prior distributions for the variance parameters (i.e. σ_{l}, ζ, and ν_{m}) were restricted to a Uniform(0,2) distribution on the standard deviation scale. For example, a value of 2 for the classspecific betweenintervention variance, ν_{m}, suggests that for a random pair of interventions, the difference in the mean change from baseline could be as large as 2.2 events on average. Uniform(0,2) prior distributions were considered for variance components in multivariate analyses in order to aid computation of the variancecovariance matrix. However, there is an argument to suggest that the variance parameters in a Bayesian model can be decomposed into the sum of several other variance components [28]. Thus, in hierarchical NMAs with additional variance components at the classlevel, more informative prior distributions for variance parameters may be reasonable.
Missing data framework
In situations where the observed standard errors of treatment effects, se_{ij}, were not reported, but baseline and followup variances were available (scenario 5 of Table 1), the correlation, ξ, between variance at baseline, \({sd_{\text {baseline}_{ij}}}^{2}\), and followup, \({sd_{\text {followup}_{ij}}}^{2}\), were used to impute estimates of the variance for change from baseline, \({sd_{\text {change}_{ij}}}^{2}\). This is calculated as:
Using external information from trials that report all variance terms (scenario 1 of Table 1), an informative prior distribution was placed on the correlation, ξ, using Fisher’s Ztransformation [23]. For trials that do not report the variability at followup (scenario 6 of Table 1), a linear predictor with baseline variance as a covariate was included such that:
where υ represents a constant term, and λ the regression coefficient. Trials that did not provide a measure of variability (scenario 7 of Table 1) were excluded from the analyses.
Model estimation
All models were estimated using Markov Chain Monte Carlo implemented in WinBUGS 1.4.3 [29]. Example WinBUGS code for the MVNMA incorporating class effects and the missing data framework is given in Additional file 4. Samples were collected for 150,000 MCMC iterations with the first 10,000 iterations discarded in the form of a ‘burnin.’ Convergence plots were assessed for a random sample of parameters of interest including treatment effect estimates and betweenstudy standard deviations [30]. BrooksGelmanRubin statistics, autocorrelation, history, and trace plots were used to detect nonconvergence for three individual MCMC chains with disparate starting values. Sensitivity analyses were undertaken to assess the impact of the choice of prior distributions for variance parameter. For both variance parameters, two alternative distributions were considered. For prior specification of the elements of V^{1/2}, (1) Gamma(0.001,0.001) on the precision scale, and (2) Halfnormal(0,1) on the standard deviation scale were considered. For prior specification of ζ, (1) Gamma(0.01,0.01) on the precision scale, and (2) Halfnormal(0,1) on the standard deviation scale were considered.
Results
Multivariate network metaanalysis
A multivariate approach allowed for the inclusion of 143 studies evaluating 115 interventions for OAB across all 3 outcomes. Previously, univariate analyses of urinary incontinence, voiding frequency and urgency episodes included 115, 119, and 60 studies evaluating 97, 100, and 54 interventions, respectively. Results for univariate analyses are given in Additional file 5 for completeness. Figure 5 illustrates the network of evidence for MVNMA evaluating incontinence, voiding, and urgency episodes. Incorporating a multivariate approach increased the ability to include more interventions for all three outcomes, compared to that of univariate analyses displayed in Figs. 1, 2 and 3. This was particularly apparent for urgency episodes, with which all interventions evaluated for incontinence and voiding could now be evaluated for urgency. Furthermore, for interventions that were evaluated for urgency in the original trials but disconnected from the univariate network of evidence (Fig. 3), e.g. reflexology (71), a MVNMA could borrow information between outcomes for which they were connected, in order to complete the network of evidence. However, eight treatments remained disconnected from the multivariate network of evidence: darifenacin 7.515mg once daily + BT (88), tolterodine + BT (94), darifenacin ER 7.515mg once daily (104), tolterodine (105), lidocaine gel 2x6ml (129), emepronium bromide immediate release 200mg three times a day (130), sacral nerve stimulation + tolterodine extended release 2mg once daily (136), tolterodine extended release 2mg once daily (137). These interventions were disconnected in all networks of evidence for univariate NMAs evaluating each of the outcomes of interest; and thus, borrowing information between outcomes had little impact on the inclusion of these particular interventions in multivariate analyses.
Table 2 displays the estimated posterior median reduction and 95% credible intervals for change from baseline in incontinence, voiding, and urgency episodes. Treatment effect estimates were first ranked according to their effectiveness in reducing incontinence episodes, then by voiding frequency, and finally by urgency episodes. Sacral nerve stimulation appeared to be the most effective intervention for the management of urinary incontinence, voiding and urgency with an estimated posterior median reduction of 7.43 (95%CrI: 9.59,4.73), 7.64 (95%CrI: 9.82,4.91) and 7.96 (95%CrI: 10.19,5.29) episodes, relative to placebo, respectively.
Adopting a multivariate approach changed the overall clinical decision for urgency episodes. In univariate analyses, electrostimulation with vaginal oestrogen cream 1.25mg/day appeared to be the most effective intervention for reducing urgency (see Additional file 5). Accounting for the correlation between outcomes using MVNMA, sacral nerve stimulation appeared to be the most promising intervention overall.
Multivariate network metaanalysis incorporating class effects
Table 3 displays the estimated posterior median reduction and 95% credible intervals for change from baseline in incontinence, voiding, and urgency episodes obtained from multivariate network metaanalysis incorporating class effects. Sacral nerve stimulation appeared to be the most effective intervention for reducing incontinence, voiding and urgency episodes with an estimated posterior median reduction of 8 (95%CrI: 9.54, 6.27), 8.19 (95%CrI: 9.69,6.49) and 8.49(95%CrI: 10.11,6.78) episodes, relative to placebo, respectively.
Incorporating class effects in MVNMAs broadly increased the precision in treatment effect estimates (Table 3) compared to MVNMAs without incorporating class effects (Table 2). For example, for sacral nerve stimulation incorporating a hierarchical structure further increased precision in the treatment effect estimates by approximately 33%, 35%, and 28% for urinary incontinence, voiding and urgency episodes, respectively.
Borrowing information between outcomes generally increased the precision in treatment effect estimates compared to univariate analyses (Figs. 6, 7, and 8). This finding was particularly apparent for sacral nerve stimulation, with which a multivariate hierarchical approach incorporating class effects increased precision in the estimated treatment effects by approximately 60% and 160% for urinary incontinence (Fig. 6) and voiding frequency (Fig. 7) respectively, compared to treatment effect estimates obtained from univariate NMAs (see additional file 5). For all interventions included in each of the univariate analyses, point estimates obtained from multivariate analyses were comparable with point estimates obtained from univariate analyses for all three outcomes (Figs. 6, 7, and 8).
Treatment profiles
Figure 9 illustrates the treatment profiles for all of the cardinal symptoms of OAB. Use of MVNMA allowed for the comparison of all interventions across all outcome measures, and thus completed the intervention profiles for efficacy outcomes. Generally, the treatment rankings were broadly similar to those obtained from univariate analyses (see Additional file 6). Sacral nerve stimulation appeared to be the most effective intervention across all three outcomes. Using multivariate analyses, estriol 1mg intravesically appeared to be amongst the top ten interventions (Fig. 9). Previously, estriol 1mg intravesically was only evaluated for voiding, and ranked amongst the top interventions (see Additional file 5). Borrowing information between outcomes allowed for estimation of treatment effects for both urinary incontinence and urgency episodes, and consequently estriol 1mg intravesically was ranked in tenth place across all outcome measures. However, the remaining top ten interventions remained unchanged.
Convergence diagnostics
Convergence diagnostic plots for a small selection of basic parameters, d_{1j}, the pooled effect estimate of intervention j relative to placebo, are given in Additional file 7. Both the between and within chain variability appeared to reach stability, and the ratio appeared to converge to 1. All plots appeared to show reducing autocorrelation with increased lag. History and trace plots took the appearance of random noise, with no obvious difference between multiple MCMC chains with very different starting values. Overall all diagnostic plots appeared to suggest that there was no evidence of nonconvergence and both d_{1j} and ζ were estimated from samples which appeared to have a reasonable degree of mixing of the chains.
Sensitivity analysis
Sensitivity analyses assessing different choices of prior distributions for the betweenstudy standard deviations, the elements of the matrix V^{1/2}, and the variability of treatment effect profiles across outcomes, ζ, are given in Additional file 8. Different choices of prior distributions for variance parameters had very little impact in the treatment effect estimates for all three outcomes, and therefore the overall clinical decisions regarding interventions effectiveness remained the same. Thus suggesting that treatment effect estimates were robust to choice of prior distribution.
Discussion
This paper extends the MVNMA framework described by Achana et al. [7], to incorporate the exchangeability of interventions belonging to the same class of interventions. This approach makes use of the correlations between multiple outcomes in order to predict and impute treatment effect estimates for missing data, and therefore, has the potential to limit the impact of outcome reporting bias, as well as, borrows strength between interventions belonging to the same class of treatment, which has the potential to increase precision in the treatment effect estimates.
In the OAB example, the datasets used in univariate analyses for each outcome independently, included 115, 119 and 60 studies evaluating 97, 100, and 54 interventions for incontinence, voiding, and urgency episodes, respectively. Despite urgency being documented as “the cardinal symptom" of OAB [25], it was sufficiently underreported in the original trials, and consequently fewer interventions were able to be evaluated in univariate analyses. Adopting a multivariate approach, we were able to borrow information across outcomes and consequently included 143 studies evaluating all 115 interventions for the management of incontinence, voiding and urgency. Using this methodology completed the treatment profiles for all prominent symptoms of OAB, which in turn allows decision makers to make inferences regarding the potential treatment benefit for all interventions, across all salient outcomes. In doing so, sacral nerve stimulation appeared to be the most effective intervention for reducing incontinence, voiding and urgency episodes with an estimated posterior median reduction of 8 (95%CrI: 9.54, 6.27), 8.19 (95%CrI: 9.69,6.49) and 8.49(95%CrI: 10.11,6.78) episodes, relative to placebo, respectively. However, due to the limited number of studies and participants in which sacral nerve stimulation was evaluated, these results should be interpreted with caution.
Sacral nerve stimulation was not evaluated for urgency in the original trials and thus could not be assessed in univariate analyses. Using a multivariate approach therefore changed the overall clinical decision for the management of urgency, where electrostimulation in combination with vaginal oestrogen cream 1.25mg/day was found to be the most effective intervention from univariate analyses. Similarly, from univariate analyses, estriol 1mg intravesically appeared to be a promising intervention for voiding, but there was no data for all other outcomes. Using a multivariate approach, estriol 1mg intravesically ranked in the top 10 interventions for all three cardinal symptoms of OAB.
A key assumption of MVNMA is that the data is multivariately normally distributed. In situations with which we wish to synthesise multiple binomial outcomes, or a mixture of binomial and continuous outcomes, it is common to assume a normal approximation of binomial data on the log odds ratio (or log odds) scale [9]. This assumption has been shown to hold well when the proportions of events are close to 0.5, but poorly when the proportions of events are close to 0 or 1 [31]. In situations such as these more sophisticated methods, such as copula models, are required to appropriately capture the withinstudy variability using binomial data directly [32].
In this paper, we describe a missing data framework to incorporate estimates of uncertainty for trials that did not report any measure of variability in the mean treatment effects. It could be argued that trials reporting incomplete outcome data are at a higher risk of bias [33], and the impact of including such trials in metaanalyses should be thoroughly explored in sensitivity analyses.
A further assumption of MVNMA is that all data are assumed to be missing at random [17], which in the case of the OAB example may not be plausible [34]. It is likely that there is an element of selective reporting in the original trials, such that outcomes with which interventions perform particularly well are more likely to be reported [35]. In this situation, treatment effects may be exaggerated [17], though recent simulation studies suggest that a multivariate metaanalysis can lead to a more appropriate estimate of treatment effect in the presence of outcome reporting bias [5, 6] under a variety of missing data scenarios, including missing at random and missing not at random [6]. In order to obtain more accurate estimates of treatment effects for decision making, data are needed for all interventions, across all outcome measures. Following the Core Outcome Measures in Effectiveness Trials (COMET) initiative [36], there is a clear need to define a core outcome set (COS) for the future reporting of OAB trials [34].
Previous simulation studies in multivariate metaanalysis have shown that in situations when there are a large proportion of missing data, a multivariate approach results in increased borrowing of information and increased precision in treatment effect estimates [6, 16]. Thus, multivariate methods are of most use in scenarios with a larger proportion of missing data, and less use in scenarios with complete data. In a Bayesian framework, with noninformative prior distributions specified for hyperparameters (as described in this paper), it has previously been shown that the total number of missing values must not exceed ((J−1)×N)−((J−1)+N), for J interventions and N outcomes. In situations with which the total number of missing values exceeds this number, the hyperparameters will become unidentifiable and informative prior distributions will be required to improve model convergence [7].
To ameliorate the impact of outcome reporting bias, the correlation between outcomes were used to obtain a predictive value for missing data. In order to achieve this, an assumption of constant relative effectiveness across outcomes was assumed for the basic parameters of the pooled treatment effect estimates, d_{(1j)l} and \(d_{(1_{1}j_{m})l}\), as described in Eqs. (6) and (10) for MVNMA and MVNMA incorporating class effects, respectively. If interest lies in the difference between active interventions, this may be a strong assumption as the outcomespecific effect, γ_{l}, will cancel. For example, the relative treatment effects of intervention A relative to intervention B obtained from MVNMA are expressed in terms of the basic parameters such that d_{AB}=(d_{(1B)l}−d_{(1A)l})∼Normal(α_{B}−α_{A},2ζ^{2}). Thus, in these situations, alternative methods should be explored. An assumption of constant relative effectiveness across outcomes is of less importance if interest lies in the relative rankings of the interventions, as these are calculated based on the basic parameters, d_{(1j)l}.
As with all random effects models, in situations when the deviation of treatment effect profiles across outcomes, ζ, is large, it may not be sensible to combine data across all outcomes. In this scenario, we encourage the use of sensitivity analyses to; i) assess the impact of potentially outlying observations in univariate and multivariate analyses, and ii) assess the difference in mean treatment effects, and variability in mean treatment effects across outcomes, using a series of univariate analyses. Reducing the model to a bivariate or univariate analysis in sensitivity analyses will allow investigation of the impact of outcomes on the magnitude of ζ, as well as the influence of missing outcome data on the robustness of the overall conclusions.
In this example, the outcomespecific effect, γ_{l}, was assumed to be constant across all interventions. Further work could extend this model to incorporate the exchangeability of outcomespecific effects within classes of interventions, such that \(\gamma _{l_{m}}\sim \textit {Normal}(\kappa _{m},\Psi _{m}^{2})\), where κ_{m} denotes the pooled outcome effect for the m^{th} class of interventions, and \(\Psi _{m}^{2}\) denotes the classspecific between intervention variance. However this approach is likely to substantially increase the number of parameters to be estimated in the model and could lead to computational difficulties. Furthermore, if there is evidence to suggest that there is a disparity in outcomespecific effects between classes, the assumption of homogeneous betweenstudy correlations is also unlikely to be satisfied and alternative parameterisations of the betweenstudy covariance matrix would need to be considered.
One limitation of implementing MVNMA in WinBUGS, is the difficultly in calculating deviance statistics for the assessment of model fit and comparison. Residual deviance could be calculated by monitoring the estimated true treatment effects, θ_{ij}, for each study i=1,...,n_{s} and intervention j=1,...,n_{t}, and calculating the difference of these true treatment effects relative to the observed treatment effects, Y_{ij}, using the following equation: total residual deviance = \(\sum _{i=1,j=1}^{i=n_{s},j=n_{t}}(\boldsymbol {Y_{ij}}\boldsymbol {\theta _{ij}})^{2}\boldsymbol {S_{ij}}\), where S_{ij} denotes the treatmentspecific withinstudy covariance matrix. In this paper, the withinstudy correlations were incorporated at the treatmentarm level in order to appropriately account for multiarm trials [7], thus the withinstudy model was parametrised at the armlevel whereas the betweenstudy model, with which θ_{ij} is estimated, was parametrised at the studylevel. This makes calculation of residual deviances more difficult, especially in the presence of large amounts of missing data. In the OAB example, there were a large number of observed datapoints with intermittently missing data, therefore, such an approach would be both computational, and timeintensive. Further work is needed to explore reparameterisation of MVNMA models, and alternative methods in order to adequately assess model fit and comparison.
In this example, use of a MVNMA incorporating class effects was illustrated using the three cardinal symptoms of OAB (incontinence, voiding, and urgency). However, interest lies in both efficacy and safety outcomes. Incorporating additional outcomes results in an exponential increase in the number of parameters to be estimated in the model. This is particularly true for estimation of the parameters involved in both the withinstudy, and betweenstudy covariance matrix. This substantial increase in the number of parameters can often result in computational difficulties for complex multivariate models such MVNMAs. Furthermore, estimating the withinstudy correlation structures can be particularly difficult for mixed [9] and binary outcomes [18]. This is because an analytic solution is not possible [18]. A further limitation of using MVNMA for imputing missing data with mixed outcomes, is the assumption that intervention effects are exchangeable across outcomes. This assumption may not be reasonable if the outcomes differ in an important way, e.g, if the outcomes were measured on different scales. For example, binary outcomes on a logodds scale and continuous outcomes on a mean difference scale will differ in terms of the uncertainty with which they were estimated.
In this analysis, a homogeneity of correlations assumption was used to simplify the number of parameters in the model and to aid computation. In situations with which there are fewer interventions, and more information for each treatment comparison, it may be desirable to incorporate treatmentspecific betweenstudy correlations. It may also be desirable to incorporate treatmentspecific withinstudy correlations, however, data for every pair of treatment comparisons may be difficult to obtain, and thus withinstudy correlations may be difficult to estimate. In this example, the withinstudy correlations were estimated from individual patient data obtained from the RELAX trial [37]. Uncertainty in estimating the withinstudy correlations needs to be further accounted for. For continuous outcomes that follow a multivariate normal distribution, it would be possible to obtain estimates of the withinstudy correlations directly from the covariance matrix [9]. Estimates of the withinstudy correlations, together with their uncertainty, could be incorporated in to the MVNMA model by applying prior distributions to the withinstudy correlation parameters, pw_{xy}, using a bootstrapping method [9].
Conclusions
Accounting for the correlation between outcomes, MVNMAs incorporating class effects allowed for the evaluation of all interventions across all outcome measures. In the OAB example, including this additional information changed the overall clinical conclusions. Estriol 1mg intravesically was ranked in the top 10 interventions for the management of all cardinal symptoms of OAB, and sacral nerve stimulation was found to be the most effective intervention for reducing urgency. Borrowing information across outcomes using MVNMAs generally increased the precision in the treatment effect estimates. This precision was further increased by incorporating a hierarchical structure where similarities between interventions that belong to the same class of interventions were also accounted for. Overall, MVNMAs can provide a flexible framework to model a mixture of outcomes and are of most use in situations when there is a large proportion of missing data, ameliorating the impact of outcome reporting bias. MVNMAs incorporating class effects, applied judiciously, have proven to be a useful methodology. These methods have the potential to aid health technology assessment decision making by increasing precision in treatment effect estimates, as well as allowing for the complete evaluation of all outcomes and interventions of interest for multimorbid, or syndromic conditions.
Availability of data and materials
The dataset analysed during this study is available from the corresponding author on reasonable request.
Abbreviations
 CrI:

Credible interval
 HTA:

Health technology assessment
 MVNMA:

Multivariate network metaanalysis
 NMA:

Network metaanalysis
 OAB:

Overactive bladder
References
 1
Sterne JA, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. Bmj. 2009; 338:2393.
 2
Lumley T. Network metaanalysis for indirect treatment comparisons. Stat Med. 2002; 21(16):2313–24.
 3
Lu G, Ades A. Combination of direct and indirect evidence in mixed treatment comparisons. Stat Med. 2004; 23(20):3105–24.
 4
Caldwell DM, Ades A, Higgins J. Simultaneous comparison of multiple treatments: combining direct and indirect evidence. Br Med J. 2005; 7521:897.
 5
Kirkham JJ, Riley RD, Williamson PR. A multivariate metaanalysis approach for reducing the impact of outcome reporting bias in systematic reviews. Stat Med. 2012; 31(20):2179–95.
 6
Hwang H, DeSantis SM. Multivariate network metaanalysis to mitigate the effects of outcome reporting bias. Stat Med. 2018; 37(22):3254–66.
 7
Achana FA, Cooper NJ, Bujkiewicz S, Hubbard SJ, Kendrick D, Jones DR, Sutton AJ. Network metaanalysis of multiple outcome measures accounting for borrowing of information across outcomes. BMC Med Res Methodol. 2014; 14(1):1.
 8
Riley RD. Multivariate metaanalysis: the effect of ignoring withinstudy correlation. J R Stat Soc Ser A (Stat Soc). 2009; 172(4):789–811.
 9
Bujkiewicz S, Thompson JR, Sutton AJ, Cooper NJ, Harrison MJ, Symmons DP, Abrams KR. Multivariate metaanalysis of mixed outcomes: a bayesian approach. Stat Med. 2013; 32(22):3926–43.
 10
Efthimiou O, Mavridis D, Cipriani A, Leucht S, Bagos P, Salanti G. An approach for modelling multiple correlated outcomes in a network of interventions using odds ratios. Stat Med. 2014; 33(13):2275–87.
 11
Berkey C, Hoaglin D, AntczakBouckoms A, Mosteller F, Colditz G. Metaanalysis of multiple outcomes by regression with random effects. Stat Med. 1998; 17(22):2537–50.
 12
Daniels MJ, Hughes MD. Metaanalysis for the evaluation of potential surrogate markers. Stat Med. 1997; 16(17):1965–82.
 13
Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in metaanalysis: multivariate approach and metaregression. Stat Med. 2002; 21(4):589–624.
 14
Nam IS, Mengersen K, Garthwaite P. Multivariate metaanalysis. Stat Med. 2003; 22(14):2309–33.
 15
Arends LR, Vokó Z, Stijnen T. Combining multiple outcome measures in a metaanalysis: an application. Stat Med. 2003; 22(8):1335–53.
 16
Riley RD, Abrams K, Lambert P, Sutton A, Thompson J. An evaluation of bivariate randomeffects metaanalysis for the joint synthesis of two correlated outcomes. Stat Med. 2007; 26(1):78–97.
 17
Jackson D, Riley R, White IR. Multivariate metaanalysis: Potential and promise. Stat Med. 2011; 30(20):2481–98.
 18
Wei Y, Higgins J. Bayesian multivariate metaanalysis with multiple outcomes. Stat Med. 2013; 32(17):2911–34.
 19
Schwarzer G, Carpenter JR, Rücker G. Multivariate metaanalysis. In: MetaAnalysis with R. Springer: 2015. p. 165–85. https://doi.org/10.1007/9783319214160_7.
 20
Ades A, Mavranezouli I, Dias S, Welton NJ, Whittington C, Kendall T. Network metaanalysis with competing risk outcomes. Value Health. 2010; 13(8):976–83.
 21
Owen RK, Tincello DG, Keith RA. Network metaanalysis: development of a threelevel hierarchical modeling approach incorporating doserelated constraints. Value Health. 2015; 18(1):116–26.
 22
Chowdhry AK, Dworkin RH, McDermott MP. Metaanalysis with missing studylevel sample variance data. Stat Med. 2016. https://doi.org/10.1002/sim.6908.
 23
Abrams KR, Gillies CL, Lambert PC. Metaanalysis of heterogeneously reported trials assessing change from baseline. Stat Med. 2005; 24(24):3823–44.
 24
Homma Y, Kakizaki H, Yamaguchi O, Yamanishi T, Nishizawa O, Yokoyama O, Takeda M, Seki N, Yoshida M. Assessment of overactive bladder symptoms: comparison of 3day bladder diary and the overactive bladder symptoms score. Urology. 2011; 77(1):60–4.
 25
Cardozo L, Chapple C, Wein A. Urgency as the cardinal symptom of overactive bladder: a critical analysis. World J Urol. 2009; 27(6):701–3.
 26
Dias S, Welton NJ, Sutton AJ, Ades A. Nice dsu technical support document 2: a generalised linear modelling framework for pairwise and network metaanalysis of randomised controlled trials. London, UK: National Institute for Health and Clinical Excellence; 2011.
 27
Watkins DS. Fundamentals of matrix computations. 1991.
 28
Geweke J, Amisano G. Analysis of variance for bayesian inference. Econ Rev. 2014; 33(14):270–88.
 29
Spiegelhalter D, Thomas A, Best N, Lunn D. WinBUGS User Manual, Version 1.4 MRC Biostatistics Unit. Cambridge; 2003.
 30
Lunn D, Jackson C, Best N, Thomas A, Spiegelhalter D. The BUGS Book: A Practical Introduction to Bayesian Analysis: CRC press; 2012. https://doi.org/10.1201/b13613.
 31
Hamza TH, van Houwelingen HC, Stijnen T. The binomial distribution of metaanalysis was preferred to model withinstudy variability. J Clin Epidemiol. 2008; 61(1):41–51.
 32
Papanikos T, Thompson JR, Abrams KR, Bujkiewicz S. A novel approach to bivariate metaanalysis of binary outcomes and its application in the context of surrogate endpoints. 2020. available from https://arxiv.org/abs/2004.02007.
 33
Higgins JP, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, Savović J, Schulz KF, Weeks L, Sterne JA. The cochrane collaboration’s tool for assessing risk of bias in randomised trials. Bmj. 2011; 343:5928.
 34
Globerman D, Robert M. Heterogeneity in postintervention prolapse and urinary outcome reporting: a oneyear review of the international urogynecology journal. Int Urogynecol J. 2015; 26(9):1373–8.
 35
Chan AW, Hróbjartsson A, Haahr MT, Gøtzsche PC, Altman DG. Empirical evidence for selective reporting of outcomes in randomized trials: comparison of protocols to published articles. Jama. 2004; 291(20):2457–65.
 36
Williamson PR, Altman DG, Blazeby JM, Clarke M, Gargon E. The comet (core outcome measures in effectiveness trials) initiative. Trials. 2011; 12(Suppl 1):70.
 37
Tincello DG, Kenyon S, Abrams KR, Mayne C, ToozsHobson P, Taylor D, Slack M. Botulinum toxin a versus placebo for refractory detrusor overactivity in women: a randomised blinded placebocontrolled trial of 240 women (the relax study). Eur Urol. 2012; 62(3):507–14.
 38
Mavridis D, Salanti G. A practical introduction to multivariate metaanalysis. Stat Methods Med Res. 2012:0962280211432219. https://doi.org/10.1177/0962280211432219.
 39
Lu G, Ades A. Modeling betweentrial variance structure in mixed treatment comparisons. Biostatistics. 2009; 10(4):792–805.
 40
Barnard J, McCulloch R, Meng XL. Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Stat Sin. 2000:1281–311.
Acknowledgements
Not applicable
Funding
This article presents independent research funded by the National Institute for Health Research (NIHR). The views expressed are those of the author(s) and not necessarily those of the National Health Service, the NIHR, or the Department of Health. Rhiannon Owen was funded by an NIHR Doctoral Research Fellowship (grant no. NIDRF201306062). Keith Abrams is partially supported by Health Data Research (HDR) UK, the UK National Institute for Health Research (NIHR) Applied Research Collaboration East Midlands (ARC EM), and as a NIHR Senior Investigator Emeritus (NFSI051210159). Sylwia Bujkiewicz was funded by the Medical Research Council (grant no. MR/L009854/1). The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Author information
Affiliations
Contributions
RKO, SB, DGT and KRA conceived the idea for the study. RKO developed and implemented all statistical analyses and drafted the manuscript. SB, DGT and KRA contributed to interim drafts of the manuscript and all authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
RKO has served as a paid consultant, providing methodological advice to Astellas, CreativCeutical, Cogentia, Daiichi Sankyo, NICE, Roche, and Vifor. She is a member of the NICE Technology Appraisal Committee. SB has served as a paid consultant, providing methodological advice, to NICE and Roche, and has received research funding from European Federation of Pharmaceutical Industries & Associations (EFPIA) and Johnson & Johnson. DGT has served as a paid consultant with Femeda, SynerMed, Cambridge Medical Robotics, and Allergan in the last 10 years. KRA has served as a paid consultant, providing methodological advice, to; Abbvie, Amaris, Allergan, Astellas, AstraZeneca, Boehringer Ingelheim, BristolMeyers Squibb, CreativCeutical, GSK, ICON/Oxford Outcomes, Ipsen, Janssen, Eli Lilly, Merck, NICE, Novartis, NovoNordisk, Pfizer, PRMA, Roche and Takeda, and has received research funding from Association of the British Pharmaceutical Industry (ABPI), European Federation of Pharmaceutical Industries & Associations (EFPIA), Pfizer and Sanofi and Swiss Precision Diagnostics. He is a Partner and Director of Visible Analytics Limited, a healthcare consultancy company.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1
Treatment codes.
Additional file 2
Multiarm correction.
Additional file 3
Estimating betweenstudy and withinstudy correlations.
Additional file 4
WinBUGS code.
Additional file 5
Univariate network metaanalyses results for change from baseline in incontinence, voiding and urgency episodes.
Additional file 6
Treatment profiles obtained from univariate network metaanalyses for change from baseline in incontinence, voiding and urgency episodes.
Additional file 7
Convergence diagnostics.
Additional file 8
Sensitivity analyses.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Owen, R.K., Bujkiewicz, S., Tincello, D.G. et al. Multivariate network metaanalysis incorporating class effects. BMC Med Res Methodol 20, 184 (2020). https://doi.org/10.1186/s12874020010258
Received:
Accepted:
Published:
Keywords
 Multivariate
 Network metaanalysis
 Mixed treatment comparisons
 Metaanalysis
 Class effect