 Research article
 Open Access
 Published:
Network metaanalysis combining individual patient and aggregate data from a mixture of study designs with an application to pulmonary arterial hypertension
BMC Medical Research Methodology volume 15, Article number: 34 (2015)
Abstract
Background
Network metaanalysis (NMA) is a methodology for indirectly comparing, and strengthening direct comparisons of two or more treatments for the management of disease by combining evidence from multiple studies. It is sometimes not possible to perform treatment comparisons as evidence networks restricted to randomized controlled trials (RCTs) may be disconnected. We propose a Bayesian NMA model that allows to include singlearm, beforeandafter, observational studies to complete these disconnected networks. We illustrate the method with an indirect comparison of treatments for pulmonary arterial hypertension (PAH).
Methods
Our method uses a random effects model for placebo improvements to include singlearm observational studies into a general NMA. Building on recent research for binary outcomes, we develop a covariateadjusted continuousoutcome NMA model that combines individual patient data (IPD) and aggregate data from twoarm RCTs with the singlearm observational studies. We apply this model to a complex comparison of therapies for PAH combining IPD from a phaseIII RCT of imatinib as addon therapy for PAH and aggregate data from RCTs and singlearm observational studies, both identified by a systematic review.
Results
Through the inclusion of observational studies, our method allowed the comparison of imatinib as addon therapy for PAH with other treatments. This comparison had not been previously possible due to the limited RCT evidence available. However, the credible intervals of our posterior estimates were wide so the overall results were inconclusive. The comparison should be treated as exploratory and should not be used to guide clinical practice.
Conclusions
Our method for the inclusion of singlearm observational studies allows the performance of indirect comparisons that had previously not been possible due to incomplete networks composed solely of available RCTs. We also built on many recent innovations to enable researchers to use both aggregate data and IPD. This method could be used in similar situations where treatment comparisons have not been possible due to restrictions to RCT evidence and where a mixture of aggregate data and IPD are available.
Background
Decision making bodies for national health care providers, such as the National Institute for Health and Care Excellence (NICE) for the NHS in England and Wales or the Pharmaceutical Benefits Advisory Committee (PBAC) in Australia, have a need to consider all available treatments when making recommendations for clinical practice. There is rarely a single definitive study comparing these treatments and it is often necessary to synthesise the best available evidence to come to a decision [1].
Network metaanalysis (NMA) for indirect mixed treatment comparisons of multiple treatments is a generalization of standard metaanalysis, which is used to combine the results of multiple studies, to the comparison of two or more than treatments. This has become a wellestablished methodology for evidence synthesis [2,3] and is routinely used and recommended by NICE [4,5]. The goldstandard of evidence to be included in a NMA are randomized controlled trials (RCTs) which include a control arm and whose populations are randomized to reduce bias and improve precision. The results are usually available from literature as only aggregate data. Access to individual patient data (IPD), when available, can be used to understand the relationship between covariates and outcomes [6,7]. Methods for the inclusion of IPD in pairwise metaanalysis have been developed by Sutton et al. [8] and Riley et al. [9,10] and these were extended to the network metaanalysis of binary outcomes by Saramago et al. [7] and Donegan et al. [6]. This model can easily be adapted to continuous outcomes and provides a covariateadjusted NMA model combining IPD and aggregate data.
One of the requirements to perform an NMA is to have a connected network [4], which can be challenging when not enough RTCs are available, as illustrated in Figure 1 for the case of a NICE technology assessment follicular lymphoma [11]. This is often a problem in new indications for small populations or orphan diseases [12]. However, a decision on the most appropriate treatment is still needed and including nonrandomized studies to complete the network and conduct the comparison is a potential solution [13]. A commonly available type of nonrandomized study is the singlearm observational study, or beforeandafter study [14], in which outcomes in a group of patients are investigated before and after an intervention.
Several methods have been proposed to incorporate such observational studies [15]. One approach is the threelevel hierarchical model which allows the incorporation of evidence from many different study designs [16,17]. An example of such a model consists of an overall effect for each treatment j, which can be labelled d_{ j }. Treatment effects for each different type of study, such as an RCT effect φ_{j1}, a beforeandafter study effect φ_{j2}, and a casecontrol study effect φ_{j3}, could then be normally distributed around this overall effect. At the bottom level of the hierarchy are the individual study effects δ_{ jki } for each treatment j, study type k, and study i, which could be normally distributed around the study type treatment effects φ_{ jk }. This approach has the advantage of keeping the inference from each type of trial separate but is not applicable in cases where the number of studies per study type per treatment is small.
An alternative approach to including observational studies, and thus connecting the network, is that of propensity scores which are the probability that a patient would be given a particular treatment on the basis of their background characteristics [1820]. These probabilities are often estimated using logistic regression. However, different propensity score models are required for each treatment and a great many studies are therefore required for each study. This is a particular drawback if IPD is not available for most of the treatments. Another disadvantage is the difficulty of incorporating propensity scores into the existing covariateadjusted NMA models.
A final alternative for including observational studies in disconnected networks is the method of constructing empirical priors informed by these observational studies [15,21]. These empirical priors inform parameter estimation via:
where the L(θRCTs) is the likelihood on the basis of the RCT evidence, L(θObs) is the likelihood on the basis of the observational evidence, P(θ) is the prior, and α is a parameter representing the strength given to the observational evidence. If α = 1, for example, the observational evidence would be given the same weight as the RCT evidence. This approach shares the advantage of the hierarchical method in that it explicitly separates the RCT and observational evidence but also shares the disadvantage of the propensity scores method that it is difficult to merge with existing NMA models.
The method we choose to build upon is the construction of control arms for beforeandafter studies by matching their baseline characteristics to those of control arms in included RCTs [20,22]. This analysis of covariance method uses regression models to estimate the effect of treatments not included in the study. We adapted this in a natural fashion to covariateadjusted NMA models through an assumption of exchangeability (randomeffects) on the placebo effects of study arms. A similar approach was originally applied to metaanalysis [23] and has recently been proposed for the construction of baseline natural history models in NMA [24]. However, recent work has been critical of placing randomeffects on the triallevel baseline improvements, namely the placebo effect [24,25], as it interferes with the randomization of the RCTs and learns across trial information. Despite these concerns, in cases such as the application we will discuss necessitate this approach as it would otherwise not be possible to compare the treatments of interest due to the disconnectedness of the evidence network.
Illustrative example: mixed treatment comparison of combination therapies for pulmonary arterial hypertension
We will illustrate our method for the inclusion of beforeandafter studies in a mixed treatment comparisons of therapies for pulmonary arterial hypertension (PAH). PAH is a rare disease characterised by progressive elevation of pulmonary vascular resistance leading to right heart failure and death [26]. Current treatments include endothelin receptor antagonists (ERA), phosphodiesterase5 inhibitors (PDE5i), and prostacyclin analogues (Pr) [27]. These drugs are often used in combination to try to improve outcomes [28,29]. The anticancer therapy imatinib is an oral therapy which has also recently been studied in PAH and its use is of interest to clinicians. No systematic comparison of available monotherapies and combination therapies for PAH has been conducted and, in particular, imatinib as addon to other combination therapies has not been investigated. Imatinib was being evaluated as an alternative to prostacyclins as additional therapy for patients on a combination of ERA and PDE5i. The comparison of imatinib with prostacyclins for this patient group was not previously possible on the basis of direct evidence or through indirect NMA comparisons restricted to RCTs and we thus took it as our primary objective for treatment comparison. However, our comparison should be viewed as illustrative and should not be used to guide clinical practice as the evidence is indirect and the analysis relies on a number of model assumptions that were necessary to facilitate the comparison.
Our primary evidence base for the NMA was the IPD from the IMPRES trial [30]. This was a randomized placebocontrolled PhaseIII trial to investigate the efficacy and safety of imatinib as an addon to combination therapy for the treatment of PAH. The study included patients with severe PAH and receiving two or more PAHspecific treatments^{.}. Patients were initially on one of four combination treatments, namely ERA + PDE5i, ERA + Pr, PDE5i + Pr, or ERA + PDE5i + Pr., were randomized within these background treatment groups to either imatinib or placebo and were followedup for at least 24 weeks. Patient group characteristics are reported in Table 1, where heterogeneity in baseline characteristics between randomized groups is exhibited. The high dropout rates in this trial reflect the severity of the disease and the sideeffects of the treatments.
The continuous outcome of short term change in 6 minute walk distance (6MWD) from baseline, in meters, is used in licensing decisions by agencies such as the Food and Drug Administration [31] and as a result is the primary outcome in nearly all PhaseIII trials in PAH. Although adjusting final 6MWD for baseline 6MWD is the recommended approach when analysing trial results [32], this was not possible as we had only aggregate data from most of the studies and many only reported the change outcome. We therefore chose change in 6MWD as our efficacy measure. Short term was defined as 12 weeks to 1 year, as clinical opinion was that patients would derive maximal benefit from treatments within 12 weeks.
Six covariates of interest were identified by a mixture of exploratory analysis of the IMPRES data and expert clinical opinion. The covariates identified were: the age at baseline (AGE), an indicator for whether a patient is male (SEX), the 6MWD at baseline (WALK), the World Health Organization New York Health Assessment status (STATUS) and pulmonary vascular resistance (PVR). STATUS categorises the severity of PAH into one of four increasingly severe categories, ranging from no limitation of activity and no symptoms with ordinary physical activity to marked limitation of activity and symptoms with any activity, even at rest. PVR is a measure of the resistance of the pulmonary vasculature calculated from the pressure drop across the pulmonary vascular bed divided by the pulmonary blood flow. Means of the covariates were used for the aggregate data.
Systematic literature review of studies in the literature
The results of a systematic literature review were available and was used to identify a network of studies to be included in the analysis. In this review, the MEDLINE® and EMBASE® databases were searched simultaneously. Patient Intervention Comparator Outcome Study type (PICOS) [33] criteria were followed and the quality assessment was performed according to the NICE checklist for RCTs [34]. Details of the PICOS terms are included in Additional file 1. Search terms included a combination of freetext and thesaurus terms relevant to PAH, ERA, prostacyclins, PDE5i, and RCTs, although casecontrol and cohort studies were also included. The Cochrane Central Register of Controlled Trials was also searched using a similar strategy. The relevance of each citation identified from the databases was based on title and abstract according to the PICOS criteria. As we wanted to explore the effects of covariates, studies that did not report two or more of the 6 covariates of interest were excluded, while we would use IMPRES data to perform single imputation when only one covariate is missing. From this review, we identified and included 5 monotherapy [3539] and 4 combination therapy [28,4042] RCTs, summary statistics for which are provided in Table 2 and Table 3, respectively. Additionally, 6 beforeandafter studies investigating monotherapies and combination therapies were included [4348] and their summary statistics are reported in Table 4. PRISMA flowcharts are provided for the systematic searches in Figure 2 and Figure 3 and a PRISMA checklist is provided in Additional file 2 [49]. Although there were substantial differences across trials in the doses of the administered treatments, as recorded in Tables 2, 3 and 4, clinical opinion was such that their effects would be comparable.
Only twoarm RCTs were identified, with each arm involving the addition of some treatment or placebo to a group of patients who were either treatment naïve or on some baseline treatment. In these studies, included patients had been on the baseline treatment for a time period before randomization (eg. ERA for at least 4 months prior to randomization [42] which was assumed sufficient to derive maximal benefit from the baseline treatment. This assumption implies that any improvement was due to the additional treatment or the placebo effect.
The beforeandafter studies were singlearm observational studies which reported the 6MWD of a group of patients on a particular background therapy before and after administering a new treatment. For example, Mathai et al. [47] studied the effect of initiating additional PDE5i therapy on a group of patients already on ERA monotherapy, thus providing evidence on the additional benefit of adding PDE5i over ERA alone.
Methods
The final evidence network of observational studies and RCTs for the NMA is shown in Figure 4. The treatment effects are labelled β_{ i } and are the expected short term improvement in 6MWD. Arrow directions indicate the interpretation of these parameters, eg. Positive β_{2} means that prostacyclins are more effective than placebo. The network of primary interest is highlighted in bold, and the comparison of primary interest β_{8} − β_{6}, the effectiveness of imatinib against prostacyclins as an addon to ERA + PDE5i, is highlighted by a bold, dashed, indirect link. This illustrates the necessity of including observational evidence as this network would be disconnected had it been restricted to RCTs. Although this indirect comparison could have been conducted with evidence from only IMPRES and the Jacobs et al. studies, the inclusion of a wider range of evidence strengthens our estimates of covariate adjustments and the short term placebo improvements in 6MWD in PAH patients. The following sections explain our development of a NMA model to estimate the parameters β_{ i } by synthesizing all available evidence. As only twoarm RCTs and singlearm observational studies were identified, the models we develop will not be designed for trials with more than two arms. This model development is summarized in Table 5.
Model M1: network metaanalysis of aggregate data from RCTs and observational studies
The first model we considered was a simple network metaanalysis of aggregated data from the IMPRES study and aggregate data from the literature. The mean short term change in 6MWD for each study i and arm j, \( {\overline{Y}}_{ij} \), was modelled as:
where SE_{ij} is the standard error of the observed change in 6MWD in arm j of study i. It should be noted that this parameterization is slightly different to that used in other network metaanalyses [15,50] as we are using a trial level placebo effect α_{ i } in combination with a trial level effect of treatment, θ_{ ij }. The placebo effect is the mean improvement in 6MWD that a group of patients would experience if they entered trial i and received only placebo, in addition to their background therapy, and is assumed to be the same for each of the arms of the trial. The effect of treatment in arm j is a linear combination of the effects of additional treatments initiated in that arm at the start of the trial:
θ_{ ij } = effect of additional treatment initiated in j^{th} arm of i^{th} study
β = vector of treatment effects
f_{ ij } = linear function with coefficients +1 or 1
In Equation (2), random effects with common variance \( {\sigma}_{\beta}^2 \) were placed on the treatment effects θ_{ ij } in the i^{th} study and j^{th} arm, as they were assumed to be exchangeable and independent. The entries of the treatment effects vector β are the treatment effect parameters β_{ l }, which we assumed to be fixed effects. For arms receiving only a placebo, it was assumed that θ_{ iC } = 0 so that the improvement in 6MWD is only the placebo effect α_{ i }. Twoarm trials with no placebo arm would have mean improvements of α_{ i } + θ_{i1} and α_{ i } + θ_{i2}, where θ_{i1} and θ_{i2} are the effect of the treatment combinations in the first and second arms, respectively.
Observational studies consist of only one arm and their inclusion required an assumption about their α_{ i }. We assumed that these α_{ i }, the placebo improvement in a trial, which would subsume the placebo effect, would be exchangeable across trials. In the model above, we expressed this by placing a Normal random effect with common mean α and variance \( {\sigma}_{\alpha}^2 \) on the α_{ i } s:
This use of random effects enables evidence from all RCT and beforeandafter observational studies to estimate the expected change in 6MWD. Note that this assumption possibly interferes with randomization as the α_{ i } will be drawn towards the mean α and thus the treatment effects β may be biased. An alternative would be to treat the α_{ i } as fixed effects [24,25,51] and thus preserve randomization, but this would not allow the inclusion of beforeandafter studies.
The linear functions f_{ ij }() were almost always single values, eg. β_{6} for Jacobs et al. as the only additional treatment was prostacyclin analogues [43]. In the BREATHE2 study [40], labelled study i for convenience, arm j = 1 was a treatment naïve group started on bosentan (ERA) and intravenous epoprostenol (Pr) while arm j = 2 was a treatment naïve group started on bosentan (ERA) alone. This was represented by the functions:
which could be read from Figure 4. The β_{ i } are our analogues of the basic parameters in the standard indirect treatment comparison model described in Dias et al. [5], while f_{ ij }(β) are our analogues of the functional parameters.
The choice of priors for α and the β_{ l } s was based on the assumption that no patient would change their walking distance by more than 400 meters, which implied a standard deviation of 200 meters. Assuming that the smallest study had at least 10 patients, this gave \( SE=\raisebox{1ex}{$200$}\!\left/ \!\raisebox{1ex}{$\sqrt{10}$}\right. \) and therefore a prior variance for effects on the mean of SE^{2} = 4000. We represented these prior beliefs via Normal distribution, which were judged appropriate in the context of changes in 6MWD through exploratory analysis of the IMPRES data and expert clinical opinion. For \( {\sigma}_{\beta}^2 \) and \( {\sigma}_{\alpha}^2 \), the vague assumptions that σ_{ β } ≤ 50 meters and σ_{ α } ≤ 50 meters were used, which expressed the belief that individual patients would not differ from the mean improvement in 6MWD by more than 100 meters. Following the recommendation of Lambert et al. [52], a uniform prior representing this belief was placed on the standard deviation. These considerations gave the priors:
which completed the specification of a NMA model for aggregate data only.
Model M2: network metaanalysis of IPD and aggregate data from RCTs and observational studies
We extended the aggregate data model described by Equation (1) in Section 2.1 to include individual patient data through the relation:
for the change in 6MWD for patient k of arm j and study i, where σ^{2} is a common variance parameter to be fit to the data. Although in general we would use a separate σ^{2}, with a subscript, for each IPD trial, we have dropped the subscript to simplify the notation as our application only includes a single IPD trial. The treatment effects and placebo effects were as in the aggregate data model M1:
Normal prior distributions were again assumed for the means of the Normal distributions and Uniforms were placed on the standard deviations. As in the specification of priors for α and the β_{ l } s in model M1, we reasoned that if a patient was assumed not to have an improvement exceeding 400 meters, their standard deviations should be 200 meters and therefore have variances of 40000. These assumptions resulted in the priors:
As the evidence for the treatment effects β_{ l } came from both individual patient and aggregate (mean) level data, the ‘vaguer’ prior was used. The prior for the placebo effect α and for the standard deviations σ_{ α }, and σ_{ β } were kept the same as in the aggregate data models in Section 2.1. This was appropriate as they have the same meaning in both the IPD and aggregate data models.
Model M3: acrossstudy and withinstudy covariate adjustments on the placebo effect
To account for acrossstudy heterogeneity, we extended the model to include covariate adjustments on the placebo effects, the α_{ i } s. A further advantage was that these adjustments for differences in the patient populations led to better assessments of the placebo improvement in the singlearm beforeandafter studies due to their better explanation of the heterogeneity. We also adjusted for heterogeneity within the studies, which is betweenpatient heterogeneity, for which we had IPD. The model was defined for a mean covariate \( {\overline{X}}_{ij} \) and individual covariate X_{ ijk } as follows:
The last two equations are as in models M1 and M2. In this model, φ was the effect of the mean and accounted for acrossstudy differences, while π was the effect of an individual’s covariate and accounted for withinstudy differences.
Note that the difference between π and φ in Equation (8) quantifies ecological bias, a bias that arises when the effect of the mean of a covariate is different from effect of the covariate itself, and that if π = φ then there would be no ecological bias.
Priors for α, β_{ l }, σ, σ_{ α }, and σ_{ β } were as in the model with no covariate adjustments of Section 2.2, while a vague Normal distribution for mean effects was used for φ and a vague Normal distribution for individual effects was used for π:
which completed the NMA model combining IPD and aggregate data with covariate adjustments on the placebo effect.
Model M4: withinstudy covariate adjustments on treatment effects
Our final extension was to include covariate adjustments for the effect of patient characteristics on the efficacy of treatments, the β_{ l } s in the models. Such a model would be useful for predicting efficacy and evaluating costeffectiveness in patient subgroups with specific baseline characteristics. As only a small number of studies were available in our example for each treatment effect, it was not practical to account for acrossstudy heterogeneity. We therefore restricted treatment effect covariate adjustments to the withinstudy level, and thus to only the treatment effect of imatinib for which IPD was available. The model was defined as
Where Equations (7) and (14) are modifications of Equations (8) and (2) to include patient specific treatment effects. The elements γ_{ l } of γ were the effects of the covariate on the treatment effect β_{ l }. The linear functions f_{ ij }() therefore acted on linear combinations of the treatment effects β and their covariate adjustments γ.
The same priors as before were used for α, β_{ l }, σ, σ_{ α }, σ_{ β }, φ and π, while a Normal distribution for individual patent level effects was used for the γ_{ l }, i.e.
This completed the specification of an NMA model for combining IPD and aggregate data from RCTs and observational studies with covariate adjustments on the placebo and treatment, of imatinib, effect. The models described in these sections are summarized in Table 5 and we applied them to the PAH example.
Covariate selection via DICbased forward stepwise selection
Model M4 potentially includes covariates at three different levels and the full model space can be quite large. In our PAH example there are 6 possible covariates, so a total of 2^{18} possible models. Although a model that includes all of these covariates would be highly adjustable to populations in which predictions are desired, it is necessary to avoid over fitting to the data. To avoid over fitting and produce robust predictions, we use the Deviance Information Criterion (DIC, [53]). This is a predictive criterion that balances fit and complexity. It is computationally infeasible to investigate the full model space so we instead apply DICbased forward stepwise selection [54,55]. This allows us to search through the space of models using the following steps:

1.
Initially chosen model has no covariates

2.
Fit extended models with one extra covariate from chosen model

3.
Choose minimum DIC model from original and extended models.

4.
Return to step 2.
Initially, for the PAH example with 6 covariates of interest, Step 2 involves a search of 18 possible models. The second time through involves 17 possible models, and so on. This leads to a maximum of 171 models to search, which is computationally feasible.
Results
All results presented here are from an implementation of the models described in Section 2, and summarized in Table 5, in the WinBUGS [56] software package. This is a Windows based software for Bayesian inference using Gibbs sampling. The code for these models is provided in Additional file 3 and the authors are happy to respond to any queries about its use. All results were sampled from 250 000 iterations of a single Markov chain Monte Carlo (MCMC) chain following a burnin of 100 000 iterations. We also sampled a second chain from alternate initial values and confirmed that 250 000 iterations was sufficient for convergence on the basis of the GelmanRubin statistic [57].
Results of models M1 and M2: NMA with no covariate adjustments
Summary statistics of the posterior distributions of the placebo and treatment effects, on the scale of change in 6MWD in meters, for the comparison of imatinib against prostacyclin analogues as addon to ERA and PDE5i from the model M1, described in Section 2.1, are presented in Table 6 and Figure 5. This NMA combined only summary statistics from the IMPRES trial and did not make use of the available IPD. The posterior means and 95% credible intervals are comfortably within the prior ranges specified in Section 2.1. The summary of the placebo effect α implies that a randomly selected group of patients would be expected to have a mean 6MWD improvement of 4.78 meters, and for this mean to lie within the range of 4.8 and 14.6 meters with a probability of 95%, were they to enter a placebo arm of one of the studies. This is not unreasonable on the basis of the means and standard errors of the observed changes in 6MWD in the control arms of the RCTs, reported in Table 2 and Table 3. The wide and inconclusive 95% credible intervals for the treatment effects and comparison are indicative of the weakness of the evidence. Also provided in Table 6 and Figure 5 are the results of model M2, described in Section 2.2, which combined available aggregate data with IPD from the IMPRES study. The means of the posterior distributions do not change very much but the credible intervals for parameters based on IPD from IMPRES, the α (imatinib to E + P5) and treatment effect β_{8}, shrink. This reduction in the width of the credible intervals is due to the complex interaction between the vague priors in the different parameterisation of model M2 from M1 and is not due to any improvement in the use of the evidence. Even vague priors are somewhat informative and this is illustrated by the reduction in the credible intervals.
Results of model M3 and M4: NMA of IPD and aggregate data with covariate adjustments
Summary statistics of the results of the application of the covariate adjusted NMA model of Section 2.3, model M3, to combining IPD from the IMPRES trial with aggregate data from the literature are reported in Table 6 and Figure 5, while further parameter estimates are provided in Additional file 4. We used DICbased forward stepwise selection to choose the covariate adjustments at acrossstudy and withinstudy level on the placebo effects and withinstudy level on the treatment effect of imatinib. It was found that the DICminimizing model had no acrossstudy covariate adjustments on the placebo effect but had withinstudy adjustments for AGE, STATUS and PVR on the placebo effect.
The benefit of including IPD is again indicated by the reduction in the 95% credible interval for the treatment effect of imatinib added to ERA and PDE5i (β_{8}) from that of model M1 of aggregated data. The 95% credible interval for the indirect comparison of imatinib against prostacyclins as addon to ERA and PDE5i, (76.65, 85.24) from model M3, remains approximately the same width as in the aggregate data model, (83.70, 89.27) from model M1, as illustrated in Figure 5. This is because the effect of additional prostacyclins is based on only aggregate data. The direction of the effects of AGE (0.90), STATUS (11.89), and PVR (0.05) on the expected 6MWD improvement of a patient in the IMPRES trial imply that older and sicker patients have a lower expected improvement, which is reasonable. The nonselection of acrossstudy covariates indicates that the imputed values for missing covariates, such as PVR in Jacobs et al., have no effect on the results. That some values were imputed may affect the DICbased selection but this is unlikely to be a strong effect as the covariate adjustments were generally found to have little impact.
We further fit model M4, described in Section 2.4, and applied DICbased stepwise selection to choose covariate adjustments on the treatment effect of imatinib. However, no such covariate adjustments were included so the chosen model M4 was identical to M3.
In addition to applying our NMA methodology to the PAH example, we also tested the impact of its assumptions through sensitivity analyses.
Sensitivity analysis, model S1: downweighting the observational studies
In our standard NMA models M1, M2, M3 and M4, we gave equal weight to the results of the beforeandafter observational studies and those of twoarm RCTs. An alternative to this assumption is to downweight the results, recognizing internal bias due to lack of rigor, through a multiplicative adjustment to the standard errors of the results in either or both arms of the aggregate data, i.e.
where δ_{ i } is the quality weight of the i^{th} study, based on a subjective assessment. This is similar to the weighting of the empirical priors derived from observational evidence discussed in the background section [15,21]. If δ_{ i } = 1, it would represent a study that was judged to be of the highest quality, and its evidence would be given full weight. This was the value we assigned to RCT data. Using the covariate adjusted model of Section 3.2, we repeated the simulations with δ_{ i } = 0.1, increasing observed standard errors by a factor of 10, for the observational studies, thus downweighting them, relative to RCTs, to represent their poorer quality.
For example, the observed change in 6MWD from baseline in Jacobs et al. was 41 meters with a standard error of 38, as reported in Table 4. This sensitivity analysis would assume that this standard error had been 380, substantially larger than any of the observed standard errors reported in Tables 2, 3 or 4 (maximum was about 50). We can therefore conclude that if our analysis is robust to downweighting by a factor of δ_{ i } = 0.1, it is likely to be robust to most levels of uncertainty we could plausibly observe.
The results from this sensitivity analysis, labeled model S1, are presented in Table 5 and Figure 5. The main change from the results of the model without downweighting of the observational studies, models M1, M2 and M3 in Table 5, was the increased range of the 95% credible intervals for treatment effects estimated on the basis of observational studies, such as β_{6}. The range of the 95% credible interval of the comparison of imatinib against prostacyclins as addon to ERA + PDE5i was also increased, by a large factor, illustrated in Figure 5, due to its reliance on the downweighted observational studies. The magnitude of the comparative effectiveness (β_{8} − β_{6}) also increased substantially, but is most likely due to the increased random variation illustrated by the expanded credible intervals.
This decrease in the accuracy of the treatment effect estimates and indirect comparisons indicates the influence of the observational studies. As the effect was largely on the accuracy of these estimates and not on their direction, it could be concluded that the NMA methodology was robust to the downweighting of the observational studies, although its reliance on possibly weak and biased observational evidence was highlighted.
Sensitivity analysis, model S2: constructed control arms in the observational studies
The lack of control arms in the observational studies presented a difficulty of not knowing what would have happened had patients not been given additional treatment. The NMA models of Section 2 placed Normally distributed random effects on the expected improvements in patients who had entered a study but only received a placebo,
An alternative was to construct a control arm for the observational studies by making an assumption about \( {\overline{Y}}_{iC} \), the mean change in 6MWD for patients who did not receive additional therapy. As the Jacobs et al. study [43] looked at patients who were deteriorating on oral therapy, we assumed that patients’ 6MWD would decrease during the trial if they were not given any new treatments. This study included patients whose 6MWD had decreased by 58 meters over a mean time of 20.6 months before entering the study. Our short term followup was approximately 24 weeks, which is less than half of 20.6 months, so we assumed a mean change of \( {\overline{Y}}_{iC}\approx 25\mathrm{m} \) would be observed in missing control arms over this short term followup. We further assumed that the standard error of the mean in this constructed control arm, SE_{ iC }, is the same as that observed in the treatment arm. We used these assumptions to construct control arms for all observational studies.
We repeated the analysis using the covariateadjusted model from Section 3.2 with the constructed control arms, giving the results, labeled model S2, presented in Table 5 and Figure 5. It was difficult to interpret the direction of the change in placebo and treatment effects, due to the effect of covariate adjustments. The direction of the comparison of imatinib against prostacyclins was shifted in favor of prostacyclins, which was expected due to the conservative assumption about the control arms. However, the wide confidence intervals and overall direction of the comparisons remained so the analysis was judged to be robust to this alternative assumption about the control arms.
Discussion
In this paper we have considered the problem of how to perform a network metaanalysis when the RCT evidence does not form a complete network. Our proposal was to complete the network using singlearm beforeandafter observational studies by building a covariate adjusted random effects model on the placebo improvements. We built on recent innovations to construct a model which combines IPD and aggregate data from RCTs and beforeandafter studies and allows for the inclusion of covariate adjustments for heterogeneity at across and withinstudy level on the placebo effect and withinstudy level on the treatment effect. Using this model, we performed a clinically novel comparison of the benefit of imatinib against prostacyclins as addon therapy to PAH patients on a combination of ERA and PDE5i. This comparison was only possible through inclusion of observational studies as an evidence network restricted to RCTs would be disconnected.
As the credible intervals were very wide, the results of our application to PAH were considered to be inconclusive. This was due to the weakness of the evidence as only a few studies, with small sample sizes, were available for each edge of the network. This data limitation may also be the reason why we found that covariate adjustments had little effect on the NMA results and that no acrossstudy adjustments were included on DIC grounds, although we can also interpret this as evidence that heterogeneity had little effect on the NMA. It is possible that important covariates were not reported by IMPRES or other studies, or that reported covariates were incorrectly considered to be of no importance due to the weakness of the data. It is also possible that the stepwise selection algorithm missed important covariates as it only investigates a small portion of the total 2^{18} possible models. A simulation/robustness study could address these concerns but would be computationally intensive as the model selection step, even using stepwise selection to reduce the set of models under consideration, was resource intensive. The best strategy to improve the practical utility of this application of NMA to PAH is to collect further evidence, ideally IPD from a new or existing RCT.
Apart from these data limitations, which are specific to the application, there are a number of limitations and untestable assumptions of the model itself. As in many metaanalysis and NMA models [15], we assumed the effects of particular treatments were the same across studies by placing a fixed effect on each β_{ l }. In cases where sufficient data are available, this could be relaxed to a random effects assumption where we assume the β_{ l } from different trials follow a common, possibly Normal, distribution. Our model also assumed, in Equation (2), that effects of additional treatments had the same variance \( {\sigma}_{\beta}^2 \) in all studies, no matter how many additional treatments were being administered. This is possibly implausible as a the effect of a combination of three new treatments should have a higher variance than the effect of a single new treatment. As in the case of fixed effects, this assumption could be relaxed in cases where sufficient data are available. A further simplification that limits the generalizability of our model is that it is restricted to single or twoarm trials. To extend the model to trials with three or more arms would require careful consideration of correlation in treatment effects across arms and within studies [5,50].
An assumption of our model that is common to most NMA models is the transitivity or consistency of treatment effects across studies. This is the assumption that studies informing the comparison of treatment A against treatment B and of treatment A against treatment C can be used to inform the comparison of B against C. Our evidence network was sparse and contained only one loop, making it impractical to test for consistency of direct and indirect evidence using nodesplitting [58] or other measures of inconsistency [59,60]. If more studies became available, it would be recommended to test that comparing ERA + Pr to ERA using the direct evidence [44,46] gave similar results, within some range of acceptability, to performing the comparison with only indirect evidence.
The principle assumption that allowed the inclusion of singlearm beforeandafter observational studies was that the placebo effects, the α_{ i } s, were exchangeable or that there was no a priori reason that there would be systematic differences between these effects. This assumption allowed us to model the α_{ i } s using a random effects distribution. We recommend the use of this assumption and our model in cases where networks are not densely populated or fully connected when restricted to RCT evidence, such as the PAH example. Decision makers would still need to give a recommendation on which treatment to use in such situations [13] and, indeed, in Australia the PBAC already considers nonrandomized observational evidence, particularly in the absence of RCTs [1]. However, this type of evidence is considered to be weak and subject to bias by decision making bodies such as NICE [14]. Additionally, the GRADE scale, which is followed by PBAC, rates the quality of such as evidence as low [61]. In cases where networks can be densely populated and fully connected by RCT evidence, this assumption may shrink placebo effects towards the mean and thus interfere with randomization [24,25,51]. In those cases, we recommend treating the α_{ i } as independent fixed effects, or nuisance parameters, and not including observational evidence.
In the PAH application, clinical opinion supported the assumption of exchangeable placebo effects, although this assumption is not testable statistically. That no acrossstudy adjustments were included in the model selection step gave an indication that there were no systematic differences in these expected improvements and that our exchangeability assumption was warranted. The simple alternative of constructing a control arm for observational studies was investigated but was found to have little effect on the results. We also explored downweighting the observational evidence and found, as expected, a reduction in the accuracy of our findings, but no change in the overall direction of the indirect comparisons results.
Our model can be criticized on the grounds that the singlearm studies contribute to the estimation of the distribution for the placebo effects α_{ i }. We considered an alternative formulation of our model where only the RCTs would contribute to this estimation and the α_{ i } s for the observational studies would be sampled separately from this distribution. This is the method proposed for baseline natural history models by Dias et al. [24]. However, our model is designed to be applied to cases where data would already be limited, such as the PAH example, so a further reduction of the evidence base would be undesirable, although in practice the contribution of the observational studies to the α_{ i } estimation will be limited.
A very simple alternative to a random effects assumption for the placebo effects is to use a single fixed effect α for the α_{ i } s. This is an assumption that all patient populations started on placebo have the same short term expected improvement in 6MWD and that any differences are due to the treatment or covariate effects. We repeated the NMA with this assumption and found that the results were similar in magnitude and direction to those of the random effects model and that the DIC was considerably higher, with 1889 for fixed effects versus 1870 for random effects. This DIC gives evidence in favor of our random effects model. The single fixed effect model was also not clinically plausible as there were many inherent differences in the studies so a common placebo effect would be difficult to justify.
Several additional sensitivity analyses were conducted. Firstly, as no acrossstudy covariates were included, we applied our final NMA model to an evidence network which included the studies which were excluded due to nonreporting of covariates. This included one extra RCT [62] and three observational studies [6365]. The results of this sensitivity analysis, not reported in this paper, were almost identical to those of the base case. Prior sensitivity analyses, where we tried prior distributions with greater variances, led us to conclude that the results were not dependent on our choice of prior parameters. Although nonnormal priors could be easily implemented if the application required them, normal priors were judged to be appropriate for the continuous outcome of change in 6MWD through expert clinical opinion and exploratory analysis of the IMPRES data.
Aside from the extensions to multiarm trials, separation of the placebo estimation between RCT and observational studies, and other possibilities so far discussed, there are a variety of directions for future extension of our methodology. One such direction would be to apply the model to noncontinuous outcomes such as binary outcomes. NMA models combining IPD and aggregate data for binary outcomes have been discussed in the literature [6,7] and the use of a random effects model for placebo effects to include singlearm studies would be a straightforward extension. Our methods are also readily applicable to pairwise metaanalysis, as it was in this setting that the use of random effects modelling of placebo effects to include singlearm studies was first proposed [23]. Although in pairwise metaanalysis the model would no longer be justified on the grounds of completing evidence networks, it may be useful in cases where there are only a limited number of small RCTs and large, highquality singlearm studies are available. An additional direction for research is the joint network metaanalysis of multivariate outcomes, such as PVR and change in 6MWD in PAH [66]. This approach would treat all covariates as responses and would account for missing values, a reason for exclusion of several studies, through a form of multiple imputation. This would have the advantage of using the evidence more consistently, rather than our approach of singly imputing missing covariates, such as PVR in Jacobs et al. [43]. However, this extension would require a greater evidence base than was available for the PAH example.
All of the limitations we have discussed should be kept in mind if applying our model in order to avoid being misled by the results of an analysis in which observational evidence is included. We would recommend conducting the sensitivity analyses we have described to ensure the model and the implications of its various assumptions are fully understood.
Conclusions
We have developed an extension of existing NMA methodology to allow the completion of disconnected networks of RCT evidence through the inclusion of singlearm beforeandafter observational studies. This model also brings together many recent developments in network metaanalysis of IPD and aggregate data. Our application to PAH demonstrated the utility of our methodology as comparisons impossible to conduct on the basis of RCTs alone could be conducted through the inclusion of observational studies. Although IPD and covariate adjustments were found to make little difference to the results, we believe this model could be easily applied to many other disease areas and settings which require the inclusion of observational evidence. Our work therefore furthers the range of evidence synthesis problems that can be approached through NMA.
Abbreviations
 DIC:

Deviance information criterion
 ERA:

Endothelin receptor antagonist
 NMA:

Network meta analysis
 PAH:

Pulmonary arterial hypertension
 PDE5i:

Phosphodiesterase5 inhibitors
 Pr:

Prostacyclin analogues
 PVR:

Pulmonary vascular resistance
 IPD:

Individual patient data
 RCT:

Randomized controlled trial
 SE:

Standard error
 6MWD:

6 minute walk distance
 STATUS:

World Health Organization New York Health Assessment status
References
 1.
PBAC: Guidelines for preparing submissions to the Pharmaceutical Benefits Advisory Committee (Version 4.4); available from http://www.pbac.pbs.gov.au/. Pharmaceutical Benefits Advisory Committee 2014.
 2.
Caldwell DM, Ades AE, Higgins JPT. Simultaneous comparison of multiple treatments: combining direct and indirect evidence. BMJ. 2005;331:897–900.
 3.
Sutton A, Ades AE, Cooper N, Abrams K. Use of indirect and mixed treatment comparisons for technology assessment. Pharmacoeconomics. 2008;26:753–67.
 4.
Dias S, Welton N, Sutton A, Ades A: NICE DSU Technical Support Document 1: Introduction to evidence synthesis for decision making; 2011; last updated April 2012; available from http://www.nicedsu.org.uk. National Institute for Health and Care Excellence 2012.
 5.
Dias S, Welton N, Sutton A, Ades A: NICE DSU Technical Support Document 2: A Generalised Linear Modelling Framework for Pairwise and Network MetaAnalysis of Randomised Controlled Trials. 2011; last updated April 2014; available from http://www.nicedsu.org.uk. National Institute for Health and Care Excellence 2014.
 6.
Donegan S, Williamson P, D’Alessandro U, Garner P, Smith CT. Combining individual patient data and aggregate data in mixed treatment comparison metaanalysis: Individual patient data may be beneficial if only for a subset of trials. Stat Med. 2013;32:914–30.
 7.
Saramago P, Sutton AJ, Cooper NJ, Manca A. Mixed treatment comparison using aggregate and individual participant level data. Stat Med. 2012;31:3516–36.
 8.
Sutton AJ, Kendrick D, Coupland CAC. Metaanalysis of individual and aggregatelevel data. Stat Med. 2008;27:651–69.
 9.
Riley RD, Steyerberg EW. Metaanalysis of a binary outcome using individual participant data and aggregate data. Research Synthesis Methods. 2010;1:2–19.
 10.
Riley RD, Lambert PC, Staessen JA, Wang J, Gueyffier F, Thijs L, et al. Stat Med. 2008;27:1870–93.
 11.
Papaioannou D, Rafia R, Rathbone J, Stevenson M, Buckley Woods H: Rituximab for the firstline treatment of stage IIIIV follicular lymphoma (Review of TA 110); available from https://www.nice.org.uk/. Health Tecnhology Assessment 2011.
 12.
Aronson JK. Rare diseases and orphan drugs. Br J Clin Pharmacol. 2006;61:243–5.
 13.
Reeves BC, Higgins JPT, Ramsay C, Shea B, Tugwel P, Wells GA. An introduction to methodological issues when including nonrandomised studies in systematic reviews on the effects of interventions. Research Synthesis Methods. 2013;4:1–11.
 14.
NICE: Process and methods guides: Methods for the development of NICE public health guidance (third edition)); available from https://www.nice.org.uk/. National Institute for Health and Care Excellence 2012.
 15.
Welton NJ, Sutton AJ, Cooper NJ, Abrams KR, Ades AE. Evidence synthesis for decision making in healthcare. Chichester: John Wiley and Sons; 2012.
 16.
Prevost T, KR A, Jones D. Hierarchical models in generalised synthesis of evidence: an example based on studies of breast cancer screening. Stat Med. 2000;19:3359–76.
 17.
Schmitz S, Adams R, Walsh C. Incorporating data from various trial designs into a mixed treatment comparison model. Stat Med. 2013;32:2935–49.
 18.
Rosenbaum P, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.
 19.
d’Agostino RJ. Propensity score methods for bias reduction in the comparison of a treatment to a nonrandomized control group. Statistcs in Medicine. 1998;17:2265–81.
 20.
d’Agostino RJ, d’Agostino RS. Estimating treatment effects using observational data. JAMA. 2007;297:314–6.
 21.
Ibrahim JG, Chen MH. Power prior distributions for regression models. Stat Sci. 2000;15:46–60.
 22.
D’Agostino RS, Kwan H. Measuring effectiveness: what to expect without a randomized control group. Med Care. 1995;33:AS95–AS105.
 23.
Li Z, Begg CB. Random effects models for combining results from controlled and uncontrolled studies in a metaanalysis. J Am Stat Assoc. 1994;89:1523–7.
 24.
Dias S, Welton N, Sutton A, Ades A. Evidence synthesis for decision making 5: the baseline natural history model. Med Decis Making. 2013;33:657–70.
 25.
Senn S. Hans van Houwelingen and the Art of Summing up. Biom J. 2010;52:85–94.
 26.
Farber HW, Loscalzo J. Pulmonary arterial hypertension. N Engl J Med. 2004;351:1655–65.
 27.
Liu C, Liu K, Ji Z, Liu G. Treatments for pulmonary arterial hypertension. Respir Med. 2006;100:765.
 28.
Simonneau G, Rubin LJ, Galie N, Barst RJ, Fleming TR, Frost AE, et al. Addition of sildenafil to longterm intravenous epoprostenol therapy in patients with pulmonary arterial hypertension. Ann Intern Med. 2008;149:521–30.
 29.
Fox BD, Shimony A, Langleben D. Metaanalysis of monotherapy versus combination therapy for pulmonary arterial hypertension. Am J Cardiol. 2011;108:1177–82.
 30.
Hoeper MM, Barst RJ, Bourge RC, Feldman J, Frost AE, Galie N, et al. Imatinib mesylate as addon therapy for pulmonary arterial hypertension: results of the randomized IMPRES study. Circulation. 2013;127:1128–38.
 31.
McLaughlin V, Badesch D, Delcroix M, Fleming TR, Gaine SP, Galie N, et al. End points and clinical trial design in pulmonary arterial hypertension. J Am Coll Cardiol. 2009;54:S97–S107.
 32.
Senn S. Change from baseline and analysis of covariance revisited. Stat Med. 2006;25:4334–44.
 33.
Alessandro L, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and metaanalyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Med. 2009;6(7):e1000100.
 34.
NICE: The Guidelines Manual: Appendix C: Methodology checklist: randomised controlled trials; available from https://www.nice.org.uk/. National Institute for Health and Care Excellence 2012, PMG6B.
 35.
Badesch DB, Bodin F, Channick RN, Frost A, Rainisio M, Robbins IM, et al. Complete Results of the first randomized, placebocontrolled study of bosentan, a dual endothelin receptor anagonis, in pulmonary arterial hypertension. Curr Ther Res. 2002;63:227–47.
 36.
Rubin LJ, Badesch DB, Barst RJ, Galie N, Black CM, Keogh A, et al. Bosentan therapy for pulmonary arterial hypertension. N Engl J Med. 2002;346:896–903.
 37.
Barst RJ, Langleben D, Badesch D, Frost A, Lawrence EC, Shapiro S, et al. Treatment of pulmonary arterial hypertension with the selective endothelina receptor antagonist sitaxsentan. J Am Coll Cardiol. 2006;47:2049–56.
 38.
Barst RJ, Rubin LJ, Long WA, McGoon MD, Rich S, Badesch DB, et al. A comparison of continuous intravenous epoprostenol (prostacyclin) with conventional therapy for primary pulmonary hypertension. N Engl J Med. 1996;334:296–301.
 39.
Galie N, Beghetti M, Gatzoulis M, Granton J, Berger R, Lauer A, et al. BREATHE5: Bosentan improves hemodynamics and exercise capacity in the first randomized placebocontrolled trial in eisenmenger physiology. CHEST  LateBreaking Science. 2005;128(issue 4):496S.
 40.
Humbert M, Barst RJ, Robbins IM, Channick RN, Galie N, Boonstra A, et al. Combination of bosentan with epoprostenol in pulmonary arterial hypertension: BREATHE2. Eur Respir J. 2004;24:353–9.
 41.
Barst RJ, Oudiz RJ, Beardsworth A, Brundage BH, Simonneau G, Ghofrani HA, et al. Tadalafil monotherapy and as addon to background bosentan in patients with pulmonary arterial hypertension. J Heart Lung Transplant. 2011;30:632–42.
 42.
McLaughlin VV, Oudiz RJ, Frost A, Tapson VF, Srinivas M, Channick RN, et al. Randomized study of adding inhaled iloprost to existing bosentan in pulmonary arterial hypertension. Am J Respir Crit Care Med. 2006;174:1257–63.
 43.
Jacobs W, Boonstra A, Marcus JT, Postmu PE, VonkNoordegraaf A. Addition of prostanoids in pulmonary hypertension deteriorating on oral therapy. J Heart Lung Transplant. 2009;28:280–4.
 44.
Akagi S, Matsubara H, Miyaji K, Ikeda E, Dan K, Tokunaga N, et al. Additional Effects of bosentan in patients with idiopathic pulmonary arterial hypertension already treated with highdose epoprostenol. Circ J. 2008;72:1142–6.
 45.
Channick RN, Olschewski H, Seeger W, Staub T, Voswinckel R, Rubin LJ. Safety and efficacy of inhaled treprostinil as addon therapy to bosentan in pulmonary arterial hypertension. J Am Coll Cardiol. 2006;48:1433–7.
 46.
Hoeper MM, Taha N, Bekjarova A, Gatzke R, Spiekerkoetter E. Bosentan treatment in patients with primary pulmonary hypertension receiving nonparenteral prostanoids. Eur Respir J. 2003;22:330–4.
 47.
Mathai SC, Girgis RE, Fisher MR, Champion HC, HoustenHarris T, Zaiman A, et al. Addition of sildenafil to bosentan monotherapy in pulmonary arterial hypertension. Eur Respir J. 2007;29:469–75.
 48.
Hoeper MM, Faulenbach C, Golpon H, Winkler J, Welte T, Niedermeyer J. Combination therapy with bosentan and sildenafil in idiopathic pulmonary arterial hypertension. Eur Respir J. 2004;24:1007–10.
 49.
Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and metaanalyses: the PRISMA statement. BMJ. 2009;339:b2535.
 50.
Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Stat Med. 2004;23:3105–24.
 51.
Senn S, Gavini F, Magrez D, Scheen A. Issues in performing a network metaanalysis. Stat Methods Med Res. 2011;22:169–89.
 52.
Lambert PC, Sutton AJ, Burton PR, Abrams KR, Jones DR. How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS. Stat Med. 2005;24:2401–28.
 53.
Spiegelhalter DJ, Best NG, Carlin BP, Linde A. Bayesian measures of model complexity and fit. JRStatist Soc B. 2002;64:583–639.
 54.
Hocking RR. The analysis and selection of variables in linear regression. Biometrics. 1976;32:1–49.
 55.
Miller AJ. Selection of subsets of regression variables. J R Stat Soc Ser A. 1984;147:389–425.
 56.
Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS  a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput. 2000;10:325–37.
 57.
Lunn DJ, Jackson CH, Best N, Thomas A, Spiegelhalter D: The BUGS Book. New York: CRC Press; 201
 58.
Dias S, Welton NJ, Caldwell DM, Ades AE. Checking consistency in mixed treatment comparison metaanalysis. Stat Med. 2010;29:932–44.
 59.
Lu G, Ades A. Assessing evidence inconsistency in mixed treatment comparisons. J Am Stat Assoc. 2006;101:447–59.
 60.
Dias S, Welton N, Sutton A, Caldwell D, Lu G, Ades A. Evidence synthesis for decision making 4: inconsistency in networks of evidence based on randomized controlled trials. Med Decis Making. 2013;33:641–56.
 61.
Guyatt GH, Oxman AD, Sultan S, Glasziou P, Akl EA, AlonsoCoello P, et al. GRADE guidelines: 9. Rating up the quality of evidence. J Clin Epidemiol. 2011;64:1311–6.
 62.
Wilkins MR, Paul GA, Strange JW, Tunariu N, GinSing W, Banya WA, et al. Sildenafil versus endothelin receptor antagonist for pulmonary hypertension (SERAPH) study. Am J Respir Crit Care Med. 2005;171:1292–7.
 63.
Porhownik NR, AlSharif H, Bshouty Z. Addition of sildenafil in patients with pulmonary arterial hypertension with inadequate response to bosentan monotherapy. Can Respir J. 2008;15:427–30.
 64.
Ghofrani HA, Rose F, Schermuly RT, Olschewski H, Wiedeman R, Kreckel A, et al. Oral Sildenafil as longterm adjunct therapy to inhaled iloprost in severe pulmonary arterial hypertension. Pulmonary Hypertension. 2003;42:158–64.
 65.
Ruiz MJ, Escribano P, Delgado JF, Jimenez C, Tello R, Gomez A, et al. Efficacy of sildenafil as a rescue therapy for patients with severe pulmonary arterial hypertension and given longtrem treatment with prostanoids: 2year experience. Pulmonary Hypertension. 2006;25:1353–7.
 66.
Ades AE, Welton NJ, Caldwell D, Price M, Goubar A, Lu G. Multiparameter evidence synthesis in epidemiology and medical decisionmaking. J Health Serv Res Policy. 2008;Suppl 3:12–22.
Acknowledgements
Novartis Pharma provided funding for HT and LH to complete this work while GC, RN, and AC were fulltime employees of Novartis. Novartis Pharma permitted the publication of this manuscript. MAPI values, under contract to Novartis Pharma, provided the systematic review that informed the application. The authors are very grateful to the helpful comments received from our reviewers and the associate editor, in particular for pointing out the example of a disconnected network in the evaluation of treatments for follicular lymphoma.
Author information
Additional information
Competing interests
HT and LH were paid external consultants of Novartis while completing this work. GC, RN and AC are fulltime employees of Novartis.
Authors’ contributions
AC and GC led the project and conceived the PAH indirect comparison application. HT developed the methodology in consultation and under the supervision of RN and GC. LH provided clinical support throughout the project. All authors read and approved the final manuscript.
Additional files
Additional file 1:
PICOS and Search Terms for systematic literature review.
Additional file 2:
PRISMA checklist for the systematic literature review.
Additional file 3:
WinBUGS code for Model M4: Covariate adjusted NMA of IPD and aggregate data.
Additional file 4:
Further parameter estimates from NMA Model M3: Covariate adjusted NMA of IPD and aggregate data.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Thom, H.H., Capkun, G., Cerulli, A. et al. Network metaanalysis combining individual patient and aggregate data from a mixture of study designs with an application to pulmonary arterial hypertension. BMC Med Res Methodol 15, 34 (2015). https://doi.org/10.1186/s1287401500070
Received:
Revised:
Accepted:
Published:
Keywords
 Network metaanalysis
 Individual patient data
 Covariate adjustments
 Observational evidence
 Mixed treatment comparison
 Pulmonary arterial hypertension