Skip to main content

Comparing the use of aggregate data and various methods of integrating individual patient data to network meta-analysis and its application to first-line ART



The 2018 World Health Organization HIV guidelines were based on the results of a network meta-analysis (NMA) of published trials. This study employed individual patient-level data (IPD) and aggregate data (AgD) and meta-regression methods to assess the evidence supporting the WHO recommendations and whether they needed any refinements.


Access to IPD from three trials was granted through (CSDR). Seven modelling approaches were applied and compared: 1) Unadjusted AgD network meta-analysis (NMA) – the original analysis; 2) AgD-NMA with meta-regression; 3) Two-stage IPD-AgD NMA; 4) Unadjusted one-stage IPD-AgD NMA; 5) One-stage IPD-AgD NMA with meta-regression (one-stage approach); 6) Two-stage IPD-AgD NMA with empirical-priors (empirical-priors approach); 7) Hierarchical meta-regression IPD-AgD NMA (HMR approach). The first two were the models used previously. Models were compared with respect to effect estimates, changes in the effect estimates, coefficient estimates, DIC and model fit, rankings and between-study heterogeneity.


IPD were available for 2160 patients, representing 6.5% of the evidence base and 3 of 24 edges. The aspect of the model affected by the choice of modeling appeared to differ across outcomes. HMR consistently generated larger intervals, often with credible intervals (CrI) containing the null value. Discontinuations due to adverse events and viral suppression at 96 weeks were the only two outcomes for which the unadjusted AgD NMA would not be selected. For the first, the selected model shifted the principal comparison of interest from an odds ratio of 0.28 (95% CrI: 10.17, 0.44) to 0.37 (95% CrI: 0.23, 0.58). Throughout all outcomes, the regression estimates differed substantially between AgD and IPD methods, with the latter being more often larger in magnitude and statistically significant.


Overall, the use of IPD often impacted the coefficient estimates, but not sufficiently as to necessitate altering the final recommendations of the 2018 WHO Guidelines. Future work should examine the features of a network where adjustments will have an impact, such as how much IPD is required in a given size of network.

Peer Review reports


With an ever-growing number of scientific publications, the need for meta-analysis to help make sense of the evidence continues to escalate [1]. Meta-analyses require that the included studies be sufficiently similar; otherwise resulting estimates may be biased due to imbalances between studies in the distribution of trial or patient characteristics that affect the relative effectiveness of the interventions being compared, named effect-modifiers [2]. Meta-regression has long been used to overcome such biases, as well as improve precision [3].

Meta-analyses typically consist of combining aggregate data (AgD) results from publications. As such, meta-regression most commonly consists of conducting linear regression of the study results as a function of an effect modifier, both in the aggregate. Two potential limitations to this form of meta-regression are: a limited number of data points to reliably estimate trends and risk of ecological fallacy (when trends at the trial-level do not match trends at the individual-level) [4]. A less common form of meta-regression involves using individual patient data (IPD), with or without AgD [5]. The use of IPD is less common, primarily due to the additional complications in obtaining such data [6]. Nonetheless, IPD meta-analysis can help overcome the two aforementioned limitations of AgD meta-regression [7]. Conducting meta-regression at patient-level values provide more data points, which also lends itself better to simultaneously adjusting for multiple variables [8].

Network meta-analysis (NMA) is an expansion of traditional meta-analysis that allows for the simultaneous analysis of multiple comparisons within a connected network of evidence [9]. Meta-regression is also an important technique to improve validity and precision of estimates in NMA [2, 10]. Given that NMA lends itself to larger evidence bases, the most common manner in which IPD is used in NMA is in analyses that include both IPD and AgD [11]. There are various ways by which to use IPD and AgD to conduct meta-regression, including two-stage approaches (whereby adjusted AgD are created using the IPD) [6], and one-stage approaches that integrate IPD and AgD together using hierarchical models [12,13,14].

In 2005, Simmonds et al. reported that 28/44 (63%) published IPD meta-analyses used the two-stage approach to IPD-AgD NMA. In a more recent 2015 review, the same researchers report roughly even use of one- and two-stage approaches, though outside of survival outcomes, the use of one-stage IPD-AgD NMA has become more popular [6]. There have also been further developments of ones-stage methods with Jackson et al. developing an expanded hierarchical method that may improve IPD-AgD meta-analysis by further reducing the risk of ecological fallacy [15, 16]. But there have not been many studies that have examined how the different types of meta-regression compare in their ability to improve analyses and conclusions. We sought to use a case study to examine if and which type (AgD or IPD) of meta-regression make such improvements.

The case study we used was a systematic literature review (SLR) and NMA that helped inform the 2016 World Health Organization (WHO) HIV clinical guidelines. The 2016 SLR found evidence of improved efficacy and tolerability of dolutegravir (DTG) relative to standard-dose efavirenz (EFV), the preferred first-line anchor treatment [17]. Following its completion, we sought IPD, independent of updating guidelines, for the comparison of AgD and IPD meta-regression methods and to see if more precise estimates might lead to stronger conclusions. In the 2016 analyses, DTG was nominally better than other treatments in its class, integrase inhibitors; however, these differences were seldomly statistically significant. In the same year, IAS-USA released its own clinical guidelines that suggested that all INSTIs were equivalent [18, 19]. We sought to further investigate this point.

The primary objective of this study, which was part of a doctoral thesis [20], was to compare the impact of using different established AgD- and IPD-based methods for meta-regression adjustments. A secondary objective was also to examine the change in outputs in the evidence synthesis of antiretroviral therapy (ART) among first-line HIV patients when including IPD – with a particular focus on the relative efficacy, safety and tolerability of DTG relative to other anchor treatments.


Systematic literature review

Study eligibility aligned with the review for the WHO Guideline update [21]. Briefly, eligible studies were randomized controlled trials (RCTs) comparing first-line ART regimens among adults and adolescents living with HIV. Eligible treatments were DTG, standard-dose EFV (low-dose) 400 mg efavirenz (EFV400), raltegravir (RAL), cobicistat-boosted elvitegravir (EVG/c), bictegravir (BIC), doravirine (DOR), rilpivirine (RPV), nevirapine (NVP), and ritonavir-boosted darunavir (DRV/r), atazanavir (ATV/r), and lopinavir (LPV/r); each in combination with a two nucleoside reverse transcriptase inhibitors (NRTI) backbone. The full PICOS (population, intervention, comparator, outcomes, study design) criteria are provided in the Additional file 1: Web-Appendix.

A comprehensive systematic search of the literature was conducted on 12 February 2018 using the following databases: MEDLINE, EMBASE, and CENTRAL (see Additional file 1: Web Appendix for search strategy). Further manual searches of the 2016–2018 Conference on Retroviruses and Opportunistic Infections (CROI), the 2016 AIDS and Glasgow HIV conferences, and the 2017 International AIDS Society (IAS) conference were conducted. Additional studies were identified through a review of clinical trial registries and the reference lists of identified publications. Two investigators, working independently, scanned all titles and abstracts identified in the literature search and reviewed subsequent full-texts. A third investigator provided arbitration as needed for discrepancies. The same approach was used for data extraction.

On 15 August 2016, IPD from three RCTs available through (CSDR) were formally requested. These were FLAMINGO (DRV/r + 2 NRTIs vs DTG + 2 NRTIs) [22, 23], SINGLE (DTG + ABC + XTC vs EFV + TDF + XTC) [24,25,26,27], and SPRING-2 (DTG + 2 NRTIs vs RAL + 2 NRTIs) [28, 29]. Access to the data was granted on 06 June 2017. In hindsight, there was one more eligible trial that was available at the time through this service, namely the phase 2 SPRING-1 [30, 31]; however, it was still included in the analysis through AgD.

The validity of individual RCTs was assessed using the Risk of Bias instrument, endorsed by the Cochrane Collaboration [32]. This instrument is used to evaluate 7 key domains: sequence generation; allocation concealment; blinding of participants and personnel; blinding of outcome assessors; incomplete outcome data; selective outcome reporting; and other sources of bias.

Reporting is in accordance with the preferred reporting items for systematic review and meta-analysis of individual participant data (PRISMA-IPD) guidelines [33].

Preparation of the individual patient data

IPD were provided in a series of lengthwise tables following the Clinical Data Interchange Standards Consortium (CDISC) standards. Using these tables, an amalgamated IPD set combining all three studies was prepared. The patients were restricted to the full analysis sets, as in each of the respective trials [22, 29, 34]. The following outcomes were obtained: Viral suppression and change from baseline in CD4 cell counts at 24, 48 and 96 weeks; discontinuations, discontinuations due to adverse events, serious adverse events. There were no missing values except for CD4, for which analyses were only conducted on the observed data. Data were further verified to ensure that published results for each trial could be obtained from the IPD.

Statistical models

Only select outcomes were used for the purpose of comparing the various statistical models of interest for conducting meta-regression adjustments with IPD and AgD. Assessing the impact on the HIV related results involved applying the preferred adjustment method to the remaining outcomes. The statistical models are presented below. Only the more complex random-effects models are presented, but both fixed- and random-effects were considered throughout.


This served as the “baseline” results from which to draw comparisons. The model is as follows:

$$ {\displaystyle \begin{array}{c}{\theta}_{jk}=\left\{\begin{array}{c}{\mu}_{jb}\kern5.8em \mathrm{if}\ k=b\\ {}{\mu}_{jb}+{\delta}_{jb k}\kern3em \mathrm{if}\ k\succ b\end{array}\right.\\ {}{\delta}_{jb k}\sim Normal\left({d}_{bk},{\sigma}^2\right)=\kern0.5em Normal\left({d}_{Ak}-{d}_{Ab},{\sigma}^2\right)\\ {}{d}_{AA}=0\end{array}} $$

In this equation, θjk reflects the ‘underlying’ outcome for treatment k in study j that has been link-function-transformed to a normally distributed scale (e.g., logit link for dichotomous outcomes). δjbk is the trial-specific treatment effect of treatment k relative to treatment b. These trial-specific effects are drawn from a random-effects distribution: δjbk~N(dbk, σ2). The pooled effects, dbk, are identified by expressing them in terms of the reference treatment A. The heterogeneity σ2 is assumed constant for all treatment comparisons.

AgD NMA with meta-regression

Traditional meta-regression for NMA as described in the NICE Technical Support Document 3 [2], and the statistical analysis plan (SAP) [35].

Two-stage IPD-AgD NMA

For these analyses, aggregate values for the DTG trials were calculated using the IPD. Specifically, mixed linear regression among the IPD was used to model each outcome adjusted for candidate covariates and provide predicted estimates of the aggregate value within the target population. The adjusted values were then simply applied to the above methods.

One-stage IPD-AgD NMA with and without adjustments

IPD and AgD were combined, along with meta-regression, in a single model. This has the advantage of being a single model using all data. The model is shown in eq. (2), where θijk is the link-function-transformed parameter from the likelihood function of interest for the ith individual, in the jth trial, treated with treatment k. Similarly, ηjk is the link-function-transformed parameter from the likelihood function for the AgD. μjb and λjb are the study-effects for the IPD and AgD, respectively. When including meta-regression adjustment, for the IPD β0j is a study-specific effect of the subject-level covariate xij. β1Ak − β1Ab reflects the interaction effects of covariate xij for treatment k relative to control treatment b. k-1 different regression coefficient β1Ak will be estimated by the model. Parameters of primary interest from analyses are the pooled estimates of dAk, the estimates for the heterogeneity, and treatment-by-covariate interaction effects β1Ak.

$$ {\displaystyle \begin{array}{c}\mathrm{IPD}\\ {}{\theta}_{ijk}=\left\{\begin{array}{c}{\mu}_{jb}+\sum \limits_l{\beta}_{0 lj}{x}_{lij}\kern16em \mathrm{if}\ k=b\\ {}{\mu}_{jb}+{\delta}_{jb k}+\sum \limits_l{\beta}_{0 lj}{x}_{lij}\kern0.5em +\kern0.5em \sum \limits_l\left({\beta}_{1 lAk}-{\beta}_{1 lAb}\right){x}_{lij}\kern2.75em \mathrm{if}\ k\succ b\end{array}\right.\\ {}\begin{array}{c}\mathrm{AgD}\\ {}{\eta}_{jk}=\left\{\begin{array}{c}{\lambda}_{jb}\kern16em \mathrm{if}\ k=b\\ {}{\lambda}_{jb}+{\delta}_{jb k}+\sum \limits_l\left({\beta}_{1 lAk}-{\beta}_{1 lAb}\right)x.{agg}_{lj}\kern3em \mathrm{if}\ k\succ b\end{array}\right.\\ {}\begin{array}{c}{\delta}_{jb k}\sim Normal\left({d}_{bk},{\sigma}^2\right)=\kern0.5em Normal\left({d}_{Ak}-{d}_{Ab},{\sigma}^2\right)\\ {}{d}_{AA}=0,{\beta}_{1 AA}=0\kern2em {d}_{Ak}\sim Normal\left(0,1000\right),{\beta}_{lk}={b}_l,{b}_l\sim Normal\left(0,1000\right)\end{array}\end{array}\end{array}} $$

Two-stage IPD-AgD NMA with empirical-priors

These models were the same as described in (2), except that the regression coefficients were provided with an empirical prior that was informed by the IPD. Rather than start with the non-informative prior for β1Ak, the IPD were first used to estimate meta-regression coefficients using mixed-effects linear regression. The estimates and standard errors of the meta-regression were used to construct an empirical prior: \( {\beta}_{1 Ak}\sim Normal\left(\hat{\beta},{prec}_{\hat{\beta}}\right) \). The idea here is to ensure that the IPD principally inform the meta-regression (potentially avoiding some ecological fallacy bias).

One-stage IPD-AgD NMA with hierarchical meta-regression

The final model that was considered was an expansion of one-stage IPD-AgD NMA that applies the hierarchical meta-regression adjustments first described by Jackson et al. and developed for NMA by Jansen et al. [15, 16] Unfortunately, these methods have only been developed for binomial outcomes. The model is shown in (3). It shares the same notations as (2).

$$ {\displaystyle \begin{array}{c}\mathrm{IPD}\\ {}{m}_{ij k}\sim Bernoulli\left({p}_{ij k}\right)\\ {}\begin{array}{c} logit\left({p}_{ij k}\right)=\left\{\begin{array}{c}{\mu}_{j\mathrm{b}}+{\beta}_0{x}_{ij}\kern16em \mathrm{if}\ k=b\\ {}{\mu}_{jb}+{\delta}_{jb k}+{\beta}_0{x}_{ij}+\left({\beta}_{1 Ak}-{\beta}_{1 Ab}\right){x}_{ij}\kern3em \mathrm{if}\ k\succ b\end{array}\right.\\ {}\mathrm{AgD}\\ {}\begin{array}{c}{r}_{jk}\sim Binomial\left({q}_{jk},{n}_{jk}\right)\\ {}{q}_{jk}={q}_{jk}^0\left(1-x.{agg}_j\right)+{q}_{jk}^1x.{agg}_j\\ {}\begin{array}{c} logit\left({q}_{jk}^0\right)=\left\{\begin{array}{c}{\lambda}_{jb}\kern20em \mathrm{if}\ k=b\\ {}{\lambda}_{jb}+{\delta}_{jb k}\kern16.5em \mathrm{if}\ k\succ b\end{array}\right.\\ {} logit\left({q}_{jk}^1\right)=\left\{\begin{array}{c}{\lambda}_{jb}+{\beta}_0\kern17.25em \mathrm{if}\ k=b\\ {}{\lambda}_{jb}+{\delta}_{jb k}+{\beta}_0+\left({\beta}_{1 Ak}-{\beta}_{1 Ab}\right)\kern5.5em \mathrm{if}\ k\succ b\end{array}\right.\\ {}\begin{array}{c}{\delta}_{jb k}\sim Normal\left({d}_{bk},{\sigma}^2\right)=\kern0.5em Normal\left({d}_{Ak}-{d}_{Ab},{\sigma}^2\right)\\ {}{d}_{AA}=0,{\beta}_{1 AA}=0\kern1.5em {d}_{Ak}\sim Normal\left(\mathrm{0,0.001}\right),{\beta}_{lk}={b}_l,{b}_l\sim Normal\left(0,1000\right)\end{array}\end{array}\end{array}\end{array}\end{array}} $$

The IPD part of this model is the same as that of the one-stage IPD-AgD NMA with adjustments, with the exception that β0 is not study specific but fixed across studies because it is now also used in the AgD part of the model (which reflects different studies). For the AgD part of the model, the number of events r in study j for treatment k is assumed to be binomially distributed with probability qjk and sample size njk. qjk can be considered as the average probability of the response of interest for an individual in study j treated with intervention k.

The covariate adjustment values β1Ak are distinct from those used in previous equations in that they are patient-level effects rather than trial-level effects. Even in the other IPD models, the effects are trial-level because they are estimated by both IPD and AgD. In (3) the values \( {q}_{jk}^0 \) and \( {q}_{jk}^1 \) are latent probabilities, therefore it is not possible to point identify β0 and β1Ak from AgD only. As such, these are solely estimated through IPD, which removes the possibility of the ecological fallacy bias entirely.

Statistical analyses

The following outcomes were used for the comparison of meta-regression methods: viral suppression and change from baseline in CD4 cell counts at 48 weeks (+/− 4 weeks), discontinuations, and discontinuations due to adverse events. We selected these because DTG and EFV400 are viewed to have as good or better efficacy and improved tolerability relative to EFV [36]. The target population was set to be the average population amongst EFV patients, the recommended preferred first-line regimen at the time. The following baseline variables were considered for covariate adjustments: CD4 cell counts, viral RNA (log-transformed), and proportion of males.

The three trials for which IPD were available tended to include healthier patients (higher baseline CD4 and lower baseline HIV RNA) and more males than the average EFV trial. In addition to being imbalanced, these factors were both plausible effect-modifiers and well-reported. Analyses consisted of comparing the modeling approaches described in the previous section. Identity link functions with Normal likelihoods were used for continuous outcomes. For dichotomous outcomes, logit link functions were use.

To assess the different models, the following measures were compared:

  • Treatment-effect estimates and posterior distributions of key comparisons.

  • Coefficient estimates and posterior distributions

  • Deviance information criterion (DIC) value comparisons across models, as well as pD and deviance

  • Between-study heterogeneity (between-study variance of the modelled outcome, e.g., log odds ratio [OR]; as calculated in the random-effects model)

  • The proportion of points falling outside the lines c = 3 and c = 4 within leverage plots (the curves are of the form x2 + y = c). Points outside of the lines with c = 3 can generally be identified as contributing to the model’s poor fit (see TSD2) [37].

  • Change in SUCRA (surface under the cumulative ranking curve) scores

The posterior distributions for treatment-effect estimates are the output that are subsequently used to draw inference and for decision-making in Bayesian modeling. Therefore, this was a primary measure of modeling impact. There were no specific hypotheses regarding how these would be affected beforehand. For comparisons in treatment-effect, the absolute effect was used because it is the most interpretable. For example, a difference of 5% in the proportion of viral suppression is more interpretable than a difference of 1.5 in the logarithm of the odds ratio. For the dichotomous variables, a difference of 1% was chosen as the threshold of minimal clinically important difference. For a change in CD4, a difference of 10 cells/mm3 was chosen to align with the values that were used in the WHO reviews. The SAP for this study was publicly available prior to conducting the analyses and provides further details regarding methods [35].


The parameters of the different models were estimated using a Markov Chain Monte Carlo method implemented in the JAGS software package. The first series of 30,000 iterations from the OpenBUGS sampler were discarded as ‘burn-in’, and the inferences were based on additional 50,000 iterations using two chains. For all analyses, model convergence was assessed through trace plots, density plots and Gelman-Rubin-Brooks (shrink factor) plots [38]. All analyses were performed using R version 3.4.4 ( and JAGS version 4.3. Code used to conduct the analyses is presented in the Additional file 1: Web Appendix.


Evidence base

Study and patient selection are presented in the PRISMA-IPD [33] flow diagram in Fig. 1. The search was conducted in three phases: the first search of AgD was conducted in May 2015 (the original SLR), a search for IPD was conducted on 15 August 2016, and then an updated search of AgD was conducted on 12 February 2018. The IPD search in 2016 involved both YODA and CDSR; however, data were only obtained through CSDR. These included 2160 patients from FLAMINGO (DRV/r vs. DTG) [22, 23], SINGLE (DTG vs EFV) [24,25,26,27], and SPRING-2 (DTG vs RAL) [28, 29]. As shown in Fig. 1, the 2160 patients for which individual patient-level data were available represent 6.5% of the total evidence base (2160/33,148), and as shown in Fig. 2, the three trials cover a total of 3 of 24 edges (12.5%; shown in red) with trials providing head-to-head evidence.

Fig. 1
figure 1

PRISMA-IPD flow diagram for identification and selection of randomized clinical trials in the evidence base

Fig. 2
figure 2

Network of evidence showing all treatments and the trial comparisons available in the evidence base. Legend: Circles (nodes) in the diagrams represent individual treatments, lines between circles represent availability of head-to-head evidence between two treatments, and the numbers on the lines are the number of RCTs informing each head-to-head comparison. Blue: NNRTIs; Green: Protease inhibitors; Orange: Integrase inhibitors. ATV/r: ritonavir-boosted atazanavir; DRV/r: ritonavir-boosted darunavir; DTG: dolutegravir; EFV: efavirenz; EFV400: efavirenz 400; EVG/c: elvitegravir/cobicistat; LPV/r: ritonavir-boosted lopinavir; NVP: nevirapine; RAL: raltegravir; RPV: rilpivirine; BIC: bictegravir; DOR: doravirine

Overall study quality was generally high (i.e., low risk of bias). Exceptions were restricted to open-label trials having a high risk of bias due to blinding and some of the more recent trials that were only reported upon in posters having insufficient information to determine with certainty that the risk of bias was either low or high (Additional file 1: Web Appendix).

The patient characteristics have been described previously [21]. As shown in Figure 1–4 of the Additional file 1: Web Appendix, in addition to being the variables that were best reported in the evidence base, the covariates selected for adjustments in this study had a high degree of variability. This was especially apparent in the baseline CD4. For full posterity, the reported results by study are provided in Tables 4–5 of the Additional file 1: Web Appendix.

Comparing meta-regression adjustments

Overall, the use of IPD appeared to have a negligible impact on the results. In each outcome, the use of IPD impacted an aspect of the results – say DIC, rankings or covariate estimates – but the aspect affected changed from one outcome to the next and tended to not be meaningful. The full set of results are shown for viral load at 48 weeks. For the remaining primary outcomes, tables and figures are presented in the Additional file 1: Web Appendix and only key highlights are focused on here.

Table 1 presents the model fit for the various models of interest for viral suppression at 48 weeks. The lowest DIC was for the unadjusted one-stage IPD-AgD NMA; however, the difference between it and the base model was not meaningful (requires a difference ≥ 3, as per SAP). The fit using the one-stage IPD-AgD NMA were considerably better than those using informative priors based on external analyses (two-stage empirical-priors approach). The use of IPD appeared to have minimal impact on the heterogeneity parameter estimate for this outcome (as calculated by the random-effects model). The proportion of observations above the third and fourth parabola in the leverage vs deviance plot tended to be stable. Nonetheless, the trend was towards having more outliers among the two-stage AgD NMA.

Table 1 Comparison of model selection and fit of all models considered in the analysis for viral suppression at 48 weeks as an outcome

Rankings remained generally unchanged by the model choice. Change in rankings tended to happen in the models with the highest DICs and hence those were not at risk of being favoured. Changes in the top three rankings tended to be limited to a re-ordering of the same treatments, with DTG usually remaining on top (Additional file 1: Web Appendix).

Table 2 presents the estimated effects for the comparisons of primary interest (DTG, EFV400 and EFV). Meta-regression adjustments based on IPD tended to lower the estimated efficacy of DTG, but almost never rendered it non-significant. The exception was the use of hierarchical meta-regression, which was limited to single variable adjustments. Importantly, these analyses included much wider credible intervals than other analyses and this was consistently observed throughout the outcomes. This aligns with results previously presented by Jansen [16]. The analyses also led to the largest shifts in estimates and these were in either direction depending on the variable of adjustment. While these methods are noted for increasing validity, we cannot conclude bias in the previous analyses on the basis of these results. Mean and maximum changes in the log-odds were large across all analyses. These changes are more easily interpretable through the change in proportions, where the maximum change was often close to 4%. The difference between 86 and 90% of patients being virally suppressed would have important implications.

Table 2 Comparison of comparative treatment estimates of all models considered in the analysis for viral suppression at 48 weeks as an outcome

The estimated coefficients across the analyses are presented in Table 3. When comparing the meta-regression coefficients, the coefficient for CD4 was statistically significant in each of the IPD analyses that included it as a covariate. Moreover, its estimated effect size was consistent across the model using IPD. The coefficient estimates were notably different across AgD and IPD models, with HIV RNA leading the way.

Table 3 Coefficient estimates across the IPD-AgD NMA of all models considered in the analysis for viral suppression at 48 weeks as an outcome

For a change in baseline CD4 at 48 weeks, no models led to a meaningfully lower DIC than the unadjusted AgD NMA; however, contrary to viral suppression, here it was the two-stage models that appeared to have the best fit (DIC ranging from 182.02–184.23, relative to183.63 for the base model) among the IPD adjusted models (DIC up to 191.41 for the rest). Moreover, the two-stage analyses also reduced the number of points outside the fourth parabola in the leverage plots (0 vs. 1–3), suggesting an overall better fit to the data. The rankings were the measure most affected by choice of model for CD4. DTG was ranked first in the base case and in the IPD-AgD NMA, but EFV400 was ranked first when using AgD meta-regression and two-stage IPD-AgD NMA. DTG remained the favoured treatment in the one-stage and two-stage empirical-priors. With respect to the research question at hand, using a two-stage approach would impact how data were interpreted, given the change in rankings, particularly with DTG becoming a mid-ranked treatment and EFV400 becoming the number one ranked treatment.

Finally, with respect to CD4 most regression coefficients were not statistically significant, but similarly to the viral suppression analysis, the estimated coefficients using IPD were substantially different than those obtained through AgD meta-regression. For example, the effect of baseline HIV RNA went from 2.5 (95% CrI: − 21.2, 26.7) to 45.5 (95% CrI: 31.3, 59.9). In other words, the AgD meta-regression estimated that on average a trial initiating at a baseline HIV RNA that was one log unit higher led to a relative change in CD4 that was 2.5 cells/ml higher, whereas the one-stage IPD-AgD NMA estimated an average increase that was 45.5 cells/ml higher (keep in mind that trials did not differ by a full log unit of baseline HIV RNA).

For discontinuations, none of the models were meaningfully different from the base AgD NMA with respect to DIC. Change in estimates tended to be minimal across models. Interestingly, the exception to this was the HMR IPD-AgD NMA with adjustments for the proportion of males, which was also the model with the lowest DIC. In this model, both DTG (OR: 0.36; 95% CrI: 0.22–0.57) and EFV400 (OR: 0.61; 95% CrI: 0.30–1.23) were considerably more tolerable relative to EFV than in the unadjusted model, with an OR of 0.52 and 0.91, respectively.

Out of all the primary outcomes, only discontinuations due to adverse events had a model other than the unadjusted AgD NMA selected through a meaningfully lower DIC. In this case, it was the two-stage empirical priors approach with adjustments for the proportion of males that was selected with a DIC of 202.79 vs. 205.79. The one-stage analyses and two-stage empirical-priors analyses also led to a lower estimate of the between-study heterogeneity, suggesting that the adjustments helped account for between-study differences as well. The selected model shifted the principal comparison of interest from an OR of 0.28 (95% CrI: 0.17–0.44) to 0.37 (95% CrI: 0.23–0.58), but this would have little impact on decision making. With respect to absolute effects, most model adjustments led to minimal differences. This aligns well with the fact that none of the covariates were found to be statistically significant. The rankings were stable across models; however, with the selected model, DTG changed from being ranked 1st to being ranked 2nd.

Comparative efficacy and safety

Largely, results of the analyses for the secondary outcomes led to similar impacts to those observed in the selected four outcomes above. Only in the case of viral suppression at 96 weeks, the model adjusted for baseline HIV RNA was selected (instead of the unadjusted model). As shown in Table 4, the DIC for the selected model more than 12 units smaller than the AgD NMA. The table also shows that there are other adjustments that lead to similar DICs, but in this case, we’ve selected the smallest DIC. There was no meaningful impact with respect to rankings across outcomes.

Table 4 Comparison of model selection and fit for viral suppression at 96 weeks

The impact of adjustments with IPD on the actual estimates was noticeable, particularly in the case of viral suppression and change in CD4 cell counts at 96 weeks. In the case of viral suppression, the relative efficacy of DTG was reduced relative to both EFV and EFV400. In the selected model, the OR decreased from 1.94 (95% CrI: 1.52, 2.48) to 1.58 (95% CrI: 1.23, 2.03) relative to EFV, with a similar change relative to EFV400. While none of the effects changed with respect to statistical significance, the average change in modeled proportions was rather large at a mean shift of 4.1% in the selected model.


This study examined the change in outputs in the evidence synthesis of ART among first-line HIV patients when including IPD and compared the extent of this impact using different established IPD-based methods for meta-regression adjustments utilizing a mixture of IPD and AgD. The four methods of adjusting for covariate imbalances using IPD that were compared are: a two-stage approach, a two-stage approach with empirical priors, a one-stage approach, and hierarchical meta-regression. In this case study, none of the four methods stood out as a clearly superior approach solely on the basis of the numerical results. Nonetheless, this study does provide insights into these methods of adjustment. First, while in most analyses, the four strategies were in general agreement, there were situations where the results differed notably between the two-stage approach and other approaches, and thus the choice of method matters. Second, the hierarchical meta-regression tended to lead to the most considerable changes in effect estimates, but did so at the steep cost of reduced precision. Third, there was a remarkable difference in the coefficient estimates obtained through IPD methods and those obtained through more traditional meta-regression using AgD only, suggesting that when adjustments are needed, IPD is more appropriate to use. This study also aimed to understand the potential impact of including individual patient data for the particular application of comparing the therapeutic landscape of anchor treatments in first-line ART for the treatment of HIV. To this end, it was reassuring to find that the conclusions reached through the evidence synthesis supplemented by the individual patient data did not lead to changes that would have impacted the WHO change in guidelines that took place in December 2018 and subsequently in 2020 [39, 40].

The possibility that the limited impact of IPD on study results are due in part to the relatively small number of patients in the network providing IPD was investigated through a separate simulation study [41]. The simulation study was borne from this work. The aim of the simulation was to investigate various network factors that could be associated with the degree of benefits from including IPD, rather than to compare the various methods of adjustments, as was the goal here. The simulation study did find that the benefits of IPD are greater in small and/or sparse networks and that having too few IPD leads to negligible benefits. Another possible reason for the lack of differences between methods is a lack of ecological fallacy – whereby trends in AgD are do not reflect the trends in IPD – which is when differences between IPD and AgD adjustments are most important. Nonetheless, it is important to note that while there were minimal differences in the results between the multiple modeling methods, these do not imply that there are no differences between the methods. Several differences are still distinguishable within this case study, as further explained below.

Despite the limited impact on the interpretation of the therapeutic landscape on the basis of IPD, there are a number of advantages to the use of IPD that were observed and that have been discussed previously [6]. First, IPD more easily allows for the simultaneous adjustment of multiple covariates because it has much higher degrees of freedom. Only edges with multiple trials and differences in covariate values along those edges allow for the estimation of the covariate of interest in an AgD setting. Second, the results of this study suggested that where traditional AgD meta-regression was feasible, it was underpowered, as demonstrated by the estimated coefficients. Under the assumption that the IPD estimates based on 2160 data points are more accurate than the meta-regression adjustments based on trends among a small number of aggregate data points, the large differences seen in estimates suggest an inaccuracy among the AgD meta-regression.

There is a clear trend towards improved access to IPD and its increased use [11, 42, 43]. The most popular IPD methods have the distinct advantage of being able to adjust for unanchored networks, but require strong assumptions (no unobserved prognostic factors and effect-modifiers) and are usually limited to indirect comparisons [8, 44]. As the use of IPD increases, we can expect increased use of IPD-AgD NMA, such as the methods compared in this study. In terms of meta-analyses and network meta-analyses, there has been a shift from the predominant use of a two-stage approach to a one-stage approach [6]. As Simmonds et al. explain in their review, this is likely due to a growing familiarity with methods, improvements in computing and the recognition that regression model offers the greatest flexibility for IPD analysis [6]. The two-stage analyses in this study included the use of regression in the first stage, which was not always used in published two-stage analyses [6]. To the best of our knowledge, no study has compared the results of one-stage and two-stage IPD-AgD NMA directly. In most analyses, there were no meaningful differences in the results using either approach. Nonetheless, there were instances where one-stage and two-stage adjustments went in opposite directions. This may be a result of having the regression adjustments for the IPD done independently for each trial in the two-stage approach, rather than collectively. In the absence of differences, the two-stage approach had the advantage of being computationally less intensive and being easier to code. Conversely, the one-stage approaches had the benefit of having more easily interpretable regression coefficients and having all the analytical steps combined. Given these advantages and the fact that choice appeared to matter for some analyses, the recommendation would be to not use the traditional two-stage approach.

The choice between one-stage IPD-AgD NMA and two-stage IPD-AgD NMA with empirical-priors is less straightforward, and is ultimately dependent on the evidence base at hand. The difference between these two approaches was much more subtle. The empirical-priors method does not appear to have been used previously. As described in the methods, the motivation for its use was to isolate the coefficient estimation to the IPD (i.e., reduce the influence of the AgD on the estimation of the regression adjustments). As such, the greater difference is seen in comparisons for which there is no IPD, so that this method becomes more important when there are numerous comparisons with AgD only. Inspection of the DTG vs. EFV estimates, for which there was an IPD trial, reveals that there was general agreement between the two modeling approaches (when keeping the same covariates). On the other hand, for the EFV400 vs. EFV comparisons, for which there were no IPD available, the difference was notable, with the empirical-priors approach leading to a larger shift in estimates. In situations where there is an abundant number of trials and treatment comparisons that have IPD, such as in the Donegan et al. example [45], the one-stage approach, which is already well adopted, would be recommended. For networks of evidence that have few treatment comparisons with IPD trials, the empirical-priors approach is likely to maximize the IPD.

Although hierarchical meta-regression has shown some promising results, it appears that more research is still needed for these methods. Simulation work has suggested that these methods reduce bias [16], which is usually favoured over precision; however, the loss of precision observed in our work was not negligible. Moreover, it was difficult to use these methods with multiple variables at a time and the methods for use on continuous outcomes have not yet been published. Once further advancements are conducted on this method, it will be worthwhile reviewing a comparison with traditional one-stage analyses again.

As discussed above, the implications for first-line ART regimens (i.e., our secondary objective) are minimal. The evidence continues to support the DTG as the more efficient and tolerable choice of treatment. In instances where models were selected, the differences between treatments tended to be less pronounced, albeit DTG continued to perform best with respect to viral suppression, change in CD4 and tolerability.

There are several limitations to this study. First and foremost, there were very few trials for which IPD were obtained, which is a problem commonly encountered by researchers. These represented a small fraction of the trials and patients and may explain why the impact on model estimates appeared to be somewhat muted (i.e., too few IPD may get washed out in a large network). The limitation of too few data was exacerbated by the missed opportunity to get IPD for the SPRING-1 trial. The oversight was identified too far along in the process and thus could not be corrected in time. Given that this was a small Phase 2 trial that would have added a small fraction of patients to an already small sample of IPD, the impact of including or excluding its IPD is very likely to be negligible. Moreover, the SPRING-1 trial was still included in the analyses. Second, use of a single case study, particularly one with few IPD relative to the size of the network, limits the generalizability of the comparisons between the different methods of adjustments to other settings. To this end, while some conclusions have been reached, further research will be needed. Third, it is unclear whether the multiple forms of meta-regression interfered with one another. To account for differences in backbone regimens, an arm-based meta-regression was used in addition to the more traditional trial/patient-based regression adjustments, and this may have been a nuisance to the modeling process. Third, the trials for which IPD were available were principally conducted in high-income countries, which may limit the ability to make adjustments needed in studies conducted in the LMICs. Nonetheless, there tended to be a wide range of values for the covariates of interest, so this is unlikely to have been an issue [22, 23, 25]. Fourth, specific to this evidence base, there were numerous other potential effect-modifiers that were too poorly reported to allow for meta-regression adjustments to be made. These principally included ethnicity and acquisition risk groups. Finally, due to low event counts and data unavailability, not all outcomes were available for re-analysis using IPD.


There are many ways in which IPD can be integrated with AgD for the purpose of NMA. Choosing the method by which to integrate these data will impact results. In most cases, the one-stage approach is recommended; however, in situations with fewer treatment comparisons that have IPD, the empirical-priors approach is a viable alternative. Further research is needed to understand whether having too few IPD can mitigate their beneficial impact. Finally, even with the revised analyses, DTG continues to demonstrate improved efficacy and tolerability over other anchor treatments.

Availability of data and materials

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request (applicable only to data extracted from published manuscripts). The individual patient data that support the findings of this study are available from GlaxoSmithKline but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of GlaxoSmithKline.



Adverse event


Aggregate data


Aggregate data network meta-analysis without adjustments


Aggregate data network meta-analysis with meta-regression adjustments


Antiretroviral therapy


Ritonavir-boosted atazanavir




Clinical Data Interchange Standards Consortium


Conference on Retroviruses and Opportunistic Infections



Deviance information criterion




Ritonavir-boosted darunavir




Standard-dose efavirenz

EFV400 :

400 mg efavirenz


Cobicistat-boosted elvitegravir


Genomics Evidence Neoplasia Information Exchange


Hierarchical meta-regression


International AIDS Society


Individual patient data


Network meta-analysis with meta-regression adjustments, using both individual patient data and aggregate data


Ritonavir-boosted lopinavir


Matched indirect comparisons


Markov Chain Monte Carlo


Mean squared error


National Institute for Health Care and Excellence


Network meta-analysis


Nucleoside reverse transcriptase inhibitors




Odds ratio


Population adjusted indirect comparisons


Preferred reporting items for systematic review and meta-analysis of individual participant data


Potential scale reduction factor




Randomized controlled trials




Serious adverse events


Statistical analysis plan


Systematic literature review


Supporting Open Access for Researchers


Surface under the cumulative ranking curve


Tenofovir disoproxil fumarate


Lamivudine or emtricitabine


Yale University Open Data Access


  1. van Wely M. The good, the bad and the ugly: meta-analyses. Hum Reprod. 2014;29(8):1622–6.

    Article  PubMed  Google Scholar 

  2. Dias S, Sutton A, Welton N, Ades A. Evidence synthesis for decision making 3: heterogeneity--subgroups, meta-regression, bias, and bias-adjustment. Med Decis Mak. 2013;33(5):618–40.

    Article  Google Scholar 

  3. Higgins J, Green S. Cochrane handbook for systematic reviews of interventions version 5.0.0. Collaboration TC, editor. Chichester: Wiley; 2008.

    Book  Google Scholar 

  4. Sedgwick P. Understanding the ecological fallacy. BMJ. 2015;351:h4773.

    Article  Google Scholar 

  5. Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340(feb05 1):c221.

    Article  PubMed  Google Scholar 

  6. Simmonds M, Stewart G, Stewart L. A decade of individual participant data meta-analyses: A review of current practice. Contemp Clin Trials. 2015;45(Pt A):76–83.

    Article  Google Scholar 

  7. Tierney J, Vale C, Riley R, Smith CT, Stewart L, Clarke M, et al. Individual participant data (IPD) meta-analyses of randomised controlled trials: guidance on their use. PLoS Med. 2015;12(7):e1001855.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Saramago P, Sutton AJ, Cooper NJ, Manca A. Mixed treatment comparisons using aggregate and individual participant level data. Stat Med. 2012;31(28):3516–36. Epub 2012 Jul 5.

    Article  PubMed  Google Scholar 

  9. Jansen J, Cappelleri J, Fleurence R, Devine B, Itzler R, Barrett A, et al. Interpreting indirect treatment comparisons and network meta-analysis for health-care decision making: report of the ISPOR task force on indirect treatment comparisons good research practices: part 1. Value Health. 2011;14(4):417–28.

    Article  PubMed  Google Scholar 

  10. Nixon RM, Bansback N, Brennan A. Using mixed treatment comparisons and meta-regression to perform indirect comparisons to estimate the efficacy of biologic treatments in rheumatoid arthritis. Stat Med. 2007;26(6):1237–54.

    Article  CAS  PubMed  Google Scholar 

  11. Veroniki A, Straus S, Soobiah C, Elliott M, Tricco A. A scoping review of indirect comparison methods and applications using individual patient data. BMC Med Res Methodol. 2016;16:47.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Higgins JP, Whitehead A, Turner RM, Omar RZ, Thompson SG. Meta-analysis of continuous outcome data from individual patients. Stat Med. 2001;20(15):2219–41.

    Article  CAS  PubMed  Google Scholar 

  13. Turner RM, Omar RZ, Yang M, Goldstein H, Thompson SG. A multilevel model framework for meta-analysis of clinical trials with binary outcomes. Stat Med. 2000;19(24):3417–32.<3417::AID-SIM614>3.0.CO;2-L.

    Article  CAS  PubMed  Google Scholar 

  14. Whitehead A, Omar RZ, Higgins JP, Savaluny E, Turner RM, Thompson SG. Meta-analysis of ordinal outcomes using individual patient data. Stat Med. 2001;20(15):2243–60.

    Article  CAS  PubMed  Google Scholar 

  15. Jackson C, Best N, Richardson S. Improving ecological inference using individual-level data. Stat Med. 2006;25(12):2136–59.

    Article  PubMed  Google Scholar 

  16. Jansen J. Network meta-analysis of individual and aggregate level data. Res Synth Methods. 2012;3(2):177–90.

    Article  PubMed  Google Scholar 

  17. Kanters S, Vitoria M, Doherty M, Socias ME, Ford N, Forrest JI, et al. Comparative efficacy and safety of first-line antiretroviral therapy for the treatment of HIV infection: a systematic review and network meta-analysis. Lancet HIV. 2016;3(11):e510–e20.

    Article  PubMed  Google Scholar 

  18. Günthard H, Saag M, Benson C, del Rio C, Eron J, Gallant JE, et al. Antiretroviral drugs for treatment and prevention of HIV infection in adults: 2016 recommendations of the international antiviral society-USA panel. JAMA. 2016;316(2):191–210.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Saag MS, Benson CA, Gandhi RT, Hoy JF, Landovitz RJ, Mugavero MJ, et al. Antiretroviral drugs for treatment and prevention of HIV infection in adults: 2018 recommendations of the international antiviral society-USA panel. JAMA. 2018;320(4):379–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Kanters S. Comparative efficacy and safety of first-line treatments for hiv patients for clinical guideline development and the impact of individual patient data. Vancouver BC: University of British Columbia; 2019.

    Google Scholar 

  21. Kanters S, Jansen J, Zoratti M, Forrest J, Humphries B, Campbell J. WEB ANNEX B. Systematic literature review and network meta-analysis assessing first-line ART treatments; In: updated recommendations on first-line and second-line antiretroviral regimens and post-exposure prophylaxis and recommendations on early infant diagnosis of HIV: interim guidelines. Geneva: World Health Organization. 2018.

  22. Clotet B, Feinberg J, van Lunzen J, Khuong-Josses MA, Antinori A, Dumitru I, et al. Once-daily dolutegravir versus darunavir plus ritonavir in antiretroviral-naive adults with HIV-1 infection (FLAMINGO): 48 week results from the randomised open-label phase 3b study. Lancet. 2014;383(9936):2222–31.

    Article  CAS  PubMed  Google Scholar 

  23. Molina J, Clotet B, van Lunzen J, Lazzarin A, Cavassini J, Henry K et al. Once-daily dolutegravir versus darunavir plus ritonavir for treatment-naive adults with HIV-1 infection (FLAMINGO): 96 week results from a randomised, open-label, phase 3b study. 2015;

    Google Scholar 

  24. Walmsley S, Baumgarten A, Berenguer J, Felizarta F, Florence E, Khuong-Josses MA, et al. Brief report: Dolutegravir plus Abacavir/lamivudine for the treatment of HIV-1 infection in antiretroviral therapy-naive patients: week 96 and week 144 results from the SINGLE randomized clinical trial. J Acquir Immune Defic Syndr. 2015;70(5):515–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Walmsley S, Berenguer J, Khuong-Josses MA, Kilby M, Lutz T, Podzamczer D et al. Dolutegravir Regimen Statistically Superior To Tenofovir/Emtricitabine/Efavirenz: 96-Wk Data. Topics in Antiviral Medicine. 2014; Conference 21st Conference on Retroviruses and Opportunistic Infections, CROI 2014 (21) United States. Conference Start: 20140303 Conference End: 6. Conference Publication: (568 pages). 22 (e-1) (pp 261–262).

  26. Walmsley S, Berenguer J, Khuong-Josses MA, Kilby JM, Lutz T, Podzamczer D et al. Dolutegravir Regimen Statistically Superior to Efavirenz/Tenofovir/Emtricitabine: 96-Week Results From the SINGLE Study (ING114467). Conference on Retrovirues and Opportunistic Infections; Boston, USA. 2014.

  27. Walmsley S, Antela A, Clumeck N, Duiculescu D, Eberhard A, Gutierrez F, et al. Dolutegravir plus abacavir-lamivudine for the treatment of HIV-1 infection. N Engl J Med. 2013;369(19):1807–18.

    Article  CAS  PubMed  Google Scholar 

  28. Raffi F, Rachlis A, Stellbrink HJ, Hardy WD, Torti C, Orkin C, et al. Once-daily dolutegravir versus raltegravir in antiretroviral-naive adults with HIV-1 infection: 48 week results from the randomised, double-blind, non-inferiority SPRING-2 study. Lancet. 2013;381(9868):735–43.

    Article  CAS  PubMed  Google Scholar 

  29. Raffi F, Jaeger H, Quiros-Roldan E, Albrecht H, Belonosova E, Gatell JM, et al. Once-daily dolutegravir versus twice-daily raltegravir in antiretroviral-naive adults with HIV-1 infection (SPRING-2 study): 96 week results from a randomised, double-blind, non-inferiority trial. Lancet Infect Dis. 2013;13(11):927–35.

    Article  CAS  PubMed  Google Scholar 

  30. Stellbrink HJ, Reynes J, Lazzarin A, Voronin E, Pulido F, Felizarta F, et al. Dolutegravir in antiretroviral-naive adults with HIV-1: 96-week results from a randomized dose-ranging study. AIDS. 2013;27(11):1771–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. van Lunzen J, Maggiolo F, Arribas JR, Rakhmanova A, Yeni P, Young B, et al. Once daily dolutegravir (S/GSK1349572) in combination therapy in antiretroviral-naive adults with HIV: planned interim 48 week results from SPRING-1, a dose-ranging, randomised, phase 2b trial. Lancet Infect Dis. 2012;12(2):111–8.

    Article  CAS  PubMed  Google Scholar 

  32. Higgins J, Altman D, Gotzsche P, Juni P, Moher D, Oxman AD, et al. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials. BMJ. 2011;343(oct18 2):d5928.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Stewart LA, Clarke M, Rovers M, Riley RD, Simmonds M, Stewart G, et al. Preferred reporting items for systematic review and meta-analyses of individual participant data: the PRISMA-IPD statement. JAMA. 2015;313(16):1657–65.

    Article  PubMed  Google Scholar 

  34. Wohl D, Cohen C, Gallant J, Mills A, Sax PE, Dejesus E, et al. A randomized, double-blind comparison of single-tablet regimen Elvitegravir/Cobicistat/Emtricitabine/Tenofovir DF versus single-tablet regimen Efavirenz/Emtricitabine/Tenofovir DF for initial treatment of HIV-1 infection: analysis of week 144 results. J Acquir Immune Defic Syndr. 2014;65(3):e118–e21.

    Article  PubMed  Google Scholar 

  35. Kanters S. Comparative effectiveness and safety of first-line antiretroviral therapy for HIV: an individual patient-level and aggregate data network meta-analysis: statistical analysis plan. Research Gate: University of British Columbia; 2018.

    Google Scholar 

  36. Vitoria M, Ford N, Clayden P, Pozniak AL, Hill AM. When could new antiretrovirals be recommended for national treatment programmes in low-income and middle-income countries: results of a WHO think tank. Curr Opin HIV AIDS. 2017;12(4):414–22.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Dias S, Welton N, Sutton A, Ades A. Technical support document 2: a generalized linear modelling framework for pairwise and network meta-analysis of randomized controlled trials; 2011.

    Google Scholar 

  38. Brooks SP, Gelman A. General methods for monitoring convergence of iterative simulations. J Comput Graph Stat. 1998;7(4):434–55.

    Google Scholar 

  39. Kanters S, Vitoria M, Zoratti M, Doherty M, Penazzato M, Rangaraj A, et al. Comparative efficacy, tolerability and safety of dolutegravir and efavirenz 400mg among antiretroviral therapies for first-line HIV treatment: a systematic literature review and network meta-analysis. EClinicalMedicine. 2020;28:100573.

    Article  PubMed  PubMed Central  Google Scholar 

  40. World Health Organization. Updated recommendations on first-line and second-line antiretroviral regimens and post-exposure prophylaxis and recommendations on early infant diagnosis of HIV: Interim guidlines. Supplement to the 2016 consolidated guidelines on the use of antiretroviral drugs for treating and preventing HIV infection. Geneva: World Health Organization. 2018.

  41. Kanters S, Karim ME, Thorlund K, Anis A, Bansback N. When does the use of individual patient data in network meta-analysis make a difference? A simulation study. BMC Med Res Methodol. 2021;21(1):21.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Cahan A, Cimino J. Improving precision medicine using individual patient data from trials. Cmaj. 2017;189(5):E204–e7.

    Article  PubMed  PubMed Central  Google Scholar 

  43. Ohmann C, Banzi R, Canham S, Battaglia S, Matei M, Ariyo C, et al. Sharing and reuse of individual participant data from clinical trials: principles and recommendations. BMJ Open. 2017;7(12):e018647.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Phillippo D, Ades A, Dias S, Palmer S, Abrams K, Welton N. NICE DSU Technical support document 18: methods for population-adjusted indirect comparisons in submission to NICE. 2016.

  45. Donegan S, Williamson P, D'Alessandro U, Garner P, Smith CT. Combining individual patient data and aggregate data in mixed treatment comparison meta-analysis: individual patient data may be beneficial if only for a subset of trials. Stat Med. 2013;32(6):914–30. Epub 2012 Sep 17.

    Article  PubMed  Google Scholar 

Download references


The authors would like to thank Hubert Wong, Michael John Milloy and Tom Trikalinos for their critical feedback, as well as GlaxoSmithKline and the programme for providing access to the individual patient data that made these analyses possible.


This work was supported by a CIHR (Canadian Institutes of Health Research) Doctoral Research Award. The IPD were provided by GlaxoSmithKline through the programme. Neither agency played any role in the development and execution of the SLR and the analyses.

Author information

Authors and Affiliations



Steve Kanters had full access to all of the data in the study. Steve Kanters takes responsibility for the integrity of the data, the accuracy of the data analysis, and the final decision to submit for publication. All authors have read and approved the manuscript. Study concept and design: SK, KT and NB. Acquisition, analysis, or interpretation of data: SK, MEK, MZ and KT. Drafting of the manuscript: SK and NB. Critical revision of the manuscript for important intellectual content: All authors. Statistical analysis: SK. Study supervision: NB and AA.

Corresponding author

Correspondence to Steve Kanters.

Ethics declarations

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Web Appendix.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kanters, S., Karim, M.E., Thorlund, K. et al. Comparing the use of aggregate data and various methods of integrating individual patient data to network meta-analysis and its application to first-line ART. BMC Med Res Methodol 21, 60 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Individual patient data
  • IPD
  • Network meta-analyses
  • One-stage NMA
  • Two-stage NMA
  • Ecological fallacy
  • HIV
  • Guideline development