Multivariate meta-analysis of critical care meta-analyses: a meta-epidemiological study

Background Meta-analyses typically consider multiple outcomes and report univariate effect sizes considered as independent. Multivariate meta-analysis (MVMA) incorporates outcome correlation and synthesises direct evidence and related outcome estimates within a single analysis. In a series of meta-analyses from the critically ill literature, the current study contrasts multiple univariate effect estimates and their precision with those derived from MVMA. Methods A previous meta-epidemiological study was used to identify meta-analyses with either one or two secondary outcomes providing sufficient detail to structure bivariate or tri-variate MVMA, with mortality as primary outcome. Analysis was performed using a random effects model for both odds ratio (OR) and risk ratio (RR); borrowing of strength (BoS) between multivariate outcome estimates was reported. Estimate comparisons, β coefficients, standard errors (SE) and confidence interval (CI) width, univariate versus multivariate, were performed using Lin’s concordance correlation coefficient (CCC). Results In bivariate meta-analyses, for OR (n = 49) and RR (n = 48), there was substantial concordance (≥ 0.69) between estimates; but this was less so for tri-variate meta-analyses for both OR (n = 25; ≥ 0.38) and RR (≥ -0.10; n = 22). A variable change in the multivariate precision of primary mortality outcome estimates compared with univariate was present for both bivariate and tri-variate meta-analyses and for metrics. For second outcomes, precision tended to decrease and CI width increase for bivariate meta-analyses, but was variable in the tri-variate. For third outcomes, precision increased and CI width decreased. In bivariate meta-analyses, OR coefficient significance reversal, univariate versus MVMA, occurred once for mortality and 6 cases for second outcomes. RR coefficient significance reversal occurred in 4 cases; 2 were discordant with OR. For tri-variate OR meta-analyses reversal of coefficient estimate significance occurred in two cases for mortality, nine cases for second and 7 cases for third outcomes. In RR meta-analyses significance reversals occurred for mortality in 2 cases, 6 cases for second and 3 cases for third; there were 7 discordances with OR. BoS was greater in trivariate MVMAs compared with bivariate and for OR versus RR. Conclusions MVMA would appear to be the preferred solution to multiple univariate analyses; parameter significance changes may occur. Analytic metric appears to be a determinant.

(MVMA), whereby direct evidence and results from related outcomes are synthesised to yield a summary outcome result [5][6][7], is an elegant solution to the above problems.
In meta-analyses of interventions in the critically ill, where mortality is a common primary outcome, it would be expected that secondary outcomes such as intensive care unit (ICU) and hospital length of stay, infections and the requirement for mechanical ventilation would demonstrate substantial correlation [6], and with the primary mortality event. MVMA in such meta-analyses would allow joint inference upon multiple outcomes and be of relevance from a methodological and clinical viewpoint. Price et al. suggested that where multiple outcomes routinely occur, MVMA would be "…more likely to have an impact" [8]. From a previous study which reported mortality outcome of a series of meta-analyses in the critically ill [9] utilising only randomised controlled trials, a metaanalytic cohort was identified where secondary outcomes were reported in such detail as to yield bivariate or tri-variate data structures. Tri-variate data structures have been rarely subjected to MVMA; in the Price et al. analysis [8], only one such MVMA was reported. Univariate and multivariate analyses were undertaken and compared with respect to differences between estimated outcome variable coefficients, their standard errors (SE) and 95% confidence interval (CI) width and statistical significance, with no selection of meta-analyses based upon the number of RCTs per meta-analysis. As a by-product of MVMA coefficient estimation, variable correlations, direct information and borrowing of strength (BoS) were determined. Whereas direct information describes the contribution of data from the same outcome, BoS represents the contribution of data from all other outcomes [10,11]. One problematic requirement of MVMA is the provision of with-study correlations which are rarely reported, although methods based upon individual patient [12] or aggregated data [13] and within the Bayesian framework [14] have been undertaken. Any recommendation for the practical application of MVMA must be accompanied by appropriate software. As such, the "alternative" MVMA model of Riley [15] was employed, whereby an overall correlation, the total marginal correlation between outcomes, was modelled, enabling seamless application to all meta-analyses considered. As results based upon indices of risk, odds ratio (OR) and risk ratio (RR), are not generally inter-translatable [16], both OR and RR estimates were compared.

Ethics
The data for this study was abstracted from published studies and an Ethics clearance was not appropriate.

Data management
A previous study [9] was used to identify meta-analyses with either one or two secondary outcomes that provided sufficient detail to generate a bivariate or tri-variate MVMA data structure, with mortality as the primary outcome; all meta-analyses were of randomised controlled trials (RCT). Usable second and third outcomes were identified as presented in the original meta-analysis.

Statistical analysis
1. To facilitate rapid data processing over a large number of models, initial univariate meta-analytic point estimates and standard errors (SE) were computed within Stata ™ V17 [17] using the "meta" suite of commands [18]; default estimation used restricted maximum likelihood (REML [19]). 2. Subsequently, both univariate and multivariate outcomes were estimated using the user written Stata command "mvmeta" ( [20], Version 3.2.0 6apr2018) in a random effects (RE) formulation. Estimation employed REML with an unstructured covariance and the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm for likelihood maximisation or the Davidon-Fletcher-Powell (DFP) algorithm if there were convergence difficulties. Maximisation employed the "difficult" option (use a different stepping algorithm in nonconcave regions) provided by Stata ™ .
(i) The two sets of univariate estimates were subsequently compared. (ii) Under persistent convergence difficulties of "mvmeta", the model was refit assuming the overall correlation matrix was fixed and known, with values set equal to the estimates from either the BGFS or DFP algorithm using the "bscovariance" option of "mvmeta" ( [8], Appendix 4). (iii) In the MVMA note was taken of very small β coefficient standard errors (SE) with consequent large z values for coefficient significance and very small p-values and CI width, such that the estimates were implausible. (iv) To avoid the requirement for specific within study correlations [1], the "alternative" model of Riley was used [15], whereby an overall correlation, the total marginal correlation between outcomes [21], was modelled; that is an amalgam of the within and between-study correlations [6,8]. The reported correlation(s) in this paper were these overall correlation(s) [22]. (v) Direct information and BoS between estimates were also reported [8,10,11], using the default ("sd") method of "mvmeta". BoS may be conceptualised as a comparison of variances of the estimated rth component of β under the uni-and multivariate models BoS RV r = 1 − var β mv,r var β uv,r , where RV refers to relative variance [11]. This ratio has also been described as the efficiency, "E" [10,23]. An equivalent but alternative method, decomposition of the score function for β, has advantage in that it defines appropriate study weights within an MVMA [11]. In a univariate RE meta-analysis study weights are inversely proportional to the sum of the within-and between-study variances. In an MVMA analysis, as undertaken by "mvmeta", weights were derived using the score decompensation method, where the score function S(θ ) is the first derivative of the log-likelihood function l(θ ); S(θ ) = dl(θ ) dθ and l(θ) is the likelihood. The weights were broken down into direct information, the contribution of data from the same outcome, and BoS, the contribution of data for all other outcomes. For a univariate analysis, the weights sum to 1, or when expressed as a percentage, 100, as in the "mvmeta" output. In a MVMA, a simple tabulation of direct information and BoS will sum to 100 for each outcome. In particular, the methodology takes the variance components as fixed and the precisions of the point estimates from a MVMA have an expectation of being greater than or equal to those from separate univariate meta-analyses [24], albeit the latter study employed the methods of Van Houwelingen et al. [25] using "Proc Mixed'' with SAS statistical software, not the "Riley" method [15].
a. The use of the "Riley" method [15] excluded the computation of multivariate I 2 for each outcome.
3. The reported confounding effect of small study effects upon changes of statistical significance between univariate and MVMA [5,8] was explored by inspection of contour-enhanced funnel plots [26] and formal regression based tests, in particular, the Harbord (for binary outcomes) and Egger (for continuous outcomes) tests [27]. Small study effects were reported for all meta-analyses, as a matter of complete reporting, but suffer from the problem of multiple testing. The power and interpretation of the tests are problematic for small RCT number (< 10) meta-analyses and in the presence of moderate (see 4., below) heterogeneity [28,29]. More importantly, univariate tests may be underpowered compared with the recently described multivariate small study effect test (MSSET), a multivariate extension of Egger's regression test [30]. 4. Meta-analytic heterogeneity was reported as the I 2 index and adjudged as medium and high if I 2 ≥ 50 and 75% respectively. The I 2 index was preferred, compared with τ 2 , as it is comparable across different metrics and number of RCTs [31]. 5. The analyses, using frequentist methods, were performed for bivariate and tri-variate models with both OR and RR metrics. 6. Agreement or otherwise between univariate and multivariate estimation results was undertaken using Lin's concordance correlation coefficient (CCC) via the user written Stata command "concord" [32]. The CCC combines measures of both precision and accuracy to determine how far the observed data deviate from the line of perfect concordance (that is, the line at 45 degrees on a square scatterplot). Other measure to characterise the comparison were: (i) Estimate differences average and standard deviation (SD); univariate versus MVMA. (ii) 95% (Bland and Altman) limits of agreement (LOA) (iii) An F test (Bradley-Blackwood) of equality of means and variances; non-significance implies concordance.
7. Boxplots [33] were used to visualise the density distribution of BOS and total marginal correlations of both OR and RR for bivariate and tri-variate models.
Statistical significance was ascribed at p < 0.05.

Results
The cohort was composed of forty-nine meta-analyses, 18% nutritional therapeutic, 18% non-pharmaceutical therapeutic and 64% pharmaceutical therapeutic, published between 2002 and 2018. The primary outcome in all was mortality; forty-nine were bivariate in outcome data composition and 30 were tri-variate. Details of the mortality, second and third outcome metaanalyses are shown in Tables 1, 2 and 3, respectively. Heterogeneity, as the I 2 index, of ≥ 50 and ≥ 75% was found in mortality, second and third outcomes in 12, 31 and 50%, and 0, 16 and 23%, respectively. Of the 49 mortality meta-analyses, five [38,43,53,60,66] demonstrated evidence of small study effects on formal testing (p < 0.05); for the second outcome, five [37,53,[59][60][61]; and for the third, five [37,49,67,68,78]. The disparity between the formal test of small study effects (p < 0.05) and the increased frequency of "query" for contour-enhanced funnel plot assessment in second    and third outcomes versus mortality outcome (37,40 and 4%, respectively) was noted and may be a function of the power of the test (see Methods, 3.). There was uniform agreement (to the second or third decimal point) between univariate estimates of "mvmeta" and "meta" in Stata ™ .

Bivariate model: OR
For the 49 meta-analyses, median (minimum, p25, p75, maximum) number of RCT per meta-analysis for the primary mortality outcome was 13(4, 10, 17, 31); for the second outcomes, 8(4, 6, 10, 31); see Tables 1 and 2. In only 11 meta-analyses was there equality between the reported primary and secondary outcome study numbers. In the MVMA the "bscovariance" option was used once only and there were no instances of "large" Z values. Second outcomes were binary in 39 and continuous in 10 (Tables 1 and 2). Estimate analysis is given in Table 4. Across all outcomes and estimates, the concordance, univariate versus multivariate, was substantial, with a general relative increment, albeit uneven, in the magnitude of multivariate estimates. Means and variances demonstrated little concordance. Reversal of coefficient estimate significance, univariate versus MVMA, occurred no cases for mortality and 6 cases for second outcomes (significant to non-significant in five [36,43,70,71,79], one meta-analysis exhibiting small study effects [43]; non-significant to significant in in one [59]).

Bivariate model: RR
In the MVMA, 49 meta-analyses were considered and there were no instances of "large" Z values. Concordance estimate analysis is given in Table 5. Substantial concordance was seen between uni-and multivariate estimates, with a variable relative increment of multivariate estimates (SE and CI width) across outcomes. Multivariate β estimates were variable with respect to univariate and means and variances lacked concordance. Reversal of coefficient estimate significance, univariate versus MVMA, occurred in one case for mortality outcome (significant to non-significant, [52]) and 3 cases for second outcomes (significant to non-significant [36,61,70]; one instance [61] was discordant with the OR metric and one instance exhibiting small study effects [61]). The bivariate distributions of BoS are displayed in Fig. 1, where an increment of BoS for RR compared with OR, for both mortality and the second outcome is evident.
The bivariate total marginal correlations, mortality vs second outcome, are shown in Fig. 2; both metrics displayed similar distribution.
Concordance estimate analysis is given in Table 6. Variable concordance between uni-and multivariate estimates was observed. Multivariate estimate precision (SE) increased, and confidence interval width tended to decrease compared with univariate, across and within outcomes. A tendency for concordance between means and variances was apparent. Reversal of coefficient estimate significance, univariate versus MVMA, occurred in two cases for mortality ( [38,73] non-significant to significant, one meta-analysis exhibiting small study effects [38]); nine cases for second outcomes (significant to non-significant in 3 [67,69,79], one meta-analysis exhibiting small study effects [67]; non-significant to significant in 6 [37,57,59,62,66,73]) and 7 cases for third outcomes (significant to non-significant in 3 [40,46,67], one meta-analysis exhibiting small study effects [67]; non-significant to significant in 4 [37,38,69,73] with one demonstrating small study effects [37]).

Tri-variate model: RR
Of the 30 tri-variate meta-analyses, there was one instance of complete convergence failure [46] and seven instances of "large" Z values [37,38,56,66,72,73,78] which were sufficient to render estimates implausible (median number of RCT per meta-analysis for primary, second and third outcomes 10, 6 and 5 respectively); the outcome data set was thus 22 meta-analyses [37, 40, 47, 49, 50, 53-55, 57, 59, 60, 62, 67-69, 71, 76, 77, 79, 80]. The median (minimum, maximum) number of studies per meta-analysis for the primary mortality outcome was 13 (4,24); for the second outcome, 8(4, 20); and the third 7 (4,15). Second outcomes were binary in 15 and continuous in 7; third outcomes were binary in 7 and continuous in 15. In the MVMA the "bscovariance" option was used on 3 occasions [57,62,67]. Concordance estimate analysis is given in Table 7. Concordance between uniand multivariate estimates was uneven, with no consistent relative change in multivariate estimates, compared with univariate, across or within outcomes. A tendency for concordance between means and variances in second and third outcomes was apparent. Reversal of coefficient estimate significance, univariate versus MVMA, occurred in two case for mortality ( [37,62] with no small study effects, non-significant to significant, not concordant with the OR cases); five cases for second outcomes with no small study effects (significant to non-significant in 2 [69,79], concordant with OR cases; non-significant to significant in 3 [54,57,77]; concordant with one OR cases only [57]); and 3 cases ( [54,60,67] non-significant to significant, one exhibiting small study effects [67]) for third outcomes with no concordance with OR cases. The tri-variate distributions of BoS are displayed using boxplots in Fig. 3. The increment of BoS for OR compared with RR for mortality and the third outcome is evident. In the panel (right top) showing BoS mortality RR there were points of large BoS for two MVMA metaanalyses, 99.3 and 93.6 [57,62]. Both these MVMA utilized the "bscovariance" option of "mvmeta" as there was initial unresolved convergence. The estimated between study mortality variance was minimal for both (5.24e-06 and 0.005, respectively) and the status of these estimates may be circumspect.
The tri-variate total marginal correlations for both OR (left) and RR (right)are shown via boxplots in Fig. 4; with progressive movement to positive correlations from mortality-second outcome through second-third outcome. Positive correlations appeared more frequent with the RR metric.

Discussion
It is easy to forget that the MVMA approach has a long history dating back to at least 1993 [81] and has subsequently been formally implemented in popular statistical software packages [82][83][84][85]. This being said, MVMA still appears rarely used by practitioners, a decade after a 2009 review by Riley [1]. From within the social science paradigm Becker, in 2000, pointed out that ignoring outcome dependence in meta-analysis will affect Type I error rates and precision and bias of estimates: "No reviewer should ever ignore dependence among study outcomes" [82]. In the current study the total marginal correlations for both bi-and tri-variate analyses was sizeable overall and, depending upon the composition of the non-primary outcomes, more positive than negative and more so for the tri-variate case. One of the principal attractions of MVMA is estimation of the BoS between parameters, well demonstrated in Fig. 3. Most of the BoS would appear to derive from studies which are more "atypical" in design. In particular, the BoS of secondary outcomes of the ith study is a function of the within-study variance matrix (V i ) and the harmonic average V of all the V i s . BoS can only arise if there are differences between the V i s ; which would entail studies of "..substantive difference[s] in background and research methods…", not simply different sample sizes [10]. The magnitude of outcome BoS would appear to be bounded by percentage of missing data for that outcome [6,24], which in the current study was substantial (see Results). A percentage missingness of 30-35% of studies informing an outcome was found to result in a "more pronounced" BoS in one empirical study [14]. Any nexus between BoS and missingness requires a missing at random (MAR) assumption, as opposed to missing completely at random (MCAR) for univariate meta-analysis [21]. The notions of MAR and MCAR are well recognised in the bio-medical literature [86], albeit inconsistency of usage has been documented [87]; in particular, the conflation of (non)"ignorable" and MAR [22]. Perhaps not surprisingly, within the domain of outcome reporting bias (ORB) [4], MVMA has been a method of choice to investigate the impact of ORB upon meta-analytic conclusions [22,88].
Computationally, MVMA requires both withinand between-study correlations and the former are typically not known and are likely not to be available, especially in higher order (trivariate) models [24]. Riley provided four alternate methods to overcome these problems [1]; the most straightforward, yet laborious, being a sensitivity analysis by correlation imputation over the entire range (-1 to + 1).
Riley's alternate model [15] has been found to have good asymptotic statistical properties compared with a fully hierarchical REML model, with known within-study correlations, and with separate univariate meta-analyses. The performance may be problematic when the overall correlation ρ is very close to 1 or -1. In the current study, only two instances were found; in the bivariate RR MVMA, ρ = 0.999 [75], and the trivariate OR MVMA, ρ = -0.986 ( [57], second versus third outcome); both MVMA utilised the "bscov" option. As the Riley model is a "working" model when the true data generating mechanism is a RE model, the standard variance estimates may not provide confidence interval coverage at the nominal level [21]. Complete failure of convergence in the current study was rare, occurring in one instance [46], but problematic SE estimation was exhibited in the trivariate series, 5 instances in OR metric and 7 in RR. This may relate to the small number of RCT in second and third outcomes (see Table 8), but these numbers were not substantially different compared with meta-analyses not demonstrating this feature, as shown in Table 8.
Frequentist and Bayesian empirical comparisons between univariate meta-analyses and MVMA have appeared in the literature [8,14,[89][90][91][92][93] with results demonstrating similar (pooled) parameter estimates between the two analytic forms. However, papers by Riley and co-workers [1,15,24,94], which included formal simulation studies, found advantage; a smaller standard error and mean-square error of pooled estimates, predicated upon the presence of missing data; again, assuming missing at random. That is, in the presence of complete data a bivariate analysis would not be expected to produce a gain in statistical efficiency. The extension to trivariate and higher order outcome data and the inability to provide within study correlations was thus identified as a "pressing research issue"; to wit, the "alternative" model of Riley [15]. Price et al. suggested that estimates of clinical and /or statistical conclusions from MVMA may occasionally differ from those from univariate analyses and observed, somewhat wryly, that any claimed discrepancy "…says more about the dangers of using concepts of statistical significance than it does the use of MVMA" [8].
The results from the current analysis were somewhat at odds with these sentiments and with the general results of bivariate studies, both empirical and simulation (see below), albeit the caution about the variance estimates of the Riley model, above, are noted. A variable change in the multivariate precision of primary mortality outcome estimates compared with univariate analysis was present for both bivariate and tri-variate meta-analyses and for metric. For second outcomes, precision tended to decrease, and CI width increase for bivariate metaanalyses; for third outcomes, precision increased, and CI width decreased. The latter finding appears not to have been previously reported although analytic reports of the tri-variate structure are rare; one case only reported by Price et al. study [8] and two by Trikalinos et al. [14]. With respect to the observed relative changes (univariate versus multivariate) across four concordance analyses, the magnitude of the difference was rather small and accompanied by a more substantial SD, suggesting a heterogeneity of the MVMA effect, grounded in the individual meta-analyses and dependent upon the nature of the outcome, binary or continuous. As MVMA allows for correlation between outcomes, CIs may be wider on the basis of increased between-study variance [8], but this was observed only in the bivariate case in the current analysis. The experience of Price et al. that "MVMA methods can be applied only in a minority of reviews of interventions in pregnancy and childbirth" [8] was not consistent with the current study. A reviewer pointed to the wide LOA of the β estimates for the second continuous outcome (days) in Table 6 (trivariate OR MVMA), this being -10.894, 8.122. Of the seven meta-analyses considered, two had stand-out differences between univariate and MVMA estimates; the study of Wan et al. ( [57], intra-meta-analytic study number = 4), -11.31, -18.17 and Chen et al. ( [69],intra-metaanalytic study number = 10), -11.27 and -4.26. The former study used the "bscov" option, recording a BOS for the second outcome of 54% and correlation between second and third outcomes of 0.986; the latter had normal convergence but record a BOS for the second outcome of 92.4% with a correlation between second and third outcomes of 0.995. This may be indicative of problematic estimation, which has been mentioned above and further addressed in "Limitations", below. Trikalinos et al. ( [14], point 4.1), using Bayesian methods, observed that "Generally, point estimates are comparable"; Price et al. ([8], Table 2) using "mvmeta" recorded differences in β between univariate and MVMA, but did not focus attention on such; and in the bivariate simulation study of Riley et al. ([94], Table 4), bias of the mean for β 1 and β 2 was comparable with coverage for both between 93-98%; similar results were also observed when considering the "alternate" model of Riley ([15], Table 1).
These changes of statistical significance are shown in forest plots as couplets, univariate versus MVMA, for binary (null line unity, Fig. 5) and continuous (null line zero, Fig. 6) outcomes. A majority of the CI width changes that achieved a change of significance about the null appear substantial; the clinical import of such changes would require case by case determination [96].
Disparities between the OR and RR occurred over a range of indices and may be a function of the current cohort. However, OR and RR are not merely interchangeable metrics and there is no monotone relationship between them [16]. Recent papers have drawn attention to potential estimation problems with the RR. First, the RR effect magnitude is dependent upon the underlying baseline prevalence, shifting toward 1 as prevalence increases, and is a ratio of two conditional probabilities, whereas the OR is a likelihood ratio whose magnitude reflects the fold increase in odds, baseline to intervention, independent of prevalence [97]. Second, under both the DerSimonian-Laird [98] and REML formulations, the requirements of log(RR) estimation to be compatible with study level event rates in the [0,1] interval (π j treat < 1 and 0 < π j control < 1) demand restriction on the parameter space with ensuant bias in estimates of both τ 2 and log(RR). Thus risk relativism may be an "illusion " [97] and the OR "appears to be a safer option" [99]. This being said, Xiao and colleagues argued that interpretability issues of the OR, lack of collapsibility and a dependence on the baseline risk, negates any in-principle recommendation for the OR [100].

Limitations
The current study utilised a single meta-analytic cohort from the critical care domain and had a modest number of bivariate meta-analyses, but less so in the trivariate series. The preference for the alternate model of Riley was a potential limitation, but when reviewing a number of bivariate and tri-variate studies in two metrics the use of sensitivity analysis by specifying within study correlations (via the "wscor" option of "mvmeta") would be unwieldy and potentially uninterpretable. This being said, the recommendation of Riley et al.

Conclusions
MVMA elucidates the structure and correlation between multiple reported outcomes in univariate meta-analyses and resolves outcome reporting bias. Change in estimate precision and CI width with MVMA appeared context dependent. The BoS entailed in this technique may be quantified and change of parameter significance may be a consequence. MVMA is a feasible solution to the meta-analytic estimation of multiple univariate effects.