Classifying information-sharing methods

Background Sparse relative effectiveness evidence is a frequent problem in Health Technology Assessment (HTA). Where evidence directly pertaining to the decision problem is sparse, it may be feasible to expand the evidence-base to include studies that relate to the decision problem only indirectly: for instance, when there is no evidence on a comparator, evidence on other treatments of the same molecular class could be used; similarly, a decision on children may borrow-strength from evidence on adults. Usually, in HTA, such indirect evidence is either included by ignoring any differences (‘lumping’) or not included at all (‘splitting’). However, a range of more sophisticated methods exists, primarily in the biostatistics literature. The objective of this study is to identify and classify the breadth of the available information-sharing methods. Methods Forwards and backwards citation-mining techniques were used on a set of seminal papers on the topic of information-sharing. Papers were included if they specified (network) meta-analytic methods for combining information from distinct populations, interventions, outcomes or study-designs. Results Overall, 89 papers were included. A plethora of evidence synthesis methods have been used for information-sharing. Most papers (n=79) described methods that shared information on relative treatment effects. Amongst these, there was a strong emphasis on methods for information-sharing across multiple outcomes (n=42) and treatments (n=25), with fewer papers focusing on study-designs (n=23) or populations (n=8). We categorise and discuss the methods under four ’core’ relationships of information-sharing: functional, exchangeability-based, prior-based and multivariate relationships, and explain the assumptions made within each of these core approaches. Conclusions This study highlights the range of information-sharing methods available. These methods often impose more moderate assumptions than lumping or splitting. Hence, the degree of information-sharing that they impose could potentially be considered more appropriate. Our identification of four ‘core’ methods of information-sharing allows for an improved understanding of the assumptions underpinning the different methods. Further research is required to understand how the methods differ in terms of the strength of sharing they impose and the implications of this for health care decisions. Supplementary Information The online version contains supplementary material available at (10.1186/s12874-021-01292-z).


Background
Health Technology Assessment (HTA) is the systematic evaluation of the properties, effects and impact of health technologies with a view to inform decision-making in health care [1]. Regardless of whether or not a system functions under explicit budget constraints, resources spent could have always been used for alternative purposes. Therefore, policy-makers are always faced with difficult decisions about whether interventions should be funded. This requires an assessment of whether the benefits of an intervention are sufficient to justify the health opportunity costs of funding it [2]. It follows that a set of tools ought to be used so that policy-makers can rationally and transparently decide about the adoption of a given health technology [3].
Decision analysis provides a quantitative framework that brings together all relevant evidence on the impact of an intervention on health outcomes and costs, whilst making explicit judgements about how different types and sources of evidence are linked together (model structure) and which elements are relevant to decision-making (reflecting social values). The outputs of a Decision Analytic Model (DAM) include incremental costs and benefits and can be useful for decision-makers [4].
Each input within a DAM is a parameter and constitutes a potential research question that can be informed by evidence which is typically identified using literature reviews. To assist study selection when identifying evidence for reviews, research questions are defined using the PICOS framework, where P stands for Population, I for Intervention, C for Comparator, O for Outcome, and S for Study-design [5]. Typically, reviewers exclude studies deviating from the inclusion criteria on any PICOS dimension; that is, they usually only include studies providing direct evidence. Hence, direct evidence on relative effectiveness comprises of one or more randomised studies, evaluating the intervention(s) under assessment, recruiting patients from the population of interest, and measuring effects on all relevant outcomes.
Where multiple studies exist to inform the same parameter, these can be synthesised to generate a single estimate that represents the evidence-base. To synthesise the evidence base and provide DAMs with relative effectiveness inputs, standard Meta-Analysis (MA) and Network Meta-Analysis (NMA) methods [6,7] are commonly used. Although synthesis is more common for Relative Treatment Effects (RTEs), evidence synthesis methods can also be applied for other DAM inputs such as costs and Quality of Life (QoL).
However, in HTA, direct evidence may be sparse, heterogeneous, or limited in other ways and synthesis may become problematic. Where evidence is sparse, it may not be possible to obtain the required Relative Treatment Effect (RTE) estimates, and even when they can be obtained, they may be highly uncertain and may not be robust due to assumptions imposed in the analysis [8,9]. Evidence sparsity may also prevent appropriate exploration of heterogeneity because small studies are at higher risk of enrolling unrepresentative populations [10] and provide less evidence to enable robust subgroup analyses.
A policy relevant alternative to limited or sparse data may be to extend the evidence base beyond the direct evidence. A topical example concerns paediatric indications for which the evidence-base is typically sparse due to the regulatory restrictions on trials. To support decisionmaking for this population, the Food and Drugs Administration (FDA) [11] and the European Medicines Agency (EMA) now propose that "The evidence needed to address the research questions that are important for marketing authorisation of a given product in the target population might be modified based on what is known for other populations" [12]. Whilst in the aforementioned example the evidence is extended to consider another population, in principle, indirect evidence may relate to any other dimension of PICOS ( Fig. 1) -it may include studies assessing a different, but related, treatment or pertaining to a different study-design than what is specified in the research question. Note that, in this context, NMA also considers indirect evidence, pertaining to other treatment comparisons i.e. indirect evidence on the 'Intervention' PICOS dimension, to inform the treatment effect(s) of primary interest [13].
Within a decision-making context, the use of indirect evidence, as long as it is judged relevant, contributes to accountability by allowing for all relevant evidence to be considered. Combining all relevant sources of evidence may yield more precise estimates than the direct evidence alone and allow better characterisation of heterogeneity and uncertainty. However, when indirect evidence are not sufficiently relevant or of high-quality, using indirect evidence may also introduce bias and inflate heterogeneity estimates.
The use of indirect evidence to support decisionmaking is not exclusive to the aforementioned regulatory context and has permeated HTA processes. Examples can be found in Technology Appraisals (TAs) conducted by the National Institute for Health and Care Excellence (NICE) to inform routine use of technologies in the National Health Service (NHS) in England and Wales. For instance, TA445 [14] considered adult studies to complement a sparse paediatric evidence base. Also, relative effectiveness has been generalised across subgroups of different Hepatitis C genotypes [15]. These two examples use indirect evidence by considering both sources perfectly generalisable ('lumping'), as an alternative to being considered completely independent ('splitting'). There are, however, examples of appraisals which use indirect evi- dence in more sophisticated ways. For instance, TA383 [16] used indirect evidence across interventions by assuming a 'class-effect' between treatments that function through the same molecular pathway. TA139 [17] and TA168 [18] simultaneously modelled two outcomes leveraging their correlation structure and TA244 [19] modelled a network of interventions with multiple treatment components assuming that the relative effect of an intervention is the sum of the relative effects of its comprising components.
Inevitably, a judgement on whether the indirect evidence is relevant is always required. However, what is often not made explicit is that, where both direct and indirect evidence are considered, there should be appropriate consideration for the extent of information-sharing permitted by different synthesis methods (i.e. the extent to which the indirect evidence is allowed to affect the estimates obtained by using only the direct evidence).
The objective of this review is to identify informationsharing evidence synthesis methods that have been used in the literature and improve understanding of these methods by making explicit the fundamental assumptions underpinning them. We do so by identifying the 'core' relationships used to share information. This review increases awareness around the breadth of available information-sharing methods and aids transparency in information-sharing methods choice. To our knowledge, this topic has not been explored in the past with a clear policy focus.

Methods
We conducted a literature search, that was systematic and transparent in its methods and conduct, but was not comprehensive due to the challenges described ahead [20]. We, therefore, will refer to our methodology as a literature review and not as a systematic review.
To inform the design and conduct of the literature review, we initially conducted a scoping review (details provided in Additional file 1). It's aims were to clarify working definitions, determine inclusion and exclusion criteria, understand whether keyword-based methods [20] would generate a sensitive and specific search strategy, obtain a comprehensive list of representative seminal papers on information-sharing, and conceive how the breadth of information-sharing methods could be categorised in a useful manner. We found that consistent terminology was not used when referring to methods that combined direct and indirect evidence. Therefore, for the main literature review, we used citation-mining methods [21] which are efficient [22] and have been used for similar reviews [23]. The methods of the main review were protocolled in advance.
Citation-mining involves two steps. The first encompassed the compilation of a list of seminal/influential papers. All relevant papers identified in the scoping review were considered and seminal papers were selected to reflect breadth [22] by including different fields of research: MA, NMA, multi-parameter evidence synthesis, synthesis of multiple outcomes and the incorporation of evidence on historical controls in trial-design. Two external evidence synthesis experts were consulted to validate the list and provide additional references. The second, and main, step of the citation-mining review was then conducted in the Web of Science (WoS) on 20/Feb/2019 by identifying all the citations of the seven seminal papers [8,[24][25][26][27][28][29], and then all articles that cited the seminal papers -i.e. a forwards and backwards citation-mining.
Inclusion and exclusion criteria were pre-specified (see Additional file 1). Articles were included if they formally specified MA or NMA models (in mathematical notation or computer code) that combined information pertaining to multiple populations, interventions, outcomes or study-designs, or if they utilised evidence from an external source (such as a previous meta-analysis). Given the aim of identifying a range of methods for the sharing of information, papers that used only standard NMA methods originally described by Lu et al. [7] (i.e. by pooling evidence sets assuming perfect exchangeability) were excluded.
Data extraction was pre-specified and included year of publication, the synthesis challenge addressed, the specific method (relationship) imposed between the 'direct' and 'indirect' evidence to facilitate information-sharing, the PICOS dimension(s) of indirectness, the 'cores' used, the parameter over which information-sharing was imposed, and whether the paper fell into the field of MA or NMA. When papers tackled multiple challenges simultaneously (e.g. [30][31][32]), the challenges they dealt with were isolated and extracted separately. Further information on the search strategy and inclusion and exclusion criteria is provided in Additional file 1.
The search was conducted in Zotero version 5.0.69 and a link to the included papers Zotero database, where the papers have been grouped according to various tags, is provided in the end of the manuscript. The PRISMA checklist for systematic reviews is provided in Additional file 2. The results of the search were reported descriptively, by grouping the methods by the policy problem and the PICOS dimension of indirectness. Methods were then categorised according to the 'core' relationship they used to enable information-sharing and the statistical methods falling within each core were described.

Characteristics of the included studies
The review identified 89 papers (Fig. 2) which are available in our online database (link provided in the end of the manuscript). The majority (n = 79) described methods that shared information on relative treatment effects. Other studies used methods to share information on comparison-specific meta-regression slopes (n = 4), comparison-specific between-studies heterogeneities (n = 6), or study-specific baselines (n = 2). Overall, there was a balance amongst papers that developed methods within MA (n = 45) and NMA (n = 44). There was a strong emphasis on methods for information-sharing across multiple outcomes (42 papers) and treatments (25 papers), with fewer papers focusing on study-designs (23 papers) or populations (8 papers) ( Table 1). Note that some papers described methods sharing information on several types of parameters and across more than one PICOS dimension (e.g. [30][31][32]). A full list of the included papers along with a description of how information was shared within each paper can be found in Additional file 3.

'Core' relationships for information-sharing
The methods identified were classified according to the 'core' relationship facilitating information-sharing. Four 'core' methods were identified: 1) functional relationships which include deterministic functions among model parameters resulting in a reduced number of parameters that need to be estimated; 2) exchangeability-based relationships which assume that a set of parameters are drawn from a common distribution that allows them to be shrunk towards its mean; 3) prior-based relationships which employ a Bayesian framework to 'load' the indirect evidence in prior distributions and 4) multivariate methods which assume that model parameters are correlated and enable information-sharing through the covariance structure. Figure 3 provides a description of the main assumption and mathematical relationship imposed by each 'core' method. Table 1 classifies papers according to the 'core' method used and the PICOS dimension of indirectness. It shows that some 'core' relationships are preferred when information is shared across specific PICOS dimensions. For instance, most of the identified papers sharing information across interventions either use functional or exchangeability-based relationships, and no example using priors was found. Also, papers that use multivariate relationships, do so to share information across related outcomes, not across populations or study-designs. This may be partly because the information required to implement multivariate methods for multiple populations or study-designs is usually unavailable in the literature. For instance, to synthesise evidence on multiple populations using multivariate methods, we would need studies that enrol all relevant populations and report separately for each, and such information is rarely provided.
Another type of functional relationship is a constraint where a strict inequality is imposed among parameters.
In a Bayesian framework, information-sharing is facilitated by preventing simulation samples that do not conform to the specified constraint. Such methods have been used to relate RTEs across dosages, expressing that higher dosages are expected to exhibit larger RTE [35,44], describe structurally-related outcomes [57], and specify  second-order consistency equations that impose a triangle inequality on the comparison-specific between-trial variances [39,40]. Meta-regression-type methods have also been suggested. In the examples found, the relationships were usually linear -on the modelling scale-with one RTE component independent and another RTE component dependent on a particular study characteristic. The most common example in this category is bias-adjustment methods, primarily used to synthesise studies of different designs. Bias-adjustment methods broadly fall into two categories: general frameworks that adjust the RTE for biases affecting internal and external validity provided that the extent of bias can be either estimated from empirical evidence or elicited from experts [91,92,98,99], and approaches that adjust for bias due to particular study-level characteristics (considered proxies for study quality such as their size [42,50,[94][95][96], publication year [97], or riskof-bias [90,93]). Meta-regression-type relationships have also been used for complex interventions. In their simplest form, they model the RTE of a complex intervention as the sum of RTEs of its treatment components [30,36,47,48]. More sophisticated approaches allow for synergistic or antagonistic relationships by suggesting functions that also contain treatment interaction RTE components [37].
Other applications include approaches that model the RTEs for two survival outcomes (e.g. time-to-mortality and time-to-progression) by assuming that they only differ by a constant component which is invariant across treatment comparisons [31], models that assume a linear relationship between dosage and RTE [35,46], methods for baseline-risk adjustment [34], and models that relate the relative effects of populations subgroups of differing disease severity [33].
Finally, more complex, non-linear, relationships have also been presented in the literature, namely those enabling the synthesis of RTEs across a range of dosages using the Emax model [51][52][53] commonly employed in pharmacokinetics or other non-linear models [35] and those enabling information-sharing across follow-up periods [61,79].

Exchangeability-based relationships
The simplest exchangeability-based relationship uses a random effect to relate a set of parameters; in this way accounting for heterogeneity without explicitly modelling its source(s). The random effect assumes that all parameters are drawn from a distribution, implying that individual parameters are shrunken towards the random effect mean; this can happen to a greater or Fig. 3 'Core' categories of information-sharing lesser extent, depending on the precision and discrepancy of each individual estimate in relation to the random effect mean. Examples of parameters to which randomeffects have been applied include: comparison-specific meta-regression slopes [34,38,41,42,50], comparisonspecific between-trial variances [39,40], and studyspecific baseline-risks [34,54].
Random-walks are another form of exchangeability relationship. They assume that data points which are more similar with respect to a particular characteristic are expected to exhibit more similar RTEs. Examples include approaches assuming that the RTE of a particular dosage or follow-up period is drawn from a distribution centred around the RTE of its adjacently lower or higher dosage [35] or follow-up period [45,61].
Multi-level models also use exchangeability, but apply it to the hierarchical/clustered structure of the available data. As such, exchangeability is applied at a first level within specific groups of parameters (i.e. multiple random effects are applied, each within groups of RTEs from studies showing a particular characteristic) and at a second level across the group-specific hyper-parameters. This is shown in Fig. 4, where in the bottom level, studies are categorised according to a characteristic and a Fig. 4 An illustration of a multi-level model different random effect is imposed within every category, producing group-specific basic parameters and heterogeneities. Subsequently, in the top-level, exchangeability is also assumed across the group-specific basic parameters which are shrunk towards an overall, global, groupindependent, hyper-mean. Examples include 'class-effect' models where, on top of the classical Random-Effects (RE) NMA models, the basic parameters of treatments that function through the same mechanism are assumed to be drawn from a common distribution with an overall 'class' mean and an across-treatments, within-class, heterogeneity [32,33,35,44,46,49]. Class-effect approaches have also been imposed across comparison-specific metaregression slopes [38,50]. Multi-level models have been suggested to combine adult and paediatric evidence [55], RTEs measured at different time-points [30], and studies of different designs [85,86,88,100].

Prior-based relationships
Direct and indirect evidence can also be combined through the use of prior distributions. The process usually consists of two-steps where initially the indirect evidence is analysed and subsequently the resulting distribution is used as a prior in the analysis of the direct evidence. Of note is that this approach is mathematically equivalent to lumping, which was described under functional relationships. Examples include the combination of adult and paediatric evidence [55] or randomised and non-randomised evidence [85][86][87][88][89]. The prior can additionally be adjusted for bias or its precision decreased [85]. Alternative ways to define the prior include the use of meta-epidemiological evidence or expert elicitation. The former has been used primarily for bias-adjustment [90], whilst both the former [24,107,108] and the latter [110] have been used to define prior a distribution for the between-trials heterogeneity.
More nuanced prior-based approaches such as mixtures of priors have also been used. Here, the informative prior (distribution representing the indirect evidence) is not used at face value, but instead mixed with a vague prior according to weights that may be specified by the analyst or estimated within the synthesis model. The resulting informative prior is typically heavy-tailed, and allows for 'adaptive' information-sharing whereby informationsharing is stronger when the direct and indirect evidence are in agreement and weaker when they conflict [56]. Mixtures of priors have been used to combine evidence on RTE and between-studies heterogeneity across adults and children [56] and to analyse the study-specific baseline parameters from studies that enrol populations with different baseline risks [34]. The use of mixtures of priors has also been discussed for the synthesis of randomised and non-randomised evidence [85].
Finally, a flexible method that has been proposed is the power-prior [111]. In this method, the likelihood of the indirect evidence is raised to a power scalar 0 a 1 which reflects the perceived similarity between the two sources of evidence. When a = 1 the results are equivalent to 'lumping' and when a = 0 results are identical to 'splitting' . The power parameter, a, needs to be specified, and it has been proposed to be elicited [112] or varied in sensitivity analysis [113]. Power priors have been used to combine observational and randomised evidence [101] and for the synthesis of adult and paediatric evidence [55].

Multivariate relationships
Multi-variate relationships have primarily been used to share information across multiple outcomes. Multivariate meta-analysis correlates the various outcomes and may separate within-and between-studies correlations [73]. At the within-study level, the study-specific correlations arise due to differences among the included patients and indicate how the outcomes co-vary across individuals within the study. For example, patients who, due to a baseline characteristic that makes their disease more severe, show high values for outcome A, are also more likely to yield high values for outcome B. At the between-studies level, correlations arise mainly due to study-level differences such as the distribution of the patient-level characteristics across studies. For instance, studies that enrol more severe cases and therefore may show high values for the mean of outcome A, are also more likely to result in high values for the mean of outcome B, whilst studies enrolling less severe cases may show lower mean values for both outcomes. These models can potentially produce more precise estimates [75] and mitigate outcome reporting bias [103,104].
Multivariate methods have been developed to consider two [74,83,84], three or more correlated outcomes [26,78], accommodate the simultaneous analyses of multiple treatments [63,68,80], and assess the relationship between surrogate and final outcomes [65,67]. Given that within-trial correlations are commonly not reported, authors have suggested the use of external data to inform these parameters [64] or, when external data is not available, methods that approximate the within-study co-variances [77]. Further extensions have been developed to handle missing data [70], assist the estimation of the between-studies covariance matrix when only a few studies are available [71], model the within-studies covariance structure using copulas [72], and allow modelling of heterogeneity and inconsistency using two separate variance components [69].
To accommodate cases where the within-trials correlations are unavailable and cannot be otherwise obtained, alternative methods, which require the same data as a univariate approach and do not separate within-and between-trials correlations have been suggested for MA [81,82] and NMA [80]. Assuming that the overall correlation is not very strong, these methods perform very similarly to their counterpart, which separates the two correlations, whilst preserving their benefits against the univariate approach.
Finally, some methods only account for either the within-or the between-studies correlations. For example, to model mutually exclusive outcomes, it has been suggested to only account for the within-trials negative correlations which are induced by the competing risks structure of the data (i.e. the more patients that reach an outcome, the fewer the patients that reach another outcome) [62]. Also, other approaches have only modelled the between-studies covariance matrix to allow simultaneous synthesis of multiple outcomes [30,31,57,60], accommodate outcomes reported at several follow-up periods [58,59] and enable information-sharing across different treatment components of complex interventions [36].

Discussion
The aim of this review was to identify and classify evidence synthesis methods that have been used to combine evidence from sources that relate directly and indirectly to a particular research question. A wide range of methods have been developed to share information between populations, treatments, outcomes and study-designs. We found that across the breadth of methods identified, four 'core' relationships are used to facilitate informationsharing. These are functional, exchangeability-based, prior-based, and multivariate relationships and are illustrated in Fig. 3.
This review highlights the breadth of methodological options that can facilitate information-sharing. Although, typically, particular relationships are used preferentially to share information on specific information-sharing contexts, it is likely that several methods are applicable and analysts would need to choose which method is more appropriate. This paper highlights that appropriate considerations need to be made when choosing 'core' relationships and methods because choices are likely to influence the degree of information-sharing. Specifically, method selection may be informed by the following considerations; the first is the plausibility of the assumptions imposed by the methods in the context of interest. By classifying methods according to the 'core' relationship that enables information-sharing, we hope to facilitate a clearer discussion about the plausibility of these assumptions in the decision context of interest.
The second is the degree of information-sharing that is imposed between direct and indirect evidence. Within the literature, there is limited exploration of how much different methods borrow-strength from indirect evidence, though for multivariate methods, it has been noted that information-sharing is 'usually modest' [26,66] and, sometimes, instead of 'borrowing-strength' , multi-variate methods may end up 'borrowing-weakness' [114]. The few studies that have assessed the degree of informationsharing typically consider only the degree of precision gains [115] rather than also examining how the point estimate -which is also important for decision making -changes. Further research to understand the extent to which different methods share information is warranted.
Finally, decision-makers may be interested in exploring different levels of information-sharing. One way to do that is by using prior-based methods that allow some control on the degree of information-sharing. For instance, an informative prior may use either the posterior distribution of the mean, or the predictive distribution of the indirect evidence. The former is equivalent to lumping, whilst the latter imposes less information-sharing. Similarly, mixture priors can regulate the weight that is placed on the informative component, and power-priors allow a range of values to be used for α which determines the extent of information-sharing.
Whilst our literature search was systematic in its methods and conduct, it is unlikely to have been fully comprehensive. The use of citation-mining techniques, while efficient and necessary for this search, may have missed relevant methods. This is because the sensitivity of citation-mining methods depends on the existence and identification of seminal papers [22] and on papers citing the most impactful references [116]. Due to time lags in citations, this technique may not capture recent developments within a field [117]. We have also excluded methods developed outside of health research and did not specifically target the grey literature. Since the search was conducted, we found in the grey literature a relevant method using multivariate methods to simultaneously synthesise the relative effects of patients treated at different lines of treatment [118]. However, we do not believe that the conclusions of a comprehensive search (had it been possible) would differ from those in this paper, namely regarding the core relationships identified, and the focus of sharing being on sharing across outcomes and treatments. We would also like to highlight that it would be important that further research considers methods developed for different purposes that could be applied for information sharing. One example is commensurate priors which have been used to combine individual-patient data and aggregate-level evidence [119].
This paper is the first to summarise and categorise the existing literature by classifying methods according to the 'core' assumption that they use to facilitate informationsharing. Further research could explore the following questions: first, how can we determine whether indirect evidence is relevant? Second, how can the appropriateness of each information-sharing method be assessed for the synthesis problem at hand? Finally, can the extent of information-sharing be quantified to assist transparent decision-making?

Conclusions
We conclude that a plethora of methods has been used to facilitate information-sharing. These can be categorised according to the main assumption they impose into functional, exchangeability-based, prior-based, and multivariate relationships. Despite the wide range of available methods, these are often used preferentially without ensuring that all options have been explored. Given that methods may differ in the degree of information-sharing they impose, the implication is that the chosen method may impose stronger or weaker information-sharing that what is considered appropriate by policy-makers. Further research should investigate ways of judging the appropriateness of the degree of information-sharing imposed by each method, and assess the impact of using different methods on decisions.