Hostname: page-component-69cd664f8f-tp5d2 Total loading time: 0 Render date: 2025-03-12T17:22:34.761Z Has data issue: false hasContentIssue false

Methods for information-sharing in network meta-analysis: Implications for inference and policy

Published online by Cambridge University Press:  10 March 2025

Georgios F. Nikolaidis*
Affiliation:
IQVIA, Paddington, London, UK Centre for Health Economics, University of York, York, UK
Beth Woods
Affiliation:
Centre for Health Economics, University of York, York, UK
Stephen Palmer
Affiliation:
Centre for Health Economics, University of York, York, UK
Sylwia Bujkiewicz
Affiliation:
Biostatistics Research Group, Department of Population Health Sciences, University of Leicester, Leicester, UK
Marta O. Soares
Affiliation:
Centre for Health Economics, University of York, York, UK
*
Corresponding author: Georgios F. Nikolaidis; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Limited evidence on relative effectiveness is common in Health Technology Assessment (HTA), often due to sparse evidence on the population of interest or study-design constraints. When evidence directly relating to the policy decision is limited, the evidence base could be extended to incorporate indirectly related evidence. For instance, a sparse evidence base in children could borrow strength from evidence in adults to improve estimation and reduce uncertainty. In HTA, indirect evidence has typically been either disregarded (‘splitting’; no information-sharing) or included without considering any differences (‘lumping’; full information-sharing). However, sophisticated methods that impose moderate degrees of information-sharing have been proposed. We describe and implement multiple information-sharing methods in a case-study evaluating the effectiveness, cost-effectiveness and value of further research of intravenous immunoglobulin for severe sepsis and septic shock. We also provide metrics to determine the degree of information-sharing. Results indicate that method choice can have significant impact. Across information-sharing models, odds ratio estimates ranged between 0.55 and 0.90 and incremental cost-effectiveness ratios between £16,000–52,000 per quality-adjusted life year gained. The need for a future trial also differed by information-sharing model. Heterogeneity in the indirect evidence should also be carefully considered, as it may significantly impact estimates. We conclude that when indirect evidence is relevant to an assessment of effectiveness, the full range of information-sharing methods should be considered. The final selection should be based on a deliberative process that considers not only the plausibility of the methods’ assumptions but also the imposed degree of information-sharing.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology

Highlights

${\textbf{What is already known?}}$

  • Various evidence synthesis methods exist for sharing information across different evidence sets.

  • Typically, only one information-sharing method is applied, and the impact of this selection is unknown.

${\textbf {What is new?}}$

  • We describe methods to share information across studies conducted in different populations.

  • We implement a range of information-sharing methods to a case-study that borrows strength from paediatric evidence to inform a decision in adults.

  • Our findings reveal significant variability in the extent of information-sharing imposed by different methods, highlighting critical implications for inference and policy.

${\textbf {Potential impact for RSM readers outside the authors' field?}}$

  • Information-sharing is pertinent across several areas beyond medical research. The conclusions of this article are generalisable across scientific fields.

1 Introduction

Evidence synthesis methods, such as (network) meta-analysis, (N)MA, combine evidence from multiple studies, typically randomised controlled clinical trials (RCTs), on the relative effectiveness between two, or more, health care technologies. Evidence from (N)MA is the cornerstone of health technology assessment (HTA), grounding the clinical and cost-effectiveness assessments supporting clinical decisions and health care policy.

The evidence base for synthesis is typically systematically identified using the PICOS framework, which defines the scope of the literature review (P, population; I, intervention; C, comparator; O, outcome; and S, study-design). 1 Usually, only evidence that meets the PICOS criteria—that is, direct evidence—is retained and synthesised. However, direct evidence may be sparse, biased, or of low internal validity, hindering robust analyses and resulting in highly uncertain estimates.Reference Ades and Sutton 2 , Reference Sweeting, Sutton and Lambert 3

To address these challenges, an alternative approach is to expand the evidence base by including indirectly related evidence that retains some relevance to the decision problem. This could entail, for example, using evidence obtained in adults to inform paediatric assessments as considered by the FDA/EMA. 4 , 5 The term ‘indirect evidence’ refers to evidence whose scope differs in at least one, but not all, of the PICOS domains,Reference Nikolaidis 6 , Reference Nikolaidis, Beth, John and Soares 7 and generalises from the use of this term in the NMA context which refers only to the Intervention and Comparator domain of PICOS.

In synthesising the two sources of evidence, indirect evidence does not need to be assumed as either perfectly generalisable with the direct evidence (i.e., ‘lumping’) or completely independent from it (i.e., ‘splitting’). Instead, information can be shared to varying degrees between the two sources. Recent work identified and categorised information-sharing methods (ISMs).Reference Nikolaidis 6 , Reference Nikolaidis, Beth, John and Soares 7 Information-sharing is facilitated by the relationship that the different methods impose between parameter(s) of interest (informed by the direct evidence) and parameter(s) informed by the indirect evidence. Four ‘core’ categories were usedReference Nikolaidis 6 , Reference Nikolaidis, Beth, John and Soares 7 : 1) functional relationships, describing deterministic functions amongst the parameters, 2) exchangeability-based relationships, assuming that the multiple parameters are independent draws from a common underlying distribution, 3) prior-based relationships, where the indirect evidence is incorporated as prior beliefs in a Bayesian framework, and 4) multivariate relationships where the parameters are modelled simultaneously, imposing assumptions on their correlation structure.

The aforementioned study also highlighted that existing HTA literature has preferentially used certain core relationships for specific policy problems (e.g., exchangeability-based relationships to facilitate information-sharing between treatments of the same class), without justified reasons for such preferences. Additionally, the study emphasized that different ISMs impose varying degrees of information-sharing, necessitating careful scrutiny. Existing research has not yet explored a broad range of models based on alternative core relationships

This article aims to compare different ISMs and their impact on inference and policy. We apply various ISMs to a case study on Intravenous Immunoglobulin (IVIG),Reference Soares, Welton, Harrison, Peura and Shankar 8 quantify the strength of information-sharing using a range of alternative metrics, and assess how these models could have enhanced evaluations of clinical effectiveness, cost-effectiveness, and the value of further clinical trials. Additionally, we clarify implementation aspects of the alternative ISMs.

2 Description of the case study

Sepsis is an inflammatory response caused by a serious bloodstream infection that can rapidly progress to a life-threatening condition.Reference Hall, Williams, DeFrances and Golosinskiy 9 Typical SoC treatment includes antibiotics to target the infection, fluids to manage septic shock symptoms, and occasionally albumin (ALB) serum to boost the immune system. 10

The research question of the original HTA related to the evaluation of the feasibility, cost, and value of information of a multi-center randomized controlled trial of IVIG as an add-on to standard of care (SoC) for adult patients with severe sepsis and septic shock (hereafter referred to as ‘sepsis’).Reference Soares, Welton, Harrison, Peura and Shankar 8 In this work, we adopt the same research question and further explore how information-sharing might have impacted conclusions relating to relative efficacy, policy making, and further research prioritisation.

2.1 Direct evidence of treatment effectiveness in adult patients, and motivation for information-sharing

The evaluation of the case study was based on direct evidence on the relative effectiveness of adjuvant IVIG or IgM-enriched IVIG (IVIGAM) as an add-on to SoC compared to SoC alone in adults, derived from 17 RCTs reporting all-cause mortality.Reference Soares, Welton, Harrison, Peura and Shankar 8

All 17 RCTs used SoC in their control arm, supplemented with either inactive placebo or albumin placebo. However, both placebo and ALB have disadvantages as control treatments. ALB is similar in appearance to IVIG (colour, transparency, opalescence) but may exert physiological effects that confound inference. Placebo eliminates physiological effects but differs in appearance from IVIG compromising blinding.

These, and similar, issues were explored in detail by Welton et al.,Reference Welton, Soares and Palmer 11 who identified high levels of statistical heterogeneity, impacting effectiveness and cost-effectiveness analyses. Meta-regression analyses explored the potential for effect modification associated with control type (placebo or ALB), treatment characteristics (e.g., IVIG preparation or treatment duration), and potential sources of bias (e.g., industry sponsorship, sample size, or study quality). Despite extensive analyses, heterogeneity was only partly explained, and the relevance of the underlying sources of heterogeneity remained unclear.

Consequently, multiple evidence synthesis models were proposed for the cost-effectiveness and value of information analysis (Table 1). Conclusions were highly sensitive to the choice of clinical effectiveness model, with the predicted odds ratio (OR) ranging from 0.68 to 1.27, and the value of a new RCT (expected maximum net benefit of sample) ranging from £137 million to £1,011 million.Reference Soares, Dumville, Ades and Welton 12 The authors recommended conducting a high-quality multicenter RCT. However, such a study would be costly, take several years to complete, and, to the best of our knowledge, has not yet been funded. Here, we explore an alternative approach to support decision-making in the adult population by sharing information from other sources of evidence.

Table 1 Evidence synthesis models used in the original HTA alongside their predicted odds ratios for all-cause mortality and key cost-effectiveness results

Note: T2: two treatments considered in the network (IVIG/IVIGAM vs. albumin or no treatment, see Supplementary Material for further details); T3b: three treatments considered in the network (IVIG/IVIGAM vs. albumin vs. no treatment, see Supplementary Material for further details). Abbreviations: EVPI, expected value of perfect information; FE, fixed effecs; IVIG, intravenous immunoglobulin; IVIGAM, IgM-enriched IVIG; OR, odds ratio; RE, random effects.

2.2 Broadening the evidence base to include evidence in paediatric patients

In this article, we explore how the body of RCT evidence on IVIG for paediatric sepsis patients could support the previous appraisal in adult patients conducted by Soares et al.Reference Soares, Welton, Harrison, Peura and Shankar 8 Thus, adults remain the primary population of interest, with pediatric evidence used to strengthen relative effect estimates. Notably, a recent studyReference Capasso, Borrelli, Ferrara, Albachiara, Coppola and Raimondi 13 utilized effectiveness evidence of IVIG in adults to support its potential value in pediatric patients, suggesting that this evidence may be partially transferable between populations. While the aim of this article is primarily methodological it would be crucial to seek clinical expert support for sharing evidence across these sets for decision-making purposes.

We i) updated the previous systematic review of RCTs on the adult population,Reference Soares, Welton, Harrison, Peura and Shankar 8 and ii) expanded the population criteria to include studies enrolling pediatric patients (see Supplementary Material for full details). We identified 28 studies: 17 enrolling adults (N = 2,300 patients, with no new studies since the previous review) and 11 enrolling children (N = 4,071 patients). No single study included both children and adults. The largest trial was pediatric, enrolling nearly 3,500 patients.Reference Brocklehurst, Farrell and King 14

Figure 1 illustrates the direct and indirect evidence base (full data available in Supplementary Material). The pediatric evidence base comprises fewer studies than the adult evidence base, shows significantly less heterogeneity, indicates a treatment effect of lower magnitude, and, unlike the adult evidence, does not produce a statistically significant result (see the random-effects meta-analyses estimates in each evidence set).

Figure 1 Fixed and random-effects pairwise meta-analyses of all-cause mortality in sepsis, separately within each population and pooled across populations.The evidence base comprises 17 studies in adults (direct evidence) and 11 studies in paediatric patients (indirect evidence). All studies report all-cause mortality. The data are available in the Supplementary Material. Points to the left of the line of no difference favour IVIG/IVIGAM over Albumin/Placebo. The plot was created using the R package ‘meta’.

3 Sharing information on relative effectiveness

In this section, we describe the range of synthesis methods that facilitate information-sharing on relative treatment effect parameters (Section 3.1.1), and the results of their application to our case study (Section 3.1.2).

3.1 Evidence synthesis

The different ISMs applied to the case study are summarised in Table 2. Methods were selected by identifying those in Nikolaidis et al.Reference Nikolaidis, Beth, John and Soares 7 which could be applied with just one indirect evidence set, and in the absence of studies that incorporate both direct and indirect evidence.

Table 2 Summary of ISMs applied in the case-study

Abbreviation: RTE, relative treatment effect.

3.1.1 Information-sharing methods

We begin by extending the notation of the standard NMA modelReference Lu and Ades 15 to describe a splitting model, not imposing any information-sharing over the basic parameters, d. We then describe alternative ISMs in the relationship they impose between the basic parameters pertaining to population of direct relevance and the basic parameters pertaining to the population of indirect relevance (i.e., $d_{1k}^{Dir}, d_{1k}^\textit{Indir}$ ).

$\underline {\text {Model 0: Splitting:}}$ Consider a set of studies comparing treatment k with a reference treatment b. Each study, i, reports only for one population, $j = \{dir, indir\}$ . Therefore, studies can be arranged as $i = 1 , ... , N_{dir} , N_{dir} + 1 , ... , N_{dir} + N_{indir}$ , with $N_{dir}$ and $N_{indir}$ being the total number of studies contributing direct and indirect information, respectively. The synthesis model for a dichotomous outcome (such as all-cause mortality considered here) takes the following form:

(1) $$ \begin{align} r_{i,k} \sim Binomial (p_{i,k}, n_{i,k}) \end{align} $$
(2) $$ \begin{align} logit(p_{i,k}) = \theta_{i,k} = \mu_{i_{b}} + \delta_{i,bk} \cdot I_{\{ k \neq b \}} \end{align} $$
(3) $$ \begin{align} \delta_{i, bk} = d_{bk}^{j} ~~~\text{(FE)} \end{align} $$
(4) $$ \begin{align} \delta_{i, bk} \sim N ( d_{bk}^{j}, {\tau^{j}}^2) ~~~\text{(RE)} \end{align} $$
(5) $$ \begin{align} d_{bk}^{j} = d_{1k}^{j} - d_{1b}^{j} \end{align} $$
(6) $$ \begin{align} d_{11}^{j} = 0 \end{align} $$

where $r_{i,k}$ , $n_{i,k}$ , and $p_{i,k}$ are the number of events, the total number of patients, and the probability of an event in study i and arm k. $\theta _{i,k}$ is the linear predictor, $\mu _{i_{b}}$ is the study-specific baseline log-odds of the outcome in the reference treatment b in trial i, and $\delta _{i,bk}$ is the study-specific Relative Treatment Effect (RTE) (log odds ratio scale) between the baseline treatment in study i and the treatment in arm k. Under a FE model, $d_{bk}^{j}$ is the population-specific RTE between treatments b and k (Equation 3). Under a RE model, the population specific parameters are the mean $d_{bk}^{j}$ and the variance ${\tau _{bk}^{j}}^2$ of the Normal distribution describing heterogeneity across the study-specific $\delta _{i, bk}^{j}$ (Equation 4). Between-trial variances are typically assumed common across comparisons (i.e., ${\tau _{bk}^{j}}^2 = {\tau ^{j}}^2$ ). Under this splitting model, the basic parameters $d_{1k}^{Dir}$ and $d_{1k}^\textit{Indir}$ are independent (no information is shared across populations), and are assigned vague prior distributions.

Note that splitting is effectively equivalent to subgroup meta-analysis and is also equivalent to a power-prior with $\alpha =0$ , a mixture prior with a weight of $0$ placed on the informative component and approximately equal to a commensurate prior with forced low precision.

$\underline{\text {Model 1 - Lumping:}}$ Lumping is implemented by extending the splitting model to assume $d_{1k}^{Dir} = d_{1k}^\textit{Indir}$ . Under random-effects, we also lump the heterogeneity parameters ( $\tau _{1k}^{Dir} = \tau _{1k}^\textit{Indir}$ ) because it is the model most commonly referred to as ‘lumping’ in policy (see NikolaidisReference Nikolaidis 6 for more extensive explorations).

Lumping is equivalent to a meta-analysis that does not distinguish between direct and indirect evidence. It is also equivalent to a power-prior with $\alpha = 1$ , a mixture prior with a weight of $1$ placed on the informative component and approximately equal to a commensurate prior with forced high precision.

$\underline{\text {Model 2 - Multi-level model:}}$ When the RTEs of the direct and indirect evidence sets are not expected to systematically differ, multi-level models can be applied by extending the splitting-model so that:

(7) $$ \begin{align} d_{1k}^{j} \sim N ( D_{1k}, \phi_{1k}^2 ) \end{align} $$

where comparison-specific basic parameters from the different populations are assumed to be normally distributed with a common mean $D_{1k}$ , and a between-populations standard deviation $\phi _{1k}$ . In the case of two populations, a common heterogeneity parameter across treatment comparisons (i.e., $\phi _{1k} = \phi $ ) is here used to ensure identifiability.

$\underline{\text {Model 3 - Standard informative prior distribution:}}$ This comprises of a two-step approach whereby the indirect evidence is first separately analysed using the standard NMA with vague prior distributions, and subsequently the posterior estimates of the first step are used as prior information in a standard NMA model including only the direct evidence:

(8) $$ \begin{align} d_{1k}^{Dir} \sim N (d_{1k}^\textit{Indir}, V_{1k}^\textit{Indir} ) \end{align} $$

where $d_{1k}^\textit{Indir}$ is the posterior mean obtained in step 1, and $V_{1k}^\textit{Indir}$ its corresponding variance. Under FE, this model is equivalent to lumping. Under RE, we can use either the posterior mean and its associated uncertainty for $d_{1k}^\textit{Indir}$ , that is, $\sim N (d_{1k}^\textit{Indir}, {se^2}_{1k}^\textit{Indir})$ , or its predictive distribution, that is, $\sim N (d_{1k}^\textit{Indir}, {se^2}_{1k}^\textit{Indir} + {\tau ^2}^\textit{Indir})$ . Here, we choose the latter because we want to ensure that the uncertainty that is due to heterogeneity is appropriately reflected.

$\underline {\text {Model 4 - Mixture prior distribution:}}$ The mixture prior model is an extension of the informative prior model whereby the posterior estimates of the initial analysis of the indirect evidence are combined with a non-informative prior to form comparison-specific mixture prior distribution for the basic parameters of the direct evidence:

(9) $$ \begin{align} d_{1k}^{Dir} \sim \nu \cdot N(d_{1k}^\textit{Indir}, V_{1k}^\textit{Indir}) + (1-\nu) \cdot N(0, 10^{4}), \end{align} $$

where $\nu $ is the weight placed on the informative component, reflecting the plausibility of sharing information between the direct and the indirect evidence sets. Note that, for $\nu =1$ , this is equivalent to the standard informative prior model (model 3), whilst for $\nu =0$ , this is equivalent to splitting (model 0). The parameter $\nu $ can be assumed, elicited from experts, or estimated within the model by assigning a Beta prior, when the mixture prior comprises only two components, or a Dirichlet prior when it comprises several components. Here, we choose the Beta prior approach, as it has been shown that it offers adaptive information-sharing,Reference Roever, Wandel and Friede 16 that is, encouraging information-sharing when the direct and indirect sources of evidence are similar, but discourages information when they are ‘in disagreement’.

$\underline {\text {Model 5 - Commensurate prior:}}$ Commensurate prior models, recently used to synthesise individual- and aggregate-level evidence,Reference Hong, Fu and Carlin 17 were here adapted to the case of multiple population groups. In this approach, the prior distributions for the basic parameters of the direct evidence are centred around the basic parameters of the indirect evidence and the variance of the prior distributions controls the extent of information-sharing.

(10) $$ \begin{align} d_{1k}^{Dir} \sim N (d_{1k}^\textit{Indir}, \eta_{1k}) \end{align} $$
(11) $$ \begin{align} \frac{1}{\eta_{1k}} \sim \begin{cases} N(20, 1), &\text{if } c_{1k}=0 \\ Gamma(0.1, 0.1) I (0.1, 5), &\text{if } c_{1k}=1\end{cases} \end{align} $$
(12) $$ \begin{align} \text{and} ~~ c_{1k} \sim Bernoulli(p_{1k}) \end{align} $$

where $\eta _{1k}$ are the comparison-specific variances. Comparison-specific variances imply a different extent of information-sharing (between direct and indirect evidence) across treatment comparisons which may not be plausible, so is it expected that a common variance across treatment comparisons (i.e., $\eta _{1k} = \eta $ ) would likely be preferred for both simplicity and identifiability purposes. The inverse of the variance parameter (i.e., the precision, $\frac {1}{\eta _{1k}}$ ) is assigned a ‘spike-and-slab’ hyperprior. Such a hyperprior defines two possibilities: one where precision assumes a high value—‘spike’ (defined as a normal distribution centered at 20 and with a low standard deviation of 1)—and hence strong information-sharing is forced, and another where precision assumes a very low value—‘slab’ (defined as a truncated Gamma distribution with shape and rate parameters of 0.1) —imposing minimal information-sharing.Reference Hong, Fu and Carlin 17 The occurrence of the scenarios is modelled using independent Bernoulli trials (i.e., $c_{1k} \sim Bernoulli(p_{1k})$ ), with the Bernoulli probability parameter controlling the extent of commensurability, and the strength of information-sharing between direct and indirect evidence. The probability parameter can be fixed to an arbitrary value (e.g., $p_{1k} = 0.5$ ), or assigned a vague hyper-prior, such as $p_{1k} \sim Beta (1,1)$ , in order to be estimated within the model. However, in both cases (i.e., $p_{1k}$ fixed or uncertain), adaptive information-sharing is facilitated, because the choice between the spike and the slab is regulated by $c_{1k}$ which is estimated in the model. In other words, $p_{1k}$ only specifies the value of the Bernoulli prior for $c_{1k}$ which controls the extent of information-sharing. Therefore, as also highlighted by Hong et al.,Reference Hong, Fu and Carlin 17 the approach where $p_{1k}$ is uncertain is likely to lead to excessive uncertainty. Here, we follow the approach used by Hong et al. (2018) and assume a fixed $p_{1k} = 0.5$ and a spike and slap with hyperparameters as show in Equation (11).

$\underline {\text {Model 6 - Power-prior:}}$ The power-prior, introduced by Ibrahim et al.Reference Ibrahim and Chen 18 and recently used in the NMA context by Jenkins et al.,Reference Jenkins, Martina, Dequen-O’Byrne, Abrams and Bujkiewicz 19 down-weights indirect evidence by raising its likelihood to a power $\alpha $ . Under this approach the posterior distribution of the basic parameters of the direct evidence becomes:

(13) $$ \begin{align} \pi(d_{1k}^{Dir} | D^{Dir}, D^\textit{Indir}, \alpha ) \propto \prod_{i=1}^{N_{Dir}} L(d_{1k}^{Dir} | D^{Dir} ) \cdot \underbrace{\prod_{i=1 + N_{Dir}}^{N_{Dir} + N_\textit{Indir}} L(d_{1k}^\textit{Indir} | D^\textit{Indir} )^\alpha \cdot \pi_{0}(d_{1k}^\textit{Indir})}_{\text{Power-prior}} \end{align} $$

where $D^{Dir}$ and $D^\textit{Indir}$ denote the direct and indirect data provided by the corresponding studies. $L(d_{1k}^{Dir} | D^{Dir})$ and $L(d_{1k}^\textit{Indir} | D^\textit{Indir} )^\alpha $ indicate the likelihood of the direct and the indirect evidence and $\pi _{0}(d_{1k}^\textit{Indir})$ is a vague prior for the basic parameters of the indirect evidence. The $\prod _{i=k}^{\lambda } f(i)$ denotes the product of the succeeding expression from study $i = k$ to study $i = \lambda $ . The parameter $\alpha $ regulates the influence of the indirect evidence: when $\alpha =1$ the power-prior is equivalent to lumping, while when $\alpha =0$ the approach is equivalent to splitting. The value of $\alpha $ can be arbitrarily definedReference Spiegelhalter, Abrams and Myles 20 or elicited from expert opinion, but it cannot be estimated within the model without model modifications. Here, we used $\alpha $ value from $0$ to $1$ in $0.1$ increments.

3.1.1.1 Information-sharing metrics

The level of sharing imposed by each ISM is determined by comparing their posterior estimates with those obtained using the direct evidence only. In what follows, the definition of each metric is described in absolute terms. The three metrics used were:

$\underline {\text {Point Estimate Divergence:}}$ The point estimate divergence (PED) evaluates the absolute difference in the adult relative effectiveness posterior mean between splitting ( $d_{ad}^{ISM_{0}}$ ) and each of the $\nu $ alternative ISMs ( $d_{ad}^{ISM_{\nu }}$ ):

(14) $$ \begin{align} PED_{\nu} = |d_{ad}^{ISM_{\nu}} - d_{ad}^{ISM_{0}}| \end{align} $$

where a larger $PED_{\nu }$ implies a larger difference in the adult point estimate between splitting and ISM $\nu $ .

$\underline {\text {Precision Increase:}}$ To capture changes in the standard error of the relative effect estimate obtained by splitting ( $sd_{d_{ad}}^{ISM_{0}}$ ) and each of the $\nu $ ISMs ( $sd_{d_{ad}}^{ISM_{\nu }}$ ), we used precision increase (PrI), which was introduced by Jackson et al.Reference Jackson, White, Price, Copas and Riley 21 (note that their measure is termed borrowing of strength (BoS). However, multiple measures are used here and this measure is renamed to better reflect the underlying quantity) defined as:

(15) $$ \begin{align} PrI_{\nu} = 1 - \frac{sd_{d_{ad}}^{ISM_{\nu}}}{sd_{d_{ad}}^{ISM_{0}}} \in (-\infty, 1] \end{align} $$

Negative $PrI$ values imply that information-sharing led to increased uncertainty compared to splitting, whilst higher positive $PrI$ values indicate increasing precision gains.

$\underline {\text {Kullback--Leibler divergence:}}$ Finally, we used Kullback–Leibler (KL) divergenceReference Kullback and Leibler 22 which simultaneously considers changes in the posterior mean and standard deviation. This metric is interpreted as the information conveyed by the probability distribution of $d_{ad}^{ISM_{\nu }}$ when used to describe another probability distribution $d_{ad}^{ISM_{0}}$ , and is defined as:

(16) $$ \begin{align} D_{KL}(p,q) = \int_{}{} [ p(x) \times log(p(x)) - p(x) \times log(q(x)) ] ~d(x) \end{align} $$

where p is the target distribution (here $d_{ad}^{ISM_{0}}$ ), and q the distribution we use to describe it (here $d_{ad}^{ISM_{\nu }}$ ).

Higher KL values indicate that the amount of information that one distribution conveys about another is reduced and therefore reflect higher information-sharing. However, KL divergence is not symmetrical and does not trade differences in point estimates and uncertainty equally (see illustration in Supplementary Material). Hence, the KL divergence can only be used comparatively across ISMs.

3.1.1.2 Implementation

Synthesis models were implemented in OpenBUGS. 23 Estimates were obtained from 50,000 iterations (with an additional 20,000 run as burn-in) across three MCMC chains with different starting values. Convergence was checked using the Gelman–Rubin diagnostic—specifically with the multivariate potential scale reduction factorReference Gelman and Rubin 24 —and visually by assessing the history, chains and autocorrelation. KL divergence was calculated using adaptive quadrature.Reference Nikolaidis 6 Vague prior distributions were used unless otherwise specified.

Selection of base-models: Although the original work used five alternative synthesis models in the economic analysis, to facilitate exposition we selected two exemplar models: a fixed effects and a random effects model. The selection of these models was informed by an extended version of the model selection strategy proposed by Welton et al.,Reference Welton, Soares and Palmer 11 and evaluated the original set of covariates explored to maintain relevance to policy—full details in Supplementary Material. The selected fixed effects model adjusts for the effect of treatment duration in the adult population only—Base-Model 1. The estimate deemed relevant for information-sharing was the relative effect of IVIG vs. ALB for treatment duration = 3 days because this duration reflects best practice in adults. To obtain this estimate, the meta-regression model was centered on that covariate value so that the treatment effect coefficient of interest (in this case of IVIG vs. ALB —network T3b) reflects the single (adjusted) estimate of interest for sharing (see Supplementary Material). The random effects model chosen adjusts for Jadad score in both adults and children with a common effect modification coefficient—Base-Model 2. The estimate deemed relevant for sharing was the relative effect of IVIG vs. ALB or Placebo (network T2) for Jadad = 5, which reflects the best possible study quality. All ISMs were separately applied to each of the selected estimates from base-models 1 and 2, and the obtained predictions were carried forward to the cost-effectiveness model. Whilst this manuscript explores how a relative effectiveness estimate can be directly informed by extended evidence from a different population, it is important to acknowledge that such an extended evidence-base can also inform heterogeneity explorations and model selection (further examined in the discussion).

3.1.2 Results

Figure 2 shows a forest plot of the log odds ratio posterior estimates derived from the application of each of the ISMs to both base-models. For both base-models, increasing information-sharing is associated with a shift of the point estimate towards the null effect which is expected since the pooled relative effect in paediatric patients is lower than in adults. The extent of the shift in the point estimate between splitting and lumping is similar across base-models. All FE ISMs yield point estimates that fall within the range defined by the lumping and splitting models (the shaded area in the plot). The standard informative prior model retrieves results equivalent to the lumping model. The multi-level and commensurate prior models yield estimates very close to those of the splitting model. As the value of $\alpha $ increases from $0$ to $1$ , the power-prior model posterior estimates transition monotonically (though non-linearly) from those of splitting to those of the lumping model. The mixture prior model estimates fall closer to those of lumping.

Figure 2 Posterior mean (log odds ratio) estimates for Base-Model 1 and Base-Model 2 across ISMs. Shaded area is defined by the point estimates of the lumping and splitting models. FE, fixed-effects; RE, random-effects; MR, meta-regression; ISM, information-sharing method.

For Base-Model 2, the point estimates from the power-prior models fall outside the range defined by the lumping and splitting models for a wide range of $\alpha $ values, showing more extreme values than lumping (towards no effect). This is due to heterogeneity in the indirect evidence, where the large multicenter trialReference Brocklehurst, Farrell and King 14 suggests no effect, contradicting the remaining small studies which suggest a relatively strong effect. Further explorations showed that for $\alpha $ values lower than 0.2, only the likelihood of the large Brocklehurst study gathers weight, with the small studies contributing to the overall effect only at $\alpha $ values above 0.2 (see supplementary Material). All other models lie within the spectrum. For the RE analyses, the standard informative prior and the mixture prior models use the predictive distribution to define the prior, resulting in considerably smaller changes in the posterior mean compared to when these approaches are applied to Base-Model 1. The multi-level model results are similar to the informative and mixture prior models. Unlike Base-Model 1, under RE, the commensurate prior deviates considerably from splitting, and aligns more closely with lumping.

Figure 3 shows the information-sharing metrics of all ISMs for Base-Model 1 and Base-Model 2 in panels A and panel B, respectively. The y-axis has been standardised to ensure that the metrics range between 0 (no information-sharing) and 1 (full information-sharing). The x-axis represents the $\alpha $ value used in the power-prior model, and hence the lines represent the strength of sharing imposed by the various power-prior models for each information-sharing metric. The remaining ISMs are shown on the right side of the plot, positioned to indicate their imposed strength of sharing for each metric. The graph demonstrates that different ISMs impose varying degrees of information-sharing within each metric, and the strength of sharing imposed by a given ISM can vary across metrics.

Figure 3 Standardised information-sharing metrics (PED, PrI, KL) of all ISMs for the FE (A) and RE (B) base-models. PED, point estimate divergence; PrI, precision increase; KL, Kullback-Leibler divergence.

3.2 Cost-effectiveness

3.2.1 Background and methods

We utilised the decision model developed in the original study, using quality-adjusted life years (QALYs) as the primary outcome and 2009 UK costs from the perspective of the National Health System (NHS). An annual discount rate of 3.5% was applied to both costs and outcomes. Incremental cost-effectiveness ratios (ICERs) were calculated as the ratio of incremental costs to QALYs and compared to common thresholds used for determining the cost-effectiveness of NHS resources. 25

The model reflects the lifetime prognosis of sepsis to capture the costs and health consequences of sepsis in the absence of IVIG/IVIGAM. There were two distinct model components: first, a short-term decision-tree evaluated the consequences of the initial hospitalisation period following the first sepsis episode. The relative treatment effects obtained from the evidence synthesis were applied to the initial decision tree. Subsequently, conditional on survival in the short term, patients entered a long-term Markov model with two disease states (alive & dead) and annual cycles that captured the long-term consequences for sepsis survivors following the initial hospitalisation.

To understand the potential implications of information-sharing for decision-making, we applied the relative effects estimated by the various ISMs (Section 3.1) to the original cost-effectiveness model. Uncertainty was propagated to the model using probabilistic sensitivity analysis.Reference Briggs, Claxton and Sculpher 26 We used a cost-effectiveness threshold of £30,000 per QALY gained to determine the probability of a treatment being cost-effective and to evaluate the value of further research.

The value of further research was quantified using value of information methods.Reference Drummond, Sculpher, Claxton, Stoddart and Torrance 27 , Reference Ades, Lu and Claxton 28 Four estimates of the value of further research were calculated. The expected value of perfect information (EVPI) estimates the value of resolving all parameter uncertainty relating to the decision problem and represents an upper bound on the value of further research. The expected value of perfect parameter information (EVPPI) estimates the value of resolving uncertainty for individual parameters or groups of parameters. The expected value of sample information (EVSI) quantifies the value of particular research designs using a defined sample size, which are therefore likely to reduce, but not eliminate, uncertainty over particular parameters. Finally, the expected net benefit of sample information (ENBS) was estimated—this subtracts the cost of sampling from the EVSI. These estimates were scaled to the total population expected to benefit from the research over the intervention’s expected lifetime and presented in monetary terms to facilitate comparison with research costs. Further details are available in Soares et al. (Chapter 6, Appendix 5).Reference Soares, Welton, Harrison, Peura and Shankar 8

3.2.2 Results

The cost effectiveness and value of information results derived using alternative ISMs under both Base-Model 1 and Base-Model 2 are shown in Table 3. The relative KL value from the evidence synthesis models is included as an indication of the degree of information-sharing in relative treatment effect estimates. The results indicate significant variability in adoption and research prioritization decisions under Base-Model 1; under Base-Model 2, the impact is smaller but still noteworthy.

Table 3 Model outputs across all applicable ISMs under Base-Model 1 and Base-Model 2 in ascending ICER order. p.EVPPI, p.EVPPI, and max.ENBS in millions pounds sterling (£)

Note: All estimates, except for ICERs, are calculated using a threshold of £ $k = 30,000$ . Informative and mixture prior models under the random-effect base-model use the predictive distribution of the indirect evidence. Abbreviations: ISM, information-sharing method; ICER, incremental cost-effectiveness ratio (in £/QALY); p.CE, probability that IVIG us cost-effective; p.EVPI, population expected value of perfect information; p.EVPPI, population expected value of perfect parameter information for the relative treatment effect; m.ENBS, maximum expected net benefit of sample; O.Sample, the sample size that achieves the maximum ENBS.

Under Base-Model 1, ICERs vary considerably from £20,542 to £55,316 per QALY gained. IVIG could be considered cost-effective if an ISM imposing low information-sharing were used but would have not been considered cost-effective if an ISM imposing strong information-sharing was deemed more appropriate. Under Base-Model 2, ICERs varied less (from £16,430 to £25,530 per QALY gained) as relative effect estimates are larger under Base-Model 2 leading to higher predicted QALYs gains. Reflecting the variation in relative effectiveness estimates (Section 3.1.2), with the power-prior model ICERs follow a non-monotonous relationship with the value of $\alpha $ .

The choice of ISM is a key determinant of the value of further research. Under Base-Model 1, ISMs imposing low information-sharing suggest IVIG is cost-effective (ICER < £30,000) with low decision uncertainty (pCE = 0.8). ISMs imposing high information-sharing suggest IVIG is not cost-effective (ICER > £30,000) with similarly low decision uncertainty (pCE = 0.1). Consequently, the resources required for a new RCT are not justified by the benefits of further uncertainty resolution. ISMs with intermediate levels of information-sharing show higher decision uncertainty and value of further research, peaking with the power-prior model at $\alpha $ values between 0.2 and 0.3.

Under Base-Model 2, increasing information-sharing leads to higher decision uncertainty and greater population EVPI and EVPPI. All ISMs indicate substantial value in prioritizing a new trial in adults, with EVPPI estimates exceeding £1 billion, which is substantially lower than the cost of a trial (indicatively, a two-arm trial enrolling 1000 patients per arm is assumed to cost around £15 million). The optimal arm sample sizes are more homogeneous, ranging from 1040 to 1520 patients per arm.

4 Discussion

This article is the first to compare information-sharing methods and examine their implications for inference and policy. We illustrate those in the context of a case-study where evidence from a related population (paediatrics) is used to strengthen inferences in the target population (adults) for a decision problem. In our case study, under the fixed effect base-model, increasing information-sharing between populations reduced cost-effectiveness and uncertainty, to the extent that IVIG was no longer deemed cost-effective, and further research was not considered worthwhile. Conversely, under the random-effects base-model, the cost-effectiveness of IVIG in adults remained uncertain regardless of the degree of information-sharing, and a large RCT was valuable.

Our findings highlight the wide-ranging impact of different ISMs, particularly under a fixed effect model, and underscore the role of heterogeneity in determining the level of information-sharing. In our case study, fixed effects models with informative priors imposed stronger information-sharing than mixture priors and commensurate priors, while multi-level models imposed only weak information-sharing, especially with only two evidence sets. The power-prior models did not show a linear or monotonic association with any strength-of-sharing metric. Importantly, applying power prior models within random effects models may result in stronger information-sharing than lumping, complicating the interpretation of the power prior $\alpha $ parameter and potentially limiting the ability to elicit such values through structured expert elicitation.

We showed that relative efficacy and cost-effectiveness estimates can be sensitive to the choice of ISMs used in the analyses. Such choice should not rely solely on statistical measures like goodness of fit. This is because lack of fit is likely to arise from a conflict between direct and indirect evidence. But, crucially, some conflict may be desirable. For instance, if the indirect evidence is of higher quality than the direct evidence, it may be beneficial to maintain strong information-sharing despite a poorer statistical fit in order to correct for bias. In such scenarios, the indirect evidence provides valuable insights that the direct evidence lacks. Therefore, instead of favouring ISMs that produce better statistical fit, policy and practice should carefully reflect on the possible reasons behind any conflicts between direct and indirect evidence, supported by clinical judgement to determine whether increased information-sharing could lead to better-informed decisions.

The extent of heterogeneity is a crucial determinant of the level of information-sharing. Our study primarily considered the extended evidence-base for information sharing over treatment effect estimates. We re-examined heterogeneity to identify the best fitting fixed- and random-effects base-models in the preparatory stage preceding ISM application. Future research should further explore the value of extended evidence in understanding and explaining heterogeneity, potentially examining methodologies for heterogeneity exploration. In our work, we extended the original approach, grounded on model fitting and selection,Reference Welton, Soares and Palmer 11 but further research could consider alternative approaches used in practice. We focused on implications of information-sharing for a target estimate of interest obtained from the fixed or random effects base model, without exploration of heterogeneity. Future research could investigate how to integrate heterogeneity exploration and information-sharing in a single stage, assessing the added value of such a joint process and its implications for the chosen analytical approaches and structural assumptions.

In our case study, direct and indirect evidence differed by a study-level population variable (adults/children), with each RCT enrolling patients from only one population. Therefore, methods that are typically used in conventional meta-analysis to explain heterogeneity, such as meta-regression, would not share any information because the effect modification coefficient would ‘absorb’ the effect of the second population. Other conventional methods such as subgroup analyses are also not applicable because, by definition, they do not facilitate any information sharing and separately analyse each population. Hence, the subgroup analysis results are effectively equivalent to splitting.

Other significant areas of uncertainty remain. Further applications and simulation studies could help identify the characteristics of the evidence base that influence the extent of information-sharing imposed by each ISM. Developing more sophisticated metrics to quantify information-sharing and structured expert elicitation methods to transparently inform model choice and specification are also required. Also, it is crucial to develop approaches to determine when information-sharing is relevant, feasible, and worthwhile before extending the evidence base, and across which types of parameters sharing is most appropriate. Finally, to facilitate methodological explorations, we shared information on a single parameter, even though our decision problem involved multiple treatments. Future research should investigate information-sharing across multiple parameters.

Overall, this article shows the range of ISMs that can be applied for sharing information between direct and indirect evidence. ISMs can significantly strengthen inference, and it is therefore important to consider these methods in support of policy-making.

Acknowledgments

We acknowledge feedback received on related material during the first author’s Thesis Advisory Panel meetings from Dr. Mona Kanaan. We are also grateful to Ifigeneia Barouma for assistance with the production of graphics.

Author contributions

Conceptualization: G.F.N., B.W., S.P., S.B., M.O.S.; Formal analysis: G.F.N.; Investigation: G.F.N., B.W., S.P., S.B., M.O.S.; Methodology: G.F.N.; Supervision: M.O.S.; Writing—original draft preparation: G.F.N., M.O.S.; Writing—review and editing: G.F.N., B.W., S.P., S.B., M.O.S.

Competing interest statement

The authors declare that no competing interests exist.

Data availability statement

All data and statistical models/programs are provided on the first author’s github https://github.com/NikolaidisGFZ/RSM_Manuscript_Methods_for_information_sharing_in_NMA.git. The data are also provided in the Supplementary Material.

Funding statement

This work was funded by a doctoral studentship awarded to GFN by the Centre for Health Economics. The Centre for Health Economics did not have a role in the design of the study, the collection, analysis, and interpretation of data, or in writing the manuscript.

Ethics approval statement

No ethics approval was required as only publicly available data were used.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/rsm.2024.17.

Footnotes

This article was awarded Open Data and Open Materials badges for transparent practices. See the Data availability statement for details.

References

Centre for Reviews and Dissemination. CRD‘s Guidance for Undertaking Reviews in Health Care. The University of York; 2008.Google Scholar
Ades, AE, Sutton, AJ. Multiparameter evidence synthesis in epidemiology and medical decision-making: current approaches. J. Royal Stat. Soc. Ser. A (Stat. Soc.) 2006;169:535.CrossRefGoogle Scholar
Sweeting, MJ, Sutton, AJ, Lambert, PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Stat. Med. 2004;23:13511375.CrossRefGoogle ScholarPubMed
European Medicines Agency. Reflection Paper on the Use of Extrapolation in the Development of Medicines for Paediatrics—EMA/189724/2018. European Medicines Agency; 2018.Google Scholar
Food and Drug Administration. Leveraging Existing Clinical Data for Extrapolation to Pediatric Uses of Medical Devices. Food and Drug Administration; 2016.Google Scholar
Nikolaidis, GF. Borrowing Strength from ‘Indirect’ Evidence: Methods and Policy Implications for Health Technology Assessment. Ph.D. thesis. University of York; 2020.Google Scholar
Nikolaidis, GF, Beth, W, John, PS, Soares, MO. Classifying information-sharing methods. BMC Med. Res. Methodol. 2021;21:113.CrossRefGoogle ScholarPubMed
Soares, MO, Welton, NJ, Harrison, D, Peura, P, Shankar, HM. An evaluation of the feasibility, cost and value of information of a multicentre randomised controlled trial of intravenous immunoglobulin for sepsis (severe sepsis and septic shock): incorporating a systematic review, meta-analysis and value of information analysis. Health Technol. Assess. 2012;16(7).CrossRefGoogle ScholarPubMed
Hall, MJ, Williams, SN, DeFrances, CJ, Golosinskiy, A. Inpatient care for septicemia or sepsis: A challenge for patients and hospitals. NCHS Data Brief. 2011;(62):18.Google Scholar
National Institute for Health and Care Excellence. Sepsis: Recognition, Assessment and Early Management (NICE Guideline 51). National Institute for Health and Care Excellence; 2016.Google Scholar
Welton, NJ, Soares, MO, Palmer, SJ, et al. Accounting for heterogeneity in relative treatment effects for use in cost-effectiveness models and value-of-information analyses. Med. Decis. Mak. 2015;35(5):608621.CrossRefGoogle ScholarPubMed
Soares, MO, Dumville, JC, Ades, AE, Welton, NJ. Treatment comparisons for decision making: facing the problems of sparse and few data. J. Royal Stat. Soc. Ser. A (Stat. Soc.) 2014;177:259279.CrossRefGoogle Scholar
Capasso, L, Borrelli, AC, Ferrara, T, Albachiara, R, Coppola, C, Raimondi, F. Adjuvant therapy in septic neonates with immunoglobulin preparations containing Ig isotypes in addition to IgG: A critical review of current literature. Curr. Pediatr. Res. 2017;21(4):535540.Google Scholar
Brocklehurst, P, Farrell, B, King, A, et al. Treatment of neonatal sepsis with intravenous immune globulin. J. Med. 2011;365:12011211.Google ScholarPubMed
Lu, G, Ades, AE. Combination of direct and indirect evidence in mixed treatment comparisons. Stat. Med. 2004;23:31053124.CrossRefGoogle ScholarPubMed
Roever, C, Wandel, S, Friede, T. Model averaging for robust extrapolation in evidence synthesis. Stat. Med. 2019;38:674694.CrossRefGoogle Scholar
Hong, H, Fu, H, Carlin, BP. Power and commensurate priors for synthesizing aggregate and individual patient level data in network meta-analysis. J. Royal Stat. Soc. Ser. C (Appl. Stat.). 2018;67(4):10471069.CrossRefGoogle Scholar
Ibrahim, JG, Chen, M. Power prior distributions for regression models. Stat. Sci. 2000;15(1):4660.Google Scholar
Jenkins, D, Martina, R, Dequen-O’Byrne, P, Abrams, K, Bujkiewicz, S. Methods for the inclusion of real-world evidence in network meta-analysis. BMC Med. Res. Methodol. 2021;21:207.CrossRefGoogle ScholarPubMed
Spiegelhalter, JD, Abrams, RK, Myles, PJ. Bayesian Approaches to Clinical Trials and Health-Care Evaluation. Wiley; 2004.Google Scholar
Jackson, D, White, IR, Price, M, Copas, J, Riley, RD. Borrowing of strength and study weights in multivariate and network meta-analysis. Stat. Methods Med. Res. 2017;26:28532868.CrossRefGoogle ScholarPubMed
Kullback, S, Leibler, RA. On information and sufficiency. Ann. Math. Stat. 1951;22:7986.CrossRefGoogle Scholar
MRC Biostatistics Unit, University of Cambridge. MRC Biostatistics Unit. The BUGS Project; 2010.Google Scholar
Gelman, A, Rubin, DB. Inference from iterative simulation using multiple sequences. Stat. Sci. 1992;7:457472.CrossRefGoogle Scholar
National Institute for Health and Care Excellence. Guide to the Methods of Technology Appraisal 2013. NICE Process and Methods Guides. National Institute for Health and Care Excellence; 2013.Google Scholar
Briggs, A, Claxton, K, Sculpher, MJ. Decision Modelling for Health Economic Evaluation. Oxford University Press; 2006.CrossRefGoogle Scholar
Drummond, MF, Sculpher, MJ, Claxton, K, Stoddart, GL, Torrance, GW. Methods for the Economic Evaluation of Health Care Programmes. Oxford University Press; 2015.Google Scholar
Ades, AE, Lu, G, Claxton, K. Expected value of sample information calculations in medical decision modeling. Med. Dec. Mak. 2004;24:207227.CrossRefGoogle ScholarPubMed
Figure 0

Table 1 Evidence synthesis models used in the original HTA alongside their predicted odds ratios for all-cause mortality and key cost-effectiveness results

Figure 1

Figure 1 Fixed and random-effects pairwise meta-analyses of all-cause mortality in sepsis, separately within each population and pooled across populations.The evidence base comprises 17 studies in adults (direct evidence) and 11 studies in paediatric patients (indirect evidence). All studies report all-cause mortality. The data are available in the Supplementary Material. Points to the left of the line of no difference favour IVIG/IVIGAM over Albumin/Placebo. The plot was created using the R package ‘meta’.

Figure 2

Table 2 Summary of ISMs applied in the case-study

Figure 3

Figure 2 Posterior mean (log odds ratio) estimates for Base-Model 1 and Base-Model 2 across ISMs. Shaded area is defined by the point estimates of the lumping and splitting models. FE, fixed-effects; RE, random-effects; MR, meta-regression; ISM, information-sharing method.

Figure 4

Figure 3 Standardised information-sharing metrics (PED, PrI, KL) of all ISMs for the FE (A) and RE (B) base-models. PED, point estimate divergence; PrI, precision increase; KL, Kullback-Leibler divergence.

Figure 5

Table 3 Model outputs across all applicable ISMs under Base-Model 1 and Base-Model 2 in ascending ICER order. p.EVPPI, p.EVPPI, and max.ENBS in millions pounds sterling (£)

Supplementary material: File

Nikolaidis et al. supplementary material

Nikolaidis et al. supplementary material
Download Nikolaidis et al. supplementary material(File)
File 2.8 MB