This article has Open Peer Review reports available.

# Joint distribution approaches to simultaneously quantifying benefit and risk

- Michele L Shaffer
^{1}Email author and - Kristi L Watterberg
^{2}

**6**:48

https://doi.org/10.1186/1471-2288-6-48

© Shaffer and Watterberg; licensee BioMed Central Ltd. 2006

**Received: **15 August 2006

**Accepted: **12 October 2006

**Published: **12 October 2006

## Abstract

### Background

The benefit-risk ratio has been proposed to measure the tradeoff between benefits and risks of two therapies for a single binary measure of efficacy and a single adverse event. The ratio is calculated from the difference in risk and difference in benefit between therapies. Small sample sizes or expected differences in benefit or risk can lead to no solution or problematic solutions for confidence intervals.

### Methods

Alternatively, using the joint distribution of benefit and risk, confidence regions for the differences in risk and benefit can be constructed in the benefit-risk plane. The information in the joint distribution can be summarized by choosing regions of interest in this plane. Using Bayesian methodology provides a very flexible framework for summarizing information in the joint distribution.

### Results

Data from a National Institute of Child Health & Human Development trial of hydrocortisone illustrate the construction of confidence regions and regions of interest in the benefit-risk plane, where benefit is survival without supplemental oxygen at 36 weeks postmenstrual age, and risk is gastrointestinal perforation. For the subgroup of infants exposed to chorioamnionitis the confidence interval based on the benefit-risk ratio is wide (Benefit-risk ratio: 1.52; 90% confidence interval: 0.23 to 5.25). Choosing regions of appreciable risk and acceptable risk in the benefit-risk plane confirms the uncertainty seen in the wide confidence interval for the benefit-risk ratio – there is a greater than 50% chance of falling into the region of acceptable risk – while visually allowing the uncertainty in risk and benefit to be shown separately. Applying Bayesian methodology, an incremental net health benefit analysis shows there is a 72% chance of having a positive incremental net benefit if hydrocortisone is used in place of placebo if one is willing to incur at most one gastrointestinal perforation for each additional infant that survives without supplemental oxygen.

### Conclusion

If the benefit-risk ratio is presented, the joint distribution of benefit and risk also should be shown. These regions avoid the ambiguity associated with collapsing benefit and risk to a single dimension. Bayesian methods allow even greater flexibility in simultaneously quantifying benefit and risk.

## Background

When comparing the effects of a new therapy with an existing therapy, it is not uncommon for the new therapy to show increased risks along with increased benefits. We consider the case of a single binary measure of efficacy and a single binary measure of risk or adverse event (absent/present, ever/never) and address the questions:

1. How do you appropriately measure the tradeoff between the benefit and risk of two therapies?

2. When should you conclude the increased benefit of a new therapy outweighs the potential increased risk?

Rather than focusing on hypothesis testing and controlling the type I error rate, our interest is in jointly quantifying benefit and risk.

### The benefit-risk ratio

One method that has been suggested for measuring the tradeoff between a binary measure of benefit and a binary measure of risk is the benefit-risk ratio [1]. The benefit-risk ratio is the ratio of the difference in benefit to difference in risk, or equivalently, the ratio of Number Needed to Harm (NNH) to Number Needed to Treat (NNT):

$R=\frac{{p}_{E}-{p}_{C}}{{q}_{E}-{q}_{C}}=\frac{NNH}{NNT}$

where *p*
_{
E
}and *p*
_{
c
}are the probabilities of benefit in the experimental treatment and control arms, respectively, and *q*
_{
E
}and *q*
_{
c
}are the probabilities of risk in the experimental treatment and control arms, respectively.

$ICER=\frac{{\gamma}_{E}-{\gamma}_{C}}{{\epsilon}_{E}-{\epsilon}_{C}}$

where *γ*
_{
E
}and *γ*
_{
C
}are average costs of the experimental and control conditions, respectively, and *ε*
_{
E
}and *ε*
_{
C
}are average effectiveness measures of the experimental and control conditions, respectively. One can similarly view the ICER in the cost-effectiveness plane. Distributional assumptions may differ for the benefit-risk ratio and cost-effectiveness ratio with cost generally considered a continuous measure. And while effectiveness appears in the denominator of the ICER, benefit is in the numerator of the benefit-risk ratio. Furthermore, although the current discussion focuses on a single binary measure of risk, consolidating multiple risks into a single measure may be more problematic than combining costs.

There is some ambiguity in reducing the difference in benefit and difference in risk to a single measure. As differing magnitudes of benefit and risk can result in the same ratio, control therapy could show more benefit and more risk and yield the same ratio as a new therapy which shows more benefit and more risk. Note in Figure 1 that any observed difference in benefit and observed difference in risk that falls on the line shown through the origin will produce the same benefit-risk ratio. For example, suppose the difference in benefit favors the new therapy over control and is 0.30, but the new therapy also increases the adverse event rate by 0.20; the resulting benefit-risk ratio is 1.5. However, if the difference in benefit favors control over the new therapy and is -0.30, but the new therapy reduces the adverse event rate by 0.20, then the resulting benefit risk ratio also is 1.5. When deciding whether the new therapy is acceptable, it is unlikely that these two scenarios would be considered equivalent. In the first scenario we are weighing increased benefit against increased risk, while in the latter we are weighing decreased benefit against decreased risk. Heitjan et al. highlighted similar complications for estimation of the ICER [2].

Confidence intervals can be constructed for the benefit-risk ratio using methods similar to those used to compute confidence intervals for cost-effectiveness ratios [3–5]. Assuming bivariate normality, Willan et al. showed that Fieller's theorem can be used to compute confidence intervals where the variance of the bivariate normal distribution is given by

$V([{\widehat{q}}_{E}-{\widehat{q}}_{C},{\widehat{p}}_{E}-{\widehat{p}}_{C}{]}^{\prime})=\left[\begin{array}{c}\frac{{q}_{E}(1-{q}_{E})}{{n}_{E}}+\frac{{q}_{C}(1-{q}_{C})}{{n}_{C}}\frac{{b}_{E}-{p}_{E}{q}_{E}}{{n}_{E}}+\frac{{b}_{C}-{p}_{C}{q}_{C}}{{n}_{C}}\\ \frac{{b}_{E}-{p}_{E}{q}_{E}}{{n}_{E}}+\frac{{b}_{C}-{p}_{C}{q}_{C}}{{n}_{C}}\frac{{p}_{E}(1-{p}_{E})}{{n}_{E}}+\frac{{p}_{C}(1-{p}_{C})}{{n}_{C}}\end{array}\right]$

where "hats" indicate the observed values of population parameters and *b*
_{
E
}and *b*
_{
C
}are the probabilities of simultaneous benefit and risk in the same subject for the experimental treatment and control arms, respectively [1]. The variance is estimated ($\widehat{V}$) by replacing the population parameters with the observed values. Calculation of the confidence limits by Fieller's theorem involves matrix manipulation which can be done in several packages including PROC IML in SAS (SAS Institute, Inc., Cary, NC), Mathematica (Wolfram Research, Inc., Champaign, IL), S-PLUS (Insightful Corporation, Seattle, WA), or the free software R [6]. Alternatively, the bootstrap can be used to construct confidence intervals using the percentile method [7].

### Other simultaneous measures of benefit and risk

Other measures have been suggested to summarize differences in benefit and risk. An early example is the work by Tallarida et al. on a severity scale developed through physician interviews which synthesizes information on disease severity and adverse drug reactions so that these considerations can be quantitatively incorporated into a benefit-risk analysis [10]. Chuang-Stein et al. presented three ratio measures that require assigning weights to categories of the form: (1) benefit without adverse event, (2) benefit with adverse event, (3) no benefit and no adverse event, (4) no benefit with adverse event, and (5) unacceptable adverse event leading to withdrawal [11]. While these ratios are more general than the benefit-risk ratio, specifying weights that reflect the relative importance of the categories may be difficult. Later work by Chuang-Stein discounts benefit by risk using consolidated safety data [12, 13]. As noted by Holden, these approaches do not clearly delineate benefit and risk which makes their interpretation more complicated than the traditional benefit-risk ratio [14].

## Methods

### Confidence regions

Rather than collapsing the difference in benefit and difference in risk into a single dimension, the joint density of benefit and risk can be represented in the benefit-risk plane. Similar methods have been proposed for cost-effectiveness analyses [15, 16]. Confidence regions can be constructed either under the bivariate normal assumption or using the bootstrap and nonparametric density estimation. Assuming bivariate normality, the confidence region is an ellipse. To construct a nonparametric confidence region, we draw repeated (bootstrap) samples with replacement and compute a benefit difference and risk difference for each of the samples. Next we obtain a two-dimensional kernel density estimate using the set of bootstrap estimates and find a contour of the kernel density estimate that includes (1 - *α*) × 100% of the bootstrap estimates [17]. Two-dimensional kernel density estimation methods are available for S-PLUS or R.

In addition to plotting the confidence region in the benefit-risk plan, we also can partition the benefit-risk plane into chosen regions of interest, e.g.,

- 1.
Appreciable risk

- 2.
No appreciable benefit

- 3.
No conclusion ("gray region")

- 4.
Experimental therapy superior

and look at the proportion of bootstrap estimates that fall into each region. These regions may be easier to specify for the clinician than the weights needed for the weighted benefit-risk ratios proposed by Chuang-Stein et al. [11].

### Bayesian methods

As an alternative to the confidence region approach, using asymptotic theory, Bayesian inference can be based on the posterior distribution of the difference in benefit and difference in risk, assuming that the prior distribution is locally uniform (or continuous and nonzero) near the true difference in risk and difference in benefit [18]. Using the posterior distribution,

$p([{q}_{E}-{q}_{C},{p}_{E}-{p}_{C}{]}^{\prime}|[{\widehat{q}}_{E}-{\widehat{q}}_{C},{\widehat{p}}_{E}-{\widehat{p}}_{C}{]}^{\prime})\approx N([{\widehat{q}}_{E}-{\widehat{q}}_{C},{\widehat{p}}_{E}-{\widehat{p}}_{C}{]}^{\prime},\widehat{V})$

the posterior probability of falling into the chosen regions can be computed [19]. The integration required can be carried out using the numerical integration function *N Integrate* in Mathematica or similar software. The probability interpretation of the Bayesian analysis is more straightforward than the confidence interpretation associated with the bootstrapping approach.

Decision analysis also can be conducted under the Bayesian framework using linear combinations of the form

*f*(*A*, *B*) = *A*(*p*
_{
E
}- *p*
_{
C
}) - *B*(*q*
_{
E
}- *q*
_{
C
})

Point estimates and probability intervals for these linear combinations can be computed by taking a large number of draws from the posterior distribution and computing f(A, B) for each draw. The median of the draws can be used as a point estimate of f(A, B), and the 100*α*/2 and 100(1 - *α*/2) centiles of these draws form a 100(1 - *α*)% interval estimate.

These linear combinations also can be used to conduct benefit-risk analyses analogous to the incremental net health benefit (*INHB*)approach used in cost-effectiveness analyses [20, 21]. In the cost-effectiveness setting, the *INHB* of an experimental treatment compared to a control is defined as

*INHB*(*λ*) = (*ε*
_{
E
}- *ε*
_{
C
}) - (*γ*
_{
E
}- *γ*
_{
C
})/*λ*

where *λ* can be thought of as the maximum society is willing to pay for an incremental gain in health [20]. One obvious advantage of this approach is that INHB is measured in units of effectiveness so the quadrant ambiguity of the cost-effectiveness approach is no longer an issue.

Analogously, in the benefit-risk setting, we'll define an incremental health benefit of the experimental therapy compared to the control as

*INHB*
_{
BR
}(*δ*) = (*p*
_{
E
}- *p*
_{
C
}) - (*q*
_{
E
}- *q*
_{
C
})/*δ*

where *δ* can be thought of as the maximum number of adverse events one is willing to incur for each subject that benefits. Alternatively, and perhaps more meaningfully, one can interpret 1/*δ* as the minimum number of subjects who should benefit for each additional adverse event. Integration over the posterior distribution of the risk difference and benefit difference can be used to compute *Pr*[*INHB*
_{
BR
}(*δ*) > 0] for a particular *δ* value or one can look at a plot of *Pr*[*INHB*
_{
BR
}(*δ*) > 0] over a range of *δ* values.

Although we have used large sample theory to assume the posterior distribution of the difference in risk and difference in benefit is bivariate normal, this assumption is not necessary for these Bayesian methods. As long as it is possible to simulate draws from the posterior distribution, these point estimates and probability intervals can be calculated under other distributional assumptions. Simulation approximations to the integration required to compute the posterior probabilities, *Pr*[*INHB*
_{
BR
}(*δ*) > 0], are obtained by computing the percentage of simulation draws for which *INHB*
_{
BR
}(*δ*) exceeds 0. Similar simulation approximations to integration can be used to compute posterior probabilities of falling into chosen regions of interest in the benefit-risk plane.

## Results and discussion

Survival without supplemental oxygen and GI perforation rates in the PROPHET study by treatment

Placebo (n = 76) | Hydrocortisone (n = 73) | |
---|---|---|

Survival without O | 18/76 (24%) | 28/73 (38%) |

GI Perforation | 1/76 (1%) | 8/73 (11%) |

Survival without O | 0/76 | 3/73 (4%) |

- 1.
Appreciable Risk: Risk difference > 0.10

- 2.Acceptable Risk: Risk difference = 0.10
- a.
Hydrocortisone Superior: Benefit difference > 0.20

- b.
No Conclusion: 0.10 = Benefit difference = 0.20

- c.
No Appreciable Benefit: Benefit difference < 0.10

- a.

Estimated probabilities of falling into selected regions of interest

Region | Proportion of Bootstrap Estimates | Posterior Probability |

Appreciable risk | 0.44 | 0.46 |

Hydrocortisone superior | 0.14 | 0.13 |

No conclusion | 0.27 | 0.27 |

No appreciable benefit | 0.15 | 0.14 |

Total | 1 | 1 |

*INHB*

_{ BR }) of hydrocortisone compared to placebo exceeds zero over a range of 1/

*δ*, which can be interpreted here as the minimum number of babies who should survive without supplemental oxygen for each additional GI perforation incurred. If the threshold is one additional survivor without supplemental oxygen for each additional GI perforation, the probability

*INHB*

_{ BR }(1)exceeds zero is approximately 0.72. This probability quickly drops off and falls below 50% when the threshold is approximately 1.5 additional survivors without supplemental oxygen for each additional GI perforation.

These findings are not conclusive and demonstrate the need for additional study to determine how hydrocortisone therapy might be used to provide benefit in these extremely low birth weight infants without increasing risk of GI perforation. One area of potential investigation is related to indomethacin therapy's role in the development of GI perforation. There is evidence in the PROPHET study of an interaction between hydrocortisone and early indomethacin therapy, although indomethacin was not randomized in this trial. In the absence of early indomethacin, low-dose hydrocortisone therapy administered as described for this study has not previously been associated with increased incidence of GI perforation [23]. For this analysis S-PLUS was used to construct the confidence ellipse and nonparametric region. The two-dimensional kernel density estimation function *kde* and the ellipse-drawing function *ellipse* for S-PLUS or R are available from StatLib [24]. Mathematica was used to compute the benefit-risk ratio and associated confidence interval and all posterior probabilities, but these computations also can be done using S-PLUS or R.

## Conclusion

It is less ambiguous to jointly look at the difference in risk and difference in benefit in the benefit-risk plane than to collapse information by computing a benefit-risk ratio. If the benefit-risk ratio is reported, the joint distribution of benefit and risk also should be presented. When looking at the joint distribution, uncertainty in benefits and risks can be represented by confidence ellipses based on the assumption of bivariate normality or plots of estimates from bootstrap samples with or without a nonparametric confidence region. To quantify the probability of falling into regions of interest, the proportion of bootstrap estimates or posterior probabilities can be computed for particular regions. Bayesian methods provide a flexible framework in which to summarize the joint distribution of benefit and risk. Using the Bayesian framework allows one to easily conduct benefit-risk analyses similar to the incremental net health benefit analyses used for cost-effectiveness research. As this approach is based on linear combinations of benefit and risk, many of the inferential problems associated with ratios are avoided.

We have chosen to focus on the comparison of two therapies for a binary measure of benefit and a binary measure of risk, as the motivating PROPHET study had a binary primary benefit outcome and an increased rate of a single adverse event, spontaneous GI perforation, which resulted in an early stop of the trial. However, the Bayesian methods easily generalize to allow for other distributions of benefit and risk, provided one can simulate samples from the posterior distribution of interest. The Bayesian methods also allow prior information to be incorporated into the inference if such information is available. When it is of interest to compare more than two therapies, the benefit-risk approaches shown can be conducted in a pairwise fashion.

## Declarations

### Acknowledgements

Partial support for this research was provided under Grant No. R01 HD038540 from the National Institute of Child Health & Human Development. The authors would like to thank three reviewers for helpful comments that improved the manuscript.

## Authors’ Affiliations

## References

- Willan AR, O'Brien BJ, Cook DJ: Benefit-risk ratios in the assessment of the clinical evidence of a new therapy. Control Clin Trials. 1997, 18: 121-130. 10.1016/S0197-2456(96)00092-X.View ArticlePubMedGoogle Scholar
- Heitjan DF, Moskowitz AJ, Whang W: Bayesian estimation of cost-effectiveness ratios from clinical trials. Health Econ. 1999, 8: 191-201. 10.1002/(SICI)1099-1050(199905)8:3<191::AID-HEC409>3.0.CO;2-R.View ArticlePubMedGoogle Scholar
- Willan AR, O'Brien BJ: Confidence intervals for cost-effectiveness ratios: an application of Fieller's theorem. Health Econ. 1996, 5: 297-305. 10.1002/(SICI)1099-1050(199607)5:4<297::AID-HEC216>3.0.CO;2-T.View ArticlePubMedGoogle Scholar
- Heitjan DF, Moskowitz AJ, Whang W: Problems with interval estimates of the incremental cost-effectiveness ratio. Med Decis Making. 1999, 19: 9-15.View ArticlePubMedGoogle Scholar
- Glick H, Polsky D: Evaluating stochastic uncertainty in cost-effectiveness analysis. 2003, [Society for Medical Decision Making short course], [http://www.uphs.upenn.edu/dgimhsr]Google Scholar
- The R Project for Statistical Computing. [http://www.r-project.org]
- Efron B: The jackknife, the bootstrap and other resampling plans. 1982, SIAM [Society for Industrial and Applied Mathematics]View ArticleGoogle Scholar
- Briggs AH, Wonderling DE, Mooney CZ: Pulling cost-effectiveness analysis up by its bootstraps: a non-parametric approach to confidence interval estimation. Health Econ. 1997, 6: 327-340. 10.1002/(SICI)1099-1050(199707)6:4<327::AID-HEC282>3.0.CO;2-W.View ArticlePubMedGoogle Scholar
- Heitjan DF, Kim CY, Li H: Bayesian estimation of cost-effectiveness from censored data. Stat Med. 2004, 23: 1297-1309. 10.1002/sim.1740.View ArticlePubMedGoogle Scholar
- Tallarida RJ, Murray RB, Eiben C: A scale for assessing the severity of diseases and adverse drug reactions. Clin Pharmacol Ther. 1979, 25: 381-390.View ArticlePubMedGoogle Scholar
- Chuang-Stein C, Mohberg NR, Sinkula MS: Three measures for simultaneously evaluating benefits and risks using categorical data from clinical trials. Stat Med. 1991, 10: 1349-1359.View ArticlePubMedGoogle Scholar
- Chuang-Stein C, Mohberg NR, Musselman DM: Organization and analysis of safety data using a multivariate approach. Stat Med. 1992, 11: 1075-1089.View ArticlePubMedGoogle Scholar
- Chuang-Stein C: A new proposal for benefit-less-risk analysis in clinical trials. Control Clin Trials. 1994, 15: 30-43. 10.1016/0197-2456(94)90026-4.View ArticlePubMedGoogle Scholar
- Holden WL: Benefit-risk analysis: a brief review and proposed quantitative approaches. Drug Saf. 2003, 26: 853-862. 10.2165/00002018-200326120-00002.View ArticlePubMedGoogle Scholar
- van Hout BA, Al MJ, Gordon GS, Rutten FF: Costs, effects and C/E-ratios alongside a clinical trial. Health Econ. 1994, 3: 309-319.View ArticlePubMedGoogle Scholar
- Noyes K, Holloway RG: Evidence from cost-effectiveness research. NeuroRx. 2004, 1: 348-355. 10.1602/neurorx.1.3.348.View ArticlePubMedPubMed CentralGoogle Scholar
- Hall P: On the bootstrap and likelihood-based confidence regions. Biometrika. 1987, 74: 481-493. 10.2307/2336687.View ArticleGoogle Scholar
- Gelman A, Carlin JB, Stern HS, Rubin DB: Bayesian Data Analysis. 1995, London: Chapman & HallGoogle Scholar
- Berger JO: Statistical Decision Theory and Bayesian Analysis. 1985, New York: Springer, SecondView ArticleGoogle Scholar
- Stinnett AA, Mullahy J: Net health benefits: a new framework for the analysis of uncertainty in cost-effectiveness analysis. Med Decis Making. 1998, 18: S68-S80.View ArticlePubMedGoogle Scholar
- Willan AR: Analysis, sample size, and power for estimating incremental net health benefit from clinical trial data. Control Clin Trials. 2001, 22: 228-237. 10.1016/S0197-2456(01)00110-6.View ArticlePubMedGoogle Scholar
- Watterberg KL, Gerdes JS, Cole CH, Aucott SW, Thilo EH, Mammel MC, Couser RJ, Garland JS, Rozycki HJ, Leach CL, Backstrom C, Shaffer ML: Prophylaxis of early adrenal insufficiency to prevent bronchopulmonary dysplasia: a multicenter trial. Pediatrics. 2004, 114: 1649-1657. 10.1542/peds.2004-1159.View ArticlePubMedGoogle Scholar
- Watterberg KL, Gerdes JS, Gifford KL, Lin HM: Prophylaxis against early adrenal insufficiency to prevent chronic lung disease in premature infants. Pediatrics. 1999, 104: 1258-1263. 10.1542/peds.104.6.1258.View ArticlePubMedGoogle Scholar
- StatLib – Software and extensions for the S (Splus) language. [http://lib.stat.cmu.edu/S]
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/6/48/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.