This article has Open Peer Review reports available.

# Bayesian methods in clinical trials: a Bayesian analysis of ECOG trials E1684 and E1690

- Joseph G Ibrahim
^{1}Email author, - Ming-Hui Chen
^{2}and - Haitao Chu
^{3}

**12**:183

https://doi.org/10.1186/1471-2288-12-183

© Ibrahim et al.; licensee BioMed Central Ltd. 2012

**Received: **10 August 2012

**Accepted: **6 November 2012

**Published: **29 November 2012

## Abstract

### Background

E1684 was the pivotal adjuvant melanoma trial for establishment of high-dose interferon (IFN) as effective therapy of high-risk melanoma patients. E1690 was an intriguing effort to corroborate E1684, and the differences between the outcomes of these trials have embroiled the field in controversy over the past several years. The analyses of E1684 and E1690 were carried out separately when the results were published, and there were no further analyses trying to perform a single analysis of the combined trials.

### Method

In this paper, we consider such a joint analysis by carrying out a Bayesian analysis of these two trials, thus providing us with a consistent and coherent methodology for combining the results from these two trials.

### Results

The Bayesian analysis using power priors provided a more coherent flexible and potentially more accurate analysis than a separate analysis of these data or a frequentist analysis of these data. The methodology provides a consistent framework for carrying out a single unified analysis by combining data from two or more studies.

### Conclusions

Such Bayesian analyses can be crucial in situations where the results from two theoretically identical trials yield somewhat conflicting or inconsistent results.

## Keywords

## Background

### Brief Introduction to Bayesian Methods

Bayesian methods in clinical trials and biomedical research, in general, have become quite prominent in the last decade due to their flexibility in use, good operating characteristics, interpretation, and in their ability to handle design and analysis issues in complex models, such as survival models, models for longitudinal data, and models for discrete data. Bayesian methods are becoming more and more standard in the design and analysis of clinical trials [1, 2]. One main reason for this is in their flexibility and operating characteristics, for example, in adaptive designs and interim monitoring [2]. There are many good introductory as well as advanced level books on Bayesian methods [3–5].

The Bayesian paradigm differs from the frequentist paradigm in that our uncertainty about unknown parameters in a model is expressed through an entire distribution, called the *prior distribution*. Thus, the prior distribution for a model parameter θ, expresses our prior uncertainty about the value of the parameter, and “prior” in the sense that we express our uncertainty before collecting the data in the study. We denote the prior distribution for θ by *π*(*θ*). The main inferential tool in the Bayesian paradigm is called the posterior distribution, that is, the distribution of *θ* after the data is collected. By Bayes’ theorem, the posterior distribution of θ is proportional to the likelihood function of the data times the prior. In terms of a formula, it is given by *π*(*θ*∣Data) ∝ *L*(Data∣*θ*)*π*(*θ*), where *π*(*θ*∣Data) denotes the posterior distribution of *θ* that is, the distribution of θ *given* the data, and *L*(Data∣*θ*) denotes the likelihood function of the data given the parameter *θ*.

A major consideration in the Bayesian paradigm is the choice of the prior. Priors that have a minimal impact on the overall Bayesian analysis are called *noninformative* priors. Other names for noninformative priors include flat, reference, or vague priors. Noninformative priors yield Bayesian inferences that are very similar to frequentist inference. A noninformative prior is “flat” relative to the likelihood function, that is, it is flat relative to the distribution of the data. On the other hand, informative priors generally do not lead to results that are similar to those of the frequentist paradigm and such priors are not flat relative to the likelihood function and do have an impact on the likelihood in a Bayesian analysis. Examples of potentially informative priors used in biomedical research include priors incorporating historical data [6, 7].

Bayesian models can be fit in a wide variety of statistical packages including WinBUGS and SAS. These software packages have become very powerful in providing the data analyst with a wide array of flexibility and capability for fitting complex Bayesian models. Both of these software packages use Markov chain Monte Carlo (MCMC) methods to carry out the Bayesian computation. MCMC methods are simulation-based methods that draw samples from the posterior distribution of *θ* and have proven to be quite powerful for fitting even the most complex models that cannot be entertained in the frequentist paradigm.

### Brief Introduction to Melanoma

Melanoma incidence is increasing at a rate that exceeds all solid tumors. Although education efforts have resulted in earlier detection of melanoma, patients who have deep primary melanoma (*>*4*mm*) or melanoma metastatic to regional draining lymph nodes, classified as *high-risk melanoma* patients, continue to have high relapse and mortality rates of 50% or higher [8]. Recently, several post-operative (adjuvant) chemotherapies have been proposed for this class of melanoma patients, and the one which seems to provide the most significant impact on relapse-free survival and survival is Interferon Alpha-2b (IFN). This immunotherapy was evaluated in two observation-controlled Eastern Cooperative Oncology Group (ECOG) phase III clinical trials, E1684 and E1690. The first trial, E1684, was a two arm clinical trial comparing high-dose interferon (IFN) to Observation (OBS). There were a total of 286 patients enrolled in the study, accrued from 1984 to 1990, and the study was unblinded in 1993 and published in 1996 [8]. The results of this study suggested that IFN has a significant impact on relapse-free survival and survival, which led to the U.S. Food and Drug Administration (FDA) approval of this regimen as an adjuvant therapy for high-risk melanoma patients. Here, relapse-free survival is defined as the time from randomization until progression of tumor or death, whichever comes first, and survival is defined as time from randomization until death. This regimen is widely used for adjuvant therapy of high-risk melanoma patients and the reference standard for evaluation of alternative modalities such as vaccines in current U.S. Cooperative group trials.

The significant treatment effect favoring IFN seen in E1684 with respect to both relapse- free survival (RFS) and overall survival (OS) was expected and was accompanied by substantial side effects due to the high-dose regimen. As a result, ECOG began a second trial (E1690) in 1991 to attempt to confirm the results of E1684 and to study the potential benefit of IFN given at a lower and less toxic dosage. The ECOG trial E1690 was a three arm phase III clinical trial, and had treatment arms of high dose interferon, low dose interferon, and observation. This study had 427 patients on the high dose interferon arm and observation arm combined. Throughout our analyses in this paper, we will use only the data from these two arms of E1690. E1690 was initiated right after the completion of E1684. The E1690 trial accrued patients from 1991 until 1995, was unblinded in 1998, and published in 2000 [9]. The E1690 trial was designed for exactly the same patient population as E1684, and the high dose interferon arm in E1690 was identical to that of E1684.

E1690 was a critical trial in the assessment of the value of high-dose IFN as adjuvant therapy for melanoma. When the results of E1690 were unblinded, separate results for E1684 and E1690 were reported [9], and analyses of the combined results were problematic and unable to be resolved into one coherent analysis. In this paper, we propose to do a combined analysis of the E1684 and E1690 trials using Bayesian methods. The Bayesian methodology lends itself well into this type of analysis since the E1684 data can effectively be used as prior information for the E1690 analysis. Using the E1684 data as the historical data and the E1690 data as the current data is natural here within the Bayesian paradigm and this will serve as the basis for our analysis. We thus examine the problem of developing suitable statistical models for high-risk melanoma patients as well as the opportunity of conducting Bayesian inference in the presence of historical data. In this article, we will discuss a Bayesian analysis for the endpoints of RFS as well as OS. It was in the OS endpoint where the results were most inconsistent between the two trials and it was this endpoint which has led to most of the controversy. The RFS results were more or less consistent between E1684 and E1690. The results of this trial also raised the issue of whether RFS can be used at all as a suitable surrogate or predictor of OS, since the RFS and OS results were not consistent in E1690.

In the present context, the incorporation of historical data, i.e., the E1684 data, into the analysis of E1690 is a natural thing to do. Towards this goal, we use the power priors of Ibrahim and Chen [6] to construct the prior distribution. Since the FDA has in the past often required a second confirmatory trial before approving a new drug for cancer therapy, historical data often exists for constructing prior distributions in a clinical trial from a previous trial or trials comparing identical or very similar treatment regimens. Such is the case for the E1684 and E1690 trials. Thus, it appears natural to use the E1684 data somehow for the analysis of E1690.

The construction of prior distributions from historical data has been examined under various contexts by Ibrahim and Chen [6]. Where the focus is on observable quantities in the prior elicitation scheme. Specifically, prior elicitation is based on the availability of historical data *D*
_{0} and a scalar *precision* parameter *a*
_{0} (0 *≤ a*
_{0} *≤* 1) quantifying the uncertainty in *D*
_{0}. Then, *D*
_{0} and *a*
_{0} are used to specify a prior distribution for the parameters in a “semi-automatic” fashion. Strictly speaking, *D*
_{0} can consist of prior predictions from past data, summary statistics from previous studies, or subjective elicitation based on case-specific information. However, the most natural specification of *D*
_{0} arises when the raw data from a similar previous study is available, and this is what we focus on in this paper. There are many advantages to this type of elicitation scheme. First, the prior is constructed in a semi-automatic fashion from the historical data in the sense that the prior itself is just a weighted likelihood and thus there is minimal subjective prior elicitation. Second, the precision parameter *a*
_{0} allows the investigator a great deal of control on the influence of the historical data for the current analysis. This is important in situations when one suspects heterogeneity between the patient populations, or when the sample sizes between the two studies are quite different.

- (a)
Patient heterogeneity between the two studies.

- (b)
Methods for incorporation of historical data.

- (c)
Consistency of the results from the two studies.

- (d)
A pooled analysis.

- (e)
Assessing Observation and high-dose IFN time trends.

- (1)
Should the analysis of E1690 be done independently of E1684?

- (2)
If we are interested in the treatment of melanoma, how can E1864 be used as historical data for E1690?

- (3)
How does E1684 impact the results?

- (4)
Can we assess and control the impact of E1684 by weighting it somehow?

- (5)
How do we weight the historical data to account for heterogeneity between the two studies?

## Methods

### Cure Rate Model

*π*= proportion of patients in the population who are cured, and 1

*− π*= proportion of patients in the population who are not cured. The proportion not cured experience events according to a survival function S

_{0}(t) = P(T > t). Then the probability of surviving beyond time

*t*for the entire population (i.e., the survival function for the entire population), denoted by S

_{p}(t), is given by

For example, *π* = 0*.*25 means that 25% of the population are “cured” and 75% are “not cured”. If *π* = 0, then we obtain the survival model with survival function S_{0}(t). The cure rate model fits the data better than the usual Cox model when a plateau occurs in the right tail of the survival curve, and thus, for E1684, and most other melanoma adjuvant trials in ECOG, the cure rate model fits the data better than the Cox model. The cure rate model also has other attractive properties. For example, the log-rank test has nice properties (i.e., high power) when a cure rate model is used in the statistical design [9, 10]. Also, the cure rate model is quite easy to fit and computationally very straightforward to program in SAS or R.

### Prior Elicitation

The specification of the prior distribution is called *prior elicitation*. Prior elicitation plays the most crucial role in Bayesian inference. Prior distributions based on historical data are useful in applied research settings where the investigator has access to previous studies measuring the same response and covariates as the current study.

*a*

_{0}, it is desirable to have historical and current datasets with minimum heterogeneity between the patient populations. Although the E1684 and E1690 trials theoretically involved identical patient populations, Figure 1 suggests that one should formally check if the historical data, E1684, is compatible with the current data, E1690, in some sense. In the context of these two trials, there are several practical formal checks one can make given a statistical model. These are outlined as follows.

- 1.
We compare important demographic and prognostic factor data from both studies, including distributions of gender, nodal status, age, Breslow depth, site of primary, ulceration, stage of disease, as well as other prognostic factors deemed important. Although not shown here, these comparisons were conducted for E1684 and E1690, and the distributions of these prognostic factors matched remarkably well. That is, the distributions of each of the demographic variables were nearly identical for both studies.

- 2.
A more formal procedure involves a comparison of the posterior distributions for each study. We can compare posterior summaries, such as posterior hazard ratios, posterior standard deviations, and Highest Posterior Density (HPD) intervals.

**Maximum likelihood estimates of hazard ratios and cure rates for E1684 and E1690**

Endpoint | Study | Hazard ratio | Cure rate | |
---|---|---|---|---|

(95% CI) | (95% CI) | |||

OBS | IFN | |||

RFS | E1684 | 1.43 | 0.17 | 0.35 |

(1.08, 1.89) | (0.06, 0.33) | (0.28, 0.43) | ||

E1690 | 1.28 | 0.32 | 0.44 | |

(1.00, 1.65) | (0.24, 0.40) | (0.37, 0.50) | ||

OS | E1684 | 1.32 | 0.25 | 0.40 |

(0.98, 1.78) | (0.12, 0.41) | (0.32, 0.49) | ||

E1690 | 1.00 | 0.49 | 0.47 | |

(0.75, 1.33) | (0.36, 0.60) | (0.37, 0.56) |

*power prior*as discussed in [6] to formally construct an informative prior distribution from the historical data E1684. The general idea of the power prior is that it is defined to be the likelihood function based on the historical data

*D*

_{0}, raised to a power

*a*

_{0}, where 0

*≤ a*

_{0}

*≤*1 is the scalar precision parameter that controls the influence of the historical data on the current data. Letting

*θ*denote a generic label for the parameters of a given model, the general power prior for a model is defined as

The quantity *π*
_{
0
}(*θ*) is the *initial prior* for *θ*. It represents the prior for *θ before* observing the historical data D_{0}. It is reasonable to restrict the range of *a*
_{0} to be between 0 and 1, since in general, it does not make sense to weight the historical data more than the current data. The parameter *a*
_{0} controls the heaviness of the tails of the prior for *θ*. As *a*
_{0} becomes smaller, the tails of equation (2) become heavier. Setting *a*
_{0} = 1, equation (2) corresponds to the update of *π*
_{0}(*θ*) using Bayes theorem. That is, with *a*
_{0} = 1, equation (2) corresponds to the posterior distribution of *θ* based on the historical data alone. When *a*
_{0} = 0, equation (2) does not depend on the historical data, and in this case, *π*(*θ*|*D*
_{0}, *a*
_{0} = 0) ≡ *π*
_{0}(*θ*). Thus, *a*
_{0} = 0 is equivalent to a prior specification with no incorporation of historical data. Equation (2) can be viewed as a generalization of the usual Bayesian update of *π*
_{0}(*θ*), and therefore serves as a coherent update of *π*
_{0}(*θ*). Details of the power prior, likelihood, and posterior for the models used in this paper are given in Additional file 1: Appendix A. We mention here that there are two ways of specifying the power prior: i) using *a*
_{0} as a fixed parameter and ii) treating *a*
_{0} as random and specifying a beta prior for it as discussed in [6]. In our experience, we have observed that essentially similar results are obtained whether we take *a*
_{0} as fixed or random. The *a*
_{0} random case is more computationally intensive than the *a*
_{0} fixed case and probably not worth the extra modeling and computation, and thus we advocate using *a*
_{0} fixed and using a goodness of fit criterion (see Additional file 1: Appendix B) to select the best *a*
_{0}. Such a procedure is much less computationally demanding than the *a*
_{0} random case.

## Results

### Combined Bayesian analysis of E1690 and E1684

In this section, we carry out a Bayesian analysis of E1690 using E1684 as historical data. For ease of exposition, all analyses are carried out with the treatment covariate alone, not adjusting for other prognostic factors. The two datasets were quite similar with respect to the distributions of several prognostic factors, including age, Breslow depth, number of nodes, performance status, gender, site of primary, and stage of disease. Prognostic factor analyses were conducted to examine the significance of time trend covariates and institutional effects for each study alone, as well as for the combined studies, for explaining the phenomenon in Figure 1. These factors were highly non-significant. We refer the reader to Kirkwood et al. [9] for detailed summaries of the prognostic factor distributions for both studies, as well as detailed prognostic factor analyses and various subset analyses conducted using the Cox model. The various prognostic factor and subset analyses in [9] yielded very similar results to analyses using the treatment covariate alone. In addition, Bayesian analyses with the proposed models using the prognostic factors mentioned above gave very similar results to the Bayesian analyses using the treatment covariate alone. Thus we conduct all analyses here with the treatment covariate alone for brevity.

*a*

_{0}= 0, 0.4, 1 using the model in equation (A.1) in Additional file 1: Appendix A along with the prior in equation (2). We see from Table 2 how the estimates of the posterior hazard ratio and posterior cure rates change as more weight is given to the historical data for both the RFS and OS endpoints. Table 2 shows that when more weight is given to the historical data, the posterior hazard ratios increase, resulting in a greater percent reduction in relapse for the IFN arm compared to the observation arm. In addition, although not shown here for brevity, the HPD intervals for the hazard ratio also become narrower and do not include 1. This is reasonable since the posterior hazard ratios based on E1684 alone were much larger than E1690 alone, and therefore as more weight is given to E1684, the greater the posterior hazard ratios and the narrower the HPD intervals. Similarly, the posterior estimates of the cure rates in Table 2 become smaller as

*a*

_{0}is increased. This again can be explained by the fact that the posterior cure rates for each arm in E1684 alone were much smaller than those based on E1690 alone. The value

*a*

_{0}= 1 implies that the E1684 and E1690 data are equally weighted in the Bayesian analysis, and this value achieves the greatest percent reduction in relapse (34%) and overall survival (14%), and the smallest cure rates for both treatment arms for both RFS and OS.

**Bayesian analysis of E1690 using E1684 as historical data**

Endpoint | Weight (a
| Posterior hazard ratio | 95% HPD Interval | % Reduction | Posterior cure rate estimate (95% HPD Interval) | |
---|---|---|---|---|---|---|

OBS | IFN | |||||

RFS | 0 | 1.29 | (0.98, 1.63) | 29 | 0.32 | 0.41 |

(0.24, 0.39) | (0.32, 0.49) | |||||

0.4 | 1.32 | (1.04, 1.62) | 32 | 0.30 | 0.40 | |

(0.23, 0.36) | (0.33, 0.46) | |||||

1 | 1.34 | (1.10, 1.60) | 34 | 0.28 | 0.39 | |

(0.23, 0.33) | (0.33, 0.44) | |||||

OS | 0 | 1.01 | (0.74, 1.30) | 1 | 0.50 | 0.50 |

(0.42, 0.58) | (0.42, 0.58) | |||||

0.4 | 1.08 | (0.82, 1.34) | 8 | 0.44 | 0.46 | |

(0.36, 0.51) | (0.38, 0.54) | |||||

1 | 1.14 | (0.91, 1.38) | 14 | 0.39 | 0.43 | |

(0.32, 0.46) | (0.36, 0.50) |

*a*

_{0}for the RFS (Figure 3a) and OS (Figure 3c) endpoints. From these plots, we can see how the posterior distributions shift to the right and become more peaked as more weight (i.e., increasing

*a*

_{0}) is given to the historical data. Specifically, as

*a*

_{0}is increased, we see a sharper peak in the posterior distribution of the hazard ratio in Figure 3a and c, thus giving a greater strength of evidence that the hazard ratio is larger than 1 (1.34 for RFS, 1.14 for OS). The same phenomenon occurs n Figure 3b and d which shows the posterior distributions of the cure rates for RFS and OS. In Figure 3b, we see that as

*a*

_{0}is increased, there is a shift to the left and sharper peaks in the posterior distributions of the cure rates, yielding a greater strength of evidence that the cure rates are 0.28 and 0.39 for the observation and IFN arms for RFS, respectively (Figure 3b, and 0.39 and 0.43 for the observation and IFN arms for OS, respectively (Figure 3d).

*a*

_{0}in the power prior, that is, what value of

*a*

_{0}should the investigator use in the analysis? To address this issue, we use the value of

*a*

_{0}that yields the best model fit according to some Bayesian criterion, such as the Deviance Information Criterion (DIC) (see Additional file 1: Appendix B). Figure 4a and b shows that the DIC is optimized when

*a*

_{0}= 0.4 for both the RFS and OS endpoints suggesting that the Bayesian model fits best when

*a*

_{0}= 0.4 is used for both of these endpoints. Interestingly enough, the same optimal value of

*a*

_{0}= 0.4 is obtained under a different Bayesian criterion (Logarithm of the Pseudo Marginal Likelihood (LPML) (see Additional file 1: Appendix B and Figure 4c and d). Thus, the value of

*a*

_{0}= 0.4 appears to be a very reasonable value to use for the Bayesian analysis, along with other

*a*

_{0}values used as a sensitivity analysis, such as

*a*

_{0}= 0 and

*a*

_{0}= 1.

*a*

_{0}= 1 in Table 2 for both RFS and OS showing that our Bayesian analysis using

*a*

_{0}= 1 essentially yields the same results as these three frequentist methods as well as being a more flexible methodology for incorporating historical data. However, we note that

*a*

_{0}= 1 is not an optimal value for the Bayesian analysis as shown by Figure 4. Our Bayesian methodology proposed here is much more general and more flexible than a frequentist analysis and chooses an optimal value of

*a*

_{0}based on the best model fit to the data.

**Maximum likelihood analysis of E1690 and E1684**

Endpoint | Weight (a
| Hazard ratio estimate | 95% CI | % Reduction | P-value |
---|---|---|---|---|---|

RFS | Study covariate | 1.35 | (1.12, 1.63) | 35 | 0.0018 |

Combined | 1.34 | (1.11, 1.62) | 34 | 0.0021 | |

Frailty | 1.35 | (1.12, 1.62) | 35 | 0.0019 | |

OS | Study covariate | 1.15 | (0.93, 1.41) | 14 | 0.196 |

Combined | 1.13 | (0.92, 1.39) | 13 | 0.232 | |

Frailty | 1.14 | (0.93, 1.40) | 14 | 0.200 |

## Discussion

The upward shifts in the OBS and IFN arm in E1690 relative to E1684 do not have a clear cut explanation. It was conjectured that the standard of care improved with time, therefore resulting in improved RFS and OS in E1690. We carried out several analyses to assess this conjecture by fitting a time trend in the Cox model. As noted in Section 5, this time trend effect was highly non-significant in all models fit. Another analysis examining institutional effects was also conducted and the analysis yielded non-significant institutional effects. Another issue still very difficult to explain is why OS was much better for the OBS patients in E1690 than in E1684.

As noted earlier, the KM plot for OS showed almost total overlap between the IFN and OBS arms. One potential explanation for this is that patients on the OBS arm received salvage therapies after relapse that may have improved their OS. The inclusion of these salvage therapies was not accounted for in the OS analysis. It will take further confirmatory trials to get better answers to these two unresolved issues.

*a*

_{0}in the presence of outliers or influential points in the data. To address this issue, we carried out a case influence analysis using the methods of Cho et al. [12]. Figure 5 shows the RFS and OS plots of the KL divergence between the posterior for the full data and the posterior without the

*i*

^{ th }case for E1684 and E1690. From Figure 5, we see that the values of the KL divergence were small for all observations. For the E1864 data, the maximum values of the KL divergence were 0.203 for RFS and 0.324 for OS, respectively, and there was only one additional case in which the KL divergence value was greater than 0.06 for each of RFS and OS. For the E1690 data, there were three cases for RFS and five cases for OS, in which the KL divergence values were greater than 0.06. We deleted these relatively most influential cases (two for E1684 and three for E1690 for RFS and two for E1684 and five for E1690 for OS) and then re-ran the analysis to obtain the optimal values of

*a*

_{0}under DIC and LPML for RFS and OS. Figure 6 shows the DIC and LPML plots for E1684 and E1690 after excluding these influential cases. We see from this figure that the optimal choice of

*a*

_{0}remains to be 0.4 for both RFS and OS. Thus, the optimal choice of

*a*

_{0}under DIC and LPML is relatively robust to influential observations.

Moreover, we have carried out a sensitivity analysis with respect to the choice of the initial prior, using several possible choices of initial priors. To this end, we consider *π*
_{0}(*β*, *λ*) = *π*
_{0}(*β*)*π*
_{0}(*λ*) where *β* ~ *N*(0, *τ*
_{0}
^{2}
*I*
_{2}), *I*
_{2} is the 2 × 2 identity matrix, and λ_{j} ~ Gamma (*b*
_{01}, *b*
_{02}) (j = 1, 2, …, J), where J=5 for RFS and J=10 for OS, with density proportional to ${\lambda}_{j}^{\left({b}_{01}-1\right)}\text{exp}\left(-{b}_{02}{\lambda}_{j}\right)$ and *b*
_{01} ≥ 0 and *b*
_{02} ≥ 0. For RFS, the posterior hazard ratios and 95% HPD intervals for the treatment effect were 1.294 and (0.977, 1.626) for *a*
_{0} = 0, 1.320 and (1.033, 1.611) for *a*
_{0} = 0.4, and 1.346 and (1.109, 1.616) for *a*
_{0} = 1 when *τ*
_{0}
^{2} = 10 and *b*
_{01} = *b*
_{02} = 1; and 1.293 and (0.968, 1.628) for *a*
_{0} = 0, 1.320 and (1.037, 1.616) for *a*
_{0} = 0.4, and 1.342 and (1.098, 1.597) for *a*
_{0} = 1 when *τ*
_{0}
^{2} = 2 and *b*
_{01} = *b*
_{02} = 10. For OS, the posterior hazard ratios and 95% HPD intervals for the treatment effect were 1.012 and (0.726, 1.303) for *a*
_{0} = 0, 1.081 and (0.832, 1.352) for *a*
_{0} = 0:4, and 1.138 and (0.910, 1.371) for *a*
_{0} = 1 when *τ*
_{0}
^{2} = 10 and *b*
_{01} = *b*
_{02} = 1; and 1.017 and (0.742, 1.317) for *a*
_{0} = 0, 1.081 and (0.826, 1.350) for *a*
_{0} = 0.4, and 1.135 and (0.909, 1.371) for *a*
_{0} = 1 when *τ*
_{0}
^{2} = 2 and *b*
_{01} = *b*
_{02} = 10. These results were quite similar to those given in Table 2. Other choices of (*τ*
_{0}
^{2}, *b*
_{01}, *b*
_{02}) were also tried and the results were similar. Thus, the analysis of E1684 and E1690 is quite robust to the choice of the initial prior.

## Conclusions

We have presented a Bayesian analysis of E1684 and E1690 using the ideas of the power prior. This prior incorporates historical data in a natural way and gives the investigator a great deal of control over the weight given to the historical data through the parameter *a*
_{0}. In Section 5, we carried out a detailed Bayesian analysis using this prior and observed that the analysis provided a more coherent, flexible, and potentially more accurate analysis than a separate analysis of these data or a frequentist analysis of these data. The methodology provides a consistent framework for carrying out a single unified analysis by combining data from two studies. Our analysis showed that using *a*
_{0} = 0.4 yielded the best fitting model based on DIC and LPML, therefore suggesting that this value is the optimal value in discounting the E1684 data in the Bayesian analysis of E1690. The Bayesian analysis using *a*
_{0} = 0.4 yielded markedly different results than those of *a*
_{0} = 0 and *a*
_{0} = 1 in terms of estimated hazard ratios and reductions in relapses and/or deaths in using IFN as compared to OBS. A philosophical issue that arises here is that when such an analysis is to be carried out. The Bayesian analysis of E1690 was conducted after seeing the inconsistent results between the studies. A more appropriate approach would be to specify these Bayesian analyses in the protocol itself before the E1690 data is even collected. Such a decision would also impact the design of E1690.

## Declarations

### Acknowledgement

We would like to thank the Associate Editor and two referees for their helpful comments and suggestions, which have led to an improved version of the article.

**Research support**

This work was carried out with partially support from the U.S. National Institutes of Health (NIH) grants #GM 70335 and #CA 74015.

## Authors’ Affiliations

## References

- Berry D, Wathen JK, Newell M: Bayesian model averaging in meta-analysis: vitamin E supplementation and mortality. Clinical Trials. 2009, 6 (1): 28-41. 10.1177/1740774508101279.View ArticlePubMedGoogle Scholar
- Berry SM, Carlin BP, Lee J, Muller P: Bayesian adaptive methods for clinical trials. 2010, Boca Raton, FL: Chapman & Hall/CRCView ArticleGoogle Scholar
- Gelman A, Carlin JB, Stern HS, Rubin DB: Bayesian data analysis. 2004, Boca Raton, FL: Chapman & Hall/CRC, 2Google Scholar
- Carlin BP, Louis TA: Bayesian methods for data analysis. 2008, Boca Raton, FL: Chapman & Hall/CRC, 3Google Scholar
- Ibrahim JG, Chen MH, Sinha D: Bayesian survival analysis. 2001, New York: SpringerView ArticleGoogle Scholar
- Ibrahim JG, Chen MH: Power prior distributions for regression models. Stat Sci. 2000, 15 (1): 46-60. 10.1214/ss/1009212673.View ArticleGoogle Scholar
- Chen MH, Harrington DP, Ibrahim JG: Bayesian cure rate models for malignant melanoma: a case-study of Eastern Cooperative Oncology Group trial E1690. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2002, 51 (2): 135-150. 10.1111/1467-9876.00259.View ArticleGoogle Scholar
- Kirkwood JM, Strawderman MH, Ernstoff MS, Smith TJ, Borden EC, Blum RH: Interferon alfa-2b adjuvant therapy of high-risk resected cutaneous melanoma: the Eastern Cooperative Oncology Group Trial EST 1684. J Clin Oncol. 1996, 14 (1): 7-17.PubMedGoogle Scholar
- Kirkwood JM, Ibrahim JG, Sondak VK, Richards J, Flaherty LE, Ernstoff MS, et al: High- and Low-dose interferon alfa-2b in high-risk melanoma: first analysis of intergroup trial E1690/S9111/C9190. J Clin Oncol. 2000, 18 (12): 2444-2458.PubMedGoogle Scholar
- Ewell M, Ibrahim JG: The large sample distribution of the weighted log rank statistic under general local alternatives. Lifetime Data Anal. 1997, 3: 5-12. 10.1023/A:1009690200504.View ArticlePubMedGoogle Scholar
- Chen MH, Ibrahim JG, Sinha D: A new Bayesian model for survival data with a surviving fraction. J Am Statist Assoc. 1999, 94: 909-919. 10.1080/01621459.1999.10474196.View ArticleGoogle Scholar
- Cho H, Ibrahim JG, Sinha D, Shu H: Bayesian case influence diagnostics for survival models. Biometrics. 2009, 65: 116-124. 10.1111/j.1541-0420.2008.01037.x.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/12/183/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.