This article has Open Peer Review reports available.
Evaluating the performance of copula models in phase I-II clinical trials under model misspecification
- Kristen Cunanan^{1} and
- Joseph S Koopmeiners^{1}Email author
https://doi.org/10.1186/1471-2288-14-51
© Cunanan and Koopmeiners; licensee BioMed Central Ltd. 2014
Received: 14 November 2013
Accepted: 8 April 2014
Published: 14 April 2014
Abstract
Background
Traditionally, phase I oncology trials are designed to determine the maximum tolerated dose (MTD), defined as the highest dose with an acceptable probability of dose limiting toxicities(DLT), of a new treatment via a dose escalation study. An alternate approach is to jointly model toxicity and efficacy and allow dose escalation to depend on a pre-specified efficacy/toxicity tradeoff in a phase I-II design. Several phase I-II trial designs have been discussed in the literature; while these model-based designs are attractive in their performance, they are potentially vulnerable to model misspecification.
Methods
Phase I-II designs often rely on copula models to specify the joint distribution of toxicity and efficacy, which include an additional correlation parameter that can be difficult to estimate. We compare and contrast three models for the joint probability of toxicity and efficacy, including two copula models that have been proposed for use in phase I-II clinical trials and a simple model that assumes the two outcomes are independent. We evaluate the performance of the various models through simulation both when the models are correct and under model misspecification.
Results
Both models exhibited similar performance, as measured by the probability of correctly identifying the optimal dose and the number of subjects treated at the optimal dose, regardless of whether the data were generated from the correct or incorrect copula, even when there is substantial correlation between the two outcomes. Similar results were observed for a simple model that assumes independence, even in the presence of strong correlation. Further simulation results indicate that estimating the correlation parameter in copula models is difficult with the sample sizes used in Phase I-II clinical trials.
Conclusions
Our simulation results indicate that the operating characteristics of phase I-II clinical trials are robust to misspecification of the copula model but that a simple model that assumes independence performs just as well due to difficulty in estimating the copula model correlation parameters from binary data.
Keywords
Background
Phase I oncology trials are primarily concerned with establishing the safety profile of a new treatment via determining the maximum tolerated dose (MTD), defined as the highest dose with the probability of toxicity less than a pre-specified target toxicity rate. Dose escalation studies can be categorized as either rule-based, such as the traditional 3 + 3 [1], or model-based, such as the continual reassessment method (CRM) [2]. Standard rule-based designs are advantageous with their simplicity in implementation; however, such designs are less desirable in their performance and efficiency, since the selection probability for the true MTD can be poor and dose assignment is based on information from the current dose level only. The standard CRM uses a simple parametric model, such as a one-parameter power model or two-parameter logistic regression mode, to characterize the relationship between dose level and the probability of experiencing a dose limiting toxicity (DLT). This method assumes a monotonic dose-response relationship between dose level and toxicity and has a variety of proposed modifications to better ensure patient safety [3].
Standard phase I designs assume that both the probabilities of toxicity and efficacy of a new drug increase as dose level increases. Nevertheless, in some instances an increase in dose level may result in a substantial increase in toxicity but only a small increase in efficacy. Thus, an alternate approach is to consider the tradeoff between toxicity and efficacy during dose escalation. To this end, several phase I-II study designs that jointly model toxicity and efficacy have been discussed in the literature [4–6]. As with the CRM, these methods assume a parametric model for both the dose-toxicity and dose-efficacy relationship, which may also include a quadratic term for efficacy to allow for model flexibility. In addition, these methods require that we specify a joint probability model for efficacy and toxicity, which is often accomplished using a copula model.
Copula models provide a flexible framework for specifying the joint distribution of two random variables [7]. In a copula model, a joint distribution is specified on the unit square and a joint distribution for any two random variables can be derived using an inverse transformation. In the context of a phase I-II clinical trial, we specify parametric models for the dose-response relationship for toxicity and efficacy and a copula model is used to specify a joint model for efficacy and toxicity.
Model-based designs tend to surpass rule-based designs in their ability to correctly identify the MTD and in the number of patients treated at the MTD [8], but are potentially vulnerable to model misspecification. This problem is exacerbated by the presence of a copula model in efficacy/toxicity tradeoff designs. Copula models impose a rigid structure on the relationship between toxicity and efficacy, which may not accurately reflect the underlying data generating process. An incorrectly specified model could potentially lead to a decreased probability of correctly identifying the MTD and decreased number of patients treated at the MTD.
In this manuscript, we use simulation to investigate the impact of model misspecification in phase I-II clinical trials. We consider five scenarios for the true probabilities of toxicity and efficacy. Data are simulated assuming one of two copula models and fit using the correct copula, an incorrect copula, and a model that assumes independence. Data are also simulated assuming differing degrees of positive correlation between toxicity and efficacy. Our results show that the two models are relatively robust to model misspecification but that the independence model actually performs better in many cases.
The remaining sections of this manuscript are organized as follows. In Section ‘Methods’, we introduce several joint probability models used in phase I-II clinical trials and describe the dose-finding algorithm used in our simulation study. In Section ‘Results and discussion’, we present simulation results evaluating the performance of the two copula models when correctly specified and under model specification. Finally, we conclude with a brief discussion in Section ‘Conclusions’.
Methods
In this section, we introduce two joint probability models used in phase I-II clinical trials. In both cases, we specify marginal models for the probabilities of toxicity and efficacy and develop a joint model using a copula model. First, we specify models for the marginal probability of toxicity and the marginal probability of efficacy.
We include a quadratic term for efficacy to allow model flexibility should the probability of efficacy level off or diminish after a certain dose level. We note that the intercept terms in (1) and (2) correspond to the log-odds of toxicity and efficacy, respectively, at the first dose level. This is useful for interpretation and prior specification purposes. We next describe two copula models used in phase I-II clinical trials for specifying a joint distribution for Y _{ T } and Y _{ E }.
Braun copula
An analogous conditional probability of Y _{ T }|Y _{ E } can also be derived.
This is a key point that must be considered during dose finding.
Gumbel copula
The conditional probability of Y _{ T } given Y _{ E } can be expressed in an analogous fashion.
An advantage of this model is that both π _{ E } and π _{ T } retain their original interpretations as the marginal probabilities of efficacy and toxicity, respectively. This can be easily seen by summing P(Y _{ E }=1,Y _{ T }=1) and P(Y _{ E }=1,Y _{ T }=0) from (4). Unlike the Braun Copula, the correlation parameter for the Gumbel copula, ψ _{2}, does not have a straight-forward interpretation.
Independent model
which is of course what we get by setting ψ _{1}=0.5 and ψ _{2}=0 in the Braun and Gumbel copulas, respectively. While it is unlikely that this model accurately reflects the true association between Y _{ T } and Y _{ E }, this model may still be useful because the sample size in phase I-II oncology trials is limited and we may lack the sample size to precisely estimate ψ _{1} and ψ _{2}. If the likelihood contains very little information about these parameters, it may be that we do not lose much with respect to our ability to identify the optimal dose by assuming independence instead of a more complicated model.
Likelihood and priors
where π(Y _{ T },Y _{ E }|z) is defined using either (3), (4) or (5) and $\overrightarrow{\beta}=({\beta}_{0,T},{\beta}_{1,T},{\beta}_{0,E},{\beta}_{1,E},{\beta}_{2,E},{\psi}_{k})$ with k=1,2 for the two copula models and $\overrightarrow{\beta}=({\beta}_{0,T},{\beta}_{1,T},{\beta}_{0,E},{\beta}_{1,E},{\beta}_{2,E})$ for the independence model.
We must specify a prior distribution for each regression and association parameter, to complete a Bayesian analysis. We specify the following normal priors for the two intercept terms and the quadratic term for efficacy: β _{0,T }∼N(−3,s d=3), β _{0,E }∼N(−1,3), and ${\beta}_{2,E}\sim N\left(0,\frac{1}{4}\right)$. The priors for β _{0,T } and β _{0,E } correspond to a prior belief of P(Y _{ T }=1|z=1)=0.05 and P(Y _{ E }=1|z=1)=0.27 but provide sufficient support over all plausible values for β _{0,T } and β _{0,E } and represent only mildly informative priors. The prior for β _{2,E } is chosen to reflect a strong belief against a quadratic relationship but allows the model flexibility should there be drastic departures from a linear relationship. Gamma priors were set for β _{1,T } and β _{1,E } with mean 1 and standard deviation 2, corresponding to a $\mathit{\text{Gamma}}(\frac{1}{4},\frac{1}{4})$. Assuming Gamma priors for β _{1,T } and β _{1,E } implies that the marginal probability of toxicity will be monotonically increasing but the same is not true for the marginal probability of efficacy due to the inclusion of a quadratic term for the marginal probability of efficacy. Finally, we specify non-informative uniform priors for the association parameters: U n i f o r m(0,1) for ψ _{1} and U n i f o r m(−1,1) for ψ _{2}.
Dose-finding algorithm
For our simulation study, we follow the dose-finding algorithm proposed by Thall and Cook [4]. These authors identify a set of acceptable doses by defining a maximum acceptable probability of toxicity assuming 100% efficacy, a minimum acceptable probability of efficacy assuming no toxicity and define a desirability index to identify the optimal dose from the set of acceptable doses.
where q is defined by identifying a probability of toxicity and probability of efficacy pair, $\left({\pi}_{T}^{\ast},{\pi}_{E}^{\ast}\right)$, that is equally desirable to $\left({\overline{\pi}}_{T},1.0\right)$ and $\left(0,{\underline{\pi}}_{E}\right)$, plugging $\left({\pi}_{T}^{\ast},{\pi}_{E}^{\ast}\right)$ into (7) and solving for q when D(z) equals 0. Larger values of D(z) are considered more desirable and the optimal combination, (0.0,1.0), has D(z) equal to 1 regardless of q.
- 1.
Treat the first cohort of m patients at the lowest dose level.
- 2.
Update the posterior distributions of the probabilities of toxicity and efficacy for each dose level using data from all previous cohorts.
- 3.
Identify the set of acceptable doses using criterion (6). If no dose is found acceptable, terminate for futility.
- 4.
Treat the next cohort at the dose that maximizes D(z) under the restriction that dose levels may not be skipped when escalating. Return to step 2.
- 5.
Repeat steps 2–4 until the maximum sample size is reached. The dose that maximizes D(z) at study completion is considered the optimal dose.
Results and discussion
We completed a small simulation study to evaluate the performance of phase I-II clinical trials when the copula model is misspecified. Trial parameters were set as follows. We assume a cohort size of 3 patients with a maximum of 15 cohorts, for a maximum sample size of 45 patients. The maximum acceptable probability of toxicity assuming 100% efficacy and minimum acceptable probability of efficacy assuming no toxicity were set at ${\underline{\pi}}_{T}=0.5$ and ${\overline{\pi}}_{E}=0.55$, respectively. We set $\left({\pi}_{T}^{\ast},{\pi}_{E}^{\ast}\right)$ equal to (0.25,0.60), which corresponds to q=2. The pre-specified threshold, p, to determine the set of acceptable doses using the posterior probability of toxicity and efficacy is assumed to be 0.05. We consider four dose levels with dose index, z={1,2,3,4}. All simulations were completed in R version 2.15.1 [11]. Gibbs sampling was completed in JAGS as called from R using rjags [12]. We simulate 1000 trials; within each trial, 1000 iterations were kept for inference following a period of 5000 iterations for burn-in.
Scenarios
True marginal probability of toxicity, marginal probability of efficacy and D ( z ) for each dose level
Scenario | Dose | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | ||
1 | P(Toxicity) | 0.05 | 0.12 | 0.27 | 0.50 |
P(Efficacy) | 0.38 | 0.55 | 0.71 | 0.83 | |
D(z) | -0.38 | -0.03 | 0.16 | -0.07 | |
2 | P(Toxicity) | 0.38 | 0.52 | 0.67 | 0.79 |
P(Efficacy) | 0.77 | 0.82 | 0.86 | 0.89 | |
D(z) | 0.08 | -0.11 | -0.38 | -0.6 | |
3 | P(Toxicity) | 0.02 | 0.07 | 0.15 | 0.31 |
P(Efficacy) | 0.12 | 0.25 | 0.45 | 0.67 | |
D(z) | -0.96 | -0.67 | -0.26 | 0.04 | |
4 | P(Toxicity) | 0.05 | 0.11 | 0.25 | 0.46 |
P(Efficacy) | 0.18 | 0.55 | 0.79 | 0.86 | |
D(z) | -0.82 | -0.02 | 0.32 | 0.03 | |
5 | P(Toxicity) | 0.03 | 0.08 | 0.18 | 0.38 |
P(Efficacy) | 0.18 | 0.25 | 0.33 | 0.43 | |
D(z) | -0.82 | -0.67 | -0.53 | -0.48 |
For each scenario, we simulated data from both copula models and fit the data using either the correct copula model, the incorrect copula model or the independence model. The association parameters, ψ _{1} and ψ _{2}, were varied to determine the impact of the correlation of Y _{ T } and Y _{ E } on model performance under model misspecification. The Braun association parameter, ψ _{1}, in (3) ranges from 0.5 to 0.9 by increments of 0.2; this corresponds to an odds ratio between toxicity and efficacy ranging from 1 to 9. The Gumbel association parameter, ψ _{2} in (4) ranges from 0 to 0.8 by increments of 0.4. Recall that efficacy and toxicity are independent if ψ _{1} equals 0.5 for the Braun model and ψ _{2} equals 0 for the Gumbel model. In these cases, the independence model is the correct model and the Braun and Gumbel models are unnecessarily trying to estimate a correlation parameter when the two endpoints are actually independent.
Results
Results using the Braun and Gumbel copulas for data simulation under Scenario 1
Data | ψ _{ k } | Model | Dose | ||||
---|---|---|---|---|---|---|---|
Futility | 1 | 2 | 3 | 4 | |||
Braun | 0.5 | Braun | 0.039 | 0.045 | 0.212 | 0.475 | 0.229 |
5.91 | 12.792 | 17.13 | 8.298 | ||||
Gumbel | 0.038 | 0.039 | 0.225 | 0.482 | 0.216 | ||
6.075 | 12.738 | 16.596 | 8.775 | ||||
Indep | 0.037 | 0.046 | 0.223 | 0.504 | 0.19 | ||
6 | 13.146 | 16.923 | 8.115 | ||||
0.7 | Braun | 0.037 | 0.034 | 0.197 | 0.514 | 0.218 | |
5.667 | 12.966 | 17.496 | 7.86 | ||||
Gumbel | 0.023 | 0.032 | 0.235 | 0.524 | 0.186 | ||
5.613 | 13.32 | 17.562 | 7.941 | ||||
Indep | 0.033 | 0.035 | 0.219 | 0.528 | 0.185 | ||
5.796 | 13.356 | 17.334 | 7.77 | ||||
0.9 | Braun | 0.024 | 0.018 | 0.208 | 0.514 | 0.236 | |
5.19 | 12.888 | 17.58 | 8.814 | ||||
Gumbel | 0.011 | 0.037 | 0.225 | 0.538 | 0.189 | ||
5.901 | 13.428 | 17.895 | 7.542 | ||||
Indep | 0.006 | 0.041 | 0.226 | 0.526 | 0.201 | ||
5.994 | 13.605 | 17.349 | 7.95 | ||||
Gumbel | 0 | Braun | 0.042 | 0.033 | 0.217 | 0.503 | 0.205 |
6.054 | 12.786 | 17.007 | 8.373 | ||||
Gumbel | 0.042 | 0.041 | 0.21 | 0.499 | 0.208 | ||
6.033 | 12.651 | 17.094 | 8.28 | ||||
Indep | 0.034 | 0.054 | 0.208 | 0.478 | 0.226 | ||
6.591 | 12.447 | 16.671 | 8.622 | ||||
0.4 | Braun | 0.036 | 0.034 | 0.216 | 0.506 | 0.208 | |
5.871 | 12.828 | 17.157 | 8.322 | ||||
Gumbel | 0.029 | 0.039 | 0.222 | 0.461 | 0.249 | ||
5.775 | 13.287 | 16.029 | 9.081 | ||||
Indep | 0.018 | 0.044 | 0.213 | 0.501 | 0.224 | ||
5.946 | 12.81 | 17.205 | 8.67 | ||||
0.8 | Braun | 0.036 | 0.028 | 0.237 | 0.475 | 0.224 | |
5.886 | 13.875 | 16.119 | 8.307 | ||||
Gumbel | 0.016 | 0.036 | 0.231 | 0.506 | 0.211 | ||
5.877 | 13.182 | 17.34 | 8.22 | ||||
Indep | 0.032 | 0.032 | 0.216 | 0.524 | 0.196 | ||
5.388 | 13.242 | 17.616 | 8.079 |
Results using the Braun and Gumbel copulas for data simulation under Scenario 2
Data | ψ _{ k } | Model | Dose | ||||
---|---|---|---|---|---|---|---|
Futility | 1 | 2 | 3 | 4 | |||
Braun | 0.5 | Braun | 0.147 | 0.748 | 0.094 | 0.011 | 0 |
33.159 | 5.922 | 0.798 | 0.042 | ||||
Gumbel | 0.13 | 0.732 | 0.121 | 0.016 | 0.001 | ||
32.652 | 6.984 | 1.161 | 0.084 | ||||
Indep | 0.129 | 0.770 | 0.093 | 0.008 | 0 | ||
34.542 | 5.826 | 0.825 | 0.063 | ||||
0.7 | Braun | 0.16 | 0.744 | 0.091 | 0.005 | 0 | |
32.919 | 6.315 | 0.645 | 0.039 | ||||
Gumbel | 0.106 | 0.761 | 0.117 | 0.016 | 0 | ||
34.059 | 6.762 | 0.81 | 0.09 | ||||
Indep | 0.098 | 0.781 | 0.109 | 0.012 | 0 | ||
34.671 | 6.516 | 0.705 | 0.09 | ||||
0.9 | Braun | 0.204 | 0.707 | 0.086 | 0.003 | 0 | |
31.878 | 6.18 | 0.396 | 0.018 | ||||
Gumbel | 0.071 | 0.799 | 0.114 | 0.015 | 0.001 | ||
34.977 | 7.2 | 0.63 | 0.063 | ||||
Indep | 0.095 | 0.773 | 0.122 | 0.009 | 0.001 | ||
34.329 | 6.789 | 0.795 | 0.078 | ||||
Gumbel | 0 | Braun | 0.14 | 0.722 | 0.126 | 0.012 | 0 |
32.889 | 6.69 | 0.756 | 0.051 | ||||
Gumbel | 0.124 | 0.745 | 0.122 | 0.008 | 0.001 | ||
33.714 | 6.564 | 0.786 | 0.069 | ||||
Indep | 0.13 | 0.751 | 0.111 | 0.007 | 0.001 | ||
33.585 | 6.465 | 0.738 | 0.099 | ||||
0.4 | Braun | 0.159 | 0.741 | 0.09 | 0.009 | 0.001 | |
33.39 | 5.886 | 0.96 | 0.066 | ||||
Gumbel | 0.127 | 0.755 | 0.1 | 0.016 | 0.002 | ||
33.882 | 6.234 | 1.02 | 0.09 | ||||
Indep | 0.118 | 0.758 | 0.113 | 0.011 | 0 | ||
33.36 | 7.059 | 0.927 | 0.036 | ||||
0.8 | Braun | 0.16 | 0.742 | 0.094 | 0.004 | 0 | |
32.991 | 6.048 | 0.534 | 0.06 | ||||
Gumbel | 0.081 | 0.799 | 0.104 | 0.014 | 0.002 | ||
35.247 | 6.657 | 0.84 | 0.048 | ||||
Indep | 0.113 | 0.76 | 0.118 | 0.009 | 0 | ||
34.068 | 6.528 | 0.801 | 0.081 |
Results using the Braun and Gumbel copulas for data simulation under Scenario 3
Data | ψ _{ k } | Model | Dose | ||||
---|---|---|---|---|---|---|---|
Futility | 1 | 2 | 3 | 4 | |||
Braun | 0 | Braun | 0.195 | 0.001 | 0.004 | 0.045 | 0.755 |
3.426 | 3.636 | 5.358 | 27.504 | ||||
Gumbel | 0.174 | 0.001 | 0.005 | 0.042 | 0.778 | ||
3.459 | 3.744 | 5.418 | 27.855 | ||||
Indep | 0.149 | 0.001 | 0.003 | 0.035 | 0.812 | ||
3.384 | 3.69 | 4.938 | 29.169 | ||||
0.4 | Braun | 0.162 | 0.001 | 0.005 | 0.025 | 0.807 | |
3.297 | 3.762 | 4.884 | 28.719 | ||||
Gumbel | 0.139 | 0 | 0.003 | 0.042 | 0.816 | ||
3.417 | 3.693 | 5.487 | 29.025 | ||||
Indep | 0.13 | 0 | 0.003 | 0.049 | 0.818 | ||
3.423 | 3.705 | 5.547 | 28.857 | ||||
0.8 | Braun | 0.143 | 0 | 0.002 | 0.033 | 0.822 | |
3.351 | 3.552 | 5.502 | 29.19 | ||||
Gumbel | 0.101 | 0 | 0.002 | 0.037 | 0.86 | ||
3.387 | 3.636 | 5.568 | 29.775 | ||||
Indep | 0.093 | 0 | 0.002 | 0.04 | 0.865 | ||
3.381 | 3.804 | 5.628 | 29.592 | ||||
Gumbel | 0 | Braun | 0.189 | 0.001 | 0.004 | 0.047 | 0.759 |
3.327 | 3.585 | 5.31 | 27.942 | ||||
Gumbel | 0.173 | 0 | 0.004 | 0.044 | 0.779 | ||
3.336 | 3.675 | 5.313 | 28.224 | ||||
Indep | 0.16 | 0.001 | 0 | 0.034 | 0.805 | ||
3.435 | 3.789 | 5.172 | 28.434 | ||||
0.4 | Braun | 0.16 | 0 | 0.001 | 0.048 | 0.791 | |
3.45 | 3.63 | 5.61 | 28.101 | ||||
Gumbel | 0.163 | 0 | 0.004 | 0.049 | 0.784 | ||
3.363 | 3.666 | 5.49 | 28.164 | ||||
Indep | 0.156 | 0 | 0.001 | 0.034 | 0.809 | ||
3.405 | 3.591 | 4.977 | 28.875 | ||||
0.8 | Braun | 0.171 | 0.001 | 0.001 | 0.029 | 0.798 | |
3.342 | 3.645 | 5.25 | 28.116 | ||||
Gumbel | 0.156 | 0.001 | 0.004 | 0.029 | 0.81 | ||
3.384 | 3.702 | 5.262 | 28.776 | ||||
Indep | 0.144 | 0 | 0.002 | 0.042 | 0.812 | ||
3.342 | 3.804 | 5.484 | 28.824 |
Results using the Braun and Gumbel copulas for data simulation under Scenario 4
Data | ψ _{ k } | Model | Dose | ||||
---|---|---|---|---|---|---|---|
Futility | 1 | 2 | 3 | 4 | |||
Braun | 0.5 | Braun | 0.023 | 0.003 | 0.057 | 0.665 | 0.252 |
3.42 | 6.39 | 23.19 | 11.268 | ||||
Gumbel | 0.015 | 0.001 | 0.055 | 0.683 | 0.246 | ||
3.402 | 7.041 | 23.358 | 10.809 | ||||
Indep | 0.017 | 0.003 | 0.078 | 0.621 | 0.281 | ||
3.351 | 7.071 | 22.74 | 11.304 | ||||
0.7 | Braun | 0.02 | 0.001 | 0.057 | 0.684 | 0.238 | |
3.333 | 6.849 | 24.066 | 10.146 | ||||
Gumbel | 0.019 | 0.003 | 0.069 | 0.671 | 0.238 | ||
3.39 | 7.107 | 23.286 | 10.68 | ||||
Indep | 0.01 | 0.001 | 0.085 | 0.655 | 0.249 | ||
3.39 | 7.593 | 22.854 | 10.875 | ||||
0.9 | Braun | 0.017 | 0.001 | 0.062 | 0.658 | 0.262 | |
3.303 | 6.996 | 23.376 | 10.839 | ||||
Gumbel | 0.002 | 0 | 0.078 | 0.698 | 0.222 | ||
3.315 | 7.476 | 23.46 | 10.695 | ||||
Indep | 0.007 | 0.001 | 0.081 | 0.647 | 0.264 | ||
3.543 | 7.743 | 22.329 | 11.172 | ||||
Gumbel | 0 | Braun | 0.027 | 0.002 | 0.064 | 0.664 | 0.243 |
3.282 | 6.885 | 23.109 | 11.034 | ||||
Gumbel | 0.027 | 0.001 | 0.065 | 0.657 | 0.25 | ||
3.351 | 6.711 | 22.95 | 11.136 | ||||
Indep | 0.025 | 0.002 | 0.084 | 0.633 | 0.256 | ||
3.375 | 7.374 | 22.767 | 10.788 | ||||
0.4 | Braun | 0.025 | 0.002 | 0.066 | 0.631 | 0.276 | |
3.504 | 7.338 | 22.515 | 11.013 | ||||
Gumbel | 0.021 | 0.002 | 0.074 | 0.655 | 0.248 | ||
3.45 | 7.296 | 22.599 | 11.013 | ||||
Indep | 0.013 | 0.001 | 0.069 | 0.649 | 0.268 | ||
3.315 | 7.032 | 22.542 | 11.787 | ||||
0.8 | Braun | 0.012 | 0.002 | 0.063 | 0.65 | 0.273 | |
3.318 | 7.152 | 22.647 | 11.502 | ||||
Gumbel | 0.019 | 0.001 | 0.075 | 0.669 | 0.236 | ||
3.297 | 7.038 | 23.154 | 10.923 | ||||
Indep | 0.013 | 0.004 | 0.075 | 0.673 | 0.235 | ||
3.33 | 7.389 | 23.172 | 10.755 |
Results using the Braun and Gumbel copulas for data simulation under Scenario 5
Data | ψ _{ k } | Model | Dose | ||||
---|---|---|---|---|---|---|---|
Futility | 1 | 2 | 3 | 4 | |||
Braun | 0.5 | Braun | 0.87 | 0.001 | 0.005 | 0.023 | 0.101 |
4.086 | 4.59 | 4.62 | 11.946 | ||||
Gumbel | 0.872 | 0 | 0.006 | 0.022 | 0.1 | ||
4.05 | 4.44 | 4.896 | 11.316 | ||||
Indep | 0.872 | 0.002 | 0.009 | 0.016 | 0.101 | ||
4.176 | 4.656 | 4.803 | 11.319 | ||||
0.7 | Braun | 0.928 | 0 | 0.004 | 0.009 | 0.059 | |
4.065 | 4.209 | 4.56 | 10.095 | ||||
Gumbel | 0.887 | 0.002 | 0.014 | 0.014 | 0.083 | ||
3.936 | 4.485 | 4.869 | 11.748 | ||||
Indep | 0.895 | 0.002 | 0.01 | 0.016 | 0.077 | ||
4.254 | 4.467 | 4.596 | 11.256 | ||||
0.9 | Braun | 0.945 | 0.001 | 0.004 | 0.009 | 0.041 | |
4.053 | 4.29 | 4.566 | 9.531 | ||||
Gumbel | 0.904 | 0.003 | 0.006 | 0.016 | 0.071 | ||
4.248 | 4.701 | 5.148 | 11.091 | ||||
Indep | 0.905 | 0 | 0.006 | 0.016 | 0.073 | ||
4.227 | 4.608 | 5.196 | 11.22 | ||||
Gumbel | 0 | Braun | 0.895 | 0.001 | 0.011 | 0.01 | 0.083 |
4.035 | 4.41 | 4.674 | 11.196 | ||||
Gumbel | 0.878 | 0.003 | 0.008 | 0.015 | 0.096 | ||
3.999 | 4.293 | 4.692 | 11.391 | ||||
Indep | 0.897 | 0.001 | 0.007 | 0.014 | 0.081 | ||
4.197 | 4.494 | 4.659 | 11.886 | ||||
0.4 | Braun | 0.898 | 0.002 | 0.01 | 0.013 | 0.077 | |
3.966 | 4.254 | 4.368 | 11.061 | ||||
Gumbel | 0.901 | 0.001 | 0.004 | 0.021 | 0.073 | ||
4.179 | 4.452 | 4.89 | 11.112 | ||||
Indep | 0.892 | 0.004 | 0.006 | 0.018 | 0.08 | ||
4.065 | 4.365 | 4.644 | 11.559 | ||||
0.8 | Braun | 0.913 | 0.003 | 0.005 | 0.013 | 0.066 | |
4.002 | 4.536 | 4.623 | 10.794 | ||||
Gumbel | 0.925 | 0.002 | 0.005 | 0.013 | 0.055 | ||
4.152 | 4.488 | 4.632 | 11.028 | ||||
Indep | 0.897 | 0.003 | 0.006 | 0.012 | 0.082 | ||
4.179 | 4.611 | 4.452 | 11.286 |
Table 2 displays results for Scenario 1. Concentrating first on the results when data are generated from the Braun model, we see that both copula models perform similarly well in their ability to select the correct dose and in the number of patients treated at the optimal dose regardless of the true correlation. The probability of correctly identifying the optimal dose differs by less than 0.024 and the number of patients treated at the optimal dose differs by less than 1 across the three correlation conditions. Surprisingly, the independence model also performs very well regardless of the true correlation and has the highest probability of identifying the optimal dose in two of the three correlation conditions. Although, the differences are small and the independence model has essentially the same performance as the two copula models.
Similar results are observed when data are generated from the Gumbel model. Both copula models performed similarly with respect to the probability of accurately identifying the optimal dose and the number of patients treated at the optimal dose. The independence model again exhibits good performance across the three scenarios, which is surprising because the independence model lacks the flexibility to model the correlation between the two outcomes. Across the three correlation scenarios, the probability of correctly identifying the optimal dose differed by less than 0.05 and the average number of patients treated at the optimal dose differed by less than 1.5 between the three models.
Results for Scenarios 2 and 3 are found in Tables 3 and 4, respectively. The results for Scenarios 2 and 3 are similar to the results for Scenario 1. The probability of correctly identifying the optimal dose and the average number of patients treated at the optimal dose are similar for both copula models regardless of how the data are generated and the independence model provides similar performance even though the independence model is unable to appropriately model the correlation between Y _{ T } and Y _{ E }.
Scenario 4 (Table 5), represents the scenario where multiple dose levels are acceptable, dose levels 3 and 4, but dose level 3 is optimal. This scenario represents one of the primary motivations for phase I-II designs as there is a dose level where further escalation results in a greater probability of toxicity but relatively little efficacy benefit. The results for Scenario 4 are consistent with our previous results: there is little difference between the three models in the probability of correctly identifying the optimal dose and the average number of patients treated at the optimal dose regardless of the correlation between endpoints and how the data are generated. Finally, the results for Scenario 5 can be found in Table 6. In this scenario, all dose levels are safe but have unacceptable efficacy and the correct decision is to terminate for futility. The Gumbel and independence models exhibit similar performance across all scenarios but we do observe that the Braun model is more likely to terminate for futility when Y _{ T } and Y _{ E } are correlated and the data are generated for the Braun model. Although, the differences are small in both cases.
Average posterior mean (standard deviation) for the correlation parameters, ψ _{ k } , k = 1,2, when 11 subjects are treated at each dose level
Braun model | Scenario 1 | Scenario 2 | Scenario 3 | Scenario 4 | Scenario 5 |
---|---|---|---|---|---|
ψ _{1}=0.5 | 0.535 | 0.546 | 0.531 | 0.546 | 0.495 |
(0.023) | (0.021) | (0.024) | (0.024) | (0.025) | |
ψ _{1}=0.7 | 0.659 | 0.668 | 0.656 | 0.660 | 0.643 |
(0.020) | (0.018) | (0.021) | (0.019) | (0.021) | |
ψ _{1}=0.9 | 0.784 | 0.824 | 0.792 | 0.777 | 0.811 |
(0.010) | (0.008) | (0.012) | (0.010) | (0.010) | |
Gumbel model | Scenario 1 | Scenario 2 | Scenario 3 | Scenario 4 | Scenario 5 |
ψ _{2}=0.0 | -0.001 | -0.003 | 0.005 | -0.005 | -0.002 |
(0.075) | (0.081) | (0.067) | (0.067) | (0.072) | |
ψ _{2}=0.4 | 0.115 | 0.105 | 0.119 | 0.091 | 0.118 |
(0.073) | (0.076) | (0.063) | (0.06) | (0.072) | |
ψ _{2}=0.8 | 0.225 | 0.244 | 0.194 | 0.212 | 0.219 |
(0.062) | (0.065) | (0.06) | (0.057) | (0.061) |
Conclusions
We completed a simulation study to evaluate the performance of copula models in phase I-II clinical trials under model misspecification. Our results suggest that the operating characteristics of our study are relatively robust to misspecifying the copula model. Both models exhibited similar performance, as measured by the probability of correctly identifying the optimal dose and the number of subjects treated at the optimal dose, regardless of whether the data were generated from the correct or incorrect copula, even when there is substantial correlation between Y _{ T } and Y _{ E }. These results were robust to changes in the maximum sample size and the prior distributions for the parameters of the logistic regression models for toxicity and efficacy. In comparing the two models, there was little difference in the operating characteristics, although, the straight-forward interpretation of the model parameters in the Gumbel model may make the Gumbel model more desirable.
Surprisingly, the naive model that ignores the correlation between Y _{ E } and Y _{ T } performed as well, better in some cases, with respect to correctly identifying the optimal dose and the number of subjects treated at the optimal dose than even the correct model. This was true regardless of the scenario and true correlation between Y _{ E } and Y _{ T }. This result is not intuitive as we would expect that correctly specifying the copula model would result in more efficient parameter estimates and improved operating characteristics of our study.
There are several possible explanations for the lack of benefit when utilizing the correct copula model in Phase I-II clinical trials. First, it is possible that the likelihood contains very little information about the correlation parameter and any benefit of modeling the correlation is negated by the need to estimate an additional correlation parameter. In this case, fitting a copula model may result in more variable estimates, in general, which would result in performance that is no better, and potentially worse, than simply assuming that the two endpoints are independent. A second explanation is that Phase I-II clinical trials do not provide sufficient information for selecting the correct copula model. Phase I-II clinical trials utilize small sample sizes, which makes it difficult to properly evaluate the fit of a model. Furthermore, regulatory bodies typically require that a model is specified in advance when utilizing an adaptive trial design. These challenges make it difficult to identify the correct copula from the data, which may negate any benefit from modeling the correlation between the toxicity and efficacy endpoint. Finally, properly modeling the correlation between two endpoints is necessary to complete proper inference (hypothesis tests, credible intervals, etc.) but it may be that modeling this correlation is not necessary in a Phase I-II clinical trial where the goal is to select a dose at study completion regardless of the error associated with estimates of the probability of efficacy and toxicity. In this case, we would not expect any benefit from modeling the correlation, which is consistent with our simulation results.
We completed a second simulation study to investigate the model’s ability to estimate the correlation parameters with the sample sizes used in phase I-II clinical trials in order to fully understand the behavior of copula models in phase I-II clinical trials. Estimates of the correlation parameters were biased towards the prior mean of no correlation in both cases but the average posterior mean of the correlation parameter in the Braun model was much closer to the true value of the correlation parameter than in the Gumbel model. This suggests that, while the likelihood for the Braun model contains a fair amount of information for the correlation parameter, the likelihood for the Gumbel model contains very little information about the correlation parameter and provides a potential explanation for the apparent lack of benefit due to properly modeling the correlation between efficacy and toxicity in phase I-II clinical trials.
The results of this manuscript are dependent the decision rule proposed by Thall and Cook [4]. Other decision rules have been proposed for phase I-II clinical trials [5, 13] and it is possible that the results of our simulation study would change with a different decision rule. We think that this is unlikely given that we consistently found no benefit of appropriately modeling the correlation between toxicity and efficacy in all scenarios and additional simulation results illustrated that the likelihood contains little information for estimating the correlation parameters in the two copulas we considered for sample sizes typical of phase I-II clinical trials. Nevertheless, this issue should be considered when evaluating the results of our simulation study.
Our results do not indicate a preference for one model over the other. Both models performed similarly, regardless of how the data were generated. Although, the performance of the Braun model suffered more than the performance of the Gumbel model when vague priors were placed on the slope parameters in the logistic regression models for efficacy and toxicity. The other primary difference between the two models is the interpretation of the model parameters. In the Gumbel model, π _{ E } and π _{ T } represent the marginal probability of efficacy and toxicity, respectively, but are conditional probabilities that depend on the correlation parameter in the Braun model. This property could make the Gumbel model preferable given the similar performance of the two models. That said, our results indicate that it would be acceptable for a practitioner to simply fit the model that assumes independence even though the two outcomes are likely correlated. The performance of the two copula models could possibly be improved by utilizing informative priors for the correlation parameters but strongly informative priors would be required to overcome the apparent lack of information in the likelihood and it is unlikely that such prior information exists in early phase clinical trials. In this case, fitting a model that assumes independence is preferable.
Declarations
Acknowledgements
This research was partially supported by a research grant from Medtronic, Inc. and a grant-in-aid of research, artistry and scholarship from the University of Minnesota. The authors would also like to thank the Associate Editor and two Referees for their helpful comments, which improved the manuscript.
Authors’ Affiliations
References
- Storer BE: Design and analysis of phase I clinical trials. Biometrics. 1989, 45 (3): 925-937. 10.2307/2531693.View ArticlePubMedGoogle Scholar
- O’Quigley J, Pepe M, Fisher L: Continual reassessment method: a practical design for phase 1 clinical trials in cancer. Biometrics. 1990, 46: 33-48. 10.2307/2531628.View ArticlePubMedGoogle Scholar
- Goodman SN, Zahurak ML, Piantadosi S: Some practical improvements in the continual reassessment method for phase I studies. Stat Med. 1995, 14 (11): 1149-1161. 10.1002/sim.4780141102.View ArticlePubMedGoogle Scholar
- Thall PF, Cook JD: Dose-finding based on efficacy/toxicity trade-offs. Biometrics. 2004, 60 (3): 684-693. 10.1111/j.0006-341X.2004.00218.x.View ArticlePubMedGoogle Scholar
- Braun TM: The bivariate continual reassessment method: extending the CRM to phase I trials of two competing outcomes. Control Clin Trials. 2002, 23 (3): 240-256. 10.1016/S0197-2456(01)00205-7.View ArticlePubMedGoogle Scholar
- Zhang W, Sargent DJ, Mandrekar S: An adaptive dose-finding design incorporating both toxicity and efficacy. Stat Med. 2006, 25 (14): 2365-2383. 10.1002/sim.2325.View ArticlePubMedGoogle Scholar
- Nelsen R: An Introduction to Copulas. 1999, New York: SpringerView ArticleGoogle Scholar
- Iasonos A, Wilton AS, Riedel ER, Seshan VE, Spriggs DR: A comprehensive comparison of the continual reassessment method to the standard 3 + 3 dose escalation scheme in phase I dose-finding studies. Clin Trials. 2008, 5 (5): 465-477. 10.1177/1740774508096474.View ArticlePubMedPubMed CentralGoogle Scholar
- Arnold BC, Strauss DJ: Bivariate distributions with conditionals in prescribed exponential families. J Roy Stat Soc B. 1991, 53 (2): 365-375. [http://www.jstor.org/stable/2345747]Google Scholar
- Murtaugh PA, Fisher LD: Bivariate binary models of efficacy and toxicity in dose-ranging trials. Comm Stat Theor Meth. 1990, 19 (6): 2003-2020. 10.1080/03610929008830305.View ArticleGoogle Scholar
- R Core Team: R: A Language and Environment for Statistical Computing. 2013, Vienna: R Foundation for Statistical Computing, [http://www.R-project.org/]Google Scholar
- Plummer M: rjags: Bayesian Graphical Models using MCMC/2013. [http://CRAN.R-project.org/package=rjags] [R package version 3-10]
- Yin G, Li Y, Ji Y: Bayesian dose-finding in phase I/II clinical trials using toxicity and efficacy odds ratios. Biometrics. 2006, 62 (3): 777-787. 10.1111/j.1541-0420.2006.00534.x.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/14/51/prepub
Pre-publication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons PublicDomain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in thisarticle, unless otherwise stated.