In this section, we introduce two joint probability models used in phase III clinical trials. In both cases, we specify marginal models for the probabilities of toxicity and efficacy and develop a joint model using a copula model. First, we specify models for the marginal probability of toxicity and the marginal probability of efficacy.
Let Y
_{
T
} and Y
_{
E
} be the binary indicators of toxicity and efficacy, respectively. Denote π(y
_{
T
},y
_{
E
}z)=P r(Y
_{
T
}=y
_{
T
},Y
_{
E
}=y
_{
E
}z) as the joint probability of toxicity and efficacy given dose level z, with marginal probabilities of toxicity and efficacy, π
_{
T
} and π
_{
E
}, respectively, also functions of z. We can model the dosetoxicity and doseefficacy relationships with any monotonic function. For simplicity, we assume logistic regression models for efficacy and toxicity as follows:
log\left(\frac{{\pi}_{T}}{1{\pi}_{T}}\right)={\beta}_{0,T}+{\beta}_{1,T}(z1),
(1)
\text{and}\phantom{\rule{1em}{0ex}}log\left(\frac{{\pi}_{E}}{1{\pi}_{E}}\right)={\beta}_{0,E}+{\beta}_{1,E}(z1)+{\beta}_{2,E}{(z1)}^{2}.
(2)
We include a quadratic term for efficacy to allow model flexibility should the probability of efficacy level off or diminish after a certain dose level. We note that the intercept terms in (1) and (2) correspond to the logodds of toxicity and efficacy, respectively, at the first dose level. This is useful for interpretation and prior specification purposes. We next describe two copula models used in phase III clinical trials for specifying a joint distribution for Y
_{
T
} and Y
_{
E
}.
Braun copula
We first consider the copula model discussed by Arnold and Strauss [9] and applied to joint modeling of efficacy and toxicity in the setting of a Phase III clinical trial by Braun [5]. These authors specify the joint distribution of Y
_{
T
} and Y
_{
E
} as:
\begin{array}{ll}\pi ({y}_{T},{y}_{E}z)=& k({\pi}_{T},{\pi}_{E},{\psi}_{1}){\pi}_{E}^{{y}_{E}}{(1{\pi}_{E})}^{1{y}_{E}}{\pi}_{T}^{{y}_{T}}\\ \times {(1{\pi}_{T})}^{1{y}_{T}}{\psi}_{1}^{{y}_{T}{y}_{E}}{(1{\psi}_{1})}^{1{y}_{T}{y}_{E}}.\end{array}
(3)
Here, ψ
_{1} represents the correlation between Y
_{
T
} and Y
_{
E
} and takes on values between 0 and 1. ψ
_{1} greater than 0.5 reflects positive correlation, ψ
_{1} less than 0.5 reflects negative correlation and ψ
_{1}=0.5 represents independence. We note also that k(π
_{
T
},π
_{
E
},ψ
_{1}) is a constant that is included to assure that the four probabilities sum to 1 and depends on π
_{
T
}, π
_{
E
}, and ψ
_{1}. The conditional probability of Y
_{
E
}Y
_{
T
} can be derived from the joint probability in (3) and is equal to:
{\pi}_{ET}=\frac{{\pi}_{E}{\psi}_{1}^{{y}_{T}}{(1{\psi}_{1})}^{1{y}_{T}}}{{\pi}_{E}{\psi}_{1}^{{y}_{T}}{(1{\psi}_{1})}^{1{y}_{T}}+(1{\psi}_{1})(1{\pi}_{E})}.
An analogous conditional probability of Y
_{
T
}Y
_{
E
} can also be derived.
There are two key properties of the above model that are worthy of discussion. First, the correlation parameter, ψ
_{1}, has the useful interpretation that ψ
_{1}/(1−ψ
_{1}) is the odds ratio between Y
_{
E
} and Y
_{
T
}. A second, less desirable property, is that π
_{
T
} and π
_{
E
} are no longer the marginal probabilities of Y
_{
T
} and Y
_{
E
} equal to 1, respectively, if ψ
_{1}≠.5. Instead, the marginal probability of Y
_{
E
} equal to 1 is:
\begin{array}{ll}\mathit{\text{Pr}}({Y}_{E}=1)=& k({\pi}_{T},{\pi}_{E},{\psi}_{1}){\pi}_{E}\left(\left(1{\pi}_{T}\right)\left(1{\psi}_{1}\right)\right.\\ \left(\right)close=")">+\phantom{\rule{0.3em}{0ex}}{\pi}_{T}{\psi}_{1}\end{array}\n
and the marginal probability of Y
_{
T
} equal to 1 is:
\begin{array}{ll}\mathit{\text{Pr}}({Y}_{T}=1)=& k({\pi}_{T},{\pi}_{E},{\psi}_{1}){\pi}_{T}\left(\left(1{\pi}_{E}\right)\left(1{\psi}_{1}\right)\right.\\ \left(\right)close=")">+\phantom{\rule{0.3em}{0ex}}{\pi}_{E}{\psi}_{1}& .\end{array}\n
This is a key point that must be considered during dose finding.
Gumbel copula
Thall and Cook [4] instead model the joint probability of efficacy and toxicity using the Gumbel copula discussed by Murtaugh and Fisher [10], which implies the following joint probability model for Y
_{
T
} and Y
_{
E
}:
\begin{array}{ll}\pi ({y}_{T},{y}_{E}z)=& {\pi}_{E}^{{y}_{E}}{(1{\pi}_{E})}^{1{y}_{E}}{\pi}_{T}^{{y}_{T}}{(1{\pi}_{T})}^{1{y}_{T}}\\ +{(1)}^{{y}_{E}+{y}_{T}}{\pi}_{E}(1{\pi}_{E}){\pi}_{T}(1{\pi}_{T}){\psi}_{2}.\end{array}
(4)
Here, ψ
_{2}∈(−1,1) captures the correlation between Y
_{
T
} and Y
_{
E
}, with ψ
_{2}=0 implying independence and ψ
_{2}∈(0,1) implying positive correlation. We can again derive the conditional probability of Y
_{
E
} given Y
_{
T
},
{\pi}_{ET}={\pi}_{E}+{(1)}^{1+{y}_{T}}{\pi}_{E}(1{\pi}_{E}){\pi}_{T}^{1{y}_{T}}{(1{\pi}_{T})}^{{y}_{T}}{\psi}_{2}.
The conditional probability of Y
_{
T
} given Y
_{
E
} can be expressed in an analogous fashion.
An advantage of this model is that both π
_{
E
} and π
_{
T
} retain their original interpretations as the marginal probabilities of efficacy and toxicity, respectively. This can be easily seen by summing P(Y
_{
E
}=1,Y
_{
T
}=1) and P(Y
_{
E
}=1,Y
_{
T
}=0) from (4). Unlike the Braun Copula, the correlation parameter for the Gumbel copula, ψ
_{2}, does not have a straightforward interpretation.
Independent model
An alternate approach would be to assume independence between Y
_{
T
} and Y
_{
E
}, in which case the joint probability of toxicity and efficacy is simply the product of the marginal probabilities,
\pi ({y}_{T},{y}_{E}z)={\pi}_{T}^{{y}_{T}}{(1{\pi}_{T})}^{1{y}_{T}}{\pi}_{E}^{{y}_{E}}{(1{\pi}_{E})}^{1{y}_{E}},
(5)
which is of course what we get by setting ψ
_{1}=0.5 and ψ
_{2}=0 in the Braun and Gumbel copulas, respectively. While it is unlikely that this model accurately reflects the true association between Y
_{
T
} and Y
_{
E
}, this model may still be useful because the sample size in phase III oncology trials is limited and we may lack the sample size to precisely estimate ψ
_{1} and ψ
_{2}. If the likelihood contains very little information about these parameters, it may be that we do not lose much with respect to our ability to identify the optimal dose by assuming independence instead of a more complicated model.
Likelihood and priors
Let (y
_{
T,1},y
_{
E,1}),(y
_{
T,2},y
_{
E,2}),…,(y
_{
T,n
},y
_{
E,n
}) be pairs of binary toxicity and efficacy outcomes at dose levels (z
_{1},z
_{2},…,z
_{
n
}). The full likelihood for the models described above is:
\begin{array}{ll}L\left(\overrightarrow{\beta}\right\overrightarrow{{y}_{T}},\overrightarrow{{y}_{E}},\overrightarrow{z})=& \prod _{i=1}^{n}\pi {\left(1,1{z}_{i}\right)}^{{y}_{T,i}{y}_{E,i}}\pi {\left(0,1{z}_{i}\right)}^{\left(1{y}_{T,i}\right){y}_{E,i}}\phantom{\rule{2em}{0ex}}\\ \times \pi {\left(1,0{z}_{i}\right)}^{{y}_{T,i}\left(1{y}_{E,i}\right)}\pi \phantom{\rule{2em}{0ex}}\\ \times {\left(0,0{z}_{i}\right)}^{\left(1{y}_{T,i}\right)\left(1{y}_{E,i}\right)}\phantom{\rule{2em}{0ex}}\end{array}
where π(Y
_{
T
},Y
_{
E
}z) is defined using either (3), (4) or (5) and \overrightarrow{\beta}=({\beta}_{0,T},{\beta}_{1,T},{\beta}_{0,E},{\beta}_{1,E},{\beta}_{2,E},{\psi}_{k}) with k=1,2 for the two copula models and \overrightarrow{\beta}=({\beta}_{0,T},{\beta}_{1,T},{\beta}_{0,E},{\beta}_{1,E},{\beta}_{2,E}) for the independence model.
We must specify a prior distribution for each regression and association parameter, to complete a Bayesian analysis. We specify the following normal priors for the two intercept terms and the quadratic term for efficacy: β
_{0,T
}∼N(−3,s d=3), β
_{0,E
}∼N(−1,3), and {\beta}_{2,E}\sim N\left(0,\frac{1}{4}\right). The priors for β
_{0,T
} and β
_{0,E
} correspond to a prior belief of P(Y
_{
T
}=1z=1)=0.05 and P(Y
_{
E
}=1z=1)=0.27 but provide sufficient support over all plausible values for β
_{0,T
} and β
_{0,E
} and represent only mildly informative priors. The prior for β
_{2,E
} is chosen to reflect a strong belief against a quadratic relationship but allows the model flexibility should there be drastic departures from a linear relationship. Gamma priors were set for β
_{1,T
} and β
_{1,E
} with mean 1 and standard deviation 2, corresponding to a \mathit{\text{Gamma}}(\frac{1}{4},\frac{1}{4}). Assuming Gamma priors for β
_{1,T
} and β
_{1,E
} implies that the marginal probability of toxicity will be monotonically increasing but the same is not true for the marginal probability of efficacy due to the inclusion of a quadratic term for the marginal probability of efficacy. Finally, we specify noninformative uniform priors for the association parameters: U n i f o r m(0,1) for ψ
_{1} and U n i f o r m(−1,1) for ψ
_{2}.
Dosefinding algorithm
For our simulation study, we follow the dosefinding algorithm proposed by Thall and Cook [4]. These authors identify a set of acceptable doses by defining a maximum acceptable probability of toxicity assuming 100% efficacy, a minimum acceptable probability of efficacy assuming no toxicity and define a desirability index to identify the optimal dose from the set of acceptable doses.
Let {\overline{\pi}}_{T} be the maximum acceptable probability of toxicity assuming 100% efficacy and {\underline{\pi}}_{E} be the minimum acceptable probability of efficacy assuming no toxicity, as specified by the physician. A dose, z, is acceptable if the posterior probabilities of the two events {\pi}_{T}\left(z\right)<{\overline{\pi}}_{T} and {\pi}_{E}\left(z\right)>{\underline{\pi}}_{E} exceed a prespecified threshold, p, i.e.
\mathit{\text{Pr}}\left({\pi}_{T}\right(z)<{\overline{\pi}}_{T},{\pi}_{E}(z)>{\underline{\pi}}_{E}\mathit{\text{Data}},z)>\mathrm{p.}
(6)
The trial terminates for futility if, at any point during the trial, all doses are unacceptable according to Equation (6). The optimal dose is selected from the set of acceptable doses using a desirability index. The desirability index for a (π
_{
T
}(z),π
_{
E
}(z)) pair is defined by Thall and Cook as follows:
D\left(z\right)=1{\left({\left(\frac{{\pi}_{T}\left(z\right)}{{\overline{\pi}}_{T}}\right)}^{q}+{\left(\frac{1{\pi}_{E}\left(z\right)}{1{\underline{\pi}}_{E}}\right)}^{q}\right)}^{1/q},
(7)
where q is defined by identifying a probability of toxicity and probability of efficacy pair, \left({\pi}_{T}^{\ast},{\pi}_{E}^{\ast}\right), that is equally desirable to \left({\overline{\pi}}_{T},1.0\right) and \left(0,{\underline{\pi}}_{E}\right), plugging \left({\pi}_{T}^{\ast},{\pi}_{E}^{\ast}\right) into (7) and solving for q when D(z) equals 0. Larger values of D(z) are considered more desirable and the optimal combination, (0.0,1.0), has D(z) equal to 1 regardless of q.
The dosefinding algorithm proceeds as follows:

1.
Treat the first cohort of m patients at the lowest dose level.

2.
Update the posterior distributions of the probabilities of toxicity and efficacy for each dose level using data from all previous cohorts.

3.
Identify the set of acceptable doses using criterion (6). If no dose is found acceptable, terminate for futility.

4.
Treat the next cohort at the dose that maximizes D(z) under the restriction that dose levels may not be skipped when escalating. Return to step 2.

5.
Repeat steps 2–4 until the maximum sample size is reached. The dose that maximizes D(z) at study completion is considered the optimal dose.