The compartment representation for transplant data
By focussing on cumulative treatment effects and prediction of survival probabilities as primary aim, the most interesting outcome of such studies is survival at a pre-specified long-term time point \( {t}^{\ast } \), e.g., in order to investigate 5-years survival probabilities, \( {t}^{\ast } \) would be set equal to 5 years. In line with the approach of genetic randomisation, two separate populations have to be distinguished conditional on their donor availability status, and survival probabilities in patients with and without a donor available, \( {S}_1\left({t}^{\ast}\right) \) and \( {S}_0\left({t}^{\ast}\right) \), have to be compared, respectively. Donor availability is defined as the identification of a donor in the time interval from study entry to maximum donor search time \( {t}^{search} \), with \( {t}^{search}\le {t}^{\ast } \).
Survival in patients with donor available
At first, survival \( {S}_1\left({t}^{\ast}\right) \) in patients with an available donor is investigated. Let \( W \) denote a random variable representing the (waiting) time from a given origin (time 0) to the time when a donor is identified, \( 0\le W\le {t}^{search} \). The distribution of \( W \) is characterised by the density \( {f}_{01}(t) \) with corresponding hazard function \( {\lambda}_{01}(t) dt=P\left(W\in \left[t,t+ dt\right)|W\ge t\right) \). In theory, \( W \) can always be observed for this population independently of the survival or censoring status of the patients by prolonging donor search until \( {t}^{search} \). In practice, as outlined above, donor availability remains unobserved in patients that die or become censored before donor identification.
Survival in the population with available donor corresponds to a stochastic process with three states (Fig. 1a), where all patients start at the initial state 0.
Let \( T \) denote the failure time to reach state 2. The absorbing state 2 can be reached either directly (0→2) if \( T<W \) or through the intermediate state 1 (0→1→2) if \( T\ge W \). Let \( {\lambda}_{02}(t) \) denote the hazard function for a transition at time \( t \) from the initial state 0 direct to the absorbing state 2. Accordingly
$$ {\lambda}_{02}(t) dt=P\left(T\in \left[t,t+ dt\right)|T\ge t,W>t\right) $$
Note that in patients with a donor (Fig. 1a), no patients remain in state 0 at \( {t}^{search} \) as all patients have either moved to state 1 or to state 2 until \( {t}^{search} \).
Let \( {\lambda}_{12}\left(t,t-w\right) \) denote the hazard function for a transition from the transient state 1 to the absorbing state 2 which may depend on both, time since start, \( t \), and time since transition to state 1, \( t-w \).
$$ {\lambda}_{12}\left(t,t-w\right) dt=P\left(T\in \left[t,t+ dt\right)|T\ge t,W=w,t\ge w\right) $$
Note that donor availability can be considered an external (exogenous) stochastic process with
$$ P\left(T\in \left[u,u+ du\right)|X(u),T\ge u\right)=P\left(T\in \left[u,u+ du\right)|X(t),T\ge u\right) $$
for all \( u,t \) such that \( 0<u\le t \) and \( X(t)=\left\{x(u);0\le u<t\right\} \) denotes the covariate history of the external time-dependent covariate \( x(t) \); for details see p.196 of Kalbfleisch and Prentice [31].
Denote
$$ {S}_1\left({t}^{\ast }|w\right)=\exp \left[-\left\{\underset{0}{\overset{w}{\int }}{\lambda}_{02}(v) dv+\underset{w}{\overset{t^{\ast }}{\int }}{\lambda}_{12}\left(v,v-w\right) dv\right\}\right] $$
(1)
as survival in a population with a fixed waiting time \( W=w \). That is, before transition to state 1 the hazard function \( {\lambda}_{02}(t) \) applies and \( {\lambda}_{12}\left(t,t-w\right) \) thereafter. Survival in the population with a donor is then
$$ {\displaystyle \begin{array}{c}{S}_1\left({t}^{\ast}\right)=\underset{0}{\overset{t^{\ast }}{\int }}{\lambda}_{01}(w)\exp \left[-\underset{0}{\overset{w}{\int }}{\lambda}_{01}(v) dv\right]\left[\exp \left\{-\underset{0}{\overset{w}{\int }}{\lambda}_{02}(v) dv-\underset{w}{\overset{t^{\ast }}{\int }}{\lambda}_{12}\left(w,v\right) dv\right\}\right] dw\\ {}=\underset{0}{\overset{t^{\ast }}{\int }}{f}_{01}(w)\;{S}_1\left({t}^{\ast }|w\right) dw\end{array}} $$
(2)
Survival in patients without donor available
In a genetic randomisation study, patients without a donor available until \( {t}^{search} \) form the control group. Let \( {T}^{\hbox{'}} \) denote the failure time in a classical two-state survival process (Fig. 1b) with
$$ {\lambda}_{02}^{\hbox{'}}(t) dt=P\left({T}^{\hbox{'}}\in \left[t,t+ dt\right)|{T}^{\hbox{'}}\ge t\right) $$
and
$$ {S}_0\left({t}^{\ast}\right)=\exp \left[-\left\{\underset{0}{\overset{t^{\ast }}{\int }}\lambda {\hbox{'}}_{02}(v) dv\right\}\right]. $$
(3)
Since donor availability is stochastically independent of patients’ prognosis under standard treatment, \( {\lambda}_{02}(v)={\lambda}_{02}^{\hbox{'}}(v) \) until \( {t}^{search} \). This independence has two important consequences. Firstly, \( {S}_0\left({t}^{\ast}\right) \) is identical to the counterfactual survival probabilities in a population where there is no change in therapeutic strategies after a donor is identified, i.e. \( {\lambda}_{12}(v)={\lambda}_{02}^{\hbox{'}}(v),v\ge w \); secondly, in order to estimate \( {S}_0\left({t}^{\ast}\right) \) the survival information of the donor available group can be exploited up to time \( w \).
Pseudo-values
In the following the original pseudo-value approach is briefly outlined. Subsequently, generalised pseudo-values for the estimation of \( {S}_0\left({t}^{\ast}\right) \) and \( {S}_1\left({t}^{\ast }|w\right) \) are introduced. Furthermore, appropriate weights for \( {S}_1\left({t}^{\ast }|w\right) \) are defined to compensate for partly unobserved waiting times when estimating \( {S}_1\left({t}^{\ast}\right) \).
Common pseudo-values
The pseudo-value regression technique [25,26,27,28,29,30] for censored survival data provides an attractive alternative to methods commonly applied in survival analysis. The approach specifically allows the modelling of survival probabilities at pre-specified interesting time points. Without relying on proportional hazards, pseudo-values can be flexibly regressed on common baseline covariates within the framework of a generalised linear model [15].
Let \( {T}_i \) denote the time to failure and \( {C}_i \) the time to censoring for the \( i \)-th patient, \( i=1,\dots, n \). Usually \( {\tilde{T}}_i=\min \left({T}_i,{C}_i\right) \) and \( {D}_i=I\left({T}_i\le {C}_i\right) \) can be observed. Let \( \widehat{S}(t) \) denote the ordinary Kaplan-Meier estimate and let \( {\widehat{S}}^{-i}(t) \) denote the Kaplan-Meier estimate where the \( i \)-th observation is excluded (jacknife statistic). The pseudo-value at time \( {t}^{\ast } \) [25] for the \( i \)-th patient is now defined as
$$ {\widehat{V}}_i\left({t}^{\ast}\right)=n\widehat{S}\left({t}^{\ast}\right)-\left(n-1\right){\widehat{S}}^{-i}\left({t}^{\ast}\right),i=1,\dots, n. $$
Individual pseudo-values can attain values below zero and above one. The approach relies on the fact that the Kaplan-Meier estimate is (approximately) unbiased for the marginal survival function [25]. Accordingly, i.i.d. observations, independent right censoring and a sufficiently large risk set at time \( {t}^{\ast } \) have to be assumed. Asymptotic conditional unbiasedness of the pseudo-values given covariates can be shown [25, 32,33,34], which allows the use of regression models with pseudo-values as outcome (further details can be found in [25, 29]).
Generalised 0➔2 pseudo-values for the estimation of \( {S}_0\left({t}^{\ast}\right) \)
For notational convenience let the first \( m \) out of \( n \) patients be those with an observed transition to state 1 at times \( {w}_1,\dots, {w}_m \).
Let \( {\widehat{S}}_0(t) \) denote the Kaplan-Meier estimate for the risk of a direct transition from state 0 to state 2 with censoring at \( {w}_1,\dots, {w}_m \) of those patients with observed transitions to state 1. Hence, only direct transitions from state 0 to state 2 are considered as event. Given the assumed independence between the two stochastic processes (transitions 0→1 and 0→2), \( {\widehat{S}}_0(t) \) is an asymptotically unbiased estimate of \( {S}_0(t) \).
Now, for all \( n \) patients 0→2 pseudo-values for \( {t}^{\ast } \)-year survival based on \( {\widehat{S}}_0\left({t}^{\ast}\right) \) are generated according to the standard approach [25],
$$ {\widehat{V}}_{i,0}\left({t}^{\ast}\right)=n{\widehat{S}}_0\left({t}^{\ast}\right)-\left(n-1\right){\widehat{S}}_0^{-i}\left({t}^{\ast}\right). $$
(4)
As all patients start in state 0 and are at risk for a direct transition to state 2, it is of importance that a 0→2 pseudo-value has to be calculated for each patient. That is, patients with a transition to state 1 are censored at \( {w}_1,\dots, {w}_m \) but not ignored, as the latter would lead to selection bias due to over-representing direct transitions from state 0 to state 2. Since \( {\widehat{V}}_{i,0}\left({t}^{\ast}\right) \) is an asymptotically unbiasedly estimate for \( {S}_0\left({t}^{\ast}\right) \), its mean \( {\overline{V}}_0=\frac{1}{n}{\sum}_{i=1}^n{\widehat{V}}_{i,0}\left({t}^{\ast}\right) \) is asymptotically unbiased as well [25, 32].
Generalised 0➔1➔2 pseudo-values for the estimation of \( {S}_1\left(\left.{t}^{\ast}\right|w\right) \)
To estimate \( {S}_1\left(\left.{t}^{\ast}\right|{w}_i\right),i=1,\dots, m \), 0→1→2 pseudo-values have to be defined. Firstly, covering only the time after transition to state 1, 1→2 pseudo-values are calculated. For \( i=1,\dots, m, \) let \( \widehat{S}\left({t}^{\ast}\left|T\ge {w}_i\right.\right) \) denote the survival propability at \( {t}^{\ast } \) estimated by Kaplan-Meier based on all \( {n}_i \) patients still at risk at \( {w}_i \). Note that \( {t}^{\ast } \) is the long-term time-point of interest since time 0, and Kaplan-Meier estimates at \( \left({t}^{\ast }-{w}_i\right) \) after \( {w}_i \) are used. The 1→2 pseudo-value for the \( i \)-th patient is now defined as
$$ {\widehat{U}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right)={n}_i\widehat{S}\left({t}^{\ast}\left|T\ge {w}_i\right.\right)-\left({n}_i-1\right){\widehat{S}}^{-\mathrm{i}}\left({t}^{\ast}\left|T\ge {w}_i\right.\right). $$
(5)
Here, the interval of the Kaplan-Meier estimate starts at time \( {w}_i \), and accordingly the waiting time history up to \( {w}_i \) can be considered as standard baseline information (see Additional file 1, Section A). Conditional on the observed waiting time \( {w}_i \) for patient \( i \), this pseudo-value \( {\widehat{U}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) shares the properties of the original approach [25, 32,33,34] and
$$ E\left[{\widehat{U}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right)\right]=S\left({t}^{\ast}\left|T\ge {w}_i,W={w}_i\right.\right)+{o}_p(1) $$
holds with \( S\left({t}^{\ast}\left|T\ge {w}_i,W={w}_i\right.\right)=\underset{w_i}{\overset{t^{\ast }}{\int }}{\lambda}_{12}\left(v,v-{w}_i\right) dv \) (for details see Additional file 1, Section A).
The definition of 0→1→2 pseudo-values for \( {S}_1\left({t}^{\ast }|{w}_i\right) \) requires to consider the risk of an event in state 0 until transition time \( {w}_i \) as well. The Kaplan-Meier estimate \( {\widehat{S}}_0\left({w}_i\right) \) can be used for this adjustment, so that the 0→1→2 generalised pseudo-value is defined by
$$ {\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right)={\widehat{S}}_0\left({w}_i\right)\kern0.5em {\widehat{U}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right). $$
(6)
Utilising the independence between \( {\widehat{S}}_0\left({w}_i\right) \) and \( {\widehat{U}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \), \( {\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) is asymptotically unbiased too,
\( E\left[{\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right)\right]={S}_0\left({w}_i\right)S\left({t}^{\ast}\left|T\ge {w}_i,W={w}_i\right.\right)+{o}_p(1)={S}_1\left({t}^{\ast}\left|{w}_i\right.\right)+{o}_p(1). \)
Estimation of \( {S}_1\left({t}^{\ast}\right) \)
While the individual \( {\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) provide an estimate of \( {S}_1\left({t}^{\ast}\left|{w}_i\right.\right),i=1,\dots, m \), their mean \( \frac{1}{m}\sum \limits_{i=1}^m{\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) estimates
$$ {\overline{S}}_1\left({t}^{\ast}\right)=\underset{0}{\overset{t^{\ast }}{\int }}q(w){S}_1\left({t}^{\ast}\left|w\right.\right) dw, $$
(7)
where \( q(w) \) is the density of the distribution of observable waiting times before \( {t}^{search} \) (Additional file 1: Section B). This density describes waiting times up to \( {t}^{search} \) not prevented by the competing risks death and early censoring with the hazard functions \( {\lambda}_{02}(t) \) and \( {\lambda}_C(t) \), respectively. According to eq. 2, our primary aim is to estimate survival in the population with a donor, that is
\( {S}_1\left({t}^{\ast}\right)=\underset{0}{\overset{t^{\ast }}{\int }}{f}_{01}(w)\;{S}_1\left(t|w\right) dw. \)
Note that, unless \( {\lambda}_{02}(v)=0 \) and \( {\lambda}_C(v)=0 \) for \( v<{t}^{search},q(w)\ne {f}_{01}(w) \) and consequently \( {\overline{S}}_1\left({t}^{\ast}\right)\ne {S}_1\left({t}^{\ast}\right) \). Since for some patients donor availability is unknown due to censoring or 0→2 transitions before time \( w \), long waiting times are under-represented in \( q(w) \) compared to \( {f}_{01}(w) \).
In order to account for unobserved 0→1 transitions due to early 0→2 transitions and censoring, inverse probability of censoring [35] can be used to estimate the weights
$$ {\gamma}_{i,1}=\frac{f_{01}\left({w}_i\right)}{q\left({w}_i\right)} $$
(8)
for every observed 0→1 transition at \( {w}_i,i=1,\dots, m \). As donor availability is independent of patients behaviour before \( {w}_i \), all \( n \) patients with and without donor available can be used. Let \( \widehat{G}(w) \) denote the Kaplan-Meier estimate on the whole sample, where censoring and 0→2 transitions are considered as events and 0→1 transitions are considered as censored observations. Then
$$ E\left[\widehat{G}(w)\right]=\exp \left[-\underset{0}{\overset{w}{\int }}{\lambda}_{02}(v)+{\lambda}_C(v) dv\right]+{o}_P(1) $$
which is the probability of being uncensored and free of a 0→2 transition at time \( w \). In patients with a donor available at waiting time \( {w}_i \) this is equivalent to the probability \( \frac{q\left({w}_i\right){p}_m}{f_{01}\left({w}_i\right)} \) of acutally observing the 0→1 transition.
The weights are then \( {\widehat{\gamma}}_{i,1}=\frac{1}{\widehat{G}\left({w}_i\right)}{\widehat{p}}_m \), \( i=1,\dots, m \), and \( {\widehat{p}}_m=\frac{m}{\sum \limits_{j=1}^m\frac{1}{\widehat{G}\left({w}_j\right)}} \) so that \( \sum \limits_{i=1}^m{\widehat{\gamma}}_{i,1}=m \).
Now, the weighted mean \( {\overline{V}}_1=\frac{1}{m}\sum \limits_{i=1}^m{\widehat{\gamma}}_{i,1}{\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) is a consistent estimator of \( {S}_1\left({t}^{\ast}\right) \) [35].
Weighted generalised linear model for comparing \( {S}_0\left({t}^{\ast}\right) \) with \( {S}_1\left({t}^{\ast}\right) \)
Analogous to the original pseudo-value approach, a weighted generalised linear model is utilized for inferential purposes with log-log link function, normal response probability distribution, and the parameters \( {\beta}_0 \) and \( {\beta}_1 \). Whereas \( {\beta}_0 \) corresponds to the intercept, \( {\beta}_1 \) corresponds to the binary indicator \( {x}_{1i} \), with \( {x}_{1i}=0 \) for \( i=1,\dots, n \) and \( {x}_{1i}=1 \) for \( i=n+1,\dots, n+m \). The \( \left(n+m\right) \)-dimensional response vector is \( \widehat{\mathbf{V}}={\left({\widehat{V}}_{1,0}\left({t}^{\ast}\right),\dots, {\widehat{V}}_{n,0}\left({t}^{\ast}\right),{\widehat{V}}_{1,1}\left({t}^{\ast}\left|{w}_1\right.\right),\dots, {\widehat{V}}_{m,1}\left({t}^{\ast}\left|{w}_m\right.\right)\right)}^{\hbox{'}} \). The weights \( {\widehat{\gamma}}_{i,1} \) for \( {\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right) \) are defined in the previous chapter and the weights \( {\widehat{\gamma}}_{i,0} \) for \( {\widehat{V}}_{i,0}\left({t}^{\ast}\right) \) are set to one.
The parameter estimates are functions of the weighted means of the two types of pseudo-values; \( {\widehat{\beta}}_0=g\left({\overline{V}}_0\right) \) and \( {\widehat{\beta}}_1=g\left({\overline{V}}_1\right)-g\left({\overline{V}}_0\right) \). When the log-log link function is used, \( g(V)=\log \left(-\log (V)\right) \) and \( \exp \left({\widehat{\beta}}_1\right) \) is an asymptotically unbiased estimate of the cumulative hazard ratio at time \( {t}^{\ast } \) comparing patients with and without donor available.
Estimation of standard errors
When using the Kaplan-Meier estimate \( {\widehat{S}}_0\left({w}_i\right) \) in the estimation of \( {\widehat{V}}_{i,1}\left({t}^{\ast}\left|{w}_i\right.\right),i=1,\dots, m \) (eq. 6), the individual patients’ variation is not properly represented. Consequently, the weighted generalised linear model with a sandwich estimator underestimates the standard errors of \( \widehat{\beta^{\hbox{'}}}=\left({\widehat{\beta}}_0,{\widehat{\beta}}_1\right) \) on average. More appropriate standard errors can be attained by replacing \( {\widehat{S}}_0\left({w}_i\right) \) in eq. 6 by random draws from Bernoulli variables \( {\mathrm{B}}_i \) with
\( {\mathrm{B}}_i\sim \mathrm{Bernoulli}\left[\exp \left\{-\exp \left({p}_i\right)\right\}\right] \) where \( {p}_i\sim N\left(\log \left[-\log \left\{{\widehat{S}}_0\left({w}_i\right)\right\}\right],\frac{{\widehat{\sigma}}^2\left({\widehat{S}}_0\left({w}_i\right)\right)}{{\left[{\widehat{S}}_0\left({w}_i\right)\kern0.24em \log \left\{{\widehat{S}}_0\left({w}_i\right)\right\}\right]}^2}\right) \)
and \( {\widehat{\sigma}}^2\left({\widehat{S}}_0\left({w}_i\right)\right) \) denotes the variance estimator of the Kaplan-Meier estimator (Greenwood’s formula). This ad-hoc approach exploits the asymptotic normality of the log-log transformation of the Kaplan-Meier estimator [31]. In practical applications, the imputations should be repeated several times and the obtained standard errors should be averaged for stability reasons.
Software implementation
The proposed method can be straightforwardly implemented using standard routines available in the majority of statistical software packages. Details on the implementation in SAS and R are described in the Additional file 1 (Section D).