We identified 12 relevant full-text papers that best illustrated the joint modelling framework and summarised its applications in localised prostate cancer, these were selected to be included within this review. Table 2 summarises these twelve papers including details of the modelling framework used, sample sizes, parameterisations, the prediction windows of interest, whether validation was undertaken, and the code/software used.
Where available, the corresponding software and code with packages can also be found [65, 66]. Nine papers (9 of 12, 75%) applied the shared-parameter joint modelling framework, with three of these presenting the standard joint model for a time-to-failure endpoint, while 6 of 9 papers presented extensions to the time-to-event submodel incorporating cure, competing risks, and multi-state models for localised prostate cancer (e.g., local- and distant recurrence, salvage therapy, and death). Three papers (3 of 12, 25%) described the joint latent class approach. In the following, we review and summarise these papers in detail around their model specification, estimation of dynamic predictions and model validations conducted.
Shared-parameter joint models to predict recurrence in localised prostate cancer
In this section, we focus our review on three relevant papers that investigated PSA dynamics to predict recurrence in localised prostate cancer using the SPJM framework: Taylor et al. [44], Sène et al. [46], and Pauler & Finkelstein [37]. All three articles develop models in localised prostate cancer patients treated with EBRT in the absence of neoadjuvant HT. Taylor focused on developing a model to creating a clinical prediction tool online; Sène explored the effect on initiating salvage treatments at different time points and its effect on the predicted dynamic probabilities of recurrence. Pauler & Finkelstein use a change-point model to capture any jump in PSA.
Model specification
In Taylor et al. [44], the functional form over time of the longitudinal PSA mixed model assumes three phases: baseline/presenting PSA (Β0), and the short-term (decrease, Β1), and long-term (increase, Β2) evolutions of PSA, Yi(t) = log[PSAi(t) + 0.1] = Β0 + Β1f1 + Β2f2, with \({f}_1=\left\{{\left(1+\mathrm{time}\right)}^{-\frac{3}{2}}-1\right\}\ \mathrm{and}\ {f}_2=\mathrm{time}\). For each of the three phases, Bk = {0, 1, 2}, are matrices containing linear combinations of the fixed baseline covariates T-stage, Gleason grade and presenting pre-treatment PSA, along with subject-specific random effects parameters. A t-distribution with five degrees-of-freedom for the error term is assumed. Time to prostate cancer clinical recurrence is measured from the end of RT. In the survival submodel, the functional form f(Mi(t), bi, α) is a linear combination of the value of PSA concentration and its slope at time t, \(f\left({\boldsymbol{M}}_i(t)\ {\boldsymbol{b}}_i,\boldsymbol{\alpha} \right)={\alpha}_1\mathrm{PSA}(t)+{\alpha}_2\frac{d\ \mathrm{PSA}(t)}{dt}\). Additionally, the survival submodel included a time-dependent indicator variable for when salvage hormonal treatment (ST) is initiated to account for the subsequent drop in hazard of clinical failure. PSA values after ST were excluded due to the sudden decrease in PSA trajectory and did not feature in the mixed-effect model; however, clinical recurrences after ST were considered. A piecewise constant function is assumed for the baseline hazard.
Sène et al. [46] made similar modelling assumptions as Taylor et al. [44] for the functional forms in the mixed and survival submodels. The model adjusted for presenting PSA, Gleason score, T-stage, and total corrected dose of EBRT (using the linear-quadratic model given in [67]). Initiation of ST was included as a time-dependent indicator variable to reflect the potential decrease in risk of progression; five functional forms of ST were considered. Three different association structures of f were fitted: a linear combination of PSA value and gradient (with and without a logistic transformation for PSA), and the random effect structure, which evaluated the individual deviations from the overall population’s PSA trajectories. A combination of those different parametrisations yielded 12 models with varied complexity.
Pauler & Finkelstein [37] proposed a change-point parameterisations in the longitudinal model for PSA, by incorporating an unknown change-point indicator variable for whether change in PSA has occurred. If a shift is indicated, a likely change-point time-range is estimated with a uniform distribution for PSA. A narrower posterior change-point range with larger differences in the slopes before- and after the change-point indicate prostate cancer recurrence is likely (before the formal clinical failure endpoint). Trivariate normal and uniform priors are used for four random effects, they included: intercept; change-point time (uniform); the slope before and after the change-point. For the survival submodel, a piecewise exponential hazard function was used. Baseline covariates included age, presenting PSA, and disease stage. For the joint model, non-informative priors were chosen.
Estimation, prediction and validation
In Taylor et al. [44], estimation was undertaken under a Bayesian framework using C software. The joint model was developed on three pooled cohorts (totalling N = 2,386 patients) and externally tested using a separate fourth dataset (N = 846 patients). Dynamic predictions for an individual patient’s PSA trajectory and risk of recurrence for the next 3 years were shown: no formal validation measures were presented. The authors opted for simpler graphical inspections to study the model, owing to the complicated nature of the time-dependent ST events within the validation cohort. An online prognostic calculator was developed, enabling individual dynamic predictions of disease recurrence given PSA trajectories for future patients (http://psacalc.sph.umich.eduFootnote 1).
In Sène et al. [46], estimation was undertaken under a frequentist framework, and R software used for model development, again using the same three cohorts as in Taylor et al. [44]. Internal approximated leave-one-out cross-validation was used to assess six of the 12 models’ predictability, using BS and EPOCE accuracy measures [64]. The two best fitting models were the logistic-transformed PSA value and slope that separated the effect of PSA before- and after ST, whilst the model with the random effect association structure performed best when assumed that the patient would not start ST within 3 years. Exemplar individualised dynamic predictions used a prediction window of 3 years on an intermediate risk patient. Different scenarios when ST would be initiated were used to illustrate the impact of delays in ST initiation on risk of recurrence. External validation was not performed.
We cannot make direct comparisons between the predictive performances of the two papers as they used different assessment methods (graphical approaches in Taylor, EPOCE & BS presented in Sène). In Sène et al., patients who did not receive HT nor ST within 3 years were mainly used in order to assess predictive performance. Sène noted that this may not be a representative situation for all patients, so they performed a sensitivity analysis using Taylor’s approach to widen the sample on HT-free patients at the landmark prediction time only, then with subsequent ST initiation within the three-year prediction window, as either a recurrence event or dependent censoring. The relative predictive performance was largely unchanged in both papers under this approach and therefore can be considered robust.
In Pauler & Finkelstein [37], estimation was done in a Bayesian framework, using C and S-plus software. The joint model was developed on a cohort of N = 676 patients. As the majority of patients do not exhibit clinical failure, the slope after the change-point was non-significantly negative, indicating PSA trajectories generally remain constant over the follow-up period. The regression coefficients from the relative risk component are not straightforward to interpret due to the number of pairwise and three-way interactions, the authors noted that coefficients are in the expected directions. Sensitivity analysis was done on three differing definitions of recurrence based on PSA rises. They showed that regardless of rule followed, there was little difference to their optimal joint model. The AIC rose when considering only a relative risk model using indicator covariates for each rule, this provided justification on using the joint change-point model, as the longitudinal PSAs substantively improve the goodness-of-fit. The posterior distributions of four individual patient change-points were shown. For two patients who do relapse, sharp change-points are given between 2 and 4 years, who then go on to recur at six and 4 years of follow-up. For stable PSA patients, the change-point is imprecise with very wide uniform posteriors. Individualised predictions are performed on two hypothetical patients showing each’s posterior predictive distributions of time to relapse. Although discussed, the model was not validated.
Latent class joint models to predict recurrence in localised prostate cancer
In this section, we focus our review on relevant papers that investigated PSA dynamics using the JLCM framework. There are three papers of interest reviewed in this section, by Proust-Lima & Taylor [42], Jacqmin-Gadda et al. [43], and a third paper by Proust-Lima et al. [45], which is appraised separately as this compares the SPJM and JLCM.
Proust-Lima & Taylor [42] modelled the functional longitudinal PSA similarly to Taylor et al. [44] (described previously). Baseline covariates T-stage, Gleason score, and pre-treatment PSA were included into both submodels. The survival submodel also includes an exogenous time-dependent indicator variable for initiation of ST, and a class-specific Weibull baseline hazard function.
Model development was performed on a single cohort of patients (N = 1,268), and external validation was performed on two additional smaller cohorts (with N = 503 and N = 615 patients respectively). Several JLCMs were fitted with ranging classes (2—6), with the five-class model (5-JLCM) producing the lowest Bayesian Information Criterion (BIC); the optimal model included estimation of 75 parameters. Predicted PSA evolutions and survival curves for each of the five classes illustrate how PSA trajectories with long-term rise of PSA correspond to greater risk of failure. Dynamic predictions were made within a prediction window of 3 years for two patients with contrasting baseline risk factors: a lower-risk patient who recurs and a higher-risk patient with no observed recurrence.
Within each external validation cohort, measures of predictive accuracy (absolute prediction errors EP and WAEP) for the five-class JLCM were computed, and compared to a relative risk model with baseline information only, and a two-stage landmark model. The JLCM was shown to be the best fitting at various landmark times, and accounting for the longitudinal biomarker reduced both the EP and WAEP, particularly at earlier landmarks.
For Jacqmin-Gadda et al. [43], the score test methodology introduced previously is applied to develop a prognostic joint model for prostate cancer recurrence (with the same dataset used as in Taylor et al., [40]). They develop the JLCM similarly to Proust-Lima et al., [35, 42]. They show that the more flexible 4-class JLCM did not reject conditional independence, whereas the less powerful alternative Wald test for dependence failed to reject the null for a 3-class JLCM.
Comparison between latent-class and shared-parameter joint models
A direct comparison is made between the two types of joint models applied to prostate cancer by Proust-Lima and colleagues [45]. Three prognostic baseline factors were adjusted for, logged initial-PSA, T-stage, and Gleason score using the same Michigan hospital cohort dataset. The three-component parameterisation of PSA in the mixed-effect model was done in the same manner to Proust-Lima & Taylor, and Taylor et al. [42, 44] for both joint models for direct comparison. The developed 4-JLCM adjusting for PSA value and slope was chosen from information criteria and conditional independence being met. The BIC favoured the 4-JLCM compared to the shared-parameter JM.
For direct comparisons between the JLCM and SPJM, evaluation of dynamic predictions (for the entire follow-up) are made using the cross-validated EPOCE framework in the first 6 years. The 4-JLCM is superior to the SPJM in the first 4 years on internal validation, and also slightly better in the first 3 years on external validation.
Extensions to the shared-parameter joint model
We present some further extensions to the joint model in the following subsections. In particular we comment and review four papers with a cured fraction [38,39,40,41]; a competing risk joint model [48], where clinical recurrence is competing with a non-related cancer death; and a multi-state joint model [47], whereby patients can go through a pathway of disease states throughout follow-up.
Joint-cure models
A natural extension to the SPJM is to incorporate a cure component to the time-to-event submodel, whereby patients are considered to be susceptible to experience the event under study (e.g. recurrence), or, on the contrary, to be cured after initial treatment, and thus never susceptible of recurrence. Allocation into the two groups is typically modelled using a logistic classifier submodel:
$$\Pr \left(D=1|\ {X}_i\right)=\frac{\exp \left({\boldsymbol{\beta}}^T{X}_i\right)}{1+\exp \left({\boldsymbol{\beta}}^T{X}_i\right)},$$
where D = 1 refers to the susceptible group (observed only when the event of interest occurs), Xi is the fixed baseline design matrix with their corresponding vector of coefficients, β. Patients that have been allocated to the ‘cured’ group are coded D = 0.
There can be a high proportion of patients that are recurrence-free after long follow-up, resulting in heavy censoring. This may compromise the predictions of a joint model given the lack of events observed. It therefore may be appropriate to model these patients that appear to have prolonged event-free survival as ‘cured’, using a cure joint model.
There are four articles that consider a joint cure model for the risk of clinical recurrence [38,39,40,41]. The four papers have a similar model specification: a nonlinear parametric exponential decay-growth (U-shaped) model is used to capture the logPSA trajectory mi(t, ri) = ri1 exp(−tri2) + ri3 exp(tri4), where ri1, …, 4 are the random effects to be estimated. Those that have been allocated to the cure group (from the logistic incidence submodel) have ri4 ∣ (D = 0) = 0, as this assumption reduces the PSA trajectory, mi(t, ri), to an exponential decay cure SPJM. The conditional failure time model is given by h(t| Di, Xi, ri, β, α, g(mi)) = h0(t) exp( βTXi + αg(mi ∣ D, t)), where g(mi) can be given by including the trajectory function and its slope given in [40, 41].
Baseline covariates included pre-treatment PSA, T-stage, and Gleason score. Additionally Taylor et al. [40] considered PSA value & slope as time-dependent covariates, age, EBRT total delivered dose (in Gy) and treatment duration as baseline covariates. Yu and colleagues [41] included an exogenous time-dependent variable to indicate start of salvage HT, similarly to [44], and used a generalised Weibull model for the baseline hazard function. Both frequentist and Bayesian approaches are directly compared by Yu et al. [39].
In Law et al. [38], the joint cure model is compared to the standard cure model without longitudinal time-dependent information, and to the shared parameter joint model without the cure component. They showed better predictions and discrimination, together with reducing biases from informative censoring. Taylor et al. and Yu et al. [40, 41] compared the predictions of the model with updated information on the same patients who were initially used to develop the model, that is whereby more longitudinal PSAs and events on the same patients are gathered.
The extended shared-parameter joint-cure model offers additional flexibility to model the inherent heterogeneity of patients that go on to have extended event-free survival. Yu et al. directly compared joint models with and without a cure component. They showed a standard JM tends to overestimate the number of clinical events. They compared the two models using the conditional predictive ordinate and BIC, both favouring the additional cure submodel component, despite an extra 30 parameters needed to be estimated [41]. This however may over-parameterise the model without adequate event sizes [68]. Also as the prostate cancer disease pathway is complicated, clinical input is recommended with regards to plausibility of the cure component and its definition.
Competing risks joint models
The event of interest may be precluded by the occurrence of a competing event, for instance, non-cancer related deaths before recurrence. It is well known that biases are elicited by censoring these competing event deaths [69, 70]; joint models can be extended considering the presence of a competing event.
Ferrer et al. [48] perform individual dynamic predictions and validate the robustness of the estimators in the presence of competing risk of death (from a non-related cancer cause), within a frequentist framework. A cause-specific proportional hazards submodel is proposed for each competing event, and thus the relationship of the longitudinal biomarker with each competing event can be assessed. Individual dynamic predictions were estimated and compared to landmarking estimators. Two simulation studies were performed using simulated data that was alike to the applied prostate cancer dataset. Each approach validated the estimators, then compared and assess their robustness to misspecification of the joint model. Both the AUC and mean-squared prediction error were employed to characterise the predictive accuracy. An extension of the AUC was adapted to the competing risk setting, proposed by Blanche et al. [61]. It was shown that in almost all cases, the joint models were superior to the landmark models. The landmark models were only superior to the joint models when the longitudinal biomarker was heavily misspecified. Ferrer’s competing risk paper is the only study to present validation metrics, using simulated studies. Code is available at https://github.com/LoicFerrer/Individual-dynamic-predictions.Footnote 2
Multi-state joint models
The evolution of localised prostate cancer over time can be characterised by the occurrence of different events of interest, such as biochemical failure, local recurrence, distant recurrence and death. One way to jointly model all these events is via multi-state models, in which the event progressions of interest define the transition between different disease states [71]. As longitudinal process such as PSA trajectories can have an impact on several of these event transitions, multi-state models can be generalised to the joint modelling framework.
Ferrer et al. [47] proposed modelling the longitudinal PSA process using a mixed-effect submodel, similarly to Proust-Lima and Taylor [42], Sène et al. [46], and Ferrer at al [48]. They used a non-homogeneous Markov multi-state model for the intensity of the transitions between five states: 0) end of EBRT treatment, 1) local recurrence, 2) salvage HT, 3) distant recurrence, and 4) death (the absorbing state). Intermediate states could be skipped (e.g. ending EBRT0 ➔ death4), and backward transitions were not allowed. Two properties were considered: 1) the Markov property whereby the future process is only dependent on the present state and not the preceding transitions / states; 2) the non-homogeneous property ensures the time since entering the study influences the evolution of the process.
Each transition intensity is modelled assuming proportional hazards and incorporated the biomarker trajectory. For each transition from state i to j, only patients visiting the state i are included in the analysis. The baseline intensity function was modelled parametrically. The maximum likelihood framework was used to estimate the corresponding parameters.
The multi-state joint model was fitted with the same two study datasets as in Ferrer et al. [48]. Four covariates (presenting PSA, Gleason score, T-stage, and study cohort) were adjusted for in the models. Worse baseline risk factors were associated with higher values in their long-term PSA trajectories, and reaching clinical failure states earlier. Higher presenting PSA was associated with a higher instantaneous risk to salvage hormone therapy or death. A linear combination of increases in PSA value & slope were associated with significant deterioration for all clinical progression transitions (local, salvage HT, or distant failures) from the initial state.
After adjusting for all other covariates and PSA slope, a unit increase in the log PSA gave rise to a 43% increase in risk to local recurrence. Patients with continually high PSAs or increasingly steep PSA gradients after EBRT treatment, led to earlier and higher hazards to clinical failure states. Conversely, higher PSA levels had a protective effect on the transition to direct death after EBRT but were more likely to progress though the prostate cancer progression states.
Predictions were compared with the observed data. The observed values were averaged at each decile with corresponding predicted values computed, they show the observed values lay within the 95% CIs, with very similar predicted values. The predicted transition probabilities over time, in a given state to another other feasible state are presented, comparing similar parametric estimated probabilities to the observed. The only exception was between transitions 1➔2 (from local recurrence to receiving HT) where the spike after EBRT was not adequately captured with the splines, it shows there is a very near-immediate initiation of HT after localised recurrence to control the disease. It is worth noting that PSA dynamics were only collected until the patient’s first clinical event and not thereafter and were extrapolated according to their posterior trajectories.
Diagnostics of the joint multi-state model were evaluated visually. Residuals vs fitted values, observed and predicted PSA trajectories, and predicted vs non-parametric transition probabilities between states were presented. In general, they showed the model fits particularly well to the longitudinal, and multi-state submodels. The models themselves were not externally validated nor stated any predictive performance measures, only the estimation process via simulation studies. Although equations for obtaining individual dynamic predictions for patients were presented in the paper, these were not demonstrated with specific examples.
The code to apply these multistate models to a simulated dataset and adapt for use is freely available at https://github.com/LoicFerrer/JMstateModelFootnote 3 and could be used to derive patient predictions and be adapted for the reader’s need.