Skip to main content

Table 1 Possible methods to account for the effects of treatment in a validation set

From: Accounting for treatment use when validating a prognostic model: a simulation study

Approach

Implementation

Key considerations

1. Exclude treated individuals

1. Exclude any individual who received treatment between the point of prediction and the assessment of the outcome from the analysis.

2. Estimate model performance in only the untreated subset.

- Provides correct estimates of performance in the (untreated) target population if treatment use is not associated with other prognostic factors.†

- Decreases the effective sample size.

2. Inverse probability weighting

1. Fit a propensity score (PS) model for treatment in the validation set using logistic regression:

logit(Tr i ) = \( {\upalpha}_0+{\sum}_{i=1}^{\mathrm{n}}\left({\upalpha}_i{\mathrm{X}}_i\right) \)

2. Calculate PS for individuals using the estimates from the fitted PS model:

PS i = \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upalpha}}_i{\mathrm{X}}_i\right) \)

3. Calculate inverse probability weights (wi) for each untreated individual based on their individual PS:

w i  = 1 / (1 - PS i ) [17]

4. Exclude treated individuals from the analysis set.

5. (optional) Truncate weights [21].

6. Estimate weighted measures of model performance in only the untreated subset.

- Provides correct estimates of performance in (untreated) target population if treatment use is or is not associated with other prognostic factors, provided key assumptions of IPW are met.†

- Does not provide correct estimates in the presence of non-positivity, or when there are unobserved predictors that are strongly associated with both the outcome and use of treatment [15, 18].

- Exclusion of treated individuals decreases the effective sample size.

- Extreme weights can further reduce precision and introduce bias.

3. Recalibration

1. Calculate the linear predictor of the prognostic model:

LP0 i = \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upbeta}}_i{\mathrm{X}}_i\right) \)

2. Re-estimate the model intercept in the full validation data [23, 22].

logit(Y i ) = γ 0 + offset(LP0 i )

3. Calculate the updated linear predictor.

LP1 i = \( {\widehat{\gamma}}_0 \) + LP0 i

4. Estimate model performance using LP1.

- Does not affect discrimination.

- Not sufficient to correct calibration if relative treatment effects are heterogeneous or use is associated with an individual’s risk.

- Adjusts for other differences in case-mix leading to misleading estimates of the calibration of the original model.

4. Model treatment

1. Refit the original prognostic model using the full validation data, including an indicator term for treatment use and treatment interaction terms.

i) with recalibration of the intercept:

logit(Y i ) = γ 0 + offset(LP0 i ) + γ Tr Tr i *

ii) with a full refit of the original model:

logit(Y i ) = γ 0 + \( {\sum}_{i=1}^{\mathrm{n}}\left({\upgamma}_i{\mathrm{X}}_i\right) \)+ γ Tr Tr i *

2. Calculate the updated linear predictor.

i) LP2 i = \( {\widehat{\gamma}}_0 \) + \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upbeta}}_i{\mathrm{X}}_i\right) \)+ \( {\widehat{\gamma}}_{Tr} \)Tr i *

ii) LP3 i = \( {\widehat{\gamma}}_0 \) + \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upgamma}}_i{\mathrm{X}}_i\right) \)+ \( {\widehat{\gamma}}_{Tr} \)Tr i *

3. Estimate model performance using LP2 or LP3.

- Can lead to an over-estimation of model discrimination.

- Adjusts for other differences in case-mix leading to misleading estimates of the calibration of the original model.

  1. Abbreviations: X i design matrix (predictor values) for individual i; Y i outcome for individual i; LP linear predictor; PS propensity score; Tr treatment
  2. \( {\widehat{\upalpha}}_i \) represent coefficients of the treatment propensity model for individual i
  3. \( {\widehat{\upbeta}}_i \) represent coefficients of the original prognostic model for individual i
  4. \( {\widehat{\upgamma}}_i \) represent coefficients of the updated prognostic model for individual i
  5. *Interaction terms between treatment use and predictors should be included where necessary
  6. Estimates will be correct providing all other modelling assumptions are met