From: Accounting for treatment use when validating a prognostic model: a simulation study
Approach | Implementation | Key considerations |
---|---|---|
1. Exclude treated individuals | 1. Exclude any individual who received treatment between the point of prediction and the assessment of the outcome from the analysis. 2. Estimate model performance in only the untreated subset. | - Provides correct estimates of performance in the (untreated) target population if treatment use is not associated with other prognostic factors.† - Decreases the effective sample size. |
2. Inverse probability weighting | 1. Fit a propensity score (PS) model for treatment in the validation set using logistic regression: logit(Tr i ) = \( {\upalpha}_0+{\sum}_{i=1}^{\mathrm{n}}\left({\upalpha}_i{\mathrm{X}}_i\right) \) 2. Calculate PS for individuals using the estimates from the fitted PS model: PS i = \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upalpha}}_i{\mathrm{X}}_i\right) \) 3. Calculate inverse probability weights (wi) for each untreated individual based on their individual PS: w i = 1 / (1 - PS i ) [17] 4. Exclude treated individuals from the analysis set. 5. (optional) Truncate weights [21]. 6. Estimate weighted measures of model performance in only the untreated subset. | - Provides correct estimates of performance in (untreated) target population if treatment use is or is not associated with other prognostic factors, provided key assumptions of IPW are met.† - Does not provide correct estimates in the presence of non-positivity, or when there are unobserved predictors that are strongly associated with both the outcome and use of treatment [15, 18]. - Exclusion of treated individuals decreases the effective sample size. - Extreme weights can further reduce precision and introduce bias. |
3. Recalibration | 1. Calculate the linear predictor of the prognostic model: LP0 i = \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upbeta}}_i{\mathrm{X}}_i\right) \) 2. Re-estimate the model intercept in the full validation data [23, 22]. logit(Y i ) = γ 0 + offset(LP0 i ) 3. Calculate the updated linear predictor. LP1 i = \( {\widehat{\gamma}}_0 \) + LP0 i 4. Estimate model performance using LP1. | - Does not affect discrimination. - Not sufficient to correct calibration if relative treatment effects are heterogeneous or use is associated with an individual’s risk. - Adjusts for other differences in case-mix leading to misleading estimates of the calibration of the original model. |
4. Model treatment | 1. Refit the original prognostic model using the full validation data, including an indicator term for treatment use and treatment interaction terms. i) with recalibration of the intercept: logit(Y i ) = γ 0 + offset(LP0 i ) + γ Tr Tr i * ii) with a full refit of the original model: logit(Y i ) = γ 0 + \( {\sum}_{i=1}^{\mathrm{n}}\left({\upgamma}_i{\mathrm{X}}_i\right) \)+ γ Tr Tr i * 2. Calculate the updated linear predictor. i) LP2 i = \( {\widehat{\gamma}}_0 \) + \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upbeta}}_i{\mathrm{X}}_i\right) \)+ \( {\widehat{\gamma}}_{Tr} \)Tr i * ii) LP3 i = \( {\widehat{\gamma}}_0 \) + \( {\sum}_{i=1}^{\mathrm{n}}\left({\widehat{\upgamma}}_i{\mathrm{X}}_i\right) \)+ \( {\widehat{\gamma}}_{Tr} \)Tr i * 3. Estimate model performance using LP2 or LP3. | - Can lead to an over-estimation of model discrimination. - Adjusts for other differences in case-mix leading to misleading estimates of the calibration of the original model. |