Skip to main content

Table 1 Summary of the performance measures

From: Review and evaluation of performance measures for survival prediction models in external validation settings

Types of Measures

Measures

Characteristics

Range and Interpretation

Software

Overall Performance

R2 BS

Assesses relative gain in predictive accuracy quantified using at a specific time point based on squared error loss function.

Range: 0 to 1

Interpretation: % gain in predictive accuracy at a single time point relative to the null model.

Available in SAS and R and easy to implement in other software

R2 IBS

Same approach as R2 BS but provides a summary over a range of time period.

Range: same as R2 BS Interpretation: % gain in predictive accuracy over a range of time period relative to the null model.

Available in SAS and R and easy to implement in other software

R2 SH

Assesses relative gain in predictive accuracy quantified based on absolute error loss function. It is not robust to model mis-specification.

Same as R2 IBS

Available in SAS and R and easy to implement in other software

R2 S

Modified version of R2 SH which is robust to model mis-specification.

Same as R2 IBS

Available in SAS and R and easy to implement other software

R2 PM

Measures the variation in the outcome explained by the covariates in the model. Assume that the model is correctly specified. Requires re-calibration in the validation data.

Range: 0 to 1

Interpretation: % of explained variation by the model.

Easy to implement in any software

R2 D

Measures the relative gain in prognostic separation quantified by the D statistic. Assume that the PI is normally distributed.

Range: 0 to 1

Interpretation: % of prognostic separation explained by the model.

Available in Stata and easy to implement in other software

Discrimination

CH

Rank order statistic based on usable pairs in which shorter time corresponds to an event.

Range: 0.5 to 1

Interpretation: probability of correct ordering for a randomly selected pair of subjects.

Available in R and Stata and easy to implement in software

CU

Rank order statistic based on usable pairs. Inverse probability weighting is used to compensate for censoring.

Same as CH.

Available in R and easy to implement in other software

CGH

Rank order statistic based on all patient pairs. Assumes that Cox PH model is correctly specified.Requires re-calibration in the validation data.

Same as CH.

Available in R and Stata and easy to implement in other software

D

Quantifies the observed separation between low and high risk groups. Assumes that PI is normally distributed.

Range: 0 to ∞

Interpretation: log hazard ratio between two equal sized prognostic groups fromed by dichotomising the PI at its median..

Available in Stata and easy to implement in other software

Calibration

Cal Slope

Regression slope of the PI and assesses the agreement between the observed and predicted survival..

Range: −∞ to ∞

Interpretation: a value of 1 suggests perfect calibration and a value much lower than 1 suggest overfitting.

Easy to implement in any software