Review and evaluation of performance measures for survival prediction models in external validation settings

Rahman, M. Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z.

doi:10.1186/s12874-017-0336-2

BMC Medical Research Methodology

Table 1 Summary of the performance measures

From: Review and evaluation of performance measures for survival prediction models in external validation settings

Types of Measures	Measures	Characteristics	Range and Interpretation	Software
Overall Performance	R² _BS	Assesses relative gain in predictive accuracy quantified using at a specific time point based on squared error loss function.	Range: 0 to 1 Interpretation: % gain in predictive accuracy at a single time point relative to the null model.	Available in SAS and R and easy to implement in other software
	R² _IBS	Same approach as R² _BS but provides a summary over a range of time period.	Range: same as R² _BS Interpretation: % gain in predictive accuracy over a range of time period relative to the null model.	Available in SAS and R and easy to implement in other software
	R² _SH	Assesses relative gain in predictive accuracy quantified based on absolute error loss function. It is not robust to model mis-specification.	Same as R² _IBS	Available in SAS and R and easy to implement in other software
	R² _S	Modified version of R² _SH which is robust to model mis-specification.	Same as R² _IBS	Available in SAS and R and easy to implement other software
	R² _PM	Measures the variation in the outcome explained by the covariates in the model. Assume that the model is correctly specified. Requires re-calibration in the validation data.	Range: 0 to 1 Interpretation: % of explained variation by the model.	Easy to implement in any software
	R² _D	Measures the relative gain in prognostic separation quantified by the D statistic. Assume that the PI is normally distributed.	Range: 0 to 1 Interpretation: % of prognostic separation explained by the model.	Available in Stata and easy to implement in other software
Discrimination	C_H	Rank order statistic based on usable pairs in which shorter time corresponds to an event.	Range: 0.5 to 1 Interpretation: probability of correct ordering for a randomly selected pair of subjects.	Available in R and Stata and easy to implement in software
	C_U	Rank order statistic based on usable pairs. Inverse probability weighting is used to compensate for censoring.	Same as C_H.	Available in R and easy to implement in other software
	C_GH	Rank order statistic based on all patient pairs. Assumes that Cox PH model is correctly specified.Requires re-calibration in the validation data.	Same as C_H.	Available in R and Stata and easy to implement in other software
	D	Quantifies the observed separation between low and high risk groups. Assumes that PI is normally distributed.	Range: 0 to ∞ Interpretation: log hazard ratio between two equal sized prognostic groups fromed by dichotomising the PI at its median..	Available in Stata and easy to implement in other software
Calibration	Cal Slope	Regression slope of the PI and assesses the agreement between the observed and predicted survival..	Range: −∞ to ∞ Interpretation: a value of 1 suggests perfect calibration and a value much lower than 1 suggest overfitting.	Easy to implement in any software

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com