Survival prediction models: an introduction to discrete-time modeling

Table 1 Machine learning methods

Method	Description
Random forest	Are an ensemble of tree-based learners that are built using bootstrap samples of the training data and average the predictions from the individuals trees. In constructing the trees, a random subset of features is selected for evaluating the split criterion at each node. This leads to de-correlated individual trees that can improve predictive performance.
Boosting	Are an ensemble of base learners that are constructed sequentially and are progressively reweighted to increase emphasis on observations with wrong predictions and high errors. Thus, the subsequent learners are more likely to correctly classify these misclassified observations.
Support vector machines	Uses a kernel function to map input features into high-dimensional feature spaces where classification (survival) can be described by a hyperplane.
Penalized regression	Provides a mathematical solution to applying regression methods to correlated features by using an ℓ₂ penalty term (ridge). Additionally, can encourage sparsity by using an ℓ₁ penalty (LASSO) to avoid overfitting and perform variable selection. A weighted combination of ℓ₁ and ℓ₂ penalties can be used to do both (elastic net).
Artificial neural networks	Are comprised of node layers starting with input layer representing the data features, that feeds into one or more hidden layers, and ends with an output layer that presents the final prediction.

ISSN: 1471-2288