Application of machine learning in predicting survival outcomes involving real-world data: a scoping review

Huang, Yinan; Li, Jieni; Li, Mai; Aparasu, Rajender R.

doi:10.1186/s12874-023-02078-1

BMC Medical Research Methodology

Table 2 Description of ML methods

From: Application of machine learning in predicting survival outcomes involving real-world data: a scoping review

Method	Basic Concept	How It Works	Pros	Cons
Random Survival Forest	An ensemble tree-based learning algorithm specialized for survival analysis	Trains multiple decision trees on different subsets of the data and averages predictions. Time-to-event data is used to split nodes and generate survival curves	Handles large, high-dimensional datasets; automatically handles feature interactions; robust to outliers	Can be slow on large datasets; may overfit without careful tuning
Boosted Tree	An ensemble tree-based method that combines weak predictors to form a strong predictor	Trains simple models in a sequential manner. Each new tree tries to correct the mistakes of the previous one	Can handle different types of data; reduces bias and variance; highly accurate	Can overfit if too many trees are used; requires careful tuning; less interpretable
Artificial Neural Network	A model inspired by the human brain, with layers of interconnected nodes or "neurons"	Each neuron receives input from previous neurons, applies a transformation, and sends the output to next neurons. Learning involves updating the transformation parameters	Can model complex nonlinear relationships; highly flexible and adaptable	Requires lots of data and computational resources; hard to interpret; prone to overfitting
Support Vector Machine	A binary classification method that finds the hyperplane maximizing the margin between classes	Finds the hyperplane that maximizes the distance between closest points of different classes. Can use kernels for nonlinear boundaries	Effective in high dimensional spaces; robust to overfitting in the right dimensional space	Not suitable for larger datasets; requires careful choice of kernel; not directly applicable for multi-class problems
Regularization (LASSO, Ridge)	Linear models with added terms in the loss function to prevent overfitting	LASSO (L1 regularization) and Ridge (L2 regularization) add penalty terms to the loss function that shrink coefficients towards zero	Prevents overfitting; reduces model complexity	May lead to underfitting if regularization parameter is not tuned correctly
K-Nearest Neighbor	A simple algorithm that predicts based on the k closest training examples	For a new instance, finds the k nearest instances in the training set and predicts based on their output	Simple to understand and implement; no assumptions about data distribution	Computationally expensive for large datasets; sensitive to irrelevant features; performance depends on the choice of k
Multi-Layer Perceptron	A type of artificial neural network with one or more hidden layers	Works as a simple neural network with added hidden layers for complex transformations	Can model complex nonlinear relationships; flexible and adaptable	Requires lots of data and computational resources; hard to interpret; prone to overfitting
Naive Bayes	Probabilistic classifier based on Bayes' theorem with strong (naive) independence assumptions between features	Each feature independently contributes to the probability of the class. Class with the highest probability is chosen	Fast and efficient; performs well with high dimensions; requires less training data	Assumes feature independence which is often not the case; can be biased if a class lacks representation in the training data

Back to article page

ISSN: 1471-2288

Contact us

General enquiries: journalsubmissions@springernature.com