Method | Basic Concept | How It Works | Pros | Cons |
---|---|---|---|---|
Random Survival Forest | An ensemble tree-based learning algorithm specialized for survival analysis | Trains multiple decision trees on different subsets of the data and averages predictions. Time-to-event data is used to split nodes and generate survival curves | Handles large, high-dimensional datasets; automatically handles feature interactions; robust to outliers | Can be slow on large datasets; may overfit without careful tuning |
Boosted Tree | An ensemble tree-based method that combines weak predictors to form a strong predictor | Trains simple models in a sequential manner. Each new tree tries to correct the mistakes of the previous one | Can handle different types of data; reduces bias and variance; highly accurate | Can overfit if too many trees are used; requires careful tuning; less interpretable |
Artificial Neural Network | A model inspired by the human brain, with layers of interconnected nodes or "neurons" | Each neuron receives input from previous neurons, applies a transformation, and sends the output to next neurons. Learning involves updating the transformation parameters | Can model complex nonlinear relationships; highly flexible and adaptable | Requires lots of data and computational resources; hard to interpret; prone to overfitting |
Support Vector Machine | A binary classification method that finds the hyperplane maximizing the margin between classes | Finds the hyperplane that maximizes the distance between closest points of different classes. Can use kernels for nonlinear boundaries | Effective in high dimensional spaces; robust to overfitting in the right dimensional space | Not suitable for larger datasets; requires careful choice of kernel; not directly applicable for multi-class problems |
Regularization (LASSO, Ridge) | Linear models with added terms in the loss function to prevent overfitting | LASSO (L1 regularization) and Ridge (L2 regularization) add penalty terms to the loss function that shrink coefficients towards zero | Prevents overfitting; reduces model complexity | May lead to underfitting if regularization parameter is not tuned correctly |
K-Nearest Neighbor | A simple algorithm that predicts based on the k closest training examples | For a new instance, finds the k nearest instances in the training set and predicts based on their output | Simple to understand and implement; no assumptions about data distribution | Computationally expensive for large datasets; sensitive to irrelevant features; performance depends on the choice of k |
Multi-Layer Perceptron | A type of artificial neural network with one or more hidden layers | Works as a simple neural network with added hidden layers for complex transformations | Can model complex nonlinear relationships; flexible and adaptable | Requires lots of data and computational resources; hard to interpret; prone to overfitting |
Naive Bayes | Probabilistic classifier based on Bayes' theorem with strong (naive) independence assumptions between features | Each feature independently contributes to the probability of the class. Class with the highest probability is chosen | Fast and efficient; performs well with high dimensions; requires less training data | Assumes feature independence which is often not the case; can be biased if a class lacks representation in the training data |