Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review

Table 2 Methods for predictor selection before and after modelling and hyperparameter tuning for 152 developed clinical prediction models, by modelling type

	All (n = 152)	Regression-based models (n = 42)	Non-regression-based models (n = 71)	Ensemble models (n = 39)
	n (%)	n (%)	n (%)	n (%)
Predictor selection (before modelling) reported	52 (34)	20 (48)	23 (32)	9 (23)
A-priori	5	3	1	1
No selection before modelling	3	1	2	–
Univariable	24	12	8	4
Clinically relevant and available data	1	–	1	–
Dropout technique at input layer	1	–	1	–
Random forest with RPA	9	1	6	2
Other modelling approach^a	9	3	4	2
Predictor selection (during modelling) reported	63 (41)	25 (59)	27 (38)	11 (28)
Stepwise	6	4	2	–
Forward selection	6	5	–	1
Backward elimination	5	3	2	–
Full model approach (no selection)	11	4	5	2
Feed forward/backpropagation	5	–	5	–
Recursive partitioning analysis	7	–	7	–
LASSO	5	5	–	–
Gini index (minimised)	7	1	4	2
Cross validation	4	2	–	2
Other^b	7	1	2	4
Hyperparameter tuning methods reported	31 (21)	4 (10)	15 (23)	12 (31)
Cross validation	19	4	7	8
Grid search (no further details provided)	6	–	4	2
Max tree depth	2	–	1	1
Adadelta method	2	–	2	–
Default software values	2	–	1	1

RPA Recursive partitioning analysis, LASSO Least Absolute Shrinkage and Selection Operator
^aModelling approaches include support vector machine, logistic regression, Cox regression, best subset linear regression, decision tree, meta-transformer (base algorithm of extra trees)
^bOther includes change in unspecified performance measure, stochastic gradient descent, function, aggregation of bootstrapped decision trees and Waikato Environment for Knowledge Analysis for development-only studies, and hyperbolic tangent function, greedy algorithm for all models and using final chosen predictors from comparator model

ISSN: 1471-2288