Skip to main content

Table 3 Sample size and number of candidate predictors informing analyses for 152 developed models, by modelling type

From: Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review

  Regression-based models (n = 42) Non-regression-based models (n = 71) Ensemble models (n = 39)
Reported, n (%) Median [IQR], range Reported, n (%) Median [IQR], range Reported, n (%) Median [IQR], range
Total sample size
 Model development 42 (100) 561 [203 to 2822], 20 to 582,398 70 (99) 447 [156 to 11,901], 20 to 582,398 39 (100) 768 [203 to 1599], 20 to 582,398
 Internal validationa 20 (48) 122 [82 to 228], 47 to 291,200 35 (49) 145 [90 to 492], 47 to 291,200 24 (62) 162 [97 to 1510], 67 to 291,200
 External validation 12 (29) 511 [67 to 2300], 11 to 836,659 14 (20) 793 [59 to 1675], 11 to 836,659 11 (28) 313 [229 to 836,659], 11 to 836,659
Number of events
 Model development 20 (48) 236 [34 to 1326], 7 to 35,019 37 (52) 62 [22 to 1075], 7 to 45,797 10 (26) 37 [22 to 241], 8 to 35,019
 Internal validationa 2 (5) 41 [21 to 61], 21 to 61 3 (4) 61 [21 to 62], 21 to 62 1 (3) 61
 External validation 8 (19) 81 [18 to 327], 7 to 513 11 (15) 19 [7 to 513], 7 to 1323 5 (13) 81 [81 to 81], 7 to 513
No. candidate predictors 38 (90) 21 [15 to 34], 6 to 33,788 64 (90) 16 [12 to 25], 5 to 33,788 36 (92) 25 [14 to 37], 4 to 33,788
Events per predictorb 20 (48) 8.0 [7.1 to 23.5], 0.2 to 5836.5 35 (49) 3.4 [1.1 to 19.1], 0.2 to 5836.5 10 (26) 1.7 [1.1 to 6.0], 0.7 to 5836.5
  1. aCombines all internal validation methods, e.g., split sample, cross validation, bootstrapping
  2. bEvents per predictor for model development