Skip to main content

Table 1 Simulation Results: Prediction Tools’ Performance Metrics

From: Directed acyclic graphs and causal thinking in clinical risk prediction modeling

  Logistic, Markov Blanket set (Nsim=100,000) Logistic, all 24 variables (Nsim=100,000) Logistic, any variables with a path to the outcome (Nsim=100,000) Logistic, node’s parent variables (Nsim=100,000) Lasso, all 24 variables (Nsim=100,000) Ridge, all 24 variables (Nsim=100,000) Elastic net, all 24 variables (Nsim=100,000) Random forest, all 24 variables (Nsim=100,000)
FULL RESULTS: Including all simulated datasets
ICI
 N Missing 8032 0 8032 37,272 8597 0 8612 1
 Mean (SD) 0.01882 (0.00445) 0.01964 (0.00495) 0.01900 (0.00461) 0.02215 (0.00421) 0.01912 (0.00451) 0.03807 (0.02058) 0.01907 (0.00456) 0.04133 (0.01779)
 Median 0.01857 0.01925 0.01867 0.02242 0.01888 0.02895 0.01881 0.03636
 Range 0.00290–0.03834 0.00289–0.04330 0.00287–0.04330 0.00290–0.03826 0.00287–0.03919 0.00710–0.18537 0.00340–0.04283 0.00704–0.16493
Number of input variables
 N Missing 0 0 0 0 0 0 0 0
 Mean (SD) 4.0 (2.8) 24.0 (0.0) 18.9 (7.0) 1.2 (1.3) 24.0 (0.0) 24.0 (0.0) 24.0 (0.0) 24.0 (0.0)
 Median 3.0 24.0 22.0 1.0 24.0 24.0 24.0 24.0
 Range 0.0–19.0 24.0–24.0 0.0–24.0 0.0–9.0 24.0–24.0 24.0–24.0 24.0–24.0 24.0–24.0
Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool
 N Missing 8032 8032 8032 37,272 9140 8032 9147 8033
  < ICI logistic MB, N (%)   39,354 (42.79%) 39,540 (42.99%) 4864 (7.75%) 26,514 (29.18%) 8871 (9.65%) 31,089 (34.22%) 1650 (1.79%)
  ≥ ICI logistic MB, N (%)   52,614 (57.21%) 52,428 (57.01%) 57,864 (92.25%) 64,346 (70.82%) 83,097 (90.35%) 59,764 (65.78%) 90,317 (98.21%)
COMPLETE CASE RESULTS: only including datasets for which ICI could be estimated for all tools
ICI
 N Missing 37,841 37,841 37,841 37,841 37,841 37,841 37,841 37,841
 Mean (SD) 0.01956 (0.00463) 0.01975 (0.00477) 0.01970 (0.00473) 0.02211 (0.00421) 0.01995 (0.00471) 0.03886 (0.02177) 0.01990 (0.00476) 0.04049 (0.02011)
 Median 0.01953 0.01962 0.01960 0.02238 0.01993 0.02883 0.01987 0.03283
 Range 0.00290–0.03834 0.00289–0.04330 0.00287–0.04330 0.00290–0.03826 0.00287–0.03919 0.00710–0.18537 0.00340–0.04283 0.00704–0.16493
Number of input variables
 N Missing 37,841 37,841 37,841 37,841 37,841 37,841 37,841 37,841
 Mean (SD) 4.1 (2.7) 24.0 (0.0) 20.8 (3.9) 1.9 (1.1) 24.0 (0.0) 24.0 (0.0) 24.0 (0.0) 24.0 (0.0)
 Median 4.0 24.0 22.0 2.0 24.0 24.0 24.0 24.0
 Range 1.0–19.0 24.0–24.0 1.0–24.0 1.0–9.0 24.0–24.0 24.0–24.0 24.0–24.0 24.0–24.0
Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool
 N Missing 37,841 37,841 37,841 37,841 37,841 37,841 37,841 37,841
  < ICI logistic MB, N (%)   26,872 (43.23%) 27,124 (43.64%) 4850 (7.80%) 16,887 (27.17%) 6508 (10.47%) 19,959 (32.11%) 1636 (2.63%)
  ≥ ICI logistic MB, N (%)   35,287 (56.77%) 35,035 (56.36%) 57,309 (92.20%) 45,272 (72.83%) 55,651 (89.53%) 42,200 (67.89%) 60,523 (97.37%)
  1. In a series of 100,000 simulated datasets, we obtained these results for ICI and number of input variables for the eight investigated prediction tools. Full results and complete case results, including only datasets for which ICI could be estimated for all tools are presented
  2. Abbreviations: ICI integrated calibration index, MB Markov Blanket, Nsim number of simulations, SD standard deviation