Skip to main content

Table 1 Simulation Results: Prediction Tools’ Performance Metrics

From: Directed acyclic graphs and causal thinking in clinical risk prediction modeling

 

Logistic, Markov Blanket set (Nsim=100,000)

Logistic, all 24 variables (Nsim=100,000)

Logistic, any variables with a path to the outcome (Nsim=100,000)

Logistic, node’s parent variables (Nsim=100,000)

Lasso, all 24 variables (Nsim=100,000)

Ridge, all 24 variables (Nsim=100,000)

Elastic net, all 24 variables (Nsim=100,000)

Random forest, all 24 variables (Nsim=100,000)

FULL RESULTS: Including all simulated datasets

ICI

 N Missing

8032

0

8032

37,272

8597

0

8612

1

 Mean (SD)

0.01882 (0.00445)

0.01964 (0.00495)

0.01900 (0.00461)

0.02215 (0.00421)

0.01912 (0.00451)

0.03807 (0.02058)

0.01907 (0.00456)

0.04133 (0.01779)

 Median

0.01857

0.01925

0.01867

0.02242

0.01888

0.02895

0.01881

0.03636

 Range

0.00290–0.03834

0.00289–0.04330

0.00287–0.04330

0.00290–0.03826

0.00287–0.03919

0.00710–0.18537

0.00340–0.04283

0.00704–0.16493

Number of input variables

 N Missing

0

0

0

0

0

0

0

0

 Mean (SD)

4.0 (2.8)

24.0 (0.0)

18.9 (7.0)

1.2 (1.3)

24.0 (0.0)

24.0 (0.0)

24.0 (0.0)

24.0 (0.0)

 Median

3.0

24.0

22.0

1.0

24.0

24.0

24.0

24.0

 Range

0.0–19.0

24.0–24.0

0.0–24.0

0.0–9.0

24.0–24.0

24.0–24.0

24.0–24.0

24.0–24.0

Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool

 N Missing

8032

8032

8032

37,272

9140

8032

9147

8033

  < ICI logistic MB, N (%)

 

39,354 (42.79%)

39,540 (42.99%)

4864 (7.75%)

26,514 (29.18%)

8871 (9.65%)

31,089 (34.22%)

1650 (1.79%)

  ≥ ICI logistic MB, N (%)

 

52,614 (57.21%)

52,428 (57.01%)

57,864 (92.25%)

64,346 (70.82%)

83,097 (90.35%)

59,764 (65.78%)

90,317 (98.21%)

COMPLETE CASE RESULTS: only including datasets for which ICI could be estimated for all tools

ICI

 N Missing

37,841

37,841

37,841

37,841

37,841

37,841

37,841

37,841

 Mean (SD)

0.01956 (0.00463)

0.01975 (0.00477)

0.01970 (0.00473)

0.02211 (0.00421)

0.01995 (0.00471)

0.03886 (0.02177)

0.01990 (0.00476)

0.04049 (0.02011)

 Median

0.01953

0.01962

0.01960

0.02238

0.01993

0.02883

0.01987

0.03283

 Range

0.00290–0.03834

0.00289–0.04330

0.00287–0.04330

0.00290–0.03826

0.00287–0.03919

0.00710–0.18537

0.00340–0.04283

0.00704–0.16493

Number of input variables

 N Missing

37,841

37,841

37,841

37,841

37,841

37,841

37,841

37,841

 Mean (SD)

4.1 (2.7)

24.0 (0.0)

20.8 (3.9)

1.9 (1.1)

24.0 (0.0)

24.0 (0.0)

24.0 (0.0)

24.0 (0.0)

 Median

4.0

24.0

22.0

2.0

24.0

24.0

24.0

24.0

 Range

1.0–19.0

24.0–24.0

1.0–24.0

1.0–9.0

24.0–24.0

24.0–24.0

24.0–24.0

24.0–24.0

Direct comparison: ICI of various methods compared to Markov Blanket-based logistic tool

 N Missing

37,841

37,841

37,841

37,841

37,841

37,841

37,841

37,841

  < ICI logistic MB, N (%)

 

26,872 (43.23%)

27,124 (43.64%)

4850 (7.80%)

16,887 (27.17%)

6508 (10.47%)

19,959 (32.11%)

1636 (2.63%)

  ≥ ICI logistic MB, N (%)

 

35,287 (56.77%)

35,035 (56.36%)

57,309 (92.20%)

45,272 (72.83%)

55,651 (89.53%)

42,200 (67.89%)

60,523 (97.37%)

  1. In a series of 100,000 simulated datasets, we obtained these results for ICI and number of input variables for the eight investigated prediction tools. Full results and complete case results, including only datasets for which ICI could be estimated for all tools are presented
  2. Abbreviations: ICI integrated calibration index, MB Markov Blanket, Nsim number of simulations, SD standard deviation