Skip to main content

Table 1 Simulation results for each variable selection approach performed on the fully observed data and among incomplete data. For bootstrap imputation based methods on incomplete data, we show results corresponding to different threshold values of π. The optimal value of π leading to the highest F1 score is π=0.1 for BI-BART and π=0.3 for BI-XGB. The sample size is n=1000. The number of useful predictors is 10 and the number of noise predictors is 40. Two missingness proportions were considered: 40% missingness in Y and 60% overall missingness; 20% missingness in Y and 40% overall missingness. The performance measures were computed across 250 data replications

From: A flexible approach for variable selection in large-scale healthcare database studies with missing covariate and outcome data

  AUC Precision Recall F1 Type I error
Fully observed data
BART 0.92 (0.88, 0.96) 1.00 0.87 0.93 0.00
XGBoost 0.88 (0.84, 0.92) 0.93 0.81 0.86 0.02
Incomplete data: 40% missingness inY and 60% overall missingness
RR-BART 0.82 (0.78, 0.86) 0.87 0.80 0.83 0.01
BI-BART π=0.1 0.83 (0.78, 0.88) 0.87 0.82 0.85 0.03
BI-BART π=0.2 0.81 (0.76, 0.86) 0.97 0.71 0.82 0.01
BI-BART π=0.3 0.77 (0.72, 0.82) 0.99 0.63 0.77 0.00
BI-BART π=0.4 0.73 (0.68, 0.78) 1.00 0.55 0.71 0.00
BI-BART π=0.5 0.67 (0.62, 0.72) 1.00 0.48 0.65 0.00
BI-BART π=0.6 0.59 (0.54, 0.64) 1.00 0.40 0.57 0.00
BI-BART π=0.7 0.49 (0.44, 0.54) 1.00 0.31 0.47 0.00
BI-BART π=0.8 0.41 (0.36, 0.46) 1.00 0.23 0.38 0.00
BI-BART π=0.9 0.33 (0.28, 0.38) 1.00 0.14 0.25 0.00
BI-BART π=1.0 0.25 (0.20, 0.30) 1.00 0.07 0.13 0.00
BI-XGB π=0.1 0.58 (0.53, 0.63) 0.40 0.88 0.55 0.36
BI-XGB π=0.2 0.74 (0.69, 0.79) 0.66 0.85 0.75 0.15
BI-XGB π=0.3 0.82 (0.77, 0.87) 0.83 0.83 0.83 0.03
BI-XGB π=0.4 0.76 (0.71, 0.81) 0.96 0.72 0.82 0.01
BI-XGB π=0.5 0.70 (0.65, 0.75) 0.99 0.63 0.77 0.00
BI-XGB π=0.6 0.64 (0.59, 0.69) 1.00 0.54 0.70 0.00
BI-XGB π=0.7 0.59 (0.54, 0.64) 1.00 0.41 0.58 0.00
BI-XGB π=0.8 0.47 (0.42, 0.52) 1.00 0.29 0.44 0.00
BI-XGB π=0.9 0.35 (0.30, 0.40) 1.00 0.17 0.29 0.00
BI-XGB π=1.0 0.28 (0.23, 0.33) 1.00 0.04 0.09 0.00
MIA-BART (Impute missing Y) 0.75 (0.71, 0.79) 0.80 0.75 0.77 0.04
MIA-BART (Exclude missing Y) 0.72 (0.66, 0.78) 0.78 0.70 0.74 0.05
MIA-XGB (Impute missing Y) 0.74 (0.70, 0.78) 0.81 0.73 0.77 0.04
MIA-XGB (Exclude missing Y) 0.71 (0.65, 0.77) 0.75 0.71 0.73 0.08
BART Complete cases 0.70 (0.63, 0.77) 0.90 0.60 0.72 0.03
XGBoost Complete cases 0.73 (0.66, 0.80) 0.90 0.68 0.77 0.04
Incomplete data: 20% missingness inY and 30% overall missingness
RR-BART 0.86 (0.82, 0.90) 0.91 0.84 0.87 0.02
BI-BART π=0.1 0.87 (0.83, 0.91) 0.91 0.87 0.89 0.01
BI-BART π=0.2 0.85 (0.80, 0.90) 0.99 0.76 0.85 0.02
BI-BART π=0.3 0.82 (0.77, 0.87) 1.00 0.68 0.82 0.01
BI-BART π=0.4 0.77 (0.72, 0.82) 1.00 0.59 0.76 0.01
BI-BART π=0.5 0.72 (0.67, 0.77) 1.00 0.54 0.70 0.00
BI-BART π=0.6 0.63 (0.58, 0.68) 1.00 0.46 0.61 0.01
BI-BART π=0.7 0.53 (0.48, 0.58) 1.00 0.36 0.51 0.00
BI-BART π=0.8 0.44 (0.39, 0.49) 1.00 0.28 0.42 0.00
BI-BART π=0.9 0.38 (0.33, 0.43) 1.00 0.18 0.30 0.00
BI-BART π=1.0 0.29 (0.24, 0.34) 1.00 0.12 0.18 0.00
BI-XGB π=0.1 0.60 (0.55, 0.65) 0.44 0.90 0.58 0.30
BI-XGB π=0.2 0.76 (0.71, 0.81) 0.69 0.87 0.77 0.11
BI-XGB π=0.3 0.84 (0.79, 0.89) 0.86 0.85 0.85 0.02
BI-XGB π=0.4 0.78 (0.73, 0.83) 0.99 0.75 0.84 0.01
BI-XGB π=0.5 0.73 (0.68, 0.78) 1.00 0.65 0.79 0.00
BI-XGB π=0.6 0.67 (0.62, 0.72) 1.00 0.57 0.73 0.00
BI-XGB π=0.7 0.62 (0.57, 0.67) 1.00 0.44 0.61 0.00
BI-XGB π=0.8 0.49 (0.44, 0.54) 1.00 0.32 0.47 0.00
BI-XGB π=0.9 0.38 (0.32, 0.42) 1.00 0.20 0.33 0.00
BI-XGB π=1.0 0.31 (0.25, 0.35) 1.00 0.08 0.14 0.00
MIA-BART (Impute missing Y) 0.78 (0.74, 0.82) 0.83 0.77 0.79 0.03
MIA-BART (Exclude missing Y) 0.76 (0.70, 0.82) 0.81 0.74 0.78 0.04
MIA-XGB (Impute missing Y) 0.77 (0.73, 0.82) 0.83 0.76 0.79 0.03
MIA-XGB (Exclude missing Y) 0.73 (0.67, 0.79) 0.78 0.73 0.75 0.06
BART Complete cases 0.73 (0.67, 0.79) 0.92 0.64 0.75 0.03
XGBoost Complete cases 0.76 (0.70, 0.82) 0.93 0.71 0.80 0.03