Skip to main content

Table 1 Candidate methods

From: Variable selection in social-environmental data: sparse regression and tree ensemble machine learning approaches

Abbreviation

Description

Selection rule

R packages

UNIV-BFN

Univariable models with Bonferroni-adjusted p-val

P < 5*10−5

base R [23]

LASSO-MIN

Lasso with λ chosen at the minimum prediction error

β ≠ 0

glmnet [24]

LASSO-1SE

Lasso with λ chosen at 1 SE above the minimum error

β ≠ 0

glmnet

ELNET-MIN

Elastic net, grid search for α (0.05–0.95 by 0.05), λ at min

β ≠ 0

glmnet

ELNET-1SE

Elastic net, grid search α (0.05–0.95 by 0.05), λ at 1 SE

β ≠ 0

glmnet

HCLST-CORR-SGL

Hierarchical clustering, groups with corr > 0.8, sparse group lasso

β ≠ 0

SGL [25]

HCLST-BOOT-SGL

Hierarchical clustering, groups from bootstrap, sparse group lasso

β ≠ 0

SGL, pvclust [16]

RF

Random Forests algorithm with bootstrap-based confidence intervals for the variable importance scores

99.995% CI > 0

randomForestSRC [26]

BAGGING

Similar to Random Forests, but with all variables considered candidates for splitting at each node

99.995% CI > 0

randomForestSRC

BART-LOCAL

Bayesian Additive Regression Trees, local criteria for Inclusion Proportion (IP)

IP > 0.95 quantile of local distribution

bartMachine [27]

BART-GLOBALSE

Bayesian Additive Regression Trees, global SE criteria for IP

IP > threshold from local distribution with global multiplier

bartMachine

BART-GLOBALMAX

Bayesian Additive Regression Trees, global Max criteria for IP

IP > 0.95 quantile of global max distribution

bartMachine