From: Accommodating heterogeneous missing data patterns for prostate cancer risk prediction
Method | Definition |
---|---|
Available cases | Pool individual-level data that have \({\mathrm{X}}^{*}\) measured across all cohorts and fit a model including \({\mathrm{X}}^{*}\) as main effects |
Iterative BIC selection | Same as available cases, but with an iterative stepwise BIC-based model selection to determine the optimal subset of \({\mathrm{X}}^{*}\) and interactions |
Cohort ensemble | Separate models are built to each cohort by using the coinciding variables of the cohort and the patient |
Categorization | All individuals in all cohorts are used. Predictors are categorized with missing as one of the categories so that the complete list of predictors \(\mathrm{X}\) is used |
Missing indicator | Include an indicator for missing a continuous predictor value and the interaction with the predictor as additional variables in the analysis. Mostly similar to Categorization |
Imputation | Impute missing covariates in the training set following the MICE method. Mean imputation for missing values in prediction |