Skip to main content

Table 2 Method issues in the application and strategies for resolving or mitigating them

From: Estimation of place-based vulnerability scores for HIV viral non-suppression: an application leveraging data from a cohort of people with histories of using drugs

Method issues

Resolution/mitigation strategies

1. The place characteristics data is clustered in place units.

Use random forest procedure that handles clustering, specifically when drawing subsamples to grow trees and when computing out-of-bag (OOB) predictions.

2. Places are connected by individuals.

Manual leave-one-out (LOO) procedure where prediction for each census tract is based on a model fit to data where not only that census tract is excluded but also all persons ever seen in that census tract are excluded.

3. Outcome data is clustered in individuals and unbalanced.

“Rough balancing”: sample a maximum of 3 time points (years) per individual and 1 place per individual-year.

4. Some place variables are not available for all years.

Attempt using as proxy the adjacent year version of the variable. If mean square error is not improved, revert back.

5. There are time trends in the data.

Remove time trends by standardizing the predictions for the census tracts within each year (to mean 0, variance 1), and use this as the V-score.