## Abstract

### Background

In the context of environmentally influenced communicable diseases, proximity to environmental sources results in spatial heterogeneity of risk, which is sometimes difficult to measure in the field. Most prevention trials use randomization to achieve comparability between groups, thus failing to account for heterogeneity.

This study aimed to determine under what conditions spatial heterogeneity biases the results of randomized prevention trials, and to compare different approaches to modeling this heterogeneity.

### Methods

Using the example of a malaria prevention trial, simulations were performed to quantify the impact of spatial heterogeneity and to compare different models.

Simulated scenarios combined variation in baseline risk, a continuous protective factor (age), a non-related factor (sex), and a binary protective factor (preventive treatment). Simulated spatial heterogeneity scenarios combined variation in breeding site density and effect, location, and population density.

The performances of the following five statistical models were assessed: a non-spatial Cox Proportional Hazard (Cox-PH) model and four models accounting for spatial heterogeneity—*i.e.,* a Data-Generating Model, a Generalized Additive Model (GAM), and two Stochastic Partial Differential Equation (SPDE) models, one modeling survival time and the other the number of events. Using a Bayesian approach, we estimated the SPDE models with an Integrated Nested Laplace Approximation algorithm.

For each factor (age, sex, treatment), model performances were assessed by quantifying parameter estimation biases, mean square errors, confidence interval coverage rates (CRs), and significance rates. The four models were applied to data from a malaria transmission blocking vaccine candidate.

### Results

The level of baseline risk did not affect our estimates. However, with a high breeding site density and a strong breeding site effect, the Cox-PH and GAM models underestimated the age and treatment effects (but not the sex effect) with a low CR.

When population density was low, the Cox-SPDE model slightly overestimated the effect of related factors (age, treatment). The two SPDE models corrected the impact of spatial heterogeneity, thus providing the best estimates.

### Conclusion

Our results show that when spatial heterogeneity is important but not measured, randomization alone cannot achieve comparability between groups. In such cases, prevention trials should model spatial heterogeneity with an adapted method.

### Trial registration

The dataset used for the application example was extracted from Vaccine Trial #NCT02334462 (ClinicalTrials.gov registry).