Bayesian structured additive regression modeling of epidemic data: application to cholera
- Frank B Osei^{1}Email author,
- Alfred A Duker^{2} and
- Alfred Stein^{3}
DOI: 10.1186/1471-2288-12-118
© Osei et al.; licensee BioMed Central Ltd. 2012
Received: 5 June 2011
Accepted: 12 July 2012
Published: 6 August 2012
Abstract
Background
A significant interest in spatial epidemiology lies in identifying associated risk factors which enhances the risk of infection. Most studies, however, make no, or limited use of the spatial structure of the data, as well as possible nonlinear effects of the risk factors.
Methods
We develop a Bayesian Structured Additive Regression model for cholera epidemic data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (MCMC) simulations. The model is applied to cholera epidemic data in the Kumasi Metropolis, Ghana. Proximity to refuse dumps, density of refuse dumps, and proximity to potential cholera reservoirs were modeled as continuous functions; presence of slum settlers and population density were modeled as fixed effects, whereas spatial references to the communities were modeled as structured and unstructured spatial effects.
Results
We observe that the risk of cholera is associated with slum settlements and high population density. The risk of cholera is equal and lower for communities with fewer refuse dumps, but variable and higher for communities with more refuse dumps. The risk is also lower for communities distant from refuse dumps and potential cholera reservoirs. The results also indicate distinct spatial variation in the risk of cholera infection.
Conclusion
The study highlights the usefulness of Bayesian semi-parametric regression model analyzing public health data. These findings could serve as novel information to help health planners and policy makers in making effective decisions to control or prevent cholera epidemics.
Keywords
Bayesian Cholera Cholera reservoir Refuse dumps SlumsBackground
A significant interest in understanding the epidemiology of diseases lies in identifying associated risk factors which enhance the risk of infection, the so called ecological studies[1, 2]. Most of these ecological studies, however, make no, or limited use of the spatial structure of the data, neither do they consider possible nonlinear effects of the risk factors. Thus, most studies use standard statistical methods such as the classical and generalized linear models that ignore methodological difficulties that arise from the nature of the data. Ali et al.[3, 4] have used logistic, simple and multiple linear regression models to study the spatial epidemiology of cholera in an endemic area of Bangladesh. Other ecological studies of cholera that have utilized standard statistical methods include Ackers et al.[5], Mugoya et al.[6] and Sasaki et al.[7]. These methods when applied to spatially distributed data present severe problems with estimating small area spatial effects, and simultaneously adjusting for other risk factors, in particular if such effects are nonlinear. If standard statistical methods are used to analyze spatially correlated data, the standard error of the covariate parameters is underestimated and thus the statistical significance is overestimated [8].
Generalized additive models (GAM) provide a powerful class of models for modeling nonlinear effects of continuous covariates in regression models with non-Gaussian responses. Structured Additive Regression (STAR) models are extensions of GAM models that allow one to incorporate small area spatial effects, nonlinear effects of risk factors, and the usual linear or fixed effects in a joint model [9]. This study applies a STAR modeling approach to develop a multivariate explanatory model for cholera.
Cholera outbreak is enhanced by several environmental and/or socioeconomic risk factors once introduced in a population. Ali et al.[3, 4] identified proximity to surface water, high population density, and low educational status as the important risk factors of cholera in an endemic area of Bangladesh. Borroto and Martinez-Piedra [10] identified poverty, low urbanization, and proximity to coastal areas as the important geographic risk factors of cholera in Mexico. Sanitation is an important environmental risk factor that predisposes inhabitants to cholera infection. Previous ecological studies have used spatial regression models to explore the dependency of cholera on some local measures of sanitation [11, 12]. No attempt, however, has been made to combine all the identified measures of sanitation, including spatial effects, into a single multivariate model to examine their joint effects on cholera. In this study, we exploit the joint effects of three main spatial measures of sanitation identified from previous studies [11, 12]. These are density of refuse dumps, proximity to refuse dumps and proximity to potential cholera reservoirs. Other risk factors used in this study include livelihood at slummy and squatter environments [13], and population density [3, 4, 14, 15]. Livelihood at slummy and squatter environments increase the risk of cholera infection, whereas high population density stresses existing sanitation systems, thus putting people at increased risk of cholera.
This study incorporates the effects of nonlinear risk factors and the usual fixed effects of some risk factors, while accounting for both structured and non structured spatial effects. A STAR model of this type has been termed geoadditive model [16, 17]. The increasing availability of disease and environmental data necessitate the development of such models to obtain valid and realistic statistical inferences that adequately describe the variation of the disease. Proximity to dumps, density of dumps, and proximity to potential cholera reservoirs are modeled as smooth continuous functions, whereas presence of slum settlers and population density are modeled as fixed effects, and spatial references to the communities are modeled as structured and unstructured spatial effects. We use a fully Bayesian estimation based on Markov Chain Monte Carlo (MCMC) simulations using simple Gibbs sampling updates. Making inferences based on a fully Bayesian approach is preferred because the functionals of the posterior can be computed without relying on large Gaussian justifications, thereby quantifying the uncertainty in the parameters [18].
Methods
Study area and cholera data
The topographic map of the metropolis and the n = 68 communities where cholera records are available was digitized. Cholera data for each community was extracted from disease records of the Kumasi Metropolitan Disease Control Unit (DCU). We accessed such data based on special permissions given by the Kumasi DCU. The centroids of the communities were used as the spatial references of cholera cases since residential addresses were not recorded during the outbreak. The denominator (population data) for computing community-specific cholera rates was obtained from the 2000 Population and Housing Census of Ghana [19].
Model specification
Here, β is a p-dimensional vector of unknown regression coefficients for the continuous covariates x _{ i }, and γ is a r-dimensional vector of unknown regression coefficients for the categorical covariates w _{ i }.
Here, ${f}_{1}\left(x\right),\dots ,{f}_{p}\left(x\right)$ are nonlinear smooth functions of the continuous covariates ${x}_{i,1},\dots ,{x}_{i,p}$ and ${f}_{\mathit{spat}}\left({s}_{i}\right)$ is a function that accounts for spatial effects at each community ${s}_{i}\in \left\{1,\dots ,S\right\}$. Spatial effect is usually a surrogate of unobserved influential factors, some of which may have a strong spatial structure and others may be present only locally (unstructured). To distinguishing between the two kinds of influential factors${f}_{\mathit{spat}}\left(s\right)$is split up into spatially correlated (smooth) part ${f}_{\mathit{str}}\left(s\right)$and spatially uncorrelated (unsmooth) part${f}_{\mathit{unstr}}\left(s\right)$, i.e. ${f}_{\mathit{spat}}\left(s\right)={f}_{\mathit{str}}\left(s\right)+{f}_{\mathit{unstr}}\left(s\right)$.
This model contains p + 2 functions and r fixed parameters to be estimated.
Prior distributions for covariates
A fully Bayesian approach for modeling and inferences requires prior assumptions for the unknown functions ${f}_{j}\left(x\right)\text{,}\phantom{\rule{0.12em}{0ex}}{f}_{\mathit{unstr}}\left(s\right),{f}_{\mathit{str}}\left(s\right)$ and the fixed effect regression parameter γ. For γ, we assume an independent diffuse prior$p\left(\gamma \right)\propto const$ due to the absence of any prior knowledge. A possible alternative choice is a weak informative multivariate Gaussian distribution.
where the precision matrix K _{ j } acts as a penalty matrix that shrinks parameters towards zero, or penalizes too abrupt jumps between neighboring parameters. Since the penalty matrix K _{ j } is rank deficient, i.e.${k}_{j}=\text{rank}\left({K}_{j}\right)<dim\left({\xi}_{j}\right)={d}_{j}$, it follows that the prior for ${\xi}_{j}|{\tau}_{j}^{2}$ is partially improper with Gaussian prior ${\xi}_{j}|{\tau}_{j}^{2}\propto N\left(0;{\tau}_{j}^{2}{K}_{j}^{-}\right)$, where ${K}_{j}^{-}$is a generalized inverse of K _{ j }. The tradeoff between flexibility and smoothness is controlled by the variance parameter ${\tau}_{j}^{2}$. A large variance corresponds with a rough estimated function, and vice versa.
Spatial components
where N _{ s } is the number of adjacent spatial units and ${s}^{\prime}\in \partial s$denotes that spatial unit s’ is a neighbor of spatial unit s. Thus, the conditional mean of f _{ str } (s) is an unweighted average of the function evaluations of neighboring spatial units. Since only the centroids of communities (point data) are available, we assume the effect of spatial interaction is dependent on distance between the centroids of pair of communities. To ensure equal number of neighbors for each community we chose a neighborhood structure based on the kth nearest neighbor method (where k is the number of neighbors). This approach results in an asymmetric neighborhood matrix; therefore, false symmetry was imposed to ensure a symmetrical neighborhood structure. Like the continuous functions f _{ j }, the tradeoff between flexibility and smoothness is controlled by the variance parameter${\tau}_{\mathit{str}}^{2}$.
In this study, we use the standard option hyper-parameters proposed by Farhmeir et al.[18]: IG (a = b = 0.001).
Bayesian inference
where L (.) is the likelihood function. The full conditional for the variance components ${\tau}_{j}^{2},j=1,\dots ,p\text{,}$ str, unstr, and σ ^{2} are inverse Gamma distributions. The full conditional for the fixed parameters γ, the unknown parameter vector${\xi}_{1},\dots ,{\xi}_{p}$, as well as ${f}_{\mathit{str}}\left(s\right)\text{,}$ ${f}_{\mathit{unstr}}\left(s\right)$ are multivariate Gaussian. Gibbs sampler was employed for MCMC simulations, drawing successively from the full conditionals for the variance components and the unknown parameters. Cholesky decompositions for band matrices were used to efficiently draw random samples from the full conditional [22, 23].
Model implementation
The continuous covariates used in this study are proximity to refuse dumps d _{ dumps }, density of refuse dumps ρ _{ dump }, and proximity to potential cholera reservoirs d _{ reser }. These variables are extracted on per community basis via a Geographic Information System (GIS). Details of the approaches for the calculation of these variables can be found in Osei and Duker [11] and Osei et al.[12]. The spatial locations of the communities are used to model the spatial effects. In the Kumasi area no administrative boundaries are present separating the communities. For ease of visualization and interpretation, the centroids of the communities are converted to Thiessen polygons whose boundaries define the area that is closest to each centroid relative to all other centroids.
Model 1 is a strictly linear regression that assumes a linear effect of the categorical and continuous covariates. Model 2 is an additive model which assumes nonlinear functions for the continuous covariates and linear effects of the categorical covariates. Model 3 is a geoadditive model, which is an extension of Model 2 that incorporates both structured and unstructured spatial effects.
The models were implemented in the public domain software BayesX ver 2.0 [24, 25]. We used a total number of 40,000 MCMC iterations and 10,000 number of burn in samples. Since, in general, these random numbers are correlated, only every 20^{th} sampled parameter of the Markov chain were stored. This yielded 2,000 samples for parameter estimation. Convergence checks of the MCMC algorithms were based on autocorrelations and the sampling paths.
We compared the strictly linear models with the additive models and the geoadditive models using the Deviance Information Criterion (DIC) values [26]. DIC is a Bayesian tool for model checking and comparison, where the model with the smallest DIC is preferred. The DIC is given by$DIC=\overline{D}+{p}_{D}$, where $\overline{D}$ is the posterior mean of the deviance, which is a measure of goodness of fit, and p _{ D } is the effective number of parameters, which is a measure of model complexity and penalizes over-fitting.
Results
Model selection
Comparison of model fit using Deviance Information Criterion ( DIC )
Model Fit | Model 1 | Model 2 | Model 3 |
---|---|---|---|
$\overline{D}$ | 37.40 | 32.35 | 10.64 |
_{ pD } | 5.85 | 8.95 | 9.43 |
_{ DIC } | 43.25 | 41.30 | 20.07 |
_{ΔDIC } ^{§} | 23.18 | 21.23 | Reference |
Fixed and nonlinear effects of covariates
Estimates of fixed effect parameters based on the linear Model 1
Variable | Mean | Std. error | 10% | 90% |
---|---|---|---|---|
constant | 0.444* | 0.213 | 0.171 | 0.718 |
${\varsigma}_{\mathit{\text{slum}}},{\gamma}_{2}$ | 0.267* | 0.098 | 0.141 | 0.393 |
${\rho}_{\mathit{\text{pop}}},{\gamma}_{1}$ | 0.344* | 0.089 | 0.230 | 0.457 |
${\rho}_{\mathit{\text{dump}}},{\beta}_{1}$ | 0.156* | 0.039 | 0.107 | 0.206 |
${d}_{\mathit{\text{dump}}},{\beta}_{2}$ | 4.99E-05 | 7.19E-05 | −4.40E-05 | 0.00014 |
${d}_{\mathit{\text{reser}}},{\beta}_{3}$ | −6.54E-05 | 6.42E-05 | −1.44 E-04 | 1.63E-05 |
Estimates of posterior mean and 90% credible intervals for the fixed effects for Model 3
Variable | Mean | Std. error | 10% | 90% |
---|---|---|---|---|
Constant | 0.73* | 0.081 | 0.63 | 0.83 |
${\varsigma}_{\mathit{\text{slum}}},{\gamma}_{2}$ | 0.28* | 0.095 | 0.16 | 0.40 |
${\rho}_{\mathit{\text{pop}}},{\gamma}_{1}$ | 0.32* | 0.092 | 0.20 | 0.44 |
Spatial effects
Summary of the sensitivity analysis of the choice of hyper-parameters for Model 3
a = 0.001 | a = 0.01 | a = 0.5 | a = 1 | |
---|---|---|---|---|
b = 0.001 | b = 0.01 | b = 0.0005 | b = 0.005 | |
Spatial effects ^{‡} | ||||
${f}_{\mathit{\text{str}}}\left(s\right)$, ${\tau}_{\mathit{\text{str}}}^{2}$ | 0.02 | 0.028 | 0.004 | 0.004 |
(0.0005 - 0.0.06) | (0.003 - 0.07) | (0.00009 - 0.01) | (0.0006 - 0.0009) | |
${f}_{\mathit{\text{unstr}}}\left(s\right)$, ${\tau}_{\mathit{\text{unstr}}}^{2}$ | 0.02 | 0.031 | 0.007 | 0.0071 |
(0.0009 - 0.0.057) | (0.005 - 0.056) | (0.0001 - 0.028) | (0.0006 - 0.019) | |
Smooth functions ^{§} | ||||
${f}_{1}\left({\rho}_{\mathit{\text{dump}}}\right)$,${\tau}_{1}^{2}$ | 0.003 | 0.006 | 0.0014 | 0.002 |
(0.0005 - 0.006) | (0.002 - 0.013) | (0.0002 - 0.003) | (0.0006 - 0.004) | |
${f}_{2}\left({d}_{\mathit{dump}}\right)$,${\tau}_{2}^{2}$ | 0.003 | 0.0078 | 0.0007 | 0.002 |
(0.0002 - 0.0058) | (0.002 - 0.017) | (0.00008 - 0.0015) | (0.0004 - 0.004) | |
0.001 | 0.004 | 0.0004 | 0.001 | |
${f}_{3}\left({d}_{\mathit{reser}}\right)$,${\tau}_{3}^{2}$ | (0.0002 - 0.0024) | (0.001 - 0.009) | (0.00006 - 0.0007) | (0.0004 - 0.003) |
Sensitivity analyses
Since the regression parameters depend on the choice of hyper-parameters, we rerun the MCMC simulations, using Model 3 for simplicity, to investigate the sensitivity of our results to different choices of hyper-parameters. In particular, the following alternatives of priors have been investigated: IG (a = 0.01, b = 0.01), IG (a = 0.5, b = 0.0005) and IG (a = 1, b = 0.005). The first alternative and the standard option IG (a = 0.001, b = 0.001) are commonly used choices for the variances of random effects. The second and third alternatives are suggested by Kelsall and Wakefield [28] and Besag and Kooperberg [27], respectively. Results of the sensitivity analysis on the choice of hyper-parameters α and b are shown in Table 4. It is noticed that the four choices of hyper-parameters yielded similar inferences for the posterior means of the fixed parameters. Minor differences, however, occur between the variance parameters for the nonlinear functions and the spatial effects suggesting the robustness of our choices. Thus, indicating that our model is less sensitive to the choice of hyper-parameters.
Discussion
This study utilizes geoadditive modeling approach to develop a multivariate explanatory model for the risk of cholera. We utilize a Bayesian semi-parametric regression model to elucidate the probability of cholera infection in relation to associated risk factors, some identified from previous studies [11, 12]. The geoadditive modeling approach is an extension of the GAM which allows the inclusion of both structured and unstructured spatial effects to account for possible unobserved factors and heterogeneity terms. To allow flexibility, the continuous covariates are modeled non-parametrically as nonlinear functions using P-splines with second-order random walk priors based, this based on contributions by Farhmeir and Lang [29, 30] and Fahrmeir et al.[18]; while the categorical covariates are modeled as fixed effects. The spatially structured and unstructured effects are modeled using Markov random filed priors and zero mean Gaussian heterogeneity priors, respectively [31]. In this modeling approach, fully Bayesian inferences based on MCMC simulations are preferred because the functionals of the posterior can be easily computed, thereby easily quantifying the uncertainty in the estimated parameters [18].
The findings of the study show that the risk of cholera infection is high amongst inhabitants dwelling in slums. The risk of infection is also relatively high in densely populated communities. These relationships may exist because most communities with slummy settlers are densely populated. Although cholera is transmitted mainly through contaminated water or food, poor sanitary conditions in the environment enhance its transmission. The cholera vibrios can survive and multiply outside the human body and can spread rapidly where living conditions are overcrowded and where there is no safe disposal of solid waste, liquid waste, and human feces [3, 4]. These conditions are mostly met in slummy and densely populated communities in Kumasi. Such high population density may necessarily result in shorter disease transmission paths, thus increasing the risk of cholera infection. Also, inhabitants living at slummy areas are generally poor, and face problems including access to potable water and sanitation. In many cases public utilities providers (e.g. water distribution) legally fail to serve these urban poor due to factors regarding land tenure system, technical and service regulations, and city development plans. Most slum settlements are also located at low lying areas susceptible to flooding. Unfavorable topography, soil, and hydro-geological conditions make it difficult to achieve and maintain high sanitation standards among such inhabitants [10].
The risk of cholera infection is observed to decrease with increasing distance from refuse dumps, inhabitants within 500 m away from the refuse dumps being the most vulnerable. This is consistent with the finding from previous studies when a quantitative assessment of critical distance discrimination on experimental buffer zones around refuse dumps showed that the optimum spatial discrimination of cholera occurs at 500 m way from refuse dumps [11]. Therefore, we hypothesize that refuse dumps located within 500 m away from inhabitants enhance the risk of cholera infection compared with those farther. The expected decreasing trend of Chol _{(R)} from ${d}_{\mathit{dump}}\phantom{\rule{0.18em}{0ex}}\ge 500\phantom{\rule{0.12em}{0ex}}\text{m}$, however, is apparently grounds for strengthening the acceptance of this hypothesis. Collectively, the nonlinear effects of d _{ dump } and ρ _{ dump } on Chol _{(R)} suggest that cholera risk is relatively high amongst inhabitants who live in close proximity to refuse dumps, and where there are numerous refuse dumps. Due to the bad defecation practices of most inhabitants, the refuse dumps may contain high fecal matter. Surface drainage from such refuse dumps pollutes water sources with feces which when used perpetuates the transmission of cholera vibrios. If the runoff from waste dumps during heavy rains serve as the major pathway for fecal and bacterial contamination of rivers and streams, then it is likely that inhabitants living closer to water bodies where these runoffs flow into will have higher cholera prevalence than those who live farther. The observed decreasing cholera prevalence with increasing distance from potentially polluted surface water bodies (Figure 4), and the significant linear relationship between d _{ dump } and d _{ reser } (results from preliminary regression analysis: β = 0.67, R ^{ 2 } = 0.34, p <0.001) support this hypothesis.
Cholera is primarily driven by environmental and socio-economic factors [3, 4]; prior knowledge indicates that geographically close communities will tend to have similar relative risks. Thus, indicating the existence of structured spatial variation in the relative risk. The structured spatial effects included in the model are surrogate measures of unobserved spatially correlated risk factors of cholera. The results show clear evidence of significant clustering of cholera, with higher cholera risk occurring at the central part (the Central Business District), and a lower risk occurring at the south-eastern part (the periphery) of Kumasi (Figure 5). These patterns clearly indicate possible unobserved risk factors of cholera, which may be global or local. For example, the increased risk at the central part of Kumasi may be an influence of high daily influx of traders and civil workers from other communities to the Central Business District. Such a high daily influx strain existing sanitation systems which consequently put people at increased risk of cholera. The dominancy of the unstructured spatial effects over the structured spatial effects indicates that the unobserved risk factors are more local than global. For instance, household socioeconomic characteristics may cause such local spatial variation. Therefore, this gives leads for further epidemiological research using additional information at household spatial scale within the study area.
Unlike classical modeling approaches, our methodological concept allows modeling flexibility which can reveal salient features of the continuous covariates. For instance, the utilization of only the linear model, Model 1, would have led to an invalid rejection of the significance of some important risk factors: density of refuse dumps, and proximity to potential cholera reservoirs. Such modeling approach is useful to establish a better epidemiological relationship that exists between the disease and the risk factors. Although the methodological concept is somewhat mathematically intensive, the availability of the public domain software, BayesX, provides opportunities for nonprogrammers to utilize these methods.
Limitations of study
Data limitations have enforced this study to be undertaken within a single-scale framework; therefore, significance of scale effects has not been accounted for in this study. Consequently, possible biases induced by modifiable areal unit problem (MAUP) have been ignored. If data at different levels of spatial scales were available, possible bias of MAUP would be evaluated within a multi-scale analysis framework as exemplified in Odoi et al.[32]. Moreover, re-aggregating the data to another set of areal units could assess the possible bias of MAUP [33]. However, this is impossible due to the limited availability of higher resolution data and difficulties in assessing the ecological fallacy associated. In accordance with the general rule of practice, the study analyzed aggregated data using the smallest areal units for which data were available to ameliorate the effects of aggregation. Accordingly, statistical inferences in this study are emphasized on the group-level rather than the individual-level.
Also, our choice of neighborhood structure induces an assumption that all the inhabitants reside at the centroid of the communities. In reality, the communities have boundaries whereby their adjacency reflects the true nature of the spatial structure. Also, the maps of the spatial effects should be interpreted with caution as the spatial boundaries used are artificial (Thiessen polygons). Perhaps different spatial patterns may be visually observed if the true boundaries of the spatial units existed.
Conclusion
This study applies a Bayesian semi-parametric modeling approach to develop an explanatory model of cholera. Such flexible modeling approaches allow joint analysis of nonlinear effects of continuous covariates, spatially structured variation, unstructured heterogeneity, and fixed effect covariates. Our model reveals that the risk of cholera infection is associated with slum settlements, high population density, proximity to and density of waste dumps, proximity to potentially polluted rivers and streams, as well as possible unobserved risk factors. The possible unobserved risk factors are shown by the distinct spatial patterns exhibited by the spatial covariates; suggesting the need for further epidemiological research. These findings should serve as novel information to help health planners and policy makers in making effective decisions about cholera control measures.
Declarations
Acknowledgements
We extend our sincere appreciation to the Kumasi Metropolitan Health Directorate for providing all the necessary data and background information for this research.
Authors’ Affiliations
References
- Lawson A, Biggeri A, Bohning , Lesaffre E, Viel J-F, Bertollini R: Introduction to spatial models in ecological analysis Disease. Disease Mapping and Risk Assessment for Public Health. Edited by: Lawson A, Biggeri A, Bohning , Lesaffre E, Viel J-F, Bertollini R. 1999, Chichester: Wiley, 181-191.Google Scholar
- Lawson AB: Statistical Methods in Spatial Epidemiology. 2001, Chichester: WileyGoogle Scholar
- Ali M, Emch M, Donnay JP, Yunus M, Sack RB: Identifying environmental risk factors of endemic cholera: a raster GIS approach. Health Place. 2002, 8: 201-210. 10.1016/S1353-8292(01)00043-0.View ArticlePubMedGoogle Scholar
- Ali M, Emch M, Donnay JP, Yunus M, Sack RB: The spatial epidemiology of cholera in an endemic area of Bangladesh. Soc Sci Med. 2002, 55: 1015-1024. 10.1016/S0277-9536(01)00230-1.View ArticlePubMedGoogle Scholar
- Ackers M-L, Quick RE, Drasbek CJ, Hutwagner L, Tauxe RV: Are there national risk factors for epidemic cholera? The correlation between socioeconomic and demographic indices and cholera incidence in Latin America. Int J Epid. 1998, 27: 330-334. 10.1093/ije/27.2.330.View ArticleGoogle Scholar
- Mugoya I, Kariuki S, Galgalo T, Njuguna C, Omollo J, Njoroge J, Kalani R, Nzioka C, Tetteh C, Bedno S, Breiman RF, Feikin DR: Rapid Spread of Vibrio cholerae O1 Throughout Kenya, 2005. AmJTrop Med Hyg. 2008, 78 (3): 527-533.Google Scholar
- Sasaki S, Suzuki H, Igarashi K, Tambatamba B, Mulenga P: Spatial Analysis of Risk Factor of Cholera Outbreak for 2003–2004 in a Peri-urban Area of Lusaka, Zambia. AmJTrop Med Hyg. 2008, 79 (3): 414-421.Google Scholar
- Cressie NAC: Statistics for Spatial Data. 1993, New York: WileyGoogle Scholar
- Kneib T: Mixed model based inference in structured additive regression. 2005, PhD thesis: Universitat MunchenGoogle Scholar
- Borroto RJ, Martinez-Piedra R: Geographical patterns of cholera in Mexico, 1991–1996. Int J Epid. 2000, 29: 764-772. 10.1093/ije/29.4.764.View ArticleGoogle Scholar
- Osei FB, Duker AA: Spatial dependency of V. cholerae prevalence on open space refuse dumps in Kumasi, Ghana: a spatial statistical modeling. Int J Health Geog. 2008, 7: 62-10.1186/1476-072X-7-62.View ArticleGoogle Scholar
- Osei FB, Duker AA, Augustijn E-W, Stein A: Spatial dependency of cholera prevalence on potential cholera reservoirs in an urban area, Kumasi, Ghana. Int J Appl Earth Obs Geoinf. 2010, 12 (5): 331-339. 10.1016/j.jag.2010.04.005.View ArticleGoogle Scholar
- Sur D, Deen J, Manna B, Niyogi S, Deb A, Kanungo S, Sarkar B, Kim D, Danovaro-Holliday M, Holliday K, Gupta V, Ali M, von Seidlein L, Clemens J, Bhattacharya S: The burden of cholera in the slums of Kolkata, India: data from a prospective, community based study. Arch Dis Child. 2005, 90 (11): 1175-1181. 10.1136/adc.2004.071316.View ArticlePubMedPubMed CentralGoogle Scholar
- Siddique AK, Zaman K, Baqui AH, Akram KA, Mutsuddy P, Eusof A, Haider K, Islam S, Sack RB: Cholera epidemics in Bangladesh:1985–1991. J Diar Dis Res. 1992, 10 (2): 79-86.Google Scholar
- Root G: Population density and spatial differentials in child mortality in Zimbabwe. Soc Sci Med. 1997, 44 (3): 413-421. 10.1016/S0277-9536(96)00162-1.View ArticlePubMedGoogle Scholar
- Kamman EE, Wand MP: Geoadditive Models. J Royal Stat Soc Series C. 2003, 52: 1-18. 10.1111/1467-9876.00385.View ArticleGoogle Scholar
- Ruppert D, Wand M, Carroll R: Semiparametric Regression. 2003, Cambridge: Cambridge University PressView ArticleGoogle Scholar
- Fahrmeir L, Kneib T, Lang S: Penalized structured additive regression for space-time data: a Bayesian perspective. Stat Sin. 2004, 14: 731-761.Google Scholar
- PHC: Population and Housing Census of Ghana. 2005, Ghana: Ghana Statistical ServiceGoogle Scholar
- Eilers PHC, Marx BD: Flexible smoothing using B-splines and penalties (with comments and rejoinder). Stat Sci. 1996, 11: 89-121. 10.1214/ss/1038425655.View ArticleGoogle Scholar
- Lang S, Brezger A: Bayesian P-splines. J Comp Graph Stat. 2004, 13: 183-212. 10.1198/1061860043010.View ArticleGoogle Scholar
- Rue H: Fast sampling of Gaussian Markov random fields with applications. J Royal Stat Soc Series B. 2001, 63: 325-338. 10.1111/1467-9868.00288.View ArticleGoogle Scholar
- Rue H, Held L: Gaussian Markov Random Fields: Theory and Applications. 2005, Boca Raton: Chapman and HallView ArticleGoogle Scholar
- Brezger A, Kneib T, Lang S: BayesX: Analyzing Bayesian structured additive regression models. J Stat Soft. 2005, 14: 11-View ArticleGoogle Scholar
- Belitz C, Brezger A, Kneib T, Lang S: BayesX-Software for Bayesian inference in structured additive regression models. 2009, Version 2.0. [http://www.stat.uni-muenchen.de/~bayesx]Google Scholar
- Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A: Bayesian measures of model complexity and fit (with discussion). J Royal Stat Soc Series B. 2002, 64: 583-640. 10.1111/1467-9868.00353.View ArticleGoogle Scholar
- Besag J, Kooperberg C: On conditional and intrinsic autoregressions. Biometrika. 1995, 82: 733-746.Google Scholar
- Kelsall J, Wakefield J: Discussion of "Bayesian models for spatially correlated disease and exposure data". Bayesian Statistics 6. Edited by: Best NG, Arnold RA, Thomas A, Conlon E, Waller LA, Bernado JM, Berger JO, Dawid AP, Smith AFM. 1999, Oxford: Oxford University Press, 151-Google Scholar
- Fahrmeir L, Lang S: Bayesian inference for generalized additive mixed models based on Markov random field priors. Applied Statistics. 2001, 50: 201-220. 10.1111/1467-9876.00229.Google Scholar
- Fahrmeir L, Lang S: Bayesian semiparametric regression analysis of multicategorical time-space data. Ann Inst Stat Math. 2001, 53: 11-30. 10.1023/A:1017904118167.View ArticleGoogle Scholar
- Besag J, York Y, Mollie A: Bayesian image-restoration, with two applications in spatial statistics (with discussion). Anna Inst Stat Math. 1991, 43: 1-59. 10.1007/BF00116466.View ArticleGoogle Scholar
- Odoi A, Martin SW, Michel P, Middleton D, Holt J, Wilson J: Investigation of clusters of giardiasis using GIS and spatial scan statistics. Int J Health Geog. 2004, 3: 11-10.1186/1476-072X-3-11.View ArticleGoogle Scholar
- Atkinson P, Molesworth A: Geographical analysis of communicable disease data. Spatial Epidemiology; Methods and Applications. Edited by: Elliot P, Wakefield JC, Best NG, Briggs DJ. 2000, New York: Oxford University Press, 253-266.Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/12/118/prepub
Pre-publication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.