Parasite threshold associated with clinical malaria in areas of different transmission intensities in north eastern Tanzania

Background In Sub-Sahara Africa, malaria due to Plasmodium falciparum is the main cause of ill health. Evaluation of malaria interventions, such as drugs and vaccines depends on clinical definition of the disease, which is still a challenge due to lack of distinct malaria specific clinical features. Parasite threshold is used in definition of clinical malaria in evaluation of interventions. This however, is likely to be influenced by other factors such as transmission intensity as well as individual level of immunity against malaria. Methods This paper describes step function and dose response model with threshold parameter as a tool for estimation of parasite threshold for onset of malaria fever in highlands (low transmission) and lowlands (high transmission intensity) strata. These models were fitted using logistic regression stratified by strata and age groups (0-1, 2-3, 4-5, 6-9, and 10-19 years). Dose response model was further extended to fit all age groups combined in each stratum. Sub-sampling bootstrap was used to compute confidence intervals. Cross-sectional and passive case detection data from Korogwe district, north eastern Tanzania were used. Results Dose response model was better in the estimation of parasite thresholds. Parasite thresholds (scale = log parasite/μL) were high in lowlands than in highlands. In the lowlands, children in age group 4-5 years had the highest parasite threshold (8.73) while individuals aged 10-19 years had the lowest (6.81). In the highlands, children aged 0-1 years had the highest threshold (7.12) and those aged 10-19 years had the lowest (4.62). Regression analysis with all ages combined showed similar pattern of thresholds in both strata, whereby, in the lowlands the threshold was highest in age group 2-5 years and lowest in older individuals, while in the highlands was highest in age group 0-1 and decreased with increased age. The sensitivity of parasite threshold by age group ranged from 64%-74% in the lowlands and 67%-97% in the highlands; while specificity ranged between 67%-90% in the lowlands and 37%-73% in the highlands. Conclusion Dose response model with threshold parameter can be used to estimate parasite threshold associated with malaria fever onset. Parasite threshold were lower in older individuals and in low malaria transmission area.


Background
Malaria is a major cause of ill health in Africa, especially south of the Sahara, where it takes its greatest toll in young children and pregnant women. About 90% of all malaria deaths in the world today occur in sub-Saharan Africa, where the most virulent species of the parasite, Plasmodium falciparum flourishes. According to health facility statistics, in Tanzania malaria is the leading cause of morbidity and mortality and accounts for about 30% and 15% of hospital admissions and deaths, respectively [1]. The major cause of malaria related deaths is severe anaemia [2] and complicated malaria, which can be attributed to delay in seeking health care [3], inadequate functional referral system, poor quality of health care and emergence of resistance to commonly used antimalarial drugs [4].
Control of malaria currently relies on chemotherapy [5] and mosquito controls [6,7]. Other control strategies would be vaccination which could target different stage of parasite development. Several vaccines are at different phase of clinical development [8], however, none is yet licensed.
Malaria controls face lot of challenges due to parasite and host dynamics, which includes resistance to both insecticides and antimalarial drugs, and slow development of immunity [5]. The acquired malaria immunity varies with the level of exposure to parasites; hence individuals living in holo-endemic areas tend to acquire immunity sooner than those in less endemic areas. Thus, individuals of similar age living under different malaria transmission intensity can vary considerably with respect to their vulnerability to infection [9].
Evaluation of efficacy of different interventions requires precise definition of endpoints, but the problem is that clinical features of malaria are not specific, for example, presence of parasite alone or parasite with fever are not adequate definition of clinical malaria [10,11]. However, high parasite density is more likely to coincide with fever [10,11], so setting criteria for definition of clinical malaria should be done with care as it may lead to some cases being misclassified; for both false positives and negatives resulting in decreased specificity and sensitivity. Low sensitivity will result if true positive cases are misclassified as negatives which could arise due to measurement errors or case definition being too stringent.
The predominant variables in definition of malaria disease are parasite density and measured fever. For example Smith et al. [10] used logistic model to model the fever attributable to different levels of parasite densities in endemic areas. They further used sensitivity-specificity analysis to estimate the parasite threshold associated with fever. Rogier et al. [12], modeled the parasite threshold as a step function to relate to the risk of fever. Other models that could be used include dose response models which has been used in toxicology studies [13][14][15]. These models consider different level of doses, to establish a dose below which no effect occur to organs or system, or in extreme the death of an organism. The theory of threshold in toxicology is the minimum dose that its effect is likely to be similar to the background (i.e similar to dose zero) [15].
Main aim of this paper is to determine Plasmodium falciparum parasite threshold associated with fever in high and low malaria transmission intensity areas. In the study area malaria transmission intensity is influenced by altitude, where the lowlands are characterized by a high transmission and highlands by low transmission [16]. The motivation behind is that, an individual develops malaria fever when a parasite density exceeds certain threshold, and this is assumed to vary with transmission intensity and age of an individual.

Data description
This involved a case-control study in individuals aged below 20 years in six villages (Kwamasimba, Kwamhanya, Magundi, Kwashemshi, Mng'aza and Mkokola) of Korogwe District, in Tanzania. These villages were under passive case detection (PCD) of malaria fevers as detailed elsewhere [17,18]. The first three villages are located in the highland and last three in lowland areas. In the PCD system, community health workers were trained to manage uncomplicated malaria episodes on clinical grounds, whereby for individuals who presented with history of fever in the past two days or with axillary temperature ≥ 37.5°C, a morbidity questionnaire was completed. Blood smear for malaria parasite detection was taken and first line antimalarial drug was administered. Pregnant women and individuals with signs of severe malaria or other features suggesting other complications were referred to nearby health facility or Korogwe district hospital [17]. Sulphadoxine pyrimethamine was used as the first line up to January 2007, while Artemether/lumefantrine was introduced in the villages in February 2007 following the change of antimalarial drug policy in Tanzania.
Cross-sectional malariometric surveys were conducted in these villages during low malaria transmission (Sep/Dec) and high transmission (March/June) seasons, where assessment of clinical features were done for all consenting individuals, and then blood samples for malaria parasite detection by thick and thin blood smears were taken.
Cases were detected from individuals under PCD and controls were individuals recruited during cross-sectional malaria surveys [19]. Individuals who reported with measured fever (axillary temperature ≥ 37.5°C) to the community health workers with intention to be treated for malaria in the six villages qualified as cases [20].
In the data set used, two villages (Kwamasimba from highland and Mkokola from lowland) started PCD in January 2003 while the rest were introduced in the system from January 2006 and all were followed to December 2007. A total of 7 cross-sectional surveys were done in villages where PCD started in 2003 and 4 in villages included in 2006. Malaria parasites were counted against 200 white blood cells (WBC), and a blood smear was declared negative after examination of 100 high power fields. Parasite counts/200 WBC were converted to parasites per microliter by multiplying by 40 assuming that 1 μL of blood of normal person contains 8000 WBC. Parasite counts were normalized by natural log transformation.
Plasmodium falciparum malaria parasite prevalence based on 6 cross sectional surveys conducted from 2005-2007 in the two strata, showed that in the lowlands the overall parasite prevalence was 37.4% while in the highlands it was 11.0%. Children below 5 years had a prevalence of 30.2% and 10.0%, 5-9 years had 46.9% and 11.8% while those in age group 10-19 years had 36.5% and 11.6% in lowlands and highlands, respectively. Distribution of Plasmodium falciparum parasite rate and fever according to age groups from the PCD data in the two strata, showed that children aged 4-5 years had the highest parasite rate, while among those infected with Plasmodium falciparum, individuals aged below 2 years had the highest rate of fever, see Figure 1.
The ethical clearance was granted by the Medical Research Coordinating Committee of the National Institute for Medical Research, Tanzania.

Parasite threshold models formulation Notation
Let y i , i = 1, 2,... n be a random variable which takes value of (Y = 1) if a fever episode is observed and (Y = 0) otherwise, with probability Pr(Y i = 1) = γ i and Pr(Y i = 0) = 1 -γ i .
Let p i denote the log parasite density observed in symptomatic and asymptomatic individuals, where n is the number of observations. Let τ be a parasite threshold for an individual to develop malaria fever (clinical malaria).
Proportions of individuals with Plasmodiumfalciparum parasite (left) and fever among parasitaemic (right) by age group from PCD data  Threshold models The n response variables y i 's given parameter (θ) can be fitted as a binomial random variables by a joint distribution given by: and log-likelihood function given by Relation to the explanatory variable(s) is through a logit link function, such that where α and β are parameter estimates and p i is log parasite density. However, our aim is the parasite threshold (τ) which is most likely to coincide with malaria fever. In modeling the parasite threshold, two approaches are considered: (1) use of a step function model, where individuals with parasite below a threshold (τ) are considered as healthy and otherwise as diseased group. The step function model takes the following form: where p i is the log parasite density, τ > 0 is the threshold, I is an indicator variable (0 if p i ≤ τ and 1 if p i > τ), and α and β are location and scale parameters.
In (2), we consider malaria fever to be a function of continuous parasite density. In this case, fever episode are modeled as a continuous function of difference of parasite density and threshold parameter. Probability of fever in individuals with parasite less or equal to τ is assumed to be constant, α (which corresponds to background) and that above the τ, the probability increases with the increase in the difference of log parasite density and threshold parameter. The model is defined as: where p i , τ > 0, α and β are similar as above. In both cases the probability of clinical malaria in individuals with parasite below τ is logit -1 (α), which corresponds to the background (i.e the probability of fever is considered to be similar between parasitaemic and non parasitaemic individuals). It should be noted that the assumption of abrupt change in reaction to infection following a certain level of parasite density is only used for modeling purposes but does not necessarily present what happens in a host.

Estimation of threshold by regression
To investigate if τ depend on age, a regression model can be fitted for different ages (age groups) combined, and then test if there is significant difference in the slope parameter. If we let the mean age of group j be A j , j = 1,..., J, and assume τ to show a similar trend as parasite rate shown in Figure 1, which was also shown elsewhere [12], and be the mean age squared of group j, then τ can be modeled as a function of A j and with parameters θ = (θ 0 , θ 1 , θ 2 ). Otherwise, if θ 2 is not significant then the effect of A* will be excluded from estimation of τ parameter. For simplicity, modeling will be done separately for each stratum as follows: So, we can extend model 2 by inserting the estimated value of τ and get: where β 1 is the parameter measuring the excess risk of fever when age (or mean age) increase by one year, β 2 is similar as β in model 2. If we let Z ij = (p ij -τ j ), the model simplifies to

Modeling fevers using non-threshold model
We used non linear model to fit the effect of parasite density on fever in each age group. The model was used for the purpose of determining how best the above threshold models (model 1 & 2) fits the data when compared to classical methods. Model considered is where f(p ij ) = β(p ij ) ν is a monotone increasing function of parasite density p ij . This model has been shown to fit well the relation of fever and parasite density better than the normal regression of fevers on parasites or log parasites, see [10]. We used Box-Tidwell [21] method to find the maximum values for β and ν. Performance of the models (1,2 and 4) were compared by using Akaike information criteria (AIC).

Models fitting
The parameter of interest is τ which is unknown and cannot be estimated explicitly, we used profile likelihood method in fitting models 1 and 2. In profile likelihood, let l(θ : p i ) = l(τ, α, β : p i ) be a log likelihood function, then we can compute the maximum likelihood estimators for α and β which maximizes the log likelihood function for all value of τ, i.e Maximum value of τ can then be obtained from a plot of against τ.
Models 3 were fitted using simulated annealing procedure, optim with method option SANN from R statistical package. Confidence intervals for τ was obtained by subsampling bootstrap method [22], because models are not smooth in the threshold parameter. The convergence rate of the threshold parameter is n, that is n( -τ) converges in distribution [23].

Simulation studies
We simulated 300 data sets, each with n = 2000 observations from a negative binomial distribution with mean, μ = 100 and scale parameter (0.03 and 0.1), to give a prevalence of 20% and 50% which mimics the distribution of parasites in a host [24] from lowlands and highlands, respectively. The simulated samples were multiplied by 40 and then log transformed to resemble unit used in quantification of malaria parasites (parasites/μL) in log scale. We generated a binomial random variable from model (2). The binomial random variable was assigned a probability of occurrence to be γ i = logit -1 (α) if p i ≤ τ and γ i = Model with minimum RMSE is chosen as the model with best fit.

Sensitivity and Specificity of threshold parameter
We assessed the sensitivity and specificity of threshold in predicting malaria fever. We defined a positive test (E 1 ) when parasite was above τ, otherwise a test was negative (E 0 ); while individuals with fever were defined as a disease group D 1 and those without fever as non-disease group D 0 . Case-control study can be used in estimation of sensitivity and specificity but not predictive values, since the latter require a prevalence of disease to be known or estimated from the data, and this cannot be obtained from our design [25]. So, for group j = 1,...,5 the sensitivity and specificity is given by: where P is the probability.

Results
Results show that model (2) had higher log-likelihood in all estimates than model (1), see for example Figure 2, and simulation results (Table 1) which shows that model (2) had considerably lower bias. We also compared models (1) and (2) to that of model (4) to see how they best fit the data, and we found model (2) was doing well since in almost all age groups, it had the lowest AIC (Table 2). Estimated odds ratio from Table 2 cannot be compared because they explain different relation, example in the model 1, is the odds between those with log parasite density above the threshold and below the threshold, while in model 2 is the odds ratio of fever when log parasite density increases by 1 above the threshold.
In lowlands, children aged 4-5 years had highest parasite threshold 8.73 (6215 parasite/μL), while the lowest was for those of age group 10-19 years (903 parasite/μL). Children in age group 0-1 in the highlands had the highest parasite threshold 7.12(1233 parasite/μL), while the lowest was seen in oldest age group, see Table 3 and Figure 4. Highland stratum had lower parasite thresholds compared the lowlands, and in both strata thresholds were decreasing with increasing age. Lower 95%CI of parasite threshold in individuals aged 10-19 years in the lowlands villages was 6.68 (796 parasite/μL), while for other age groups in the same stratum, the lower 95%CI was above 7.87(2617 parasite/μL). In the highlands, children aged 0-1 and 4-5 years had lower 95%CI of parasite threshold above 300, while the older age groups, the lower 95%CI threshold were below 100 parasites per μL, see Table 3. Figure 3 shows that in both strata, children in age group 0-1 years who had negative blood smears had higher rate of fever compared to older age groups; and this was decreasing as age increased. A slightly increase in rate was also seen in individuals in age group 10-19 years. This however cannot be interpreted as prevalence of fevers in these age groups within populations, because of sampling fraction between controls and cases which is unknown. Figure 5 shows that odds ratio (OR) of fever in individuals who had parasite above the threshold value was increasing across the age group in the lowlands, where the maximum was reached in the age group 6-9 years. There were more or less similar ORs in the highlands, except in age group 4-5 years, which had higher OR than rest of age groups.
Results from model (3) showed that, parasite threshold varied significantly with age in both strata. In the lowlands, the threshold parameter was log parasites density 8.164(95%CI 7.990 -9.071), and it was increasing by 0.119(95%CI: -0.092 -0.174) when mean age increased by one year, and decreased by 0.015(95%CI: 0.003 -  Observed (open circles) and fitted (lines) proportions of measured fever against log parasite/μL by age group Figure 3 Observed (open circles) and fitted (lines) proportions of measured fever against log parasite/μL by age group. Upper and lower rows represent lowlands and highlands, respectively, while the turning point in each line is the threshold value.  Parameters were estimates by profile likelihood methods using model (2) 0.019) for any increase in mean age squared by one year squared, see Figure 4. Similarly, in the highlands the threshold parameter was log parasite density 7.480(95%CI: 6.942 -8.326), which decreased by 0.368(95%CI: 0.221-0.604) when mean age increased by one year and increased by 0.011(95%CI: 0.002 -0.024) with the increase of mean age squared by one year squared. In the lowlands stratum, the highest and lowest parasite threshold were (4397 parasites/μL) and (1071 parasites/μL) in age group 4-5 and 10-19 years, respectively. This model shows that there was a significant age effect in both strata and that the threshold parameters estimated from this model were similar to the ones obtained by model (2).

Profile likelihood for log(parasite/μL) in individuals aged 0-4 and 5-19 years in lowlands and highlands
Sensitivity and specificity of threshold parameter for different age groups and strata is as shown in Table 3. The sensitivity were higher in the highlands (range 67%-97%) than in the lowlands (range 65%-74%), while specificity were higher in the lowlands (range 67%-90%) than in the highlands (range 37%-73%). In the lowlands, children aged 0-1 years had the highest sensitivity and lowest specificity, while in the highlands, the highest sensitivity and lowest specificity, were in children of age 6-9 years. On average, in the lowlands the specificity was higher than sensitivity while in the highlands it was the opposite.

Discussion
A better understanding of interaction between malaria parasites and clinical features might lead to a better designing and evaluation of different malaria interventions particularly drugs and vaccines. The main aim of our analysis was to determine the parasite threshold associated with fever in individuals aged 0-19 years. Lowland stratum is characterized as high malaria transmission area and the highlands as the low transmission area [16,26], and that in the high transmission area, children are the most affected while in the low transmission area all age groups are almost equally affected because of lack of acquired immunity against parasite and clinical disease [5]. The acquired malaria immunity varies with the level of exposure to parasite; hence individuals living in holoendemic areas tend to acquire immunity sooner than those in hyper endemic areas. Older individuals in holoendemic areas have higher immunity than children because of their prolonged exposure to malaria parasites [9].
Our modeling approach used strata and age of individuals as an important factors in determining parasite threshold associated with fever. We fitted three models (1-3) to determine parasite thresholds. Model (3) is an extension of model (2), where instead of fitting different models for each age group, one model was fitted for the whole data Distribution of parasite thresholds against mean age groups for lowlands (left) and highlands (right) Figure 4 Distribution of parasite thresholds against mean age groups for lowlands (left) and highlands (right). Points represent estimates with 95%CI (line segments) from model (2) and full line is the fit by model (3) 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 Mean age (years) taking age as explanatory variable. A fourth model (4) was fitted for the purpose of assessing whether the first two models fits the data well when compared to the traditional ones. Model (1 and 2) had low information compared to model (4) because of the truncation (i.e parasite below and/or above τ were each categorized in one group), and hence the first model had lowest information.
Comparing the AICs, model (1) performed poorly when compared to the other two models, and this indicates that grouping parasite density into two groups is an oversimplification and might results in poor predictions. It was interesting to note that model (2) performs better than model (4) for most of groups (the exception being lowlands 4-5 and highlands 0-1 years) despite the fact that model (2) was based on a cut-off threshold and therefore truncated data. This probably reflects that quantification of low density parasitaemia by microscopy is difficult and not precise.
Simulation shows that model (2) was better in the estimation of malaria parasite threshold than model (1). The model provides the background level where the probability of fever is considered similar for parasite below the threshold and then probability of fever increases as parasite density increases above the threshold. Model (1) might be simple but this gives only the probabilities below threshold and that above the threshold, and it does not give a flexible way of choosing the threshold; for instance when a strict parasite threshold is desired. Furthermore, the model is weak especially when the slope parameter (rate of change of parasite density) is low, as it can overestimate the threshold parameter, see Table 1.
Model (3) gives a better way of explaining the relation of the thresholds to variable of interest. For example, the age and transmission intensities which are known to be important determinants in malaria transmission. This model provides a way of adjusting for other covariates. However, the problem is the optimization process where computing time grows as number of parameters to be estimated increases. For example, in a computer with Intel Pentium ® Dual CPU T3200 2.0 GHz processor and 4 GB of RAM, without sub-sampling bootstrap, it takes few seconds to get the estimates for the parameters in models (1 and 2), while model in (3) where τ was fitted on age and age squared, in a data set with 5050 records it requires about 5 minutes. Threshold parameters were well estimated in model (3) especially for the lowlands strata, as Distribution of odds ratios (full lines) and 95%CI (dashed lines) for fever by age group among individuals with parasite exceed-ing threshold (τ) shown in Table (2) Figure 5 Distribution of odds ratios (full lines) and 95%CI (dashed lines) for fever by age group among individuals with parasite exceeding threshold (τ) shown in Table (2). The left and right plots are for lowland and highland strata, respectively. can be seen from Figure 4; that a uniform confidence band will also include the fitted estimates of thresholds in the separate age groups. However, in the highlands, there was much variability in the threshold parameters fitted by model (2), which might be due to the few number of individuals with positive smears in each age group. So, model (3) might be more appropriate since it contains more data. Source of variations in the model fittings can be due to measurement errors in parasite density, which could be due to inaccurate parasite quantification as a result of either poor sample preparation, error in parasite counting and technician performance [27].
Even though methods used in estimation of parasite threshold and our data set type differ considerably with other studies [10,28], the findings from this study are still comparable. For example, in children below 5 years, Chandler et., al. [28] found a parasite threshold of 4000 parasite/μL in those living in the lowlands and 1000 in the the highlands, while Smith et., al. [10] found a threshold of 5000 in children living in area of high transmission. The parasite thresholds were higher in high transmission area and lower in low transmission area [19,28]. There was also a difference in the threshold across the age groups where children below five years had higher parasite threshold compared to older individuals, which is similar pattern as found elsewhere [28]. This study shows that children in age group 2-5 years in the lowlands had the highest threshold while individuals in age group 10-19 years had the lowest. The pattern of the thresholds in lowlands, where malaria is endemic shows a similar trend to that of parasite density by age. This suggests that immunity plays a significant role in the threshold, as it has been shown to develop slowly in individuals who are constantly exposed to malaria, where a period of ten years was estimated to be the maximum for full development of immunity [29].
Results from sensitivity and specificity analysis for case definition using parasite threshold gave results similar to other studies done in lowland areas [10,28], suggesting that the model can be used in similar settings.

Conclusion
We conclude that dose response model with threshold parameter can be used to estimate parasite thresholds. Parasite threshold varies with transmission intensity, and in this area children below five years have highest threshold.