In this analysis of data from an HIV self-test trial among FSWs in three Zambian border towns, we show that interviewers often substantially affected what respondents reported regarding their lives, in particular their psychological wellbeing and experiences of violence. In the context of 16 interviewers each conducting at least 60 interviews, an average of one-sixth of all variance in question responses was observed at the interviewer level, even after accounting for study site. This interviewer-level variance rose to almost one-third for questions about psychological ill-health and violence, despite the prevalence of both being very high and careful interviewer training [27]. These variations fed through in some cases to measures of association, i.e., failing to account for interviewer effects led to different coefficient estimates in regression models.
The importance of interviewer variation has long been recognized in the survey design and analysis literature [12, 19, 31] and our findings reinforce the importance of interviewers for measures of prevalence. Our findings support particularly strong interviewer effects for sensitive topics, notably physical and sexual abuse, and subjective ones, such as depression, social support and self-efficacy. For example, for the question “In the past 12 months, has a sexual partner ever physically forced you to have sex when you did not want to?”, the proportion of each interviewer’s 60 respondents answering in the affirmative varied from 13 to 97%. This occurred despite the two interviewers with the most extreme proportions working in the same town, and thus theoretically interviewing fully exchangeable respondents.
The potential impact of interviewer variation can be minimized by careful training in question presentation, and monitoring of response patterns by interviewer identity during study conduct (with feedback of these findings to the field teams). Other potentially useful steps include matching interviewers and respondents by age and gender, and providing support for interviewers in managing their own distress in hearing reports of violence or other hardship [23]. When interviewer-level variance is anticipated, it is also preferable to have a large number of interviewers doing few interviews, rather than a few interviewers doing many; this both reduces the burden on interviewers, and avoids outlying interviewers from having oversized impacts [32].
Despite the substantial variance in responses at the interviewer level, interviewers’ gender was associated with relatively few variables. There were substantive (i.e., more than 10 percentage points), if non-significant, differences by gender-of-interviewer for several variables and significant differences for two question topics: SES and sex-work related violence. We were unable to determine in this analysis whether the gender-of-interviewer differences seen reflect social distance or social desirability, since there was no variation in respondent gender. However, our finding that the largest gender-of-interviewer effects exist for topics which have substantial gender components (i.e., SES and IPV) provides support for social role theory. Specifically, FSWs reported having lower SES and more recent sex-work related IPV to female interviewers. This was in contrast to almost no reporting difference for questions such as age, marital status, pregnancy history and perceived risk of being HIV-positive. These findings highlight that, while matching interviewers and respondents on key characteristics may not be feasible, the influence of interviewer-respondent dyad characteristics should evaluated for analysis on topics with strong social role expectations, such as gender-based violence and economic behaviour.
We also showed that the association between two self-reported variables can be confounded by interviewers. In our analysis, recent HIV testing behaviour was significantly negatively associated with both past physical and sexual abuse when we did not include interviewer identity in our models, but this association was attenuated and rendered non-significnat by including interviewer-level random effects. In order for interviewers to have such an effect, both exposure and outcome must be susceptible to interviewer influence. This is clearly the case when both variables are self-reported, but can also arise when interviewers are also asking individuals to take a test – a topic that has been substantively investigated in the context of HIV testing within population studies [33, 34]. Our results highlight the need to consider interviewer identity as a possible confounder in associational as well as prevalence analyses.
Given that much of the data in this study is self-reported, it is difficult to know which interviewers are receiving the “truer” responses and thus which results to act on. In this population, for example, even based on responses to male interviewers respondents are poor and at substantial risk for IPV: median income is under $600 per annum, half the Zambian average, and over 40% reported each of: physical abuse; sexual abuse; and having had sex when they did not want to because they were afraid in the past 12 months. There is clearly a substantial public health concern whichever values are closer to reality. However, in some other settings, the level of impact interviewer gender had in this study may be sufficient to provide conflicting results – with male interviewers finding a substantial health risk but female interviewers only a limited one, or vice versa.
Strengths and limitations
Our results should be interpreted in the light of various strengths and limitations. The underlying ZEST study comprised almost 1000 FSWs who were part of a population with relatively little experience of engaging with researchers, which should minimize respondent learning effects in terms of intentional mis-reporting. However, this may also have led to respondents misunderstanding questions they had not previously considered in a systematic fashion.
Since all ZEST participants were women, we are unable to differentiate whether the gender effects we saw reflected gender-of-interviewer effects or gender-homophily of interviewer-respondent dyads. Our ability to generalize from the ZEST study population to others is also somewhat limited: it is hard to know whether FSWs in more cosmopolitan settings, or women more generally in Zambia or sub-Saharan Africa (including those engaging in informal sex work), would have been similarly affected by interviewer characteristics. Nevertheless, our key findings that interviewers can generate substantial, systematic differences in item response patterns, even when randomly assigned to respondents, are likely to be widely applicable.
Furthermore, we do not have sufficiently detailed information available on interviewer identities to determine whether interviewers varied systematically by gender on other characteristics, e.g. educational attainment, that might have affected their ability to elicit sensitive responses from respondents. Concern on this front is somewhat allayed by the very similar responses (and low ICC values) for less sensitive topics. Finally, the ZEST study did not include follow-up interviews on the topic of interviewer-respondent interaction, and thus we are not able to directly assess whether between-interviewer differences reflected true random difference or some combination of social distance, social desirability and social role.