Modelling attrition and nonparticipation in a longitudinal study of prostate cancer

Background Attrition occurs when a participant fails to respond to one or more study waves. The accumulation of attrition over several waves can lower the sample size and power and create a final sample that could differ in characteristics than those who drop out. The main reason to conduct a longitudinal study is to analyze repeated measures; research subjects who drop out cannot be replaced easily. Our group recently investigated factors affecting nonparticipation (refusal) in the first wave of a population-based study of prostate cancer. In this study we assess factors affecting attrition in the second wave of the same study. We compare factors affecting nonparticipation in the second wave to the ones affecting nonparticipation in the first wave. Methods Information available on participants in the first wave was used to model attrition. Different sources of attrition were investigated separately. The overall and race-stratified factors affecting attrition were assessed. Kaplan-Meier survival curve estimates were calculated to assess the impact of follow-up time on participation. Results High cancer aggressiveness was the main predictor of attrition due to death or frailty. Higher Charlson Comorbidity Index increased the odds of attrition due to death or frailty only in African Americans (AAs). Young age at diagnosis for AAs and low income for European Americans (EAs) were predictors for attrition due to lost to follow-up. High cancer aggressiveness for AAs, low income for EAs, and lower patient provider communication scores for EAs were predictors for attrition due to refusal. These predictors of nonparticipation were not the same as those in wave 1. For short follow-up time, the participation probability of EAs was higher than that of AAs. Conclusions Predictors of attrition can vary depending on the attrition source. Examining overall attrition (combining all sources of attrition under one category) instead of distinguishing among its different sources should be avoided. The factors affecting attrition in one wave can be different in a later wave and should be studied separately.


Background
Nonresponse occurs when a sampled subject fails to respond to a survey either partially (item nonresponse) or entirely (unit nonresponse). Unit nonresponse reduces sample size and study power. Significant differences between respondents and non-respondents can cause nonresponse bias, which is a type of selection bias [1].
Attrition occurs in a longitudinal study when a participant fails to respond to one or more study follow-up waves. A participant may skip one wave but subsequently respond to a later wave (intermittent) or may quit the study completely (drop-out). Since the accumulation of drop-outs over several waves can reduce the representativeness of the target population, nonresponse is even more of a concern in longitudinal studies. Problems from attrition in longitudinal studies are similar to those from nonresponse in cross-sectional surveys; they reduce study power and cause bias in estimates if certain subpopulations are over-or under-represented in the sample. Attrition can accumulate with each study wave, creating a final sample that could differ in characteristics from the original sample. Attrition due to death and decline in the health of study participants might cause particular problems in health-related studies of older people, because those who continue to participate would be healthier [2].
Nonresponse rates to epidemiologic studies have been observed to increase in recent years regardless of the disease studied, geographical region or age of the study population [3][4][5][6][7][8][9][10]. Morton et al. [11] abstracted information from 355 research articles and found that the declines in participation rates were particularly sharp in population-based case control studies, with an average decrease of 1.18% for cases and 1.49% for controls per year from 1970 to 2003. Such increases in nonresponse are paralleled in other disciplines. Curtin et al. [12] studied response to a telephone-based survey of consumer attitudes from 1979 through 2003 and observed a decrease in total response rate (from 72 to 48%) and an increase in refusal rate (from 19 to 27%) over time, with accelerating rates of decline in response for the period 1996-2003. In two longitudinal studies of recovery from coronary events, the drop-out rates ranged from 15 to 40%, causing a participant loss as high as 70% [13]. A longitudinal study of drug and alcohol use in adolescents lost almost 25% of the original cohort at 1 year follow-up [14].
Several theories exist for the increase in nonresponse. A decrease in social responsibility and an increase in privacy concerns have led individuals to be more reluctant to participate in surveys [15]. A rise in surveys has created research fatigue [8,15]. Distrust of science and researchers, especially in non-European American (EA) communities, has hindered study recruitment efforts [16,17]. Additional theories exist for increased attrition, such as loss of interest, moving and life changes [13]. Study fatigue also affects attrition, especially in older populations due to increased cognitive impairment and morbidity [13,18]. Consequently, numerous studies have been conducted to determine which persons are more likely to be non-respondents or drop-outs. Males, people who are less educated, unemployed, not-married or of lower economic status are less likely to participate in cross-sectional studies. Results regarding age are less consistent, with some studies showing those who are younger having lower participation propensity than those who are older and other studies showing the opposite [8]. Generally, those who are more disadvantaged and in poorer health tend to be non-respondents [19]. Characteristics of drop-outs are similar. Males, people with less education, not-married, of lower economic status, of small household size, of poor health status, ethnic minority, or living in urban areas are at greater risk of attrition [13,20,21]. The effect of age on attrition is unclear, with some studies reporting increased attrition with old age and others reporting the opposite [22].
Longitudinal studies that focused on older cohorts generally reported a negative effect of old age on attrition. Older age, low education, and longer distance between the study center and the participant's residence affected attrition in the Baltimore Longitudinal Study of Aging [23]. Increasing age and cognitive impairment were consistently related to increased attrition in a systematic review of 12 longitudinal studies in elderly populations [24].
Many strategies have been proposed to minimize nonresponse. Monetary incentives and advance letters have been shown to increase participation [15,25]. Increasing interview attempts during evening hours might help to establish contact [1,15]. Reminder letters can be sent to notify participants about when they will be contacted for the study. Interviewers' interaction with potential participants might impact refusals; thus training interviewers regarding how to tailor their behavior while recruiting might help minimize refusals [1]. More experienced, extroverted and conscientious interviewers can increase participation [26]. Recruiting minority populations might require additional strategies. Organizing community outreach events, using racially diverse interviewers, and providing a toll free number might be beneficial in recruiting non-EA participants and for creating and maintaining trust [16,27]. Note that an approach with a specific subgroup in a specific study may not work with a similar group under different circumstances. A combination of strategies should be used while monitoring the survey process in real time and modifying them as needed to decrease nonresponse and minimize racial/ ethnic differences between respondents and non-respondents [28].
Attrition also can be minimized by offering incentives, sending postcards and scheduling telephone reminders. Collecting detailed contact information from participants at each wave could especially be useful in longitudinal studies with long follow-up times. A longitudinal study on adolescent drinking that maintained contact with 97% of its sample after 18 months initially asked participants not only for their personal contact information, but also for the contact information of adults and friends who could locate them if future contact was lost [29]. Re-contacting and re-interviewing participants who miss a study wave and bringing them back at later waves can help reduce overall attrition [30]. Newsletters may be used to update study participants on study progress in between waves to prevent attrition in longitudinal studies with long follow-up times. Community engagement, tracing noncontacts, utilizing mixed survey modes and providing incentives also have been shown to reduce attrition [21].
Statistical techniques can be used to adjust for nonresponse and attrition after collecting the data, but these procedures are not absolute. When auxiliary data exist, weighting procedures can reduce nonresponse bias, but they may bias estimates of standard errors. Small or large weights might create instabilities in estimates. Multiple imputation techniques can be used to replace missing data due to attrition by modeling the missingness; however, specifying the correct model from available data is difficult [31]. Optimal recruitment strategies should be implemented before and during data collection because post-survey adjustments are merely estimated remedies to the problems caused by nonresponse and attrition [28].
Current research on attrition usually does not differentiate between attrition through refusals and attrition through other reasons, such as death or lost to follow-up; however different sources of attrition have been shown to have different determinants [32]. In this study, we distinguished between different sources of attrition. We evaluated the potential predictors of attrition in the second wave of a study of prostate cancer (PCa). Research on attrition is commonly based on the characteristics of non-respondents at wave 1 to explain attrition in later waves, which ignores the role of events after wave 1. However, attrition models fit to early waves may become less predictive of attrition in future waves [32]. In order to compare the differences between the two waves of the same study group, we defined "nonparticipation" as "attrition through refusals". We compared these findings with our previous nonparticipation analyses of the same Louisiana (LA) cohort [28] and evaluated if the factors affecting refusals were the same in each wave. Specifically, we assessed if racial differences regarding refusals were consistent over time.

Methods
The North Carolina-Louisiana prostate cancer project (Wave 1) The North Carolina-Louisiana Prostate Cancer Project (PCaP) is a multidisciplinary population-based case-only study designed to identify racial and geographic influences on PCa aggressiveness. The study collected information on social, individual, and tumor level factors. Eligibility criteria included living in the North Carolina (NC) or LA study areas, having first diagnosis of histologically confirmed adenocarcinoma of the prostate, being 40-79 years old at PCa diagnosis, being able to complete the interview in English, living outside an institution, not being cognitively impaired or physically severely debilitated, and not being under the influence of severe medication or alcohol, or apparently psychotic at the time of the interview [33]. Demographic and socioeconomic characteristics of LA participants in PCaP (PCaP-LA) were described in Brennan et al. [34]. The PCaP-LA cohort started enrollment in September 2004 but suspended accrual in August 2005 due to Hurricane Katrina. This study phase is referred to as the pre-Katrina (Pre-K) sample, which enrolled 119 African-Americans (AAs) and 94 EAs. The post-Katrina (Post-K) enrollment resumed in September 2006 and completed in August 2009 with 506 AA and 508 EA participants [28,34].
The quality of life in prostate cancer project (Wave 2) The Quality of Life in Prostate Cancer Project (Q-PCaP) is a follow-up study of the LA PCaP participants who were re-contacted 3-6 years after their initial interview. The follow-up study sought to investigate racial disparities in quality of life in men with PCa. All study participants enrolled in PCaP who completed the baseline interview and consented for future contact were eligible for Q-PCaP [34].

Unit nonresponse in the PCaP-LA cohort
Our group previously collected auxiliary information on refusals of the PCaP-LA cohort by combining data from LA Tumor Registry (LTR), U.S. census tract and PCaP eligibility forms, and evaluated factors affecting nonparticipation in PCaP with a specific focus on race, PCa diagnosis age, and study phase (Pre-K vs Post-K) [28]. Results showed that older age for AAs (≥70 years), high neighborhood poverty for EAs, and study phase for both races were significant predictors of nonparticipation among eligible PCaP-LA research subjects [28]. In this study, we compared previous findings from wave 1 [28] with the current analyses from wave 2 to evaluate whether the factors affecting nonparticipation had changed with respect to the waves.

Measures
Same or equivalent characteristics to those modeled previously [28] were included in our analyses to allow for comparisons; however in the current analyses, the aggregate and LTR based data used previously were replaced by individual-level data collected at wave 1. Age at diagnosis, race, and study phase were categorized as before [28]. We replaced the Gleason score and tumor stage used previously [28] by cancer aggressiveness, which is a composite score of Gleason score, tumor stage, and PSA at PCa diagnosis [35]. Census tract poverty was replaced with income and categorized as in Song et al. [35]. Rural density and parish were excluded in the current analyses since these factors pertained to Hurricane Katrina and thus to wave 1. We included additional factors in our models. Education was dichotomized as ≤high school and > high school as in Song et al. [35]. The Rapid Estimate of Adult Literacy in Medicine (REALM) short form [36], which measures health literacy, was categorized as ≤sixth grade (scores 0 to 44), seventh to eighth grade (scores 45 to 60), or high school (scores 61 to 66). Patient provider communication (PPC) score was measured using a 5-items indicator and adapted subscales from the Primary Care Assessment Survey 1995 Safran/ The Health Institute [37], where higher scores indicate more positive communication between the patient and provider during PCa treatment. The Charlson Comorbidity Index (CCI) [38] was constructed from a comorbidity questionnaire where higher scores indicate more comorbidities.

Statistical analyses
Binomial exact tests were used to compare sources of attrition in wave 2 with respect to race. Pearson chi-square tests and two-sample t-tests were used to assess associations between characteristics of wave 1 respondents and participation status in wave 2. Attrition was modeled using multinomial logistic regression. Racial differences in the attrition sources were assessed using race-stratified logistic regression and Firth's penalized likelihood models; Firth's logistic regression models were used to solve the problem of separation and reduce the bias of the maximum likelihood estimates due to low event rates after stratification. Survival analyses were performed to assess the impact of the follow-up time (time between the two waves) on participation. Kaplan-Meier estimates were calculated assuming that the event of interest is participation in wave 2. Time to event was considered to be the time since wave 1 (in years) and calculated as follows: For respondents of wave 2, the time difference between wave 1 and wave 2 interviews was calculated. Time to event was assigned a zero for the 46 participants who refused further contact at wave 1 interview. All drop-outs were considered to be censored observations, and the time to event was calculated as the time difference between wave 1 interview date and February 28, 2013, which was the last day of wave 2 data collection. To use the Kaplan Meier estimation technique, we assumed that the non-informative censoring assumption was satisfied by assuming time to participate in wave 2 (survival time) is independent of time to drop-out after wave 1 (censoring time). Additionally, we assumed that the participation probabilities were the same for research subjects recruited early and recruited late in PCaP. The Wilcoxon test was used to assess differences in participation probabilities between races. All statistical analyses were performed using SAS 9.4 (SAS Institute, Inc., Cary, North Carolina).

Results
Of 1227 PCaP LA participants, 46 refused further contact at the time of wave 1. Contact was attempted for the remaining 1181 participants of whom 118 were deceased, 23 were too frail at the time of wave 2, 87 were lost to follow-up, and 189 refused to participate. The reasons for attrition stratified by race are shown in Table 1. The most common reason for attrition among AAs was active refusal, followed by lost to follow-up. In EAs, the most common reason for attrition also was active refusal, but followed by being deceased. More AAs than EAs dropped out overall (p = 0.001), were lost to follow-up (p < 0.001) and passively refused (interview was scheduled but never completed) to participate in wave 2 (p = 0.005).
All baseline characteristics were significantly associated with attrition in wave 2 ( Table 2). A larger percentage of wave 2 drop-outs were diagnosed between 60 and 69 years old, AA, had an income of $30,000 or less, had an education of high school or less, had a REALM score of high school, and had low cancer aggressiveness. While the average PPC score was lower among drop-outs than respondents (p = 0.011), the average CCI was higher (p = 0.030). Attrition rates also were calculated for each characteristic from the formulas given in the American Association of Public Opinion Research (AAPOR) [39] and provided in Table 2. For example, the attrition rate of men diagnosed between 40 and 59 years old was the number of drop-outs for that category, 142, divided by the total number of men for that category, 395, or 36%. The overall attrition rate was 38%; the greatest attrition rate occurred for wave 1 participants with high cancer aggressiveness (55%).
Results of the multinomial logistic regression are shown in Table 3. Attrition because of death was less likely to occur among men who enrolled in wave 1 after Katrina than those who enrolled before it (OR = 0.6, 95% CI: 0.32-0.95). Men with high cancer aggressiveness at wave 1 were 4.5 times more likely to be deceased in wave 2 than those with low aggressiveness (95% CI: 2.54-7.86). The odds of attrition from death increased 1.3 times for every unit increase in CCI (95% CI: 1.12-1.46). Men who enrolled in wave 1 after Katrina were      Table 4. The deceased and frail categories were combined due to the small numbers in the frail category. Both AAs and EAs who enrolled in wave 1 after Katrina were less likely to be deceased or frail than those who enrolled before Katrina (OR = 0.4, 95% CI: 0.20-0.88 and OR = 0.5, 95% CI: 0.22-0.94, respectively). Both AAs and EAs with high cancer aggressiveness were more likely to drop out because of death or frailty than those with low cancer aggressiveness (OR = 3.0, 95% CI: 1.34-6.62 and OR = 4.1, 95% CI: 1.94-8.63, respectively). Every one unit increase in CCI increased the odds of AAs being deceased or frail at wave 2 (OR = 1.3, 95% CI: 1.03-1.54), but the corresponding odds among EAs were not significant. AAs 60 and older at diagnosis were less likely to be lost to follow-up than AAs with younger age (OR = 0.4, 95% CI: 0.18-0.73). Although older age for EAs had similar associations with being lost to follow-up, these associations did not reach statistical significance. EAs with an income $30,000 or less were 8.7 times more likely to be lost to follow-up than those with an income more than $70,000 (95% CI: 1.88-39.77). Income was not a significant predictor of being lost to follow-up in AAs. Education was no longer a significant predictor for lost to follow-up for either AAs or EAs. AAs with high cancer aggressiveness were 2.5 times more likely to refuse to participate in wave 2 than those with low cancer aggressiveness (95% CI: 1.34-4.72). EAs with income $30,000 or less (OR = 2.8, 95% CI: 1.34-5.81) or with income between $30,001 and 70,000 (OR = 2.0, 95% CI: 1.07-3.58) were more likely to be refusals than those with income more than $70,000. Income was not a significant predictor of refusal in AAs. The odds of refusal in EAs decreased 0.7 times for every unit increase in PPC score (95% CI: 0.51-0.96). REALM score was not a significant predictor for refusal in either race.
The overall and race-stratified product-limit estimates and 95% CIs, and their accompanying Kaplan-Meier survival curves are provided in Table 5 and Fig. 1, respectively. In the overall cohort, the probability of participating in wave 2 decreased to 50% (95% CI: 46.9-52.8), 4.64 years after the baseline interview. When stratified by race, the probability of participating in wave 2 decreased to 50% (95% CI: 45.6-54.1), 4.65 years after the baseline interview for AAs, and it decreased to 50% (95% CI: 45.7-53.8), 4.66 years after the baseline interview for EAs. The estimated curve for EAs was above the one for AAs until the curves crossed at 4.66 years after wave 1 (Fig. 1b). Although the probability of participating at wave 2 was higher for EAs initially, the probability of participating in wave 2 became slightly higher for AAs after 4.66 years. The probability of participating in wave 2 decreased to 20% both in the overall sample and in both races 8 years after the baseline interview. The Wilcoxon test for equality of participation probabilities was significant (p < 0.0001), which indicated a short-term difference in participation probabilities between races with respect to follow-up times.

Discussion
The results showed that enrollment in wave 1 before Katrina was a significant predictor of attrition due to death or frailty, both for the overall sample and for both races. High cancer aggressiveness significantly increased the odds of attrition due to death or frailty for both races, yet higher CCI increased the odds of attrition due to death or frailty only in AAs. The results for attrition due to lost to follow-up in the overall model were consistent with literature that has found minorities and those with lower education are more likely to be lost to follow-up [40]. Although lower education was significantly associated with being lost to follow-up in the overall model, the association did not reach significance when the model was stratified by race. For AAs, being younger than 60 at diagnosis was the only significant For the same cohort, our group previously showed that older diagnosis age for AAs (≥70 years), low neighborhood poverty for EAs (< 20% of the households within the tract living in poverty), and study phase with respect to Katrina for both races were significant predictors of nonparticipation (refusal) in wave 1 [28]. Age and study phase were no longer significant predictors of nonparticipation in wave 2 for either race. Instead, high cancer aggressiveness for AAs, low income for EAs (≤ $70,000), and lower PPC scores for EAs were the significant predictors of nonparticipation in wave 2. One possible explanation for the opposite effect of income in EA nonparticipation might be that the compensation for participation in wave 1 was a maximum of $75, while it was only $25 in wave 2. This incentive decrease might have caused EAs with low income to be less interested in participating in wave 2. Another explanation can be provided based on the leverage-salience theory [42]: the salience and/or leverage of the survey features, such as busyness, monetary incentive amount, or willingness to contribute to PCa research, might have changed for EAs after wave 1. There may have been an additional survey feature that was not present at wave 1, such as experience with the interviewer in the first wave, which added to their leverage and changed their participation. Income did not alter AAs' participation in either wave, which highlights the need for using different approaches to boost participation of AAs in PCa studies in addition to providing monetary incentives (see [28]). PPC score, which is associated with higher patient satisfaction [43,44], has not been studied to our knowledge in nonresponse/attrition research. These results indicated that a more positive perception of PPC significantly decreased the odds of refusals, both in the overall sample and in EAs. PCa researchers should encourage providers to promote their patients' ongoing study involvement throughout longitudinal data collection.
In longitudinal studies, longer follow-up periods are well known to be associated with higher attrition, but the degree of this association has not been studied well [45,46]. The results from survival analyses showed that the participation probability of EAs was higher than the participation probability of AAs when the length of follow-up was shorter than 4.66 years. However, these probabilities were reversed when the follow-up time increased. Thus, racial differences may need to be considered when planning follow-up times. PCa researchers should keep the times between waves short when conducting longitudinal studies. However, the survival analyses were limited by lack of information on exact time of attrition. Further research needs to be done on the effect of exact time between waves on attrition to confirm these findings.

Conclusions
We assessed factors affecting attrition in the second wave of a population-based study of PCa. Studies have been conducted in various populations, including elderly, to determine the effects of health on nonresponse and attrition [2,13,30]. But, to our knowledge, no such study exists specifically for PCa populations, which are prone to both aging and frailty. The large and approximately equal sample sizes of AAs and EAs enabled us to assess the racial differences in attrition. Our results verified the need for studying sources of attrition separately when possible; examining attrition without distinguishing between its different sources can cause separate factors to be missed. Our results also demonstrated the danger of using one wave of a longitudinal study to evaluate nonresponse in later waves. Unless the interval between waves is very short, strategies used to decrease attrition at an earlier wave may not be useful at the time of a subsequent wave because the salience and/or leverage of the study features might change for participants over time. The factors affecting attrition and nonparticipation should be studied constantly at each wave to tailor ongoing retention efforts in longitudinal studies.