Determinants of subject visit participation in a prospective cohort study of HTLV infection

Background Understanding participation in a prospective study is crucial to maintaining and improving retention rates. In 1990–92, following attempted blood donation at five blood centers, we enrolled 155 HTLV-I, 387 HTLV-II and 799 HTLV seronegative persons in a long-term prospective cohort. Methods Health questionnaires and physical exams were administered at enrollment and 2-year intervals through 2004. To examine factors influencing attendance at study visits of the cohort participants we calculated odds ratios (ORs) with generalized estimated equations (GEE) to analyze fixed and time-varying predictors of study visit participation. Results There were significant independent associations between better visit attendance and female gender (OR = 1.31), graduate education (OR = 1.86) and income > $75,000 (OR = 2.68). Participants at two centers (OR = 0.47, 0.67) and of Black race/ethnicity (OR = 0.61) were less likely to continue. Higher subject reimbursement for interview was associated with better visit attendance (OR = 1.84 for $25 vs. $10). None of the health related variables (HTLV status, perceived health status and referral to specialty diagnostic exam for potential adverse health outcomes) significantly affected participation after controlling for demographic variables. Conclusion Increasing and maintaining participation by minority and lower socioeconomic status participants is an ongoing challenge in the study of chronic disease outcomes. Future studies should include methods to evaluate attrition and retention, in addition to primary study outcomes, including qualitative analysis of reasons for participation or withdrawal.


Background
Understanding participation in a prospective study is crucial to maintaining and improving retention rates [1,2]. A high participant loss rate will impact the ability to draw valid conclusions. This is particularly relevant in longitu-dinal studies where loss of data points within or between visits can distort the relationships between measurements [3,4]. Many strong predictors of attrition, such as health problems and socioeconomic factors [5], race/ethnicity [6], and substance abuse [7] have been explored in efforts to develop strategies to maintain participation in a long term cohort study. Most studies evaluate retention by employing a survival approach, looking at the time to loss to follow-up; some have explored positive and negative predictors of retention in long-term prospective studies using mixed model statistical techniques [8]. But few have examined drop-out of study participants followed by return after missing at least one visit. Although newer statistical techniques allow analysis of data with missing data points, minimization of attrition still remains important for drawing valid research conclusions [9].
The purpose of this paper was to examine factors influencing the continued or renewed participation of subjects in a prospective longitudinal cohort study of human T-cell lymphotropic virus (HTLV) outcomes. By analyzing the large HTLV Outcomes Study (HOST) data set, we had the opportunity to investigate some uncommon but possible reasons for long-term study participation. The HTLV positive donors, with matched controls, followed in HOST since the early 1990s, permitted the examination of three research questions: first, does diagnosis with a relatively obscure virus in healthy adults impact study retention; second, does poorer health status influence participants to continue in a study, and third, does referral for specialty physician diagnostic examination, an indication of possible development of HTLV-associated disease or other adverse health outcome, contribute to long-term study enrollment. Our hypothesis was that healthy persons diagnosed with HTLV would be more likely to continue in a longitudinal study that included regular health assessments compared to HTLV negative controls. Further, we hypothesized that changes toward poorer health status regardless of HTLV status would increase the likelihood of staying in or reengaging in the study.

Sample
Beginning in 1990 through 1992, 155 HTLV-I seropositive, 387 HTLV-II seropositive and 799 seronegative participants were enrolled from populations of blood donors from five sites across the United States. Participants were aged 18 and older, testing either positive or negative for HTLV at the time of attempted donation. HOST data have been the source of many publications on the transmission, natural history and health outcomes of HTLV infection [10][11][12][13][14][15]. HOST is an extension of the cohort enrolled previously under the Retrovirus Epidemiology Study (REDS) and the details of HOST study design have been described elsewhere [16].
To improve the comparability of the groups, seronegative subjects were matched to HTLV seropositive subjects by age (5 year groups), sex, race/ethnicity, type of blood donation (whole blood, autologous or platelet pheresis) and blood center. A ratio of 1.5 seronegatives to HTLV seropositives was attempted, anticipating lower follow-up success with seronegative subjects. All participants were HIV seronegative. The HOST cohort included some sexual partners of HTLV seropositive donors, but they are excluded from analysis in this paper.

Setting
HOST is a multi-center, longitudinal prospective cohort study of the health effects of infection with HTLV-I and HTLV-II occurring at five blood banks in United States cities. The five clinical and data collection sites include three American Red Cross (ARC) blood services centers: Chesapeake/Potomac (Washington/Baltimore), Southeastern Michigan (Detroit), and Southern California (Los Angeles), as well as two independent blood centers: Blood Centers of the Pacific in San Francisco, California and the Oklahoma Blood Institute in Oklahoma City, Oklahoma. Testing for HTLV was routinely done at the time of blood donation, and donors found to be seropositive were permanently deferred from blood donation prior to enrollment. Seropositive persons are not usually ill, as there is only a 1-2% risk of progressing to either of the two recognized HTLV related diseases: adult T-cell leukemia (ATL) and HTLV-associated myelopathy (HAM) [17].

Procedures
Following enrollment and baseline data collection in 1990-92, participants have been contacted every two years to complete the three activities that comprise each visit: a health questionnaire, a basic neurologic exam, and phlebotomy for complete blood count and storage of specimens in the HOST biorepository. All activities and procedures were identical for seropostive and seronegative participants. Nurse counselors at each site were trained and monitored to perform all study activities in a standardized manner, but there was staff turnover during follow-up. The structured interview questions were asked by the study nurse and entered into a questionnaire booklet as the participant answered each question. Attempts were made to see all participants in person, but telephone interviews were accepted from participants who had moved out of state or who refused an in-person visit. An exception was the fourth study visit. Due to decreased resources, the fourth visit had a protocol modification which differed from the other visits. It consisted of an abbreviated health questionnaire completed by mail or telephone, no basic neurologic examination, and remote phlebotomy with the blood sample sent by courier to the central laboratory. The study reverted back to the original protocol when resources were restored for visits 5 and 6.
During each visit, an effort was made to limit the number of participants lost to follow-up by updating subject information for possible changes of name, address or telephone number. Participants consented to allow study personnel to search telephone directory assistance, the U.S. Postal Service forwarding service, public use databases, and credit bureau records if their previous information had changed between visits. Additionally, at each visit, participants were asked to designate a relative or friend who could be contacted to provide updated contact information or knowledge of the subject's death. In cases where the participant and designated contact person were no longer valid sources of information, a professional tracing expert was assigned to the participant with the purpose of discovering new contact information.
The health questionnaire, neurologic exam and phlebotomy were developed to screen for medical conditions or disease outcomes which might be associated with HTLV-I or HTLV-II, including ATL or HAM. Study clinicians developed an algorithm to identify abnormal responses in the health questionnaire, neurologic exam or phlebotomy results. A computer program was written to use the algorithm to screen all participant data (health questionnaire, basic neurologic exam, complete blood count and medical records) and identify/flag participants whose data were suggestive of clinical outcomes. A panel of three medical physicians with expertise in HTLV clinical and hematologic presentation met at regular intervals during each visit. The panel was blinded to participant serostatus and made decisions for participant referral to the local study physician and/or specialty physician for further diagnostic examinations in a uniform fashion.

Dependent Variable
The outcome of interest for this analysis is the attendance of study participants at each of the visits 2, 3, 5 and 6, following enrollment in the baseline visit 1. Data for visit 4 were excluded because of different procedures for that visit (see above). Study visit participation was defined as active if a participant completed at least a study health questionnaire either in person or by telephone, whether or not he or she completed the screening physical exam and phlebotomy.

Independent Variables
Our independent variables were related to health status in a natural history study. One was HTLV status (HTLV-I, HTLV-II or seronegative) measured at baseline. Perceived health status, measured by a five item Likert scale from excellent to poor, was measured at each visit. The third independent variable was referral to a specialty physician diagnostic examination, also determined at each visit. In addition to the main independent variables, another time-varying variable measured at each visit was reimbursement for interview, which changed from $10 to $25 for visits 4, 5 and 6.

Covariates
Fixed covariates measured at baseline were gender, age, race/ethnicity, education, annual income, ever use of injection drugs and study site. Race/ethnicity was recorded in detail (16 specific origins corresponding to risk groups for HTLV infection) but was collapsed to five for the analysis (White, Black, Hispanic, Asian and other). Educational achievement was collapsed from six categories (8 th grade or less; 9 th -12 th grade but no diploma; high school graduate or equivalent, such as GED; some college or technical school; bachelor's degree; master's or professional degree) to four (high school or less; some college; bachelor's degree; master's or professional degree). Income was collapsed from seven categories (<$10,000; $10,000 to 19,999; $20,000 to 29,000; $30,000 to 39,999; $40,000 to 49,999; $50,000 to 74,999 and $75,000 or more) to five (< $10,000; $10,000 to 29,000; $30,000 to 49,999; $50,000 to 74,999 and > $75,000).

Analysis
We first described the sample on baseline characteristics by HTLV status using chi-square tests comparing the percent in each category across HTLV status. In our initial analysis, we first categorized participants as taking part in visit 1 only (baseline only), in visit 1 and at least one other visit (some follow-up) or in all visits (all follow-up) by chi-square tests to compare proportions in each category.
We then used multivariate Generalized Estimating Equation (GEE) analysis to test the relationship between attendance at a study visit after baseline enrollment at visit 1 and independent variables over time. The model included the fixed (HTLV status, gender, age, race/ethnicity, education, income, ever drug use and site) and time-varying (health status at previous visit, referral to a specialty physician diagnostic examination at previous visit and reimbursement at previous visit) variables. Variables were then sequentially removed, starting with the least statistically significant. We forced two variables (HTLV status and referral for further exam) into the final model for plausibility: our hypothesis is that they were associated with participation, although they were not statistically significant in our adjusted model. Time was entered in the model as visit, and attendance at each visit was used to predict attendance at the following visit. GEE analysis does not require a balanced design (i.e., observations at all measurements for each participant), and it accommodates correlated errors due to repeated measures. We used the binomial logit function to estimate the likelihood of participation and to present the results of these tests in the form of adjusted odds ratios (OR) with 95% confidence intervals (CI). All analyses were done with SAS, version 9.0 (SAS Institute, Inc., Cary, NC).

Results
The characteristics of the 1341 participants at baseline are shown, by HTLV status, in Table 1. HTLV-I blood donors were more likely to be Black and HTLV-II donors to be Hispanic, and both HTLV seropositive groups were observed to have lower education and lower annual income than HTLV seronegative donors.
After recruitment and baseline data collection in visit 1, 88 (7%) participants were lost to follow-up and completed no further interviews or examinations; 51 (14%) after visit 2, 113 (31%) after visit 3, 84 (23%) after visit 5. Most of the 366 (27%) participants who dropped out were lost at the second or third visit. As some participants rejoined the study, a total of 1020 (76%) participants completed one or more follow-ups from visit 2 through visit 6, and 233 (17%) participants completed all visits. All 1341 participants were seen in person at baseline. Telephone interviews rather than in-person visits were done for 3% at visit 2, for 8% at visit 3, for all participants at visit 4 as described earlier, for 56% at visit 5, and for 40% at visit 6. Of the 1341 participants enrolled at baseline, 985 participated in visit 6 (73%). Characteristics in bivariate analysis by sociodemographic and health-related variables and study site, by these groupings of participation, are shown in Table 2.
Overall study participation by site is shown in Figure 1. Visit 4 had a telephone rather than in-person interview, and demonstrated considerably lower study participation. Not all subjects answered every question; the number answering each question is listed with each characteristic. Percentages may not add to 100 because of rounding, and are based on those answering the question.
Further, the visit 4 health questionnaire did not include the perceived health status question. Because of the resulting loss of data and statistical power and our interest in perceived health as a predictor of participation, we examined GEE results and found no differences in effect sizes with and without visit 4 data. Among health-related variables in the bivariate analysis, HTLV-seronegatives and those with excellent or good health status were more likely to attend study visits (Table a Not all subjects answered every question; the number who did is listed with each characteristic. Percents may not add to 100 because of rounding, and are based on those answering the question. b These subjects provided data at baseline (Visit1) only. c The total represents all who answered the question.
3). However, in multiple regression analysis, neither health status nor HTLV status was associated with participation after adjusting for relevant covariates. Referral for speciality physician diagnostic exam was not a significant predictor of participation. We examined health status, which was significant in bivariate analyses but not in the multivariate model. We found that education accounted for the apparent association between health status and visit participation seen in the bivariate model. Of the protocol-related variables, higher subject reimbursement and study site were statistically significant predictors of participation at a subsequent visit. When reimbursement was increased from $10 to $25, participants were nearly twice as likely to continue (adjusted OR 1.84). As shown by the differences in proportions, participation by study site remained significant in the GEE analysis. In particular, compared to the San Francisco site, Detroit and Los Angeles were significantly less likely to participate (adjusted OR 0.47 and 0.67, respectively).
Of the sociodemographic variables examined, there was a clear trend for socioeconomic status. Those in higher income categories were increasingly more likely to continue in the study as compared to those in the lowest income group, with those reporting $75,000 or more in annual income 2.68 times as likely to continue as those making less than $10,000. A similar trend was seen for increasing education, with those in the highest education category 1.86 times as likely as those with high school or less education to participate in study visits. Women were more likely to participate compared to men (adjusted OR 1.31), and Blacks and "other race" subjects were less likely to attend study visits as compared to Whites (adjusted OR 0.61 and 0.59, respectively).

Discussion
The main findings of this study were that persons with higher incomes and more education were more likely to participate in study visits and men and persons of Black and other race/ethnicity were less likely to participate. Contrary to our hypothesis, HTLV seropositivity, poorer perceived health status, and referral to specialty diagnostic exam for potential adverse health outcomes did not significantly affect participation after controlling for demographic variables. Specific protocol-related characteristics did matter: study site and an increase in reimbursement were positively associated with participation.
Retention rates overall have remained high in this 12 year study of blood donors, 73% through visit 6. By virtue of selection criteria, blood donors are generally healthier than the general population. The diagnosis of a viral infection, with serious albeit rare consequences, is an unexpected consequence of blood donation. We hypothesized that being seropositive for HLTV, having poorer perceived health status, and referral for further physician examination because of possible HTLV-related disease would be associated with higher rates of overall participation and re-engagement in subsequent visits. Our data did not support these hypotheses: HTLV positive status, perceived health status, and referral for specialty physician diagnostic examination made no difference in retention or reengagement of participants. This inability to reject the null hypothesis is reassuring for the HOST study's scientific validity. Loss to follow-up related to HTLV seropositivity and the presence of adverse health outcomes, whether perceived or as a result of changes in objective health measures, could be an important source of bias in this longitudinal study.
Instead, as previously reported in the literature, demographic factors were important predictors of retention in this cohort. Males, those with lower education and lower income, and persons of color were less likely to participate in study visits. There is controversy about the effect of gender on study participation. Some studies indicate women have been shown to be more likely to consent to study participation [18] and continue in studies over time [19], others say there is no difference in participation by gender [20].
What is novel in this research is that the health status of the participants did not appear to affect visit participation. These findings are difficult to compare with other studies because of the inherent difference of these essentially healthy participants with a diagnosis as positive with a virus yet not ill, compared to participants in longitudinal studies of chronic illness. Poor health is usually predictive of dropping out of longitudinal studies [5,21,22].
Increasing and maintaining participation by underrepresented groups, who are likely to be in lower socioeconomic strata as well, is an ongoing challenge for researchers wanting to characterize health and disease for the general population [23][24][25][26]. While studies have shown that blood donors as a group have higher socioeconomic status [27], the persistent and independent influence of race/ethnicity, education and income demonstrates the continued and urgent need to develop and test strategies to encourage participation by under-represented groups.
In addition to well known sociodemographic factors, notable differences in the protocol and its implementation were important in study participation. As others have shown [28] the increase of monetary reimbursement (from $10 in visit 3 to $25 in visit 4) was positively associated with study participation. The most dramatic change in participation was seen in visit 4 when interviews were done by telephone or mail and phlebotomy was done remotely, instead of in-person interviews and phlebotomy by the study nurse. The modified approach resulted in profound decreases in participation at all centers and despite the increase in reimbursement, so for subsequent visits the study resumed in-person methods. Moreover, differences by site despite consistency in training and protocol management may have represented subtle differences in personnel and in implementation of the protocol. For example, the Los Angeles site reported that subjects moved often and required intensive tracing efforts, and that urban sprawl and the large, traffic-congested metropolitan area was cited by many subjects as a reason to drop out. Anecdotally, frequent changes in study nurses at some centers probably disrupted rapport essential to maintaining retention. These protocol and logistical observations, while consistent with common sense, remain crucial to the successful implementation of future prospective studies.
Strengths of this analysis are that the data concerned five different blood centers and a long follow-up period. HOST follows a uniform, well funded study protocol with a data coordinating center. The overall retention rate was high, allowing better measurement of differences among study groups. Limitations include the telephone followup in visit 4, which was addressed by excluding those data. In addition, few variables were collected specifically for the analysis of study participation. As is often the case, studies of retention are secondary analyses, peripheral to the primary research aim, and often do not have the depth or richness of data to examine the more subjective aspects of retention.

Conclusion
In future research, investigators may wish to study various strategies to minimize participant attrition. These have been categorized by others into three areas: competence, dedication and standardized training; communication and collaborative effort between participant and researcher; and expressions of appreciation to participants [29][30][31][32][33]. For future longitudinal, natural history studies, researchers should consider the collection of data specifically related to study participation, including characteris-tics of study personnel, protocol implementation process and outcomes, changes in the study environment that could affect collection efforts, and other factors directly related to retention. Such factors may include flexible staffing hours, recommended by some to insure that the research interviews are convenient for the participant [34] and home visits, although time consuming and costly, that may have a positive impact on retention [21]. Qualitative research to better understand the range of interactions between subject and researcher may also be useful in developing testable hypotheses.
In conclusion, poor longitudinal visit participation is one of the major challenges to study validity. Our data have confirmed previous findings and suggested new insights. We recommend that future longitudinal studies incorporate specific measures of participant attrition and retention into their design, including qualitative analysis of participant-researcher interactions. In this way, real progress may be made in understanding and improving participation in studies.