Determinants of patient recruitment in a multicenter clinical trials group: trends, seasonality and the effect of large studies

Background We examined whether quarterly patient enrollment in a large multicenter clinical trials group could be modeled in terms of predictors including time parameters (such as long-term trends and seasonality), the effect of large trials and the number of new studies launched each quarter. We used the database of all clinical studies launched by the AIDS Clinical Trials Group (ACTG) between October 1986 and November 1999. Analyses were performed in two datasets: one included all studies and substudies (n = 475, total enrollment 69,992 patients) and the other included only main studies (n = 352, total enrollment 57,563 patients). Results Enrollment differed across different months of the year with peaks in spring and late fall. Enrollment accelerated over time (+27 patients per quarter for all studies and +16 patients per quarter for the main studies, p < 0.001) and was affected by the performance of large studies with target sample size > 1,000 (p < 0.001). These relationships remained significant in multivariate autoregressive modeling. A time series based on enrollment during the first 32 quarters could forecast adequately the remaining 21 quarters. Conclusions The fate and popularity of large trials may determine the overall recruitment of multicenter groups. Modeling of enrollment rates may be used to comprehend long-term patterns and to perform future strategic planning.


Background
Adequate patient recruitment is an important prerequisite for the optimal function of multicenter clinical trials groups. Such groups are likely to perform several clinical trials concurrently across a number of participating clinical sites. The number of patients enrolled over time may thus depend on the capabilities of the participating centers as well as on the type and sample size of studies that are open at a given time. The enrollment performance of a clinical trials group may also change gradually over time. It would be useful if one could predict enrollment in the future based on past trends and on the new trials that are launched or being proposed. This knowledge may offer useful insight for future strategic planning. In particular, large trials may pose an especially higher burden on the group in terms of patient recruitment. Additionally, there are anecdotal beliefs that patient enrollment may show seasonal variability with fewer patients enrolled in the summer months or during the winter holidays season, when enrollment efforts may be diminished. However, there are no good empirical data addressing these issues.
In the present study we evaluated whether we could identify predictors of the overall rate of enrollment in a large multicenter clinical trials group that has been conducting clinical studies in the domain of human immunodeficiency virus (HIV) infection for over 13 years. Detailed data were available on enrollment (on-study dates) on nearly 70,000 patient entries from 475 studies. This offered the opportunity for examining the effect of long-term trends, new studies, large studies and seasonal parameters on the enrollment and for evaluating whether future enrollment can be forecast based on past performance.

Databases
We used data from the AIDS (acquired immunodeficiency syndrome) Clinical Trials Group (ACTG) on the accrual of patients in clinical studies between October 10, 1986 andNovember 12, 1996. All ACTG clinical studies were considered, including both randomized and non-randomized designs of all phases (I-IV). ACTG is sponsored by the Division of AIDS of the National Institute of Allergy and Infectious Diseases (NIAID) and it represents a large network for the conduct of clinical studies on HIV infection and its complications. ACTG performs studies both in adults (Adult ACTG) and in children (Pediatric ACTG). It uses the clinical resources of the network of 30 university sites and many other affiliated centers across the United States. The performance of sites is evaluated regularly and re-competition occurs approximately every 4 years; at some re-competition cycles a few sites have been replaced by new ones, however the total number of sites has remained relatively constant.
On-study dates were used for all the analyses. All analyses were performed in two datasets: one considered all ACTG studies and the other one excluded the recruitment in substudies of main studies. The analyses excluding substudies may be more robust, because by definition substudies typically included only subsets of the same patients as the main studies. For occasional studies, the ACTG had collaborated with other multicenter organizations, such as the Community Programs for Clinical Research on AIDS. In these cases, typically only data on the ACTG-site patients were available in the dataset.

Seasonal effect: patient enrollment and initiation of studies
First, we examined using the chi-square test whether there is a seasonal effect affecting patient recruitment and the initiation of new studies. Histograms summarized the number of patients enrolled every month and the number of studies starting every month of the year from the start of 1987 through the end of 1998.

Candidate predictors of enrollment
We evaluated whether the total ACTG enrollment may be influenced by the following parameters: 1. Trend over time -examination of the raw data suggested that a linear trend with increasing quarterly enrollment over time may be present. Alternative transformations (such as logarithmic) were also probed, but the fit was not improved.
2. Early "starting" effect -it is anticipated that the overall enrollment of a multicenter clinical trials group may need some time to reach a functional level when the group is first established, since sites may not register at the same time and some time is needed for a critical number of studies to be launched. Examination of the raw data suggested that this early effect might be modeled by the introduction of an indicator variable for the first 3 quarters (9 months).
3. Seasonal effect -dummy variables were employed where the reference season was winter.
4. The effect of launching large studies -large studies may boost the overall enrollment of a clinical trials group due to their excessive demands on patient recruitment. The definition of what consists a large study is arbitrary. We considered a priori all ACTG studies with target enrollment exceeding 1,000 patients where a prevalent eligible patient pool is already available for enrollment when a study is launched. We generally excluded prospective studies of HIV perinatal transmission (the prevalent pool of HIV-infected pregnant women is small). The effect of large trials was modeled by giving weights for extra enrollment in the early quarters of their accrual. Specifically, for studies with target sample size over 1,000 patients, the weight was 1 / 2 for the quarter during which they were initiated, 1 for the subsequent quarter, and 1 / 2 for the next quarter. For parsimony, the "active" effects of large studies were considered to be negligible beyond the third quarter. These weights were used empirically, because the raw data suggested that, although exceptions may occur, on average large studies tend to have their peak accrual at the second quarter from their onset [1]. The overall effect of large studies was constructed by summing up the respective weights of the large studies that were "active" in each quarter. In a sensitivity analysis, studies with target sample size between 500 and 1,000 patients were also considered using half the weights described above, but the fit was not improved (not shown). 5. Number of studies starting each quarter -the vast majority of studies launched by the ACTG have fewer than 1,000 (or 500) patients. For such studies, the typical en-rollment period is short (most often less than half a year or even just a few months). To model this effect, we considered as a predictor the number of studies starting in each quarter. Sensitivity analysis, considering also as a predictor the number of studies starting in the preceding quarter, did not improve the model fit (not shown). For parsimony, we only consider the current quarter variable in the presented analyses.

Time series modeling
Evaluation of correlograms on the raw data and after the time trend had been removed, suggested that a first-order autoregressive model may be appropriate, since there was a high first lag autocorrelation coefficient and exponentially tapering, non-significant higher-lag autocorrelation coefficients, while the partial autocorrelation function showed only one prominent term at lag one. Therefore, we also examined the effect of the linear time trend in a first-order autoregressive model [2]. Furthermore, we also adjusted this model separately for seasonal effect, early "starting" effect, the effect of large studies and the number of new studies starting each quarter in order to see whether the significance and magnitude of the autoregressive effect was altered once these parameters were taken into account. Finally, a multivariate model was considered, employing all the predictors above as well as a first-order autoregressive term.

Training and forecasting
Using a first-order autoregressive multivariate model with the same parameters, another training model based on the enrollment of the first 32 quarters was used to forecast the enrollment of the remaining 21 quarters. For all the above time series analyses sequence graphs with predicted and observed quarterly enrollment were visualized and both the absolute and proportional differences were calculated.
Statistical analyses were conducted in SPSS version 9.0 (SPSS Inc, Chicago, IL). All reported p-values are twotailed.

Results
Between October 1986 and November 1999, 475 ACTG studies were launched with a total enrollment of 69,992 patients. Excluding substudies, there were 352 main studies with a total enrollment of 57,563 patients. When limited to the period January 1, 1987 to December 31, 1998, there were a total of 441 studies (62,995 patients), including a total of 334 main studies (51,927 patients).
Enrollment differed significantly between different months, (p < 0.001, figures 1 and 2). Patient recruitment peaked in spring (March, April) and late fall (October, November) while it was slower during the winter months and in September. Despite a trend for an increased number of studies starting in October and May, overall the number of studies starting each month did not differ significantly (p = 0.060 for all ACTG studies, p = 0.359 excluding substudies).
In univariate regressions (table 1), recruitment accelerated over time; on average the acceleration per quarter was approximately 27 patients when all studies were considered (p < 0.001) and 16 patients when substudies were excluded (p < 0.001). Also the effect of large randomized trials contributed significantly to patient enrollment (p < 0.001). Compared to the early enrollment during the first three quarters, more patients were recruited later on (p < 0.001 for all ACTG studies and p = 0.001 for main studies, respectively). The number of studies and substudies starting each quarter seemed to influence enrollment (p = 0.001) when all ACTG studies were considered, but this was not true when the substudies were excluded (p = 0.97). More patients seemed to be accrued during spring, but the seasonal effect was overall not very prominent.
We evaluated also models taking autocorrelation into ac- In multivariate modeling with first-order autocorrelation being considered, the time trend and large trials seemed to be the most important determinants of the quarterly enrollment (table 1). The autoregressive term was not important. The fit for all ACTG studies and excluding substudies is presented in figures 3 and 4.
When the training model based on the first 32 quarters was assessed (table 2), the coefficients were largely similar to the final multivariate model described above. In figures 5 and 6. the fit of this training model is presented including its ability to forecast the enrollment during the subsequent 21 quarters. Considering the sequence graph based on all ACTG studies, the difference between observed and predicted exceeded 400 patient entries in 6 forecast quarters and the mean absolute difference was 346. The maximal absolute difference between observed and predicted enrollment in forecast quarters was 856 patients. When all studies were considered, the prediction missed the actual observed accrual by over 30% in 5 quarters and the average relative deviation from the observed accrual was 24.7%. At the largest deviation, the predicted deviation was double than the observed accrual. When the substudies were excluded, the forecast was even better. The maximal absolute difference between observed and predicted enrollment in forecast quarters was 723 patients, the absolute difference was more than 400 patients in only 5 forecast quarters and the mean absolute difference was 249. Excluding substudies, the prediction also missed the actual observed accrual by over 30% in 5 quarters and the average relative deviation from the observed accrual was 20.8%. At the largest deviation, the predicted accrual was 69% larger than the observed accrual. For both predictions, the largest divergence from the observed enrollment was seen in the period between summer 1996 and winter 1996-1997.

Discussion
The performance of large studies with target sample size above 1,000 patients is a key determinant of the overall number of patients recruited by a multicenter clinical trials group. Also, there was strong evidence in the ACTG of an increase in enrollment performance over time. Al-

Figure 1
Overall enrollment per month of the year for all ACTG studies (1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998) though the acceleration over time was significantly related to an increasing number of studies (particularly substudies) being launched over time, the dynamics of the acceleration of enrollment were more complex. The long-term time trend may reflect an increasing efficiency of the group and better networking for the referral of patients for participation into clinical trial initiatives. Other candidate predictors were less prominent. We identified a clear monthly variation with peaks in October-November and springtime and with troughs in summer and winter, mostly for patient enrollment than for the initiation of new studies. However, this variation did not translate to a strong seasonal effect. For example, September had the lowest enrollment of all months of the year, thus attenuating whatever typical seasonal pattern might exist.
The data suggested the presence of a strong first-order autoregressive parameter. This is not surprising, since enrollment in a given quarter is likely to be influenced by what the level of enrollment had been in the previous quarter. Intuitively, this may reflect the ongoing trial activity in the multicenter clinical trials group, and large studies may be the most important component of this activity in this regard. Thus, the magnitude of the autoregressive term was markedly diminished when the effect of large trials was taken into account, while it remained unaltered when other candidate predictors were considered.
The forecasting ability of a training model based on the first 32 quarters was satisfactory. Enrollment was forecast adequately for a period exceeding 5 years, which represents a very long period of time, far longer than an-
ything that might be needed for planning purposes. Interestingly, forecasting was most inadequate between the summer of 1996 and the winter of 1996-1997 when the model predicted good overall enrollment, while a relatively deep trough was observed in reality. This may be explained, because in the summer of 1996, ACTG 320, a large trial with target enrollment of 1,750 patients, experienced a sharp, unexpected decline in its rate of enrollment despite an early rigorous accrual pattern. The study fell short of its target accrual (final sample size n = 1,178). Early interim analyses showed a large treatment difference and led to early termination (January 1997) [3]. Based on early surrogate marker trials that suggested the superiority of triple drug regimens, the study had came under attack in the summer of 1996 from various advocacy groups. The crisis escalated in the fall and winter. Perhaps this crisis may have affected accrual also in oth-er ACTG trials at that time. Although ACTG 320 is a special case, this example further shows the importance that large trials may have for the materialization of the overall program of a group of clinical investigators. Some large trials may become reference points for a group, and their fate may influence the course of the group as a whole. A far more dramatic example is available from the breast cancer literature [4], where allegations to the falsification of some limited trial data led to a large trough in the enrollment of a very experienced group of trialists.
Although we deem that the overall predictive performance of the developed models was satisfactory, of course we should acknowledge that the adequacy of the forecasting ability depends also on a subjective interpretation of the results. Under different settings, it may be necessary to achieve even tighter predictions for the per- formance of a model to be satisfactory for operational use by a multicenter clinical trials group.
Previous research that has been conducted to describe patient recruitment in various areas has mainly focused on the study protocols or the institutional sites [5,6,7,8,9,10,11,12]. There has been debate on whether simple enrollment should be considered as a key determinant of site performance [5,6,10]. Besides enrollment, data quality and patient retention obviously need to be considered as well [5,13]. Moreover, different studies entail different levels of effort per patient enrolled. While some studies require simple follow-up, others may be far more labor-and data-intensive and may demand specialized expertise. Simple enrollment rates cannot convey this complex information neither for specific sites nor for a clinical trials group as a whole. Nevertheless, overall enrollment rates may still be a useful piece of information in trying to assess group performance over time. Our analysis has focused on this perspective and has shown that there are identifiable factors affecting the overall patient enrollment and that predicted enrollment is fairly sensitive to these simple predictors. This information may be used in monitoring the progress of a group and in planning ahead the strategic development and the incorporation of future trials in its program. Coupled with data on anticipated cost [14,15] and intensity of data collection and analysis, the forecasts on enrollment may improve rational group steering.
Some of the limitations of our study include the relatively arbitrary definition of large trials [16,17] and the fact that our analysis pertains to a specific group of clinical investigators with activities restricted to the field of HIV infection. Regarding the large study definition, it was unavoidable that an a priori approach had to be decided upon. The cut-off of 1,000 patients has been used also in previous methodologic research [16,17]. Nevertheless, results were robust also when studies with a target sample size more than 500 subjects was considered. It is conceivable that the definition of a large study may differ in different fields. No mega-trials with over 10,000 patients have yet been performed in the field of HIV infection and the largest ACTG trial has had slightly over 3,000 patients. For groups performing large studies with a wide range of sample sizes (including mega-trials), the weighting scheme for modeling the effect of large trials may need to show more gradation, based on the target sample size.
Finally, even though our analysis refers to a particular clinical trials group, the predictors and the modeling process may be generalized to other groups and settings. Comparative analyses in other groups may evaluate whether some of the predictors considered here, such as the seasonal effect, may differ between various medical fields. For example, seasonal effects are likely to be far more prominent for diseases that show seasonality in their incidence. On the other hand, other predictors, such as the effect of large trials, may be consistently important across diverse settings.

Figure 5
Observed and predicted number of patients enrolled every quarter in all ACTG studies during October 1986-November 1999. The model was trained on the first 32 quarters and forecasting was performed for the remaining 21 quarters.

Figure 6
Observed and predicted number of patients enrolled every quarter in all ACTG studies excluding substudies during October 1986 -November 1999. The model was trained on the first 32 quarters and forecasting was performed for the remaining 21 quarters.

Competing Interests
None declared  ACTG: AIDS Clinical Trials Group, AR1: First-order autoregressive parameter * Sum of weights for large studies in each quarter: 1 / 2 for a large study which is initiated in that quarter, 1 for a large study which is on its second quarter of enrollment, and 1 / 2 for a large study which is on its third quarter of enrollment. † Dummy variable for first 3 quarters (Fall 1986-Spring 1987) ‡ p < 0.001 §0.001 ≤ p < 0.05