In this review of articles, we quantified the appropriate use of self-controlled designs in pharmacoepidemiology involving electronic healthcare databases in terms of major and minor validity assumptions for the use of the designs. We focused on studies involving medical databases, in which self-controlled designs are particularly useful to adjust for time-invariant confounders that may not be collected. Self-controlled designs were not appropriately used in 34% and 13% of the articles we reviewed that described a case-crossover or self-controlled case series design, respectively. We encourage better use of these designs for situations in which major validity assumptions are fulfilled (i.e., for which they are recommended), accounting for situations for which the design can be adapted.
Our study updated the Nordmann et al. review  and is the first systematic review exploring the appropriate use of self-controlled designs in pharmacoepidemiology involving electronic healthcare databases, in terms of major and minor validity assumptions, in accordance with recent recommendations. The fulfilment of major assumptions is the minimum requirement for self-controlled designs to be valid, as they can be superior to designs with comparison groups in such situations (more powered and less biased) . Moreover, recent recommendations state that the self-controlled designs should be preferred to designs with comparison groups in studies performed on healthcare databases, when key validity assumptions are fulfilled . For articles that did not fulfil all of the major validity assumptions, it was essentially due to the study of sustained exposure (e.g., antihypertensive drugs or prophylaxis for cardiovascular events) or events with an insidious onset (depression or chronic fatigue). The intermittency of exposure is a requirement for both case-crossover and self-controlled case series to ensure that the number of patients with varying exposure statuses is not too small . The acuteness of the event onset is a validity assumption that reduces the likelihood of misclassification bias . However, some have proposed an adaptation of the case-crossover design for studying prolonged exposures and insidious-onset outcomes  and an adaptation of the self-controlled case series (towards the end of our observation period) for studying cumulative exposure . The design adaptation Wang et al. proposed consists of lengthening exposure assessment windows . Despite these adaptations, self-controlled designs are usually less powered than between-person comparisons when studying sustained exposures, because discordant pairs seldom arise when exposures are actually sustained and the observation period is short as compared to the risk and control period durations. Thus, both case-crossover and self-controlled case series designs would fail: the former because of a too-small number of discordant pairs and the latter because cases would be seldom unexposed. Hence, self-controlled designs are not recommended for situations of sustained exposure because they could lead to a loss of power . In fact, few statistical power (or sample size) calculations were carried out in the included studies so the impact of studying long-term exposures when there is a low probability of switching is uncertain. In addition, lengthening exposure assessment windows increases the risk of bias due to within-person time-varying confounders (the absence of which is the main advantage of using self-controlled designs). Even if lengthening exposure assessment windows reduces misclassification bias, it does not answer the issue of reverse causality that could arise when an exposure occurs after the true time of outcome onset, thereby leading to a spurious association. More generally, failure to meet important assumptions of self-controlled designs has been associated with increased risk of discrepant results between case-only and cohort-based approaches (which can occur even in the absence of unmeasured confounders) .
For all these reasons, we considered that studies involving a self-controlled design were invalid in situations of sustained exposure when there was a small probability of switching between exposure status within the observation window or when the outcome has an insidious or delayed onset or results from a cumulative effect. Of note, we classified drugs that are usually used chronically as “intermittent exposures” when a high opportunity for a switch from exposed to unexposed status (or vice versa) can be assumed over the observation period. Several examples can be cited: methylphenidate in children with attention-deficit/hyperactivity disorder (the treatment is usually discontinued during holidays) and palivizumab, an anti-respiratory syncytial virus (RSV) monoclonal antibody for prophylaxis of severe lower respiratory tract infection in children (usually administered during the high-risk season of RSV infection).
In a sensitivity analysis, considering studies examining a sustained exposure as appropriate (without accounting for the probability of switching exposure status), 45 (85%) case-crossover and 53 (96%) self-controlled case series studies fulfilled all major assumptions. In this analysis, the most frequent validity threats were insidious or common events.
In terms of event frequency, we considered rare and/or recurrent events as appropriate for self-controlled case series and only rare events for case-crossover designs. However, in situations when the event is both non-recurrent and non-rare, a self-controlled design can still be used. Nevertheless, this use would imply that the number of strata (here, the number of cases) would increase but not their size (here, the number of periods within the same patient), thereby leading to poor estimation of the variance when using stratified models. Therefore, we considered the rare or recurrent event as a major assumption. Moreover, in the study of Pouwels et al., the rareness of the outcome was a factor associated with fewer discrepancies .
We found that minor assumptions were most often valid when major ones were valid. As a reminder, those assumptions were considered minor because the design can be adapted if they are not fulfilled, which allows for a self-controlled design. A small proportion of the self-controlled designs, 18 (16%), could have been improved by applying those adaptations. For instance, 3 studies with a case-crossover design did not adjust for a time trend in exposure, even though the paper clearly stated that such a trend existed. It has been shown that lack of adjustment for exposure time-trends in case-crossover studies led to biased estimations [14, 25, 34], and hence several extensions of the case-crossover design have been developed to take into account a temporal trend in exposure [14, 15, 24, 25]. Of note, in 22 additional papers from our systematic review, the existence of an exposure time-trend was not discussed by the authors nor could be assessed from the reported information, but we still considered them appropriate. Thus, the proportion of case-crossover studies that could have used the design more adequately may be underestimated. Researchers must keep in mind that the exploration and reporting of such a trend in case-crossover studies is crucial for design validity. Concerning the event-independent exposure assumption, we found that it was fulfilled in 88% of articles involving a self-controlled case series. A simulation study reported that relative incidence is almost always overestimated when the event-independent exposure assumption is violated in self-controlled case series studies (except for the situation of extreme dependence), but the bias is corrected when the design is adapted . The corresponding methodological developments were published in the late 2000s [11, 16], perhaps too recently to be applied in the studies we reviewed.
Of note, a simulation study explored the validity of the case-time-control design in situations of within-individual exposure dependency over several control periods but showed that the method is robust to deviation of this assumption . We did not explore this assumption the studies included in our review.
The previous systematic review by Nordmann et al. reported the validity assumptions of self-controlled designs in pharmacoepidemiology between 1995 and 2010 (before the development of the previously cited recommendations) . The authors reported an inappropriate use of self-controlled designs: validity assumptions were not fulfilled for 76% of the articles describing a case-crossover design and 60% self-controlled case series. Concerning major assumptions, our review, which covered healthcare database studies published recently, shows that these data have improved. Moreover, major and minor validity assumptions were not distinguished in the Nordmann et al. review. Nevertheless, we found the same main reasons for the inappropriate use of these designs (i.e., the study of sustained exposure and the absence of considering exposure time-trend).
Self-controlled designs can control for intra-individual time-invariant confounders. Many design extensions that weaken the validity assumptions have been developed, and these designs still are under development, such as for the study of multiple exposures , or the study of recurrent events when recurrences are not independent . However, the designs are subject to several biases (e.g., residual confounding due to unmeasured within-person time-varying factors or misclassification of exposure ). Moreover, self-controlled designs explore the triggers that precede abrupt-onset events, and answer the questions “Why now?” or “What happened just before?”, which is slightly different from the question raised with between-person comparisons (“Why me?”) . Nevertheless, they are complementary to cohort-based approaches , and both designs should be applied, especially when one or more assumptions are not fulfilled .
Regarding the quality of reporting, we found 9 studies examining sustained exposures (e.g., antihypertensive treatments or low-dose aspirin for secondary prevention of cardiovascular events), which indicates the reporting of the design being appropriate to study abrupt-onset outcome and transient drug exposure. This high number underlines that authors and reviewers should be aware of the design’s validity assumptions, recommendations for use and the need to report validity assumptions fulfilled (or not). Indeed, the minor assumptions were rarely reported in the papers we reviewed. In addition, the sample size or power calculation was rarely reported, with no improvement compared to a previous review . However, studies involving electronic healthcare databases usually have very large sample sizes and perhaps the sample size calculation is not needed in this context, because the sample size cannot be chosen. However post-hoc power calculation in the database study sample is important to decide which healthcare database should be used and to interpret the absence of a statistical association, especially in the context of a very rare event. Post-hoc power calculation indicates how easily an effect that is fixed a priori can be shown (accounting for the observed number of patients/cases and the observed variability). Even if confidence intervals (which represent how accurate the results are estimated) are reported, the power calculation is more related to the number of observed events, and is information that is easier for the reader to understand: in case of non-significant associations, power can be quite difficult to interpret on the basis of the sole confidence interval.
With respect to other elements of reporting quality, a measure of variability of the estimate was always reported, which we considered adequate if the number and duration of different periods were also reported. The effect estimator was not always appropriately reported, but reported statistical models were all considered as appropriate. Valid models other than the conditional logistic regression or the conditional Poisson regression can still be applied, such as the Cox stratified proportional-hazards model for case-crossover studies and the Cox stratified or conditional logistic regression models for self-controlled case series , even if unusual. However, it has been shown that in case-crossover studies, for instance, conditional logistic regression model and conditional Poisson model give identical estimates . In case-crossover studies, times to event are the same for case and control periods (because periods are defined similarly within strata) and even if a stratified Cox model can be used, it does not assess and compare a time to event. Therefore, the estimator computed (even if called a “hazard ratio”) still represents an “odds ratio” and should be interpreted as such. One third of the included papers reported both adjusted and unadjusted estimates, which is quite low when they allow for assessing the importance of bias. Nordmann et al. recommended that the number of discordant pairs in case-crossover designs (i.e., number of patients who crossed from unexposed in the control period to exposed in the case period, or vice versa ), or the count of events in the different time periods for self-controlled case series should be reported . We considered that these items need to be reported, except if a measure of variability of the estimate is reported along with a clear description of the number and duration of different periods. However, only a small number of self-controlled case series reported the duration of control period along with the person-times in risk and control periods, but this was appropriately reported for case-crossover studies. Consequently, the quality of reporting in self-controlled studies can still be improved, in accordance with the recommendations provided by Nordmann et al. .
Our study has some limitations. First, a potential paper selection bias could exist, which we tried to limit with a comprehensive literature search using keywords, title and abstract terms, and few limits, that were already used in previous systematic reviews [6, 30, 39]. Moreover, the definition of abrupt versus insidious onset of the event is somewhat subjective, as is transient versus sustained exposure, or some minor assumptions for the self-controlled case series. We tried to limit this issue by consensus. Some strengths of this study are worth noting. We focused on studies involving medical databases because of a growing interest in the use of “big data” in healthcare research . We updated Nordmann et al. review  up to 2014, and to ensure that no indexing issue in PubMed can be suspected, we updated our literature search in July 2016.