 Research
 Open access
 Published:
Comparison of statistical methods used to metaanalyse results from interrupted time series studies: an empirical study
BMC Medical Research Methodology volume 24, Article number: 31 (2024)
Abstract
Background
The Interrupted Time Series (ITS) is a robust design for evaluating public health and policy interventions or exposures when randomisation may be infeasible. Several statistical methods are available for the analysis and metaanalysis of ITS studies. We sought to empirically compare available methods when applied to realworld ITS data.
Methods
We sourced ITS data from published metaanalyses to create an online data repository. Each dataset was reanalysed using two ITS estimation methods. The level and slopechange effect estimates (and standard errors) were calculated and combined using fixedeffect and four randomeffects metaanalysis methods. We examined differences in metaanalytic level and slopechange estimates, their 95% confidence intervals, pvalues, and estimates of heterogeneity across the statistical methods.
Results
Of 40 eligible metaanalyses, data from 17 metaanalyses including 282 ITS studies were obtained (predominantly investigating the effects of public health interruptions (88%)) and analysed. We found that on average, the metaanalytic effect estimates, their standard errors and betweenstudy variances were not sensitive to metaanalysis method choice, irrespective of the ITS analysis method. However, across ITS analysis methods, for any given metaanalysis, there could be small to moderate differences in metaanalytic effect estimates, and important differences in the metaanalytic standard errors. Furthermore, the confidence interval widths and pvalues for the metaanalytic effect estimates varied depending on the choice of confidence interval method and ITS analysis method.
Conclusions
Our empirical study showed that metaanalysis effect estimates, their standard errors, confidence interval widths and pvalues can be affected by statistical method choice. These differences may importantly impact interpretations and conclusions of a metaanalysis and suggest that the statistical methods are not interchangeable in practice.
Introduction
Systematic reviews may be used to collate and synthesise evidence on the effects of interventions targeted at populations (e.g., effects of a countrywide ban on smoking rates [1]) or the impacts of exposures (e.g., impacts of flooding events [2]). These reviews may include evidence beyond randomised trials by necessity, because trials may not be possible (in the case of exposures) or feasible (in the case of interventions targeted at populations) [3]. The interrupted time series (ITS) may be considered for inclusion in such reviews because this design is often used to examine populationlevel interventions and exposures, when randomisation is not possible (e.g., for ethical reasons, when a policy targets an entire population). Furthermore, this design is considered a robust alternative for evaluating the impact of populationlevel interventions / exposures [4,5,6,7]. The results across the included ITS studies may be statistically combined using metaanalysis, providing a combined estimate of the intervention / exposure’s impact [8, 9].
In a classical ITS study, data are collected over time both before and after an intervention or exposure (henceforth referred to as an ‘interruption’), and aggregated using summary statistics over regular time intervals [10]. For example, in Ejlerskov et al. [11], the interruptions examined were policies implemented in six supermarkets that aimed to reduce the purchasing of lesshealthy foods that are commonly displayed at supermarket checkouts. The outcome examined was the number of checkout food purchases, aggregated into fourweekly periods (Fig. 1, Additional file 1: Figure S1) [11]. While the ITS design may also be used to examine the effects of an intervention on individuals (in which multiple measurements are taken before and after the intervention for each individual), we do not consider the use of the ITS design in this context further [12, 13].
In the analysis of data from this classical ITS design, a commonly fitted model structure is the segmented linear model [14, 15]. This model allows estimation of separate trends before and after the interruption (referred to as the pre and postinterruption trends). Hence the advantage of the ITS design is that the series acts as its own control; the preinterruption trend can be projected into the postinterruption period, which, when modelled correctly, provides a counterfactual for what would have occurred in the absence of the interruption [5, 14, 15]. The impact of the interruption can then be estimated by comparing the counterfactual with the observed postinterruption trend. A variety of effect metrics can be calculated, including levelchange (e.g., immediately following the interruption) and slopechange [7, 16].
When estimating the regression parameters of a segmented linear model, characteristics of time series data need to be accounted for [17]. One of these characteristics is autocorrelation, which allows for the fact that values of near neighbouring datapoints may be more similar (or different) than distant datapoints [7, 18, 19]. If autocorrelation is unaccounted for [e.g., when using ordinary least squares (OLS), in the presence of (likely) positive autocorrelation] the regression parameter standard errors may be underestimated [17, 20, 21]. Several estimation methods are available to account for autocorrelation [e.g., restricted maximum likelihood (REML), PraisWinsten (PW)] [20, 22, 23].
Twostage metaanalysis may be used to combine effects across ITS studies. In the first stage, segmented linear models are fitted to each ITS study to obtain interruption effect estimates and their standard errors [24, 25]. These estimates may be reported in the primary publications, or the systematic reviewer may reanalyse the time series data to obtain the required estimates [26]. Then, in the second stage, the effect estimates are combined using a metaanalysis model; commonly either a fixed (common) effect or randomeffects metaanalysis model [24]. Fixedeffect metaanalysis weights studies by the inverse of the variance of their estimated effect, and hence analysis requires only the effect estimates and their standard errors. However, the randomeffects method weights additionally involve the betweenstudy variance, a parameter which must be estimated and for which many estimators are available [24, 27,28,29]. Furthermore, there exist many confidence interval methods for the summary (combined) metaanalytic effect [30].
We previously undertook a numerical simulation study examining the performance of different metaanalysis methods to combine results from ITS studies with continuous outcomes, and how characteristics of the metaanalysis, ITS design, and method of analysis of the individual ITS studies modified the performance [31]. We examined ITS analysis and metaanalysis methods that are commonly used, or have been shown through numerical simulation to be preferable [20, 29, 30]. We found that all randomeffects methods yielded confidence interval coverage for the summary effect close to the nominal level, irrespective of the ITS analysis method used. However, the betweenstudy variance was overestimated in some scenarios [31]. In this companion study, we aimed to demonstrate empirically how the same methods compare when applied to realworld data, and answer the question: does statistical method choice importantly impact the metaanalysis results? Together, the simulation and empirical studies allow for a more complete understanding of which methods should be used in different scenarios. Specifically, our objectives were to: i) compare the metaanalysis estimates of the immediate levelchange and slopechange, their standard errors, confidence intervals and pvalues, and the estimates of betweenstudy variance obtained from different metaanalysis and ITS analysis methods; and ii), create a repository of data from ITS studies.
Methods
Overview of the methods
An overview of the steps and corresponding Sections is depicted in Fig. 2. In brief, we sourced ITS data from published metaanalyses (sections Identification of reviews and metaanalyses and Methods to obtain time series data) and reanalysed them using two ITS analysis estimation methods (section Interrupted time series (ITS) analysis methods). The levelchange and slopechange effect estimates (and their associated standard errors) were metaanalysed using a fixedeffect and four randomeffects metaanalysis methods (section Metaanalysis methods). We compared the metaanalysis effect estimates, their standard errors, confidence intervals and pvalues, and estimates of the betweenstudy variance, across the metaanalysis methods (sections Analysis and metaanalysis of the ITS datasets and Comparison of results from different ITSanalysis and metaanalysis methods).
Identification of reviews and metaanalyses
We sourced data for the present study from our previous methodological review that examined the statistical approaches used in reviews that include metaanalysis of ITS studies [26]. In brief, we searched eight electronic databases and included reviews containing at least one metaanalysis that included at least two ITS studies (using the review authors’ definition of an ITS). From each review, metaanalysis methods were examined for a single comparisonoutcome (see the methodological review protocol for selection details [32]). In addition, reviews were eligible for the present study if:

1)
The review’s metaanalysis included at least two ITS studies that had at least three datapoints before and after an interruption and a clearly defined interruption timepoint; and

2)
The raw time series data were available. Data was classified as unavailable, if for example, the review authors had directly extracted effect estimates from the primary studies, or if it was not clear if the review authors had directly extracted effect estimates from the primary studies or reanalysed the raw time series data.
Methods to obtain time series data
We sought the raw time series data using the following hierarchy of approaches:

1.
Sourced the time series data from the review (e.g., where the data were available in supplementary files).

2.
Contacted (via email) the corresponding author of the review, and requested the time points (and time unit, e.g., week, month), aggregate summary statistic (e.g., mean, rate, proportion), and time point(s) at which the interruption(s) occurred for each ITS.

3.
Digitally extracted time series data from published figures in the review using WebPlotDigitizer [33]. This data extraction tool has been shown to yield data that can be used to obtain accurate estimates of the effect estimates and standard errors from published ITS graphs [34].
We only sought time series data from authors of the reviews, and not authors of the primary studies, for reasons of feasibility.
Interrupted time series (ITS) analysis methods
Statistical model for an ITS analysis
We fitted the following segmented linear regression model to each of the included ITS studies [5]:
The continuous outcome at time \(t (t=1, \dots , T)\) is represented by \({Y}_{t}\). The series are divided into two segments, before and after the interruption. The time of the interruption (I) occurs at time \({T}_{I}\). The segments are identified by \({D}_{t}\) (\({D}_{t}={1}_{\left(t\ge {T}_{I}\right)}\) in the postinterruption period) (Additional file 1: Figure S1)\(.\) \({\beta }_{0}\) represents the intercept in the preinterruption period, \({\beta }_{1}\) the preinterruption slope, and \({\beta }_{2}\) and \({\beta }_{3}\) represent the interruption effects—respectively, immediate levelchange and slopechange. The error term accommodates lag1 (AR(1)) autocorrelation (\(\rho\)) via \({\varepsilon }_{t}= \rho {\varepsilon }_{t1}+{w}_{t}\), (\({w}_{t}\sim N\left(\mathrm{0,1}\right)\)); where \(\rho {\varepsilon }_{t1}\) allows for correlation between the current and the previous time point. Longer lags (i.e., higher order autocorrelation) can be modelled; however, we did not consider these here since we did not investigate longer lags in our companion numerical simulation study [31].
Estimation methods for ITS analysis
We used three statistical estimation methods for the analysis of the included ITS studies. These methods were selected because they are commonly used in practice [35], or have been shown to have improved statistical performance (via numerical simulation) [20, 22]. Briefly, the methods were:

Ordinary least squares (OLS) [17], which assumes that the model errors are uncorrelated between observations. In the presence of positive autocorrelation, which has been shown to frequently occur in time series data [36], this assumption is violated, leading to potential underestimation of the variances of the regression parameters [15, 37];

PraisWinsten (PW), which is a generalised leastsquares extension of OLS. PW estimation involves fitting the model using OLS and estimating lag1 autocorrelation from the residuals, then, transforming the data using the estimated autocorrelation and reestimating the regression parameters [23]. The aim is to remove the autocorrelation from the errors, which may require multiple iterations for the estimated autocorrelation to converge [23]. Accounting for autocorrelation in this way has been shown to improve estimation of the regression parameter standard errors compared with OLS estimation in the presence of autocorrelation; however, the standard errors are still underestimated using PW, particularly when there are few datapoints [20].

Restricted Maximum Likelihood (REML), which is a form of maximum likelihood (ML) estimation, attempts to avoid the underestimation of the variance (and covariance) parameter estimates that can arise with ML estimation. REML involves separate estimation of the (co)variance parameters to account for the loss in degrees of freedom due to estimation of the regression parameters [22]. In the context of ITS studies, while both ML and REML directly estimate and adjust standard errors for autocorrelation, ML has been shown to yield less biased standard errors of the regression parameters compared with REML when autocorrelation was small, but positively biased standard errors when autocorrelation was large [20, 22].
Metaanalysis methods
We used metaanalysis to combine the interruption effect estimates calculated using the methods in section Interrupted time series (ITS) analysis methods for each ITS study. We examined five metaanalysis methods, selected because they are frequently used in practice, or are known to have more favourable statistical properties.
Statistical models for metaanalysis
We examined a fixedeffect (common effect) and four randomeffects models. The fixedeffect model is specified by:
where it is assumed that each of the \(K\) included ITS studies provide an estimate (\({\widehat{\beta }}_{mk}\)) of a single true interruption effect common to all studies, \({\beta }_{m}\) (where \(m\) indicates the regression parameter of interest from Eq. 1, such as \({\beta }_{2}\) for immediate levelchange), and any withinstudy error in the estimation is due to sampling variability alone, \({\varepsilon }_{mk}\sim N(0,{\sigma }_{mk}^{2}\)).
The randomeffects metaanalysis model is specified by:
where it is assumed that each of the \(K\) ITS studies provide an estimate (\({\widehat{\beta }}_{mk}\)) of a true interruption effect specific to the \({k}^{th}\) study (i.e., \({\beta }_{m}^{*}+{\delta }_{mk}\)), where \({\beta }_{m}^{*}\) represents the mean of the distribution of true interruption effects (for the \({m}^{th}\) regression parameter) and \({\delta }_{mk}\) represents a random effect in the \({k}^{th}\) ITS study, which are assumed to be normally distributed about the mean with a betweenstudy variance \({\tau }_{m}^{2}\). The withinstudy error in estimating the \({k}^{th}\) ITS study’s interruption effect from a sample of participants is represented by \({\varepsilon }_{mk}^{*}\sim N(0,{\sigma }_{mk}^{2}\)).
Estimation methods for metaanalysis
The metaanalytic effect of the \({m}^{th}\) regression parameter is calculated as a weighted average of the \(K\) ITS study effect estimates, \({\widehat{\beta }}_{m}=\frac{\sum {W}_{mk}.{\widehat{\beta }}_{mk}}{\sum {W}_{mk}}\) (with a variance of \(\frac{1}{\sum {W}_{mk}}\)). The weight given to the \({k}^{th}\) ITS study is the reciprocal of the withinstudy variance, \({W}_{mkFE}=\frac{1}{{\sigma }_{mk}^{2}}\) when using a fixedeffect model, or \({W}_{mkRE}=\frac{1}{{\sigma }_{mk}^{2}+{\widehat{\tau }}_{m}^{2}}\) when using a randomeffects model. Different betweenstudy variance (\({\widehat{\tau }}_{m}^{2})\) estimators are available [29], as well as methods to calculate the confidence interval for the metaanalytic effect [30]. We used two betweenstudy variance estimators and two confidence interval methods.
We examined the following betweenstudy variance estimators:

DerSimonian and Laird (DL) [38], which is a momentbased betweenstudy variance estimator derived from Cochran’s Qstatistic, was selected for evaluation in this study because it is commonly used in practice [26, 29]. However, DL is well known to yield biased estimates of the betweenstudy variance in particular scenarios (i.e., small underlying betweenstudy variance and few studies; or, many studies and large underlying heterogeneity) [31, 39, 40];

Restricted Maximum Likelihood (REML), which is an iterative betweenstudy variance estimator that attempts to correct for the negative bias associated with the ML estimator [29]. REML has been recommended as an alternative estimator because of its slightly improved performance compared with DL, and for this reason was selected for evaluation in this study [29, 40, 41].
We examined two confidence interval methods for the metaanalytic effect, which can be used with both the DL and REML betweenstudy variance estimators:

The Waldtype normal distribution (WT) confidence interval method [42], which uses the standard normal distribution to calculate the confidence limits. This method maintains the assumption of normality of \({\widehat{{\beta }^{*}}}_{m}\) despite the withinstudy and betweenstudy variances not being known and instead estimated [28, 30]. The WT method relies on largesample approximations, which are not generally met in the context of metaanalysis due to few included studies [43, 44]. This can lead to lower than nominal levels of 95% confidence interval coverage, particularly when there are few included studies or the betweenstudy variance is large [30].

The HartungKnapp [45]/SidikJonkman [46] (HKSJ) confidence interval method, which attempts to overcome the assumption that the withinstudy variance is known and the betweenstudy variance is accurately estimated, in scenarios where these conditions are unlikely to be met (e.g., metaanalyses with few studies of small sample sizes). The method involves making a small sample adjustment to the metaanalysis standard error and uses the tdistribution (with K1 degrees of freedom) in the calculation of the confidence limits. This adjustment yields wider confidence intervals than the WT method, except when there are few studies and the estimated betweenstudy variance is zero [29].
Analysis and metaanalysis of the ITS datasets
Prior to fitting the models, we excluded ITS from the metaanalyses where the study i) did not meet our minimum required number of datapoints, or ii) had a large proportion of time series datapoints that were zero (i.e., greater than 40%), such that it was not reasonable to assume that the error term would be normally distributed. In addition, we removed any control series that were included in the original metaanalysis, because our interest was in the interrupted series only. Furthermore, we excluded segments of studies that had multiple interruptions. Specifically, we only included the first interruption (and the adjacent segments). Additional file 1: Table S1 includes all modifications, with justifications. Modifications were discussed and agreed upon at team meetings (including authors EK, SLT, ABF, AK and JEM).
We fitted a segmented linear regression model (section Statistical model for an ITS analysis, Eq. 1) to each ITS study and estimated the regression parameters (immediate levelchange (\({\beta }_{2}\)) and slopechange (\({\beta }_{3}\))) using both OLS and REML (section Estimation methods for ITS analysis) (Fig. 2). If REML failed to converge or to yield an estimate of autocorrelation between 1 and 1, we used PW, and where PW failed, we used OLS. Given the outcomes varied across the metaanalyses, we standardised the ITS study effect estimates (immediate levelchange, slopechange) prior to metaanalysis, so that the resulting metaanalysis effect estimates were standardised and comparable across metaanalyses. The ITS effect estimates obtained via REML, PW and OLS were standardised by dividing them (and their standard errors) by the root mean square error estimated from the OLS analysis. Slopechange effect estimates were then standardised, if required, to reflect the standardised slopechange per month by multiplying or dividing by an appropriate factor (e.g., slopechange calculated from a series with yearly timepoints was divided by 12 to reflect the slopechange per month).
The standardised ITS study levelchange and slopechange estimates were then metaanalysed (separately) using five metaanalysis methods (section Estimation methods for metaanalysis; Fig. 2). We standardised the direction of these metaanalysis effects so that for all a positive estimate reflected a beneficial impact of the interruption. This was achieved by multiplying the metaanalysis estimates where a negative estimate was beneficial (e.g., a decrease in fatality rates) by 1, to reverse the direction of interpretation.
We undertook sensitivity analyses to investigate whether the results were robust to our choice of threshold for excluding ITS based on the proportion of datapoints that were zero. For the sensitivity analysis, we excluded ITS from the metaanalyses where the study had greater than 30% but less 40% of time series datapoints that were zero. We then repeated the above analyses and informally compared the results.
All analyses were performed using Stata version 16.1 [47] and results were visualised using R version 4.1.0 (dplyr [48], foreign [49], ggplot2 [50]). Code and the repository of data are available in the Monash University online repository, Bridges [51].
Comparison of results from different ITSanalysis and metaanalysis methods
We compared metaanalysis effect estimates (i.e., immediate levelchange and slopechange), and their standard errors between each of the combinations of ITS analysis methods and metaanalysis methods. For each pairwise comparison between the combinations, we calculated (and tabulated) the average of the differences between the estimates (i.e., the mean difference = the sum of the differences between the estimates yielded by the two methods being compared, divided by the number of metaanalyses, 17) and the limits of agreement (calculated as the mean difference ± 1.96 × standard deviation of the differences) [52]. The limits of agreement provide a range within which most of the differences between estimates will lie [52]. For the standard errors, we first logtransformed these to remove the relationship between the variability of the differences and the magnitude of the standard errors [52]. We used Bland–Altman scatter plots to visualise the agreement, whereby, for each pairwise comparison between combinations, we plotted the difference between the estimates vs their average [52].
We compared confidence interval widths between each of the combinations of ITS analysis and metaanalysis methods. For each pairwise comparison, we plotted the ratio of the confidence intervals, scaled such that the reference confidence interval width spanned 0.5 to 0.5 (following the approach of Turner et al. [36]).
We compared the estimates of betweenstudy variance (\({\widehat{\tau }}^{2}\)) between each combination of ITS analysis methods and betweenstudy variance estimators. For each metaanalysis and pairwise comparison, we calculated (and tabulated) the median and interquartile range (IQR) of the differences between the estimates of the betweenstudy variance.
We compared the pvalues of the metaanalytic levelchange and slopechange estimates between each of the combinations of ITS analysis and metaanalysis methods. We categorised the pvalues using the conventionally used (though not recommended) statistical significance threshold of 0.05. The percentage of metaanalyses where there was agreement in the categories of statistical significance was calculated. Namely, the percentage of metaanalyses where the pvalue for the effect estimate from both methods was < 0.05 or \(\ge\) 0.05. Agreement between the statistical methods in the conclusion about the statistical significance was further quantified using the kappa statistic, where we used the following adjectives to describe agreement: moderate agreement as a kappa value of 0.41–0.6, substantial agreement as a value of 0.61–0.8, and almost perfect agreement as a value of 0.81–1.0 [53].
Results
Of the 54 reviews included in the source methodological review [26], 40 met the additional eligibility criteria for the present study (Fig. 3). We extracted data from the supplementary material of two reviews, and emailed the remaining 38 review authors. Of these, 35 emails were successfully delivered, from which 13 authors provided data. For a further two reviews, it was possible to digitally extract data from the ITS graphs included in the reviews. This resulted in the inclusion of 17 metaanalyses with 390 ITS. We further excluded 108 ITS from these metaanalyses for a variety of reasons (Fig. 3), leaving 282 ITS (from 17 metaanalyses) for our primary analyses.
Characteristics of the included metaanalyses and ITS studies
The reviews were published between 2005 and 2019. Most reviews investigated the effects of public health interruptions (88%, 15/17) [e.g., examining the impact of insecticide space spraying strategies on the incidence of malaria], while two examined the effects of crime interventions (12%, 2/17) (Table 1). The interruptions were predominantly targeted at the population level (59%, 10/17) [e.g., statewide legislation] or organisational level (30%, 5/17) [e.g., hospitalwide policy]. The 17 included metaanalyses had a median of 11 included ITS studies (IQR: 5.0–15.0, range: 3–62). The median series length of the ITS studies was 52 (IQR: 27–61, range: 7–195, n = 282), while the average series length at the metaanalysis level had a median of 40 (IQR: 22–59, range: 9.7–165.3). The time interval used for aggregation of the datapoints was most commonly months (11/17, 65%) followed by years (4/17, 24%). The outcome types were predominantly rates (6/17, 35%) and counts (5/17, 29%). The autocorrelation of the ITS studies estimated by REML ITS analysis had a median of 0.22 (IQR: 0.00, 0.48, n = 282), while the average estimate of autocorrelation at the metaanalysis level had a median of 0.17 (IQR: 0.13, 0.42).
Convergence of ITS analyses and metaanalyses using REML
Of the 282 ITS that were analysed using REML, 255 (90%) converged. For those that did not converge, PW was used, of which 4/27 (19%) failed to converge. OLS was used for the four that did not converge. All metaanalyses using REML converged.
Comparison of results from the different metaanalysis and ITS analysis method combinations
Estimates of level and slopechange metaanalytic effect estimates
When fixedeffect metaanalysis was fitted, on average, REML ITS analysis yielded slightly larger estimated immediate levelchanges compared with OLS (depicted by the horizontal solid orange line, representing the average of the differences, being greater than zero in Fig. 4, solid red box; Table 2), but with wide limits of agreement (depicted by the horizontal dashed orange lines being wide), largely due to the influence of one outlying estimated levelchange using REML. The different betweenstudy variance estimators (i.e., using DL or REML) had no impact on the immediate levelchange within ITS analysis method (i.e., OLS ITS analysis with the DL betweenstudy variance estimator vs OLS ITS with the REML estimator; REML ITS analysis with the DL betweenstudy variance estimator vs REML ITS with the REML estimator), as depicted by the horizontal solid orange line sitting on zero, and the limits of agreement being close to zero in Fig. 4 (solid blue boxes). Furthermore, the estimated metaanalytic immediate levelchanges were, on average, similar across the combinations of betweenstudy variance estimators and ITS analysis methods (Fig. 4 solid black boxes); however, the limits of agreement (which were approximately \(\pm 0.33\)) showed that methods could yield small to moderate differences in estimates of levelchange for a given metaanalysis. The patterns were similar for the effect estimates of the metaanalytic slopechange per month (see Fig. 4, dashed boxes).
Standard errors of the level and slopechange metaanalytic effects
The standard errors of the metaanalytic levelchange were most influenced by the metaanalysis model, with the standard errors being substantially larger when a randomeffects model was fitted (as depicted by the horizontal solid orange line being greater or less than zero, depending on the order of the comparisons, in Fig. 5, yellow boxes, and Table 3). When randomeffects metaanalysis methods were fitted, on average, there were no important differences in the standard errors of the metaanalytic levelchange (depicted by the horizontal solid orange line sitting on zero in Fig. 5), across ITS analysis methods (black boxes), betweenstudy variance estimators (blue boxes) or where there was a small sample adjustment made to the metaanalysis standard error (as occurs with the HKSJ method)[red boxes]. However, the limits of agreement were wide across ITS analysis methods (black boxes) and where there was a small sample adjustment (red boxes); for example, the limits of agreement for the comparison of REML ITS vs OLS ITS analysis (both with the REML betweenstudy variance estimator and HKSJ confidence interval method) suggest that the metaanalysis estimate of standard error is likely to be between 37% smaller to 63% larger when using REML ITS compared with OLS ITS analysis (Table 3). The patterns were similar for the standard errors of the metaanalytic slopechange per month (dashed boxes).
Confidence intervals of level and slopechange metaanalytic effects
The confidence interval widths of the randomeffects metaanalytic levelchange were similar irrespective of the ITS analysis method, or betweenstudy variance estimator (as depicted by the confidence intervals being the width of the reference rectangle in Fig. 6 black and blue boxes, see Additional file 1: Figure S3 for randomeffect metaanalysis comparisons only). However, the confidence interval widths were mostly similar or wider when the HKSJ method was used as compared with the WT confidence interval method (as depicted by confidence intervals being the width of the reference rectangle, or wider, in Fig. 6 red boxes). The confidence intervals of the randomeffects metaanalytic slopechange per month were more variable than the levelchange confidence interval widths; however, the patterns were the same (dashed boxes).
pvalues
Pairwise comparisons of the metaanalytic levelchange statistical significance between REML ITS analysis and OLS ITS analysis (keeping metaanalysis method constant) ranged from substantial to almost perfect agreement, irrespective of the metaanalysis methods used (Table 4 and Additional file 1: Table S2). Similarly, the level of statistical significance agreement between comparisons of betweenstudy variance estimators, and comparisons of confidence interval methods, ranged from substantial to almost perfect agreement. However, the agreement was systematically (slightly) lower when REML ITS analysis was used compared to OLS. In addition, the statistical significance agreement was lower when different confidence interval methods were used; this reduction in agreement was more pronounced when REML ITS analysis was used compared with OLS. The patterns were similar for the statistical significance agreement for the metaanalytic slopechange per month, which ranged from moderate to almost perfect agreement between most pairwise comparisons, irrespective of the statistical methods used.
Estimates of betweenstudy variance
We compared the betweenstudy variance estimates yielded by different combinations of ITS analysis methods (OLS and REML) and the betweenstudy variance estimators (DL and REML). The median and IQR of the pairwise differences in betweenstudy variance estimates indicated no substantial differences (Fig. 7 and Table 5).
Sensitivity analysis
In our sensitivity analysis, we excluded 16 ITS from 5 metaanalyses. The results of the sensitivity analysis did not differ substantively from the primary analyses. Details of the differences between the metaanalyses in the primary analysis and the sensitivity analysis are presented in Additional file 1: Table S3; summary results are provided in Additional file 1: Appendix 3.
Repository of ITS data
The ITS datasets analysed in this study, for which the authors gave consent (for 16 of 17 metaanalyses) are provided in an online repository: https://doi.org/10.26180/21280791 [51]. For each dataset, we describe the intervention and outcome examined, any changes made to the original metaanalysis to suit our purposes, and indicate for each ITS, the time, interval of time, time of interruption, segment in segmented linear regression model, the observation and its outcome type, and whether the ITS study was excluded from our sensitivity analysis.
Discussion
Summary and discussion of key findings
To our knowledge, no previous studies have empirically examined implications of different statistical methods for ITS analysis and metaanalysis using realworld ITS data. We created a repository of 17 metaanalyses including 282 ITS studies. We reanalysed each ITS study using two ITS analysis methods, and then metaanalysed the levelchange and slopechange effects using five metaanalysis methods. We compared the impact of using different statistical methods on the metaanalytic level and slopechange effect estimates, standard errors, confidence intervals and pvalues. The results of our empirical study provide insight into the behaviour of ITS analysis and metaanalysis methods when applied to realworld ITS data.
When fixedeffect metaanalysis was used, our results indicated that there may be differences in the estimated metaanalytic effect for a given metaanalysis. However, the immediate levelchange effect estimates yielded by REML ITS analysis were only slightly larger, on average, compared with OLS, which was likely driven by a single metaanalysis result. In addition, while on average we found unimportant differences in the estimated standard errors of the metaanalytic effects between the ITS analysis methods, for a given metaanalysis, there could be important differences. Estimated standard errors of the fixedeffect metaanalytic effects between the ITS methods have been shown (via numerical simulation [31]) to importantly differ for short series or where the underlying autocorrelation tends to be larger (i.e., at least 0.4). In the present dataset, some of the series were short and had autocorrelation greater than 0.4 potentially explaining the differences.
When randomeffects metaanalysis was used, we found that on average the estimates of the randomeffects metaanalytic effects of level and slopechanges and their standard errors, were not impacted by the choice of randomeffects metaanalysis method, irrespective of the ITS analysis method used. As expected, however, the standard errors were substantially larger compared with a fixedeffect model, due to the betweenITS variance (which was commonly estimated as greater than zero) being accounted for in the randomeffects model. Furthermore, we found that the betweenstudy variance estimates did not systematically differ by ITS analysis method or betweenstudy variance estimator; which has been observed in other studies [29, 31]. However, the confidence interval method was shown to impact the confidence interval widths and statistical significance of the metaanalytic levelchanges. This was primarily driven by the use of the tdistribution in the calculation of the confidence interval limits when using the HKSJ confidence interval method, rather than the small sample adjustment to the metaanalytic standard error. The consequence of wider confidence intervals and more conservative pvalues when using HKSJ compared to WT, is that the conclusions drawn from the metaanalysis may differ.
Strengths and limitations
Our study has several strengths. We examined ten statistical analysis combinations, which we compared using the metrics typically important to researchers undertaking metaanalysis, i.e., the metaanalytic point estimates, betweenstudy variance estimates, confidence intervals, and pvalues. Furthermore, the included systematic reviews and metaanalyses varied by the types of interruptions examined, the outcomes, the number of included studies per metaanalysis, and the number of datapoints per ITS study. The repository of ITS datasets has been made publicly available in an online repository, facilitating future methodological and statistical research.
Our study has several limitations. We were able to obtain raw ITS data from 17 of the 40 reviews included in our methodological review. While a small number of datasets is common in empirical methodological research [46, 54,55,56,57], this hinders examination of factors that may modify how the methods compare (e.g., the number of studies per metaanalysis). Furthermore, with a small number of datasets, outliers have more influence and parameters (such as the limits of agreement) are estimated with more uncertainty. In addition, we made several assumptions when analysing the ITS studies which may not hold (e.g., assuming count outcomes were continuous); we did not adjust for potential confounders (that may have been adjusted for in the original analysis); and, we fitted a segmented linear regression model with lag1 autocorrelation (which may have differed to the original analysis and may not have provided the best fit). However, for reasons of feasibility and our interest in comparing the statistical methods and not in addressing the research question examined in the original metaanalysis, meant that we did not assess the fit or modify the models for the 282 included ITS studies.
Implications for practice
We have demonstrated that the statistical methods for ITS analysis and metaanalysis do not, on average, impact the metaanalytic level and slopechange effect estimates, their standard errors or the betweenstudy variance estimates. However, across ITS analysis methods, for any given metaanalysis, there could be small to moderate differences in metaanalytic effect estimates, and important differences in the metaanalytic standard errors. Furthermore, the confidence intervals and pvalues may be impacted. This demonstrates that in practice the statistical methods choices we have investigated may materially impact the results and conclusions, and the methods should therefore not be considered interchangeable. In this circumstance, numerical simulation studies provide the best evidence as to which methods are optimal under different scenarios (e.g., metaanalyses including short series), and we refer readers to our companion numerical simulation study for recommendations [31]. Furthermore, given the choice of methods can impact the results, it is even more important that the specific ITS analysis and metaanalysis methods used are reported. A systematic review examining the statistical methods used in metaanalysis of ITS studies found that while the ITS estimation method could almost always be determined (in 95% of reviews), if and how autocorrelation was accounted for could only be determined in 59% of reviews, and the betweenstudy variance estimator and confidence interval method for the combined effect could only be determined in 60% and 57% of metaanalyses examined in the systematic review, respectively [26]. Hence much needs to be improved in reporting ITS studies.
Implications for future research
Our ITS data repository may be expanded, facilitating other methodological and statistical research. Our research could be extended to examine the impact of ITS methods for analysing other outcome types, particularly count outcomes, due to their frequent use in ITS studies. Furthermore, our examinations could be expanded to accommodate increasing autocorrelation lags and seasonal patterns. In addition, we have not examined the impacts of the statistical methods on metaanalytic effect prediction intervals, which provide a predicted range for the true interruption effect in an individual study, and are a critical tool for decisionmaking [58]. Understanding the implications of statistical method choice on the prediction intervals is an important next step given the known impact of the ITS analysis methods on the estimation of betweenstudy variance [31].
Conclusions
We found on average minimal impact of statistical method choice on the metaanalysis effect estimates, their standard errors or the betweenstudy variance estimates. However, across ITS analysis methods, for any given metaanalysis, there could be small to moderate differences in metaanalytic effect estimates, and important differences in the metaanalytic standard errors. Furthermore, we found that confidence intervals and pvalues could vary according to the choice of statistical method. These differences may materially impact the results and conclusions and suggest that the statistical methods are not interchangeable in practice.
Availability of data and materials
The datasets and code used in this empirical study are available in the Monash Bridges repository, https://doi.org/10.26180/2128079151.
Abbreviations
 AR(1):

Lag1 autocorrelation
 ITS:

Interrupted Time Series
 OLS:

Ordinary Least Squares
 REML:

REstricted Maximum Likelihood
 PW:

PraisWinsten
 DL:

DerSimonian and Laird betweenstudy variance estimator
 WT:

Waldtype confidence interval method
 HKSJ:

HartungKnapp/SidikJonkman confidence interval method
References
VicedoCabrera AM, Schindler C, Radovanovic D, et al. Benefits of smoking bans on preterm and earlyterm births: a natural experimental design in Switzerland. Tob Control. 2016;25:e135–41. https://doi.org/10.1136/tobaccocontrol2015052739. Research Support, NonU.S. Gov't.
Zhang N, Song D, Zhang J, et al. The impact of the 2016 flood event in Anhui Province, China on infectious diarrhea disease: An interrupted timeseries study. Environ Int. 2019;127:801–9. https://doi.org/10.1016/j.envint.2019.03.063. Research Support, NonU.S. Gov't.
Reeves BC, Deeks JJ, Higgins JPT, et al. Cochrane Handbook for Systematic Reviews of Interventions version 6.3. Chapter 24: Including nonrandomized studies on intervention effects. 6.3 ed.: Cochrane, 2022.
Shadish WR, Cook TD, Campbell DT. Experimental and quasiexperimental designs for generalized causal inference. 2002.
Kontopantelis E, Doran T, Springate DA, Buchan I, Reeves D. Regression based quasiexperimental approach when randomisation is not an option: interrupted time series analysis. BMJ. 2015;350:h2750. https://doi.org/10.1136/bmj.h2750.
Biglan A, Ary D, Wagenaar AC. The Value of Interrupted TimeSeries Experiments for Community Intervention Research. Prev Sci. 2000;1:31–49. https://doi.org/10.1023/a:1010024016308.
Lopez Bernal J, Cummins S, Gasparrini A. Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol. 2017;46:348–55. https://doi.org/10.1093/ije/dyw098. 2016/06/11.
Velicer WF. Time series models of individual substance abusers. NIDA Res Monogr. 1994;142:264–301 1994/01/01.
Gebski V, Ellingson K, Edwards J, et al. Modelling interrupted time series to evaluate prevention and control of infection in healthcare. Epidemiol Infect. 2012;140:2131–41. https://doi.org/10.1017/S0950268812000179. 2012/02/18.
Thyer BA. Interrupted Time Series Designs. In: Thyer BA, editor (online edition). QuasiExperimental Research Designs. New York: Oxford University Press, Inc.; 2012. p. 107–26.
Ejlerskov KT, Sharp SJ, Stead M, et al. Supermarket policies on lesshealthy food at checkouts: Natural experimental evaluation using interrupted time series analyses of purchases. PLOS Med. 2018;15:e1002712. https://doi.org/10.1371/journal.pmed.1002712. Research Support, NonU.S. Gov't.
Gast DL, Ledford JR. Single subject research methodology in behavioral sciences. NY: Routledge New York; 2009.
Kazdin AE. Singlecase experimental designs Evaluating interventions in research and clinical practice. Behav Res Ther. 2019;117:3–17. https://doi.org/10.1016/j.brat.2018.11.015.
Taljaard M, McKenzie JE, Ramsay CR, et al. The use of segmented regression in analysing interrupted time series studies: an example in prehospital ambulance care. Implement Sci. 2014;9:77. https://doi.org/10.1186/17485908977. 2014/06/20.
Wagner AK, Soumerai SB, Zhang F, et al. Segmented regression analysis of interrupted time series studies in medication use research. J Clin Pharm Ther. 2002;27:299–309. https://doi.org/10.1046/j.13652710.2002.00430.x. 2002/08/14.
Schaffer AL, Dobbins TA, Pearson SA. Interrupted time series analysis using autoregressive integrated moving average (ARIMA) models: a guide for evaluating largescale health interventions. BMC Med Res Methodol. 2021;21:58. https://doi.org/10.1186/s12874021012358.
Kutner MH, Nachtsheim CJ, Neter J, et al. Applied linear statistical models. 1996.
Huitema BE, McKean JW. Identifying autocorrelation generated by various error processes in interrupted timeseries regression designs  A comparison of AR1 and portmanteau tests. Educ Psychol Meas. 2007;67:447–59. https://doi.org/10.1177/0013164406294774.
Lopez Bernal J, Cummins S, Gasparrini A. Corrigendum to: Interrupted time series regression for the evaluation of public health interventions: a tutorial. Int J Epidemiol. 2020;49:1414. https://doi.org/10.1093/ije/dyaa118. 2020/09/04.
Turner SL, Forbes AB, Karahalios A, et al. Evaluation of statistical methods used in the analysis of interrupted time series studies: a simulation study. BMC Med Res Methodol. 2021;21:181. https://doi.org/10.1186/s12874021013640. 2021/08/30.
Chatterjee S, Simonoff JS. Time Series Data and Autocorrelation. Handbook of Regression Analysis. eds S. Chatterjee and J.S. Simonoff ed. 2012. p. 81–109.
Cheang WK, Reinsel GC. Bias Reduction of Autoregressive Estimates in Time Series Regression Model through Restricted Maximum Likelihood. J Am Stat Assoc. 2000;95:1173–84. https://doi.org/10.2307/2669758.
Judge GG. The Theory and practice of econometrics. 2nd ed. New York: Wiley; 1985. p. xxix–1019.
McKenzie JE, Beller EM, Forbes AB. Introduction to systematic reviews and metaanalysis. Respirology. 2016;21:626–37. https://doi.org/10.1111/resp.12783. 2016/04/22.
Ramsay C, Grimshaw JM, Grilli R. Metaanalysis of interrupted time series designs: what is the effect size? In: 9th Annual Cochrane Colloquium Lyon. 2001.
Korevaar E, Karahalios A, Turner SL, et al. Methodological systematic review recommends improvements to conduct and reporting when metaanalysing interrupted time series studies. J Clin Epidemiol. 2022. https://doi.org/10.1016/j.jclinepi.2022.01.010. 2022/01/20.
Deeks J, Higgins J, Altman D, et al. Chapter 10: Analysing data and undertaking metaanalyses. In: Higgins J, Thomas J, Chandler J, et al., editors. Cochrane Handbook for Systematic Reviews of Interventions. Cochrane. 2019.
Brockwell SE, Gordon IR. A comparison of statistical methods for metaanalysis. Stat Med. 2001;20:825–40. https://doi.org/10.1002/sim.650. 2001/03/17.
Veroniki AA, Jackson D, Viechtbauer W, et al. Methods to estimate the betweenstudy variance and its uncertainty in metaanalysis. Res Synth Meth. 2016;7:55–79. https://doi.org/10.1002/jrsm.1164. 2015/09/04.
Veroniki AA, Jackson D, Bender R, et al. Methods to calculate uncertainty in the estimated overall effect size from a randomeffects metaanalysis. Res Synth Meth. 2019;10:23–43. https://doi.org/10.1002/jrsm.1319. 2018/08/22.
Korevaar E, Turner SL, Forbes AB, et al. Evaluation of statistical methods used to metaanalyse results from interrupted time series studies: A simulation study. Res Synth Methods 2023. https://doi.org/10.1002/jrsm.1669. 2023/09/21.
Korevaar E, Karahalios A, Forbes AB, et al. Methods used to metaanalyse results from interrupted time series studies: A methodological systematic review protocol. F1000Res. 2020;9:110. https://doi.org/10.12688/f1000research.22226.3. 2020/12/24.
Rohatgi A. Webplotdigitizer: Version 4.5. 4.5 ed. 2021.
Turner SL, Korevaar E, Cumpston M, et al. Effect estimates can be accurately calculated with data digitally extracted from interrupted time series graphs. Res Syn Meth. 2023;14(4):622–38. https://doi.org/10.1002/jrsm.1646.
Turner SL, Karahalios A, Forbes AB, et al. Design characteristics and statistical methods used in interrupted time series studies evaluating public health interventions: a review. J Clin Epidemiol. 2020;122:1–11. https://doi.org/10.1016/j.jclinepi.2020.02.006. 2020/02/29.
Turner SL, Karahalios A, Forbes AB, et al. Comparison of six statistical methods for interrupted time series studies: empirical evaluation of 190 published series. BMC Med Res Methodol. 2021;21:134. https://doi.org/10.1186/s1287402101306w. 2021/06/28.
Hudson J, Fielding S, Ramsay CR. Methodology and reporting characteristics of studies using interrupted time series design in healthcare. BMC Med Res Methodol. 2019;19:137. https://doi.org/10.1186/s128740190777x. 2019/07/06.
DerSimonian R, Laird N. Metaanalysis in clinical trials. Control Clin Trials. 1986;7:177–88. https://doi.org/10.1016/01972456(86)900462. 1986/09/01.
Novianti PW, Roes KC, van der Tweel I. Corrigendum to “Estimation of betweentrial variance in sequential metaanalyses: A simulation study” [Contemp Clin Trials 37/1 (2014) 129–138]. Contemp Clin Trials. 2015;41:335. https://doi.org/10.1016/j.cct.2015.03.004.
Novianti PW, Roes KCB, van der Tweel I. Estimation of betweentrial variance in sequential metaanalyses: A simulation study. Contemp Clin Trials. 2014;37:129–38. https://doi.org/10.1016/j.cct.2013.11.012.
Langan D, Higgins JPT, Jackson D, et al. A comparison of heterogeneity variance estimators in simulated randomeffects metaanalyses. Res Synth Meth. 2019;10:83–98. https://doi.org/10.1002/jrsm.1316. 2018/08/02.
Page MJ, Altman DG, McKenzie JE, et al. Flaws in the application and interpretation of statistical analyses in systematic reviews of therapeutic interventions were common: a crosssectional analysis. J Clin Epidemiol. 2018;95:7–18. https://doi.org/10.1016/j.jclinepi.2017.11.022. 2017/12/06.
Davey J, Turner RM, Clarke MJ, et al. Characteristics of metaanalyses and their component studies in the Cochrane Database of Systematic Reviews: a crosssectional, descriptive analysis. BMC Med Res Methodol. 2011;11:160. https://doi.org/10.1186/1471228811160. 2011/11/26.
Page MJ, Shamseer L, Altman DG, et al. Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A CrossSectional Study. PLoS Med. 2016;13:e1002028. https://doi.org/10.1371/journal.pmed.1002028. 2016/05/25.
Knapp G, Hartung J. Improved tests for a random effects metaregression with a single covariate. Stat Med. 2003;22:2693–710. https://doi.org/10.1002/sim.1482. 2003/08/27.
Sidik K, Jonkman JN. A simple confidence interval for metaanalysis. Stat Med. 2002;21:3153–9. https://doi.org/10.1002/sim.1262. 2002/10/11.
StataCorp. Stata statistical software: release 16. Tx: College Station: StataCorp LLC; 2019.
Wickham H, François R, Lionel H, et al. dplyr: A Grammar of Data Manipulation. 2022.
Team RC. foreign: Read Data Stored by “Minitab”, “S”, “SAS”, “SPSS”, “Stata”, “Systat”, “Weka”, “dBase”, ... 2022.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. New York: SpringerVerlag; 2016.
Korevaar E, Turner SL, Forbes AB, et al. Comparison of statistical methods used to metaanalyse results from interrupted time series studies: an empirical study  Code and data. Monash University. 2022.
Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–60. https://doi.org/10.1177/096228029900800204. 1999/09/29.
Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37–46.
Chung Y, RabeHesketh S, Choi IH. Avoiding zero betweenstudy variance estimates in randomeffects metaanalysis. Stat Med. 2013;32:4071–89. https://doi.org/10.1002/sim.5821.
SanchezMeca J, MarinMartinez F. Confidence intervals for the overall effect size in randomeffects metaanalysis. Psychol Methods. 2008;13:31–48. https://doi.org/10.1037/1082989x.13.1.31.
Sidik K, Jonkman JN. Robust variance estimation for random effects metaanalysis. Comput Stat Data Anal. 2006;50:3681–701. https://doi.org/10.1016/j.csda.2005.07.019.
Biggerstaff BJ, Tweedie RL. Incorporating variability in estimates of heterogeneity in the random effects model in metaanalysis. Stat Med. 1997;16:753–68. https://doi.org/10.1002/(SICI)10970258(19970415)16:7%3c753::AIDSIM494%3e3.0.CO;2G.
IntHout J, Ioannidis JP, Rovers MM, et al. Plea for routinely presenting prediction intervals in metaanalysis. BMJ Open. 2016;6:e010247. https://doi.org/10.1136/bmjopen2015010247. 2016/07/14.
Acknowledgements
We thank all of the researchers who provided datasets for this study.
Funding
E.K. is supported through an Australian Government Research Training Program (RTP) Scholarship administered by Monash University, Australia.
J.E.M. and S.L.T. are supported by Joanne E McKenzie’s NHMRC Investigator Grant (GNT2009612).
The project is funded by the Australian National Health and Medical Research Council (NHMRC) project grant GNT1145273, "How should we analyse, synthesize, and interpret evidence from interrupted time series studies? Making the best use of available evidence", McKenzie JE, Forbes A, Taljaard M, Cheng A, Grimshaw J, Bero L, Karahalios A.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
J.E.M. conceived the study, and all authors contributed to its design. E.K. and J.E.M. completed the ethics application. E.K. collected the data, conducted the analysis and wrote the first draft of the manuscript, with contributions from J.E.M. S.L.T. contributed to digital data extraction where required. S.L.T., A.K., A.B.F., M.T. and J.E.M. contributed to revisions of the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Ethics approval was obtained from the Monash University Human Research Ethics Committee (Project ID 30078). We sought consent from participants to i) use their provided time series data to compare results when using different ITS analysis and metaanalysis methods, and ii) share their time series data via the online repository, Monash University Bridges.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Appendix 1.
Example ITS study and descriptions of metaanalysis modifications. Appendix 2. Additional results tables and figures. Appendix 3. Sensitivity analysis results. Appendix 4. Reviews that contributed data.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Korevaar, E., Turner, S.L., Forbes, A.B. et al. Comparison of statistical methods used to metaanalyse results from interrupted time series studies: an empirical study. BMC Med Res Methodol 24, 31 (2024). https://doi.org/10.1186/s1287402402147z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1287402402147z