Accurate confidence intervals for risk difference in meta-analysis with rare events

Jiang, Tao; Cao, Baixin; Shan, Guogen

doi:10.1186/s12874-020-00954-8

Research Article
Open access
Published: 30 April 2020

Accurate confidence intervals for risk difference in meta-analysis with rare events

Tao Jiang¹^na1,
Baixin Cao² &
Guogen Shan³

BMC Medical Research Methodology volume 20, Article number: 98 (2020) Cite this article

3410 Accesses
4 Citations
1 Altmetric
Metrics details

Abstract

Background

Meta-analysis provides a useful statistical tool to effectively estimate treatment effect from multiple studies. When the outcome is binary and it is rare (e.g., safety data in clinical trials), the traditionally used methods may have unsatisfactory performance.

Methods

We propose using importance sampling to compute confidence intervals for risk difference in meta-analysis with rare events. The proposed intervals are not exact, but they often have the coverage probabilities close to the nominal level. We compare the proposed accurate intervals with the existing intervals from the fixed- or random-effects models and the interval by Tian et al. (2009).

Results

We conduct extensive simulation studies to compare them with regards to coverage probability and average length, when data are simulated under the homogeneity or heterogeneity assumption of study effects.

Conclusions

The proposed accurate interval based on the random-effects model for sample space ordering generally has satisfactory performance under the heterogeneity assumption, while the traditionally used interval based on the fixed-effects model works well when the studies are homogeneous.

Peer Review reports

Background

Meta-analysis is a useful statistical tool in medical research to evaluate treatment effect by analyzing outcomes from multiple clinical trials. The estimated treatment effect from meta-analysis is always more reliable and accurate than the estimate from one selected study among the available studies. In early phase clinical trials to study safety of a new drug, rare events are very common [1]. In meta-analysis for such data, Vandermeer et al. [1] pointed out that the traditionally used asymptotic point estimates and confidence intervals could be substantially different from the results using exact methods under the exact conditional framework [2]. It is well known that asymptotic approaches often do not have satisfactory performance when outcome is extreme or sample size is small.

Multiple methods have been developed for meta-analysis with rare events over decades [3, 4]. The fixed-effects models are conveniently used in practice, such as the Mantel-Haenszel method [5]. When one or both groups in a study have zero events, a continuity correction is often needed in order to estimate risk ratio or odds ratio, but the traditional correction by adding 0.5 may lead to undesirable influence on the analysis results as pointed out by Sweeting et al [6]. Later, they developed a continuity correction method by adding a float value based on the size of each group to improve the coverage probability. Multiple follow-up articles discussed this issue whether or not a small value should be added to studies with rare events in data analysis [7,8]. Kuss et al. [8] suggested using a beta-binomial model to avoid adding arbitrary values to each cell in data analysis. Recently, Tian et al. [9] proposed a simple and effective method for confidence interval calculation without artificial continuity correction. The confidence intervals from each study were weighted to construct an overall interval from simulation studies under the fixed-effects model. Their developed confidence intervals were shown to have better coverages when the events are rare, but the length of their intervals could be much longer than others.

In contrast to fixed-effects models, the treatment effect in the random-effects model is assumed to follow a normal distribution. DerSimonian and Laird [10] proposed a random-effects model by including a random study effect to account for the variation of study population or study design. The statistical software R package meta can be used to compute confidence intervals for the fixed-effects model and random-effects model [11]. Recently, Bakbergenuly and Kulinskaya [12] suggested the generalized linear mixed models (GLMMs) in meta-analysis to include the correlation between point estimate and its variance estimate in data analysis.

The aforementioned exact conditional approach assumes both marginal totals in each study fixed [2]. It is reasonable to assume that the numbers of participant in each treatment group are fixed. It is not usual that a repeated study has the same total number of events as the observed study. The exact one-sided limit by Buehler [13] follows the study design with sample size in each treatment group fixed [14–16]. However, it is too computationally intensive to generate all possible samples in meta-analysis with binary outcome [17].

In this article, we propose using importance sampling to construct confidence interval for risk difference in meta-analysis with rare events. We apply the importance sampling method described by Lloyd and Li [18] to compute the profile confidence limit proposed by Kabaila and Lloyd [19]. Importance sampling methods have been studied by many researchers with regards to coverages of confidence intervals [20,21]. Importance sampling does not require to enumerate all possible samples [19]. This approach simulates samples from the distribution estimated from the observed data. Importance sampling has to be used in conjunction with a designated statistic to order the limits of simulated samples. We consider the existing intervals from the fixed-effects and random-effects models as designated statistics in this article.

The rest of this article is organized as follows. In “Methods” section, we describe the fixed-effects and random-effects models to estimate confidence intervals for risk difference. We then introduce importance sampling for interval calculation. In “Results” section, we use an example from 18 schizophrenia clinical trials to illustrate the application of the proposed intervals, and then compare the proposed intervals with the existing intervals with regards to coverage probability and average length. In “Conclusions” section, we provide some remarks on data analysis for meta-analysis with rare events.

Methods

For meta-analysis with binary outcome, data can be organized in a K×4 table, where K is the number of studies (Table 1). Each row represents the results from a parallel study with the number of events and the number of non-events in the new treatment group and the control group, respectively. Let the two treatment groups be indexed by 0 and 1 for the control and the new treatment, respectively. Suppose X_ijr is the number of participants having r events from the treatment j in ith study, where i=1,2,⋯,K,j=0,1, and r=0,1. For studies with rare events, X_ij1 is often very small. Let n_ij=X_ij1+X_ij0 be the total number of participants from the treatment j in the ith study, and N₁=(n₁₁,n₂₁,⋯,n_K1) and N₀=(n₁₀,n₂₀,⋯,n_K0) be the sample sizes for the new treatment group and the control group, respectively. Suppose p_j is the event rate of the treatment j. Given the sample size n_ij, the number of responses among these participants, X_ij1, follows a binomial distribution, B(n_ij,p_j). We assume that each study is independent from each other, and the two groups within each study are independent from each other as well. The parameter of interest here is the risk difference between the treatment group and the control group,

$$\Delta=p_{1}-p_{0}.$$

Table 1 Data from K independent studies with binary outcome

Full size table

We first review the existing methods to construct two-sided confidence intervals for Δ in “Intervals based on fixed or random-effects model” section, and then develop accurate intervals in “Accurate intervals” section.

Intervals based on fixed or random-effects model

We first consider the fixed-effects model to calculate confidence interval for Δ. Under the study homogeneity assumption, the treatment effect in each study is assumed to be the same,

$$\Delta_{i}=\mu,$$

where μ is the treatment effect. In the ith study, the risk difference Δ_i is estimated as

$$\widehat \Delta_{i}=\hat p_{i1}-\hat p_{i0},$$

where $\hat p_{ij}=X_{ij1}/n_{ij}$ is the estimated rate of the treatment j in the ith study. The variance is estimated as $s_{i}^{2}=\sum _{j=0}^{1} \frac {\hat p_{ij}(1-\hat p_{ij})}{n_{ij}}$ from two independent proportions. The weight for the ith study is

$$w_{i}=\frac{n_{i1}n_{i0}}{n_{i1}+n_{i0}}\frac{1}{\sum_{i=1}^{K} \frac{n_{i1}n_{i0}}{n_{i1}+n_{i0}}},$$

where $\sum _{i=1}^{K} \frac {n_{i1}n_{i0}}{n_{i1}+n_{i0}}$ is the factor to standardize the weight values, with $\sum _{i=1}^{K} w_{i}=1$. It is easy to show that w_i is an increasing function of n_i1 (n_i0) when n_i0 (n_i1) is fixed.

The overall weighted treatment effect using the fixed-effects model is calculated as

$$\widehat\Delta_{F}={\sum_{i=1}^{K} w_{i} \widehat\Delta_{i} }.$$

and its variance is estimated as

$$\widehat {SE}_{F}^{2}={\sum_{i=1}^{K} w_{i}^{2} s_{i}^{2}}.$$

The standardized statistic $\widehat \Delta / \widehat {SE}_{F}$ follows the standard normal distribution asymptotically when Δ=0. Therefore, the asymptotic confidence interval for Δ based on the fixed-effects model (the F interval) at the nominal level of 100(1−α)% is

$$ CI_{F}=(\widehat\Delta_{F}-z_{1-\alpha/2} \widehat {SE}_{F},\widehat\Delta_{F}+z_{1-\alpha/2} \widehat {SE}_{F}), $$

(1)

where z_a is the ath quantile of the standard normal distribution.

In the observation of study heterogeneity which could be caused by study population or study design or influential covariates, DerSimonian and Laird [10] proposed using the random-effects model to include the study random effect in the model as

$$\Delta_{i}=\mu + u_{i},$$

where u_i is the deviation of the ith study from the population mean μ, and it follows a normal distribution. Let v_i be the weight of the ith study from the fitted random-effects model. Then, the weighted treatment effect and its variance are $\widehat \Delta _{R}={\sum _{i=1}^{K} v_{i} \widehat \Delta _{i} }$, and $\widehat {SE}_{R}^{2}={\sum _{i=1}^{K} v_{i}^{2} s_{i}^{2}} $, respectively. It follows that the asymptotic confidence interval for Δ using the random-effects model (the R interval) is computed as

$$ CI_{R}=(\widehat\Delta_{R}-z_{1-\alpha/2} \widehat {SE}_{R},\widehat\Delta_{R}+z_{1-\alpha/2} \widehat {SE}_{R}), $$

(2)

It can be seen that the difference between CI_F and CI_R is the weights used in the treatment effect and its variance calculation. The F interval and the R interval can be computed by using the function metabin from the statistical software package meta [11,22]. In the metabin function, we use MH.exact=TRUE in the option with no continuity correction in the estimates.

Accurate intervals

Exact confidence limit by Buehler [13] for Δ is preferable, but it is computationally intensive to save all the possible samples in meta-analysis with sample size n_ij fixed. For this reason, we consider importance sampling (IS) to construct accurate intervals for Δ by simulating samples from the distribution estimated from the observed data to make statistical inference. Importance sampling has been applied to many important medical research areas that often only have one nuisance parameter (e.g., the proportion difference in a parallel study [21,23]). We extend the application of IS to meta-analysis with multiple nuisance parameters in confidence interval calculation. The intervals computed using importance sampling are accurate with coverage close to the nominal level. In addition, importance sampling has the computational advantage over exact methods [19].

The calculation of the IS intervals has to be used in conjunction with a designated statistic for the interval ordering. Let T be the considered designated statistic. Suppose p₀=(p₁₀,p₂₀,⋯,p_K0) is the probability vector of the control group, where p_i0 is the probability of the control group in the ith study. The accurate upper limit based on the designated statistic T is computed as the supremum of Δ such that

$$ G(\Delta)=P\Big(T(\mathbf{Y})\leq T(\mathbf{y}_{\text{\textbf{obs}}})\ |\ \Delta,\hat{\mathbf{p}}_{0}(\Delta)\Big)>\frac{\alpha}{2}, $$

(3)

where y_obs is the observed data, Y is data from the simulated data set, and $\hat {\mathbf {p}}_{0}(\Delta)$ is the maximum likelihood estimate of p₀ given Δ.

Suppose we simulate B data sets from independent binomial distributions with the probabilities using $\widehat {\Delta }^{*}$ and $\widehat {\mathbf {p}}_{0}(\Delta ^{*})$ estimated from the observed data y_obs. For studies with double zeros, although their estimated risk differences are zero, sample sizes from such studies are still valuable information in estimating the overall Δ and it confidence intervals [24]. Sample sizes from all studies including the ones with double zeros are used in the proposed method. The number of events are simulated from binomial distributions with the probabilities of $\widehat {\mathbf {p}}_{0}(\Delta ^{*})$.

The designated statistic of each simulated data set is computed, and compared with T(y_obs). The set of T(Y)≤T(y_obs) equals to Ω_T(y_obs)={Y:T(Y)≤T(y_obs)}. Let the size of Ω_T(y_obs) be B₁ with data: $\phantom {\dot {i}\!}\mathbf {Y}_{1}, \cdots, \mathbf {Y}_{B_{1}}$. Then, the upper limit in Eq. 3 can be rewritten as the supremum of Δ such that

$$\widehat G(\Delta)=\frac{1}{B}\sum_{b=1}^{B_{1}} \frac{f(\mathbf{Y}_{\mathbf{b}}|\Delta,\widehat{\mathbf{p}}_{0}(\Delta))}{f(\mathbf{Y}_{\mathbf{b}}|\widehat{\Delta}^{*},\widehat{\mathbf{p}}_{0}(\Delta^{*}))}>\frac{\alpha}{2},$$

where f(Y_b) is the probability density function of Y_b, which is a product of independent binomial distributions with parameters (n_ij,p_ij) for the treatment j in the ith study. For a given Δ, numerical algorithms can be used to find the maximum likelihood estimator of p₀(Δ) to calculate $\widehat {G}(\Delta)$.

Similarly, the IS lower limit can be computed. It should be noted that designated statistics from the same model are used for the IS upper limit and the IS lower limit. For example, the asymptotic upper limit from the fixed-effects model is used as the designated statistic for the accurate upper limit, and then the lower limits from the same model is used for the accurate lower limit. We refer this accurate interval as the IS-F interval. When the asymptotic limits from the random-effects model are used as the designated statistics, the computed accurate limits are referred to be as the IS-R interval.

Results

We first use an example from 18 schizophrenia clinical trials to illustrate the application of the proposed accurate intervals. In addition to the F interval, the R interval, the IS-F interval, and the IS-R interval, We also include the confidence interval for Δ by Tian [9] in the comparison (referred to be as the Tian interval). Tian interval can be computed by using their developed R function meta.exact from the exactmeta function, without the mid-p value approach. All data including studies with zero events are used in the confidence interval calculation.

These 18 schizophrenia clinical trials reported the number of all-cause mortality for patients treated with the long-acting injectable antipsychotics (LAI-AP) or the oral antipsychotics (OAP) which is the control treatment here. Data of these 18 trials are presented in Table 2, which was provided by Efthimiou [25]. Out of a total of 3774 participants treated with the LAI-AP, 7 events were observed. In the OAP group, there were 6 events recorded from a total of 2145 participants in the control group. The naive estimates for all-cause mortality rates are 0.185% and 0.279% in the LAI-AP group and the OAP group, respectively.

Table 2 Data from 18 clinical trials comparing all-cause mortality rate of patients treated with long-acting injectable antipsychotics (LAI-AP) or the oral antipsychotics (OAP) treatment as the control

Full size table

Table 3 presents the estimated $\widehat {\Delta }$ and the 95% confidence interval for Δ using the five methods. The point estimate of $\widehat {\Delta }$ from the R method is similar to the Tian method, and they are larger than that from the F method. It can be seen that the Tian interval is much wider than others, and the asymptotic F or R intervals have shorter lengths than the proposed accurate intervals. The upper limits of the proposed accurate intervals are smaller than those of other intervals. All the intervals contain zero. Therefore, we fail to reject the null hypothesis that there is no difference between the LAI-AP treatment and the OAP treatment with regards to the all-cause mortality rate.

Table 3 Confidence intervals for risk difference between the LAI-AP group and the OAP group

Full size table

Simulation studies

We conduct extensive simulation studies to compare coverage probability and average length of the five intervals: the F interval, the R interval, the IS-F interval, the IS-R interval, and the Tian interval. The nominal confidence level is set as 95%. The sample sizes, n_ij, are assumed to be the same as those in the aforementioned example, as N₁ and N₀ in Table 2. The number of responses X_ij1 follows a binomial distribution (n_ij,p_ij). We simulate D=1,000 data for each configuration: Y₁,Y₂,⋯, and Y_D. For the proposed IS intervals, we generate B=2,000 importance samples from the estimated distribution using each simulated data.

Coverage probability is defined as the proportion of the pre-specified risk difference Δ being included in the confidence intervals:

$$CP=\frac{1}{D}\sum_{d=1}^{D} I\Big(\Delta \in CI(\mathbf{Y}_{\mathbf{d}})\Big).$$

A confidence interval with the simulated interval being closer to the nominal level is preferable. Average length is defined as the average of all the lengths

$$AL=\sum_{d=1}^{D} \frac{CI_{upper}(\mathbf{Y}_{\mathbf{d}})-CI_{lower}(\mathbf{Y}_{\mathbf{d}})}{D},$$

where CI_lower and CI_upper are the lower limit and the upper limit of an interval. When two intervals are comparable with regards to coverage probability, the one with a shorter average length outperforms the other.

Homogeneity of study effects

We first compare the coverage probabilities of the five methods with fixed probabilities, p₁ and p₀. For simplicity, we assume a common rate in the control group, p_i0=p, with p from 0.01% to 10%. The treatment probability is p_i1=p+Δ. For each configuration of (p,Δ), the coverage probabilities of these methods are computed, see Fig. 1 when Δ=0.005 and 0.05. It can be seen that the F method has the coverage closer to the nominal level when Δ=0.005, except the case in which p is very low. As Δ is increased to 0.05, the F interval, the IS-R interval, and the IS-F interval have similar coverages when p is small. The IS-F interval and the IS-R interval are conservative when p is large. In this plot with Δ=0.05, the Tian interval and the R interval have the coverage probabilities below the nominal level. Overall, the F interval has good performance with regards to coverage when studies are homogeneous and have common rates.

Given the number of nuisance probabilities, it is difficult to compare the performance of the five methods under each configuration. With 18 studies and 5 considered probabilities, the number of possible configurations is 5³⁶, which is over 10²⁵. For this reason, we follow the approach by Tian et al. [9] to compare the performance of these methods by simulating the probabilities of the control group (p₀) from uniform distributions: U(0,b), where b=0.0001, 0.001, 0.01, and 0.1. We consider the following five Δ values: 0.001, 0.005, 0.01, 0.05, and 0.1. Under the study homogeneity assumption, the probabilities of the treatment group p₁ are then obtained as p_i1=p_i0+Δ.

Table 4 presents coverage and average length comparison between the five intervals when p₀∼U(0,0.01%). Coverage probabilities of the F interval range from 89% to 96%. The R interval is very conservative when Δ is small, and its coverage is below 95% when Δ is larger. The Tian interval is conservative when Δ≤1%, but it could be as low as 76% when Δ is 10%. The proposed accurate intervals always have the coverage probabilities close to the nominal level as compared to the existing intervals. Average length is always an increasing function of Δ for each confidence interval method. The Tian intervals are wider than others when they all guarantee the coverage probability. The IS-R interval generally has a shorter length as compared to the R interval and the IS-F intervals.

Table 4 Coverage probability and average length comparison between the five intervals when p₀∼ U(0,0.01%)

Full size table

When the event rates of the control group are higher with p₀∼U(0,0.1%) in Fig. 2, the F interval generally performs better than others with regards to coverage probability and average length. When Δ is large (e.g., 10%), coverage probabilities of these intervals are all slightly below 95%. In this case with a small p₀ and a relatively large Δ, the proposed intervals (IS-R or IS-F intervals) have better coverage probabilities than the F interval, and the length difference between the accurate intervals and the F interval is small. When Δ=10%, the coverage probability of the Tian interval is below 80%. When the rates are even higher with p₀∼U(0,1%), the rates are not rare in these configurations, and the F interval outperforms others as seen in Fig. 2.

Heterogeneity of study effects

Under the study heterogeneity assumption, the probability in the treatment group is p_i1= p_i0+u_i, where u_i is the random study effect that follows a normal distribution with mean of Δ and standard deviation of Δ/2. Figure 3 presents the coverage probability and average length comparison between the five intervals when p₀∼U(0,0.01%),U(0,0.1%), and U(0,1%). As Δ increases, the standard deviation of the probabilities in the treatment group goes up. When Δ is small, the F interval, the IS-R interval, and the IS-F interval have the coverage probabilities closer to the nominal level as compared to the R interval and the Tian interval. Coverage probabilities of the F interval and the IS-F interval drop to almost 50% when Δ is 10%. The R interval generally has good coverage when Δ is large. However, the R interval’s coverage probabilities are very low when Δ=1% in meta-analysis with rare events (e.g., p₀∼U(0,0.01%) or U(0,0.1%)). The IS-R interval has consistent good performance with regards to coverage and length as compared to others in meta-analysis with rare events. Figure 3 also presents the results when the event rates are not rare (e.g., p₀∼U(0,1%)). When Δ is large, the R interval and the IS-R interval have better coverage probabilities than others. When variance of study effects is small (for the configurations with small Δ values), the F interval performs better where the configurations are similar to the ones under the study homogeneity assumption.

Conclusions

We propose using importance sampling to construct confidence intervals for risk difference in meta-analysis with rare events. The traditionally used F interval has satisfactory performance with regards to coverage probability and interval length when the rate of events is not rare under the study homogeneity assumption, but this interval could have a very low coverage probability under the study heterogeneity assumption. The IS-R interval based on the asymptotic limits from the random-effects model outperforms the existing intervals under the heterogeneity assumption. The IS intervals use the existing asymptotic limits to order the sample space. Although the asymptotic limits are computed from asymptotic approaches whose performances are based on the approximation of the test statistic to the limiting distribution, the order of these limits provides a useful information to produce better IS limits.

The Tian interval often guarantees the coverage probability when the rates of both groups are rare, but that interval could have the coverage probability below the nominal level when Δ is large. Theoretically, the Tian interval can be used as a designated statistic to order the sample space. However, simulations are involved in the Tian interval calculation that would significantly increase the computational intensity of the proposed IS intervals. In addition, the ordering of the sample space based on the Tian interval may change as the number of simulations being utilized. For these reasons, we do not include the IS intervals based on the ordering by the Tian interval.

Discussion

The method by Buehler [13] to construct exact one-sided confidence interval is ideal for binary outcome when the size of the sample size is not too large that allows a full enumeration of the sample space [16,26–29]. However, it is not feasible in meta-analysis as it is extremely difficult to save the sample space under the unconditional framework with sample size in each treatment group fixed. If the upper bound of the possible number of events can be determined and the size of the sample size is not too large, exact Buehler interval may be computed. Otherwise, an efficient search algorithm should be developed to order the sample space efficiently.

Exact confidence intervals are preferable for statistical inference. However, it is often computationally intensive, such as the aforementioned the exact interval by Buehler [28,30–32]. For these reasons, simulation based intervals are proposed for use in practice, including the proposed interval here, the Tian interval, and the interval based on confidence distribution [24,33–35]. It is still a big challenge in exact meta-analysis by enumerating all possible data, which becomes a big data problem with the requirement of huge memory and computational power.

In addition to risk difference, odd ratio and risk ratio are also used to measure the treatment effect. For studies with zero events in one or both treatment groups, the estimated risk difference is zero. However, the estimated ratios could be infinity [17,36–39]. In order to avoid this issue, an arbitrary small number (e.g., ε=0.5, 1) is often added to each cell in the data. The performance of the test statistics is affected by the chosen small value [6,40–42]. The added value ε also raises the question of whether the number of participants in a study should be n_ij or n_ij+2ε. We consider this as future work to study the IS intervals for ratios.

Availability of data and materials

Not applicable. This is a manuscript to develop novel statistical approaches, therefore, no real data is involved.

Abbreviations

GLMMS:: Generalized linear mixed models
IS:: Importance sampling
LAI-AP:: Long-acting injectable antipsychotics
OAP:: Oral antipsychotics

References

Vandermeer B, Bialy L, Hooton N, Hartling L, Klassen TP, Johnston BC, Wiebe N. Meta-analyses of safety data: a comparison of exact versus asymptotic methods. Stat Methods Med Res. 2009; 18(4):421–32. https://doi.org/10.1177/0962280208092559.
Article PubMed Google Scholar
Mehta CR, Patel NR, Gray R. Computing an Exact Confidence Interval for the Common Odds Ratio in Several 2 * 2 Contingency Tables. J Am Stat Assoc. 1985; 80(392):969–73. https://doi.org/10.1080/01621459.1985.10478212.
Google Scholar
Cai T, Parast L, Ryan L. Meta-analysis for rare events. Stat Med. 2010; 29(20):2078–89. https://doi.org/10.1002/sim.3964.
Article PubMed PubMed Central Google Scholar
Wan X, Wang W, Liu J, Tong T. Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC Med Res Methodol. 2014; 14(1):135. https://doi.org/10.1186/1471-2288-14-135.
Article PubMed PubMed Central Google Scholar
Mantel N, Haenszel W. Statistical Aspects of the Analysis of Data From Retrospective Studies of Disease. JNCI J Natl Cancer Inst. 1959; 22(4):719–48. https://doi.org/10.1093/jnci/22.4.719.
CAS PubMed Google Scholar
Sweeting MJ, Sutton AJ, Lambert PC. What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. Stat Med. 2004; 23(9):1351–75. https://doi.org/10.1002/sim.1761.
Article PubMed Google Scholar
Rücker G, Schwarzer G, Carpenter J, Olkin I. Why add anything to nothing? The arcsine difference as a measure of treatment effect in meta-analysis with zero cells. Stat Med. 2009; 28(5):721–38. https://doi.org/10.1002/sim.3511.
Article PubMed Google Scholar
Kuss O. Statistical methods for meta-analyses including information from studies without any events-add nothing to nothing and succeed nevertheless. Stat Med. 2015; 34(7):1097–116. https://doi.org/10.1002/sim.6383.
Article CAS PubMed Google Scholar
Tian L, Cai T, Pfeffer MA, Piankov N, Cremieux P-Y, Wei LJ. Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction. Biostat (Oxford Engl). 2009; 10(2):275–81. https://doi.org/10.1093/biostatistics/kxn034.
Article Google Scholar
DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986; 7(3):177–88.
Article CAS PubMed Google Scholar
Schwarzer G, Carpenter JR, Rücker G. Meta-Analysis with R, Use R!Cham: Springer; 2015. https://doi.org/10.1007/978-3-319-21416-0. http://link.springer.com/10.1007/978-3-319-21416-0.
Book Google Scholar
Bakbergenuly I, Kulinskaya E. Meta-analysis of binary outcomes via generalized linear mixed models: A simulation study. BMC Med Res Methodol. 2018; 18(1):70. https://doi.org/10.1186/s12874-018-0531-9.
Article PubMed PubMed Central Google Scholar
Buehler RJ. Confidence intervals for the product of two binomial parameters. J Am Stat Assoc. 1957; 52(280):482–93.
Article Google Scholar
Kabaila P, Lloyd CJ. The efficiency of Buehler confidence limits. Stat Probab Lett. 2003; 65(1):21–8. https://doi.org/10.1016/s0167-7152(03)00215-3.
Article Google Scholar
Kabaila P, Lloyd CJ. Buehler confidence limits and nesting. Aust N Z J Stat. 2004; 46(3):463–9. https://doi.org/10.1111/j.1467-842x.2004.00343.x.
Article Google Scholar
Kabaila P. Computation of exact confidence limits from discrete data. Comput Stat. 2005; 20(3):401–14. https://doi.org/10.1007/bf02741305.
Article Google Scholar
Shan G. Exact Statistical Inference for Categorical Data, 1st edn.San Diego: Academic Press; 2015. http://www.worldcat.org/isbn/0081006810.
Google Scholar
Lloyd CJ, Li D. Computing highly accurate confidence limits from discrete data using importance sampling. Stat Comput. 2014; 24(4):663–73. https://doi.org/10.1007/s11222-013-9409-1.
Article Google Scholar
Kabaila P, Lloyd CJ. Profile upper Confidence Limits from Discrete Data. Aust N Z J Stat. 2000; 42(1):67–79. https://doi.org/10.1111/1467-842X.00108.
Article Google Scholar
Garthwaite PH, Buckland ST. Generating Monte Carlo confidence intervals by the Robbins– Monro process. J Comput Graph Stat. 1992; 41(1):159–71.
Google Scholar
Garthwaite PH, Jones MC. A stochastic approximation method and its application to confidence intervals. Journal of Computational and Graphical Statistics. 2009; 18(1):184–200.
Article Google Scholar
Viechtbauer W. Conducting Meta-Analyses in <i>R</i> with the <b>metafor</b> Package. J Stat Softw. 2010; 36(3):1–48. https://doi.org/10.18637/jss.v036.i03.
Article Google Scholar
Lloyd CJ. Accurate confidence limits for stratified clinical trials. Stat Med. 2013; 32(20):3415–23. https://doi.org/10.1002/sim.5809.
Article PubMed Google Scholar
Yang G, Liu D, Wang J, Xie MG. Meta-analysis framework for exact inferences with application to the analysis of rare events. Biometrics. 2016; 72(4):1378–86. https://doi.org/10.1111/biom.12497.
Article PubMed Google Scholar
Efthimiou O. Practical guide to the meta-analysis of rare events. Evid Based Ment Health. 2018; 21(2):72–6. https://doi.org/10.1136/eb-2018-102911.
Article PubMed Google Scholar
Kabaila P, Lloyd CJ. Tight upper confidence limits from discrete data. Aust J Stat. 1997; 39(2):193–204. https://doi.org/10.1111/j.1467-842X.1997.tb00535.x.
Article Google Scholar
Kabaila Paul. Better Buehler confidence limits. Stat Probab Lett. 2001; 52(2):145–54.
Article Google Scholar
Shan G, Banks S, Miller JB, Ritter A, Bernick C, Lombardo J, Cummings JL. Statistical advances in clinical trials and clinical research. Alzheimers Dement Transl Res Clin Interv. 2018; 4:366–71.
Article Google Scholar
Shan G. Exact confidence limits for the probability of response in two-stage designs. Statistics. 2018; 52(5):1086–95. https://doi.org/10.1080/02331888.2018.1469023.
Article PubMed PubMed Central Google Scholar
Shan G. Exact Tests for Disease Prevalence Studies With Partially Validated Data. Stat Biopharm Res. 2019:1–14. https://doi.org/10.1080/19466315.2018.1555099.
Shan G. Exact confidence limits for the response rate in two-stage designs with over or under enrollment in the second stage. Stat Methods Med Res. 2018; 27(4):1045–55.
Article PubMed Google Scholar
Zhang H, Shan G. Letter to Editor: A novel confidence interval for a single proportion in the presence of clustered binary outcome data. Stat Methods Med Res. 2019:096228021984005. https://doi.org/10.1177/0962280219840056.
Liu D, Liu RY, ge Xie M. Exact Meta-Analysis Approach for Discrete Data and its Application to 2 2 Tables With Rare Events. J Am Stat Assoc. 2014; 109(508):1450–65. https://doi.org/10.1080/01621459.2014.946318.
Article CAS PubMed PubMed Central Google Scholar
Shan G, Ma C, Hutson AD, Wilding GE. Randomized Two-Stage Phase II Clinical Trial Designs Based on Barnard’s Exact Test. J Biopharm Stat. 2013; 23(5):1081–90. https://doi.org/10.1080/10543406.2013.813525.
Article PubMed Google Scholar
Shan G, Zhang H, Jiang T. Minimax and admissible adaptive two-stage designs in phase II clinical trials. BMC Med Res Methodol. 2016; 16(1):90. https://doi.org/10.1186/s12874-016-0194-3.
Article PubMed PubMed Central Google Scholar
Shan G, Hutson AD, Wilding GE. Two-stage k-sample designs for the ordered alternative problem. Pharm Stat. 2012; 11(4):287–94. https://doi.org/10.1002/pst.1499.
Article PubMed Google Scholar
Shan G, Ma C, Hutson AD, Wilding GE. Some tests for detecting trends based on the modified Baumgartner-Weiß-Schindler statistics. Comput Stat Data Anal. 2013; 57(1):246–61. https://doi.org/10.1016/j.csda.2012.04.021.
Article PubMed PubMed Central Google Scholar
Shan G, Wilding GE. Powerful Exact Unconditional Tests for Agreement between Two Raters with Binary Endpoints. PLoS ONE. 2014; 9(5):97386. https://doi.org/10.1371/journal.pone.0097386.
Article Google Scholar
Shan G, Wilding GE, Hutson AD, Gerstenberger S. Optimal adaptive two-stage designs for early phase II clinical trials. Stat Med. 2016; 35(8):1257–66. https://doi.org/10.1002/sim.6794.
Article PubMed Google Scholar
Shan G, Kang L, Xiao M, Zhang H, Jiang T. Accurate unconditional p-values for a two-arm study with binary endpoints. J Stat Comput Simul. 2018; 88(6):1200–10.
Article PubMed PubMed Central Google Scholar
Shan G. Comments on ’Two-sample binary phase 2 trials with low type I error and low sample size’. Stat Med. 2017; 36(21):3437–8. https://doi.org/10.1002/sim.7359.
Article PubMed Google Scholar
Shan G, Gerstenberger S. Fisher’s exact approach for post hoc analysis of a chi-squared test. PLoS ONE. 2017; 12(12):0188709. https://doi.org/10.1371/journal.pone.0188709.
Article Google Scholar

Download references

Acknowledgements

We would like to thank the support from the supercomputing center at UNLV.

Funding

Shan’s research is partially supported by grants from the National Institute of General Medical Sciences from the National Institutes of Health: P20GM109025. Jiang’s work is supported by the National Natural Foundation of China under grant 11971433, and the First Class Discipline of Zhejiang –A (Zhejiang Gongshang University-Statistics).

Author information

Tao Jiang and Guogen Shan contributed equally to this work.

Authors and Affiliations

School of Statistics and Mathematics, and School of Business, Zhejiang Gongshang University, Hangzhou, Zhejiang, China
Tao Jiang
School of Mathematical Sciences, Nankai University, Tianjin, China
Baixin Cao
Epidemiology and Biostatistics Program, School of Public Health, University of Nevada Las Vegas, Las Vegas, USA
Guogen Shan

Authors

Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Baixin Cao
View author publications
You can also search for this author in PubMed Google Scholar
Guogen Shan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The idea for the paper was originally developed by GS. GS computed the new confidence interval for meta-analysis with rare binary outcome. GS, CB, and TJ drafted the manuscript and approved the final version.

Corresponding authors

Correspondence to Tao Jiang or Guogen Shan.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Jiang, T., Cao, B. & Shan, G. Accurate confidence intervals for risk difference in meta-analysis with rare events. BMC Med Res Methodol 20, 98 (2020). https://doi.org/10.1186/s12874-020-00954-8

Download citation

Received: 26 September 2019
Accepted: 17 March 2020
Published: 30 April 2020
DOI: https://doi.org/10.1186/s12874-020-00954-8

Accurate confidence intervals for risk difference in meta-analysis with rare events