Bmc Medical Research Methodology Open Access Pooling Overdispersed Binomial Data to Estimate Event Rate

Background: The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research.


Background
In clinical research and public health, it is frequently necessary to combine findings from multiple interventional or observational studies in order to address important safety and efficacy questions. A single study rarely provides a definitive answer because of limited sample size and the specific attributes of particular study populations. The challenges of combining data from heterogeneous studies are well described in the meta-analysis literature. In the majority of meta-analysis reports, the outcome of interest is a comparative risk estimate such as the odds ratio, relative risk, or risk difference [1]. Absolute risks, however, such as the proportion of clinical events among a cohort of patients or the response rate among patients receiving a certain treatment regimen, are important measures for helping to guide clinical and public health decisions. In the correct epidemiology and statistical terminology, these so-called rates are really proportions, but we will treat rates and proportions as equivalent in this paper as this term is commonly used in medical product safety research. Relevant methods to pool the absolute risks are especially important in safety evaluation of medical products as the risks for serious adverse outcomes are often rare, and precise estimates of the probability of these outcomes are crucial in the risk-benefit evaluation.
In this report we describe the implementation of the betabinomial method to pool the absolute risks from overdispersed data. This method estimates a summary probability of adverse events and is applicable in medical product safety evaluation as it takes into account the heterogeneity of studies. The application of the beta-binomial method in drug safety settings was previously described by Chuang-Stein in 1993 [2]. Here we aim to provide a detailed description of the method and to update and broaden its applications.
The general setting is that of a clinical trial or cohort study of a specific exposure, such as: drug A with a sample size of n resulted in x number of adverse events (e.g. liver injury). Within each individual study the probability of encountering x number of adverse events out of a sample size of n is characterized by the binomial distribution. To summarize multiple studies of the same exposure, we need to account for their heterogeneity of the studies, for they could differ in their sample sizes, clinical settings, investigators, protocols, and prevalence of comorbidity among study subjects. The assumption of one binomial distribution that can describe the proportions of adverse event from all the studies is not always valid. Numerous factors, including ethnic difference, disease severity, comorbid conditions, and concomitant medications can contribute to the variation of the probability of interest, thus requiring additional assumptions beyond the binomial model. This phenomenon is often referred to as overdispersion [3,4]. Ignoring overdispersion when pooling overdispersed data that are binomial in nature could result in erroneous estimates of the probability of interest and its confidence interval.
In the clinical trial literature, Chuang-Stein [2] proposed using the beta-binomial model to combine binomial event rates across multiple studies in an article titled "An application of the beta-binomial model to combine and monitor medical event rates in clinical trials." Despite its sound statistical basis, this method has not been widely used in clinical and public health research articles during the years since its publication. Meanwhile, the application of the beta-binomial model in other fields is becoming more prevalent as it has been applied in fields as distant as sensory analysis [5] and computational linguistics [6]. We utilized this method to estimate the risk of liver toxicity among users of oral antifungal treatments [7] and believe that it can be used more widely to help address similar questions. In the rest of this article we describe the statistical assumptions for the beta-binomial model, the process of estimating the probability of interest, methods to test for over-dispersion, and an example of its application.

The Beta-Binomial distribution
Both Chuang-Stein [2] and Ennis [5] provide excellent references for those who are interested in the history of the beta-binomial model. Recall the definition of the binomial distribution: where x is the number of successes in a sequence of n independent success/failure experiments, each of which has probability p for success.
Let probability p follow a beta distribution (p|α, β), then where Γ is the gamma function over the domain [0, 1]; α and β are two positive parameters. The beta distribution was selected in the past because of its flexibility (capable of a wide range of shapes, see Figure 1) and its ability to provide good approximations. As Skellam [8] stated as early as 1948, "in practice we could, at least in most cases, take this form of distribution as a convenient approximation." As a result, we arrive at a combination of the binomial distribution with a beta density function: where x takes on the values 0, 1, 2... n, and α and β are positive. Note in equation (3) that n is the total number of study subjects, and x is the total number of subjects Variety of shapes for beta distributions Figure 1 Variety of shapes for beta distributions.
with a certain adverse event, although what most investigators are interested in is the proportion p that varies between 0 and 1 and has the appearance of a continuous distribution.
So let p i = x i /n i , i = 1,2, ... k, where i indexes the different studies, x i is the number of events in the ith study and n i is the sample size of the study. To reiterate, within the context of multiple studies where each study with sample size n i and binomial probability p i (e.g. for adverse events), one binomial distribution cannot adequately describe the additional variation when p i varies and thus the data are fitted with a beta distribution with parameters (α, β), with α > 0 and β > 0. Let μ = α/(α+β), θ = 1/(α+β), where μ is the mean event rate (i.e., the expected value of a variable binomial parameter p) and θ is a measure of the variation in p. In short, we have constructed a two-stage model: The mean and variance of X are nμ and nμ(1-μ){θ/(1+θ)} [9]. One can view the term {θ/(1+θ)} as a multiplier of the binomial variance. In other words, it models the overdispersion. Some authors (e.g. Kleinman [10]) prefer the term γ where γ = θ/(1+θ) = 1/(α+β+1). Then the variance is nμ(1 -μ) γ. In essence, one can derive the same information from θ and γ about the beta-binomial distribution, so it is beneficial to know both and employ whichever is more convenient for computation.

Estimation of Parameters
Two main methods, one involving moments and the other involving maximum likelihood, are often used to estimate the parameters μ and θ.

The Moment Estimates Method
In terms of actual data observed from different studies, let x i is the number of events in the ith study and n i is the sample size of the study. The n i 's here are almost always unequal in clinical studies.
Let where {w i } represents a set of weights and w is the sum of all the weights [10].
Let also then the moment estimates of μ and γ are: and where . To derive θ, we can simply perform the following conversion: Providing the proper set of weights is challenging because {w i } is a function of the unknown parameter γ. Kleinman [10] first offered an empirical weighting procedure and suggested to set w i = n i or w i = 1 to obtain an initial approximation of estimates of μ and γ using equation (4). Using this estimation of γ to compute {w i }, one then can use these "empirical" weights to arrive at a new estimate of μ. In cases where γ estimates are negative, they are to be set to zero. Chuang-Stein [2] proposed an improvement on Kleinman's procedure by suggesting that the iteration be carried further until the differences between two consecutive sets of estimates for μ and γ are both smaller than some predetermined value. The example that was given in the paper [2] was 10 -6 .
Notations are simpler in cases where all n i 's are equal, then The moment estimates of μ and γ are and

The Maximum Likelihood Estimates Method
As is written above, let where is the beta function of α and β and is used here to simplify equation (3). The log likelihood function is then where c is a constant. Next we will need to take the partial derivative of the log likelihood function with respect to α and β. The ML equations involving α and β are where The second derivatives of lnL are: where These second derivatives of the log likelihood function can be used to form the Hessian matrix which, in turn, can be used to derive the standard errors for the parameters. An example will be given in a following section. Most often μ is the main parameter of interest, and therefore we present a direct estimation of it rather than proceeding through α and β.
Define f x (x) (x = 0,1,2, ..., k) as the observed frequencies of events from k trials. Then the likelihood of beta-binomial can be also written as Where P(x) has already been stated in (3). Let so that is the total sample size of all the individual trials combined.
The log likelihood function in terms of μ and θ is where c is a constant and the ML estimators of and are solutions of These equations can be solved iteratively using the Newton-Raphson method [11].
Again, the second partial derivatives of the log likelihood function can be used to form the Hessian matrix (H) at the ML solution which, after being inverted, can be used to derive the covariance matrix and the standard errors for the parameters: And the confidence intervals for and can be obtained by where Z 1-α/2 is the 1-α/2 percentile of a standard normal distribution function.
Once and are estimated, one can also derive and from the relationships that μ = α/(α+β), θ = 1/(α+β). It can easily be shown that the estimate of is / and the estimate of is (1 -)/ . If we substitute these estimates for α and β in the beta-binomial model (3), then the cumulative distribution can be calculated.
As we have shown above, either method can be used to estimate the parameters of the beta-binomial distribution.
Readers who are interested in more details should consult Griffiths [9] and Kleinman [10]. Researchers have implemented the maximum likelihood estimation (MLE) method in two popular commercial statistical software packages. In addition, free statistical software, such as R and WinBUGS, have methods for fitting the beta-binomial model, but they require some programming.
One of those two popular commercial statistical software packages is SAS (SAS Institute Inc., Cary, NC, USA). The macro BETABIN written by Ian Wakeling [12] is freely available. It borrows the existing SAS procedure NLMIXED to provide a maximum likelihood estimation of μ and θ. It provides not only the standard beta-binomial model, but also Brockhoff's [13] corrected beta-binomial model. Interested readers can also experiment directly with Proc NLMIXED to fit the beta-binomial model as others have done [14].
The other software is Stata (College Station, Texas). Guimarães provided the necessary computer commands for beta-binomial estimations using the Stata command xtnbreg with conditional maximum likelihood [15]. In addition, Guimarães emphasized the common knowledge that the beta-binomial distribution was a special case of the more general Dirichlet-multinomial (DM) distribution -with two parameters in this case. In the general Dirichlet-multinomial distribution there are m parameters, allowing far more than two (α and β) in the betabinomial distribution. In situations where one is indeed concerned with multiple types of adverse events associated with the same exposure, expanding to the Dirichlet-multinomial distribution is a logical solution. Technical details of the multinomial model have been given by others [15][16][17].

Test of overdispersion
Using the binomial model when the variability in the data exceeds what the binomial model can accommodate could result in an underestimation of the standard error of the pooled event rate and thus increase the chance of a Type I error. Ennis and Bi [5] described an experiment with 10,000 sets of simulated overdispersed binomial data where they found that the Type I error was 0.44 and not the false assumption of 0.05. It is precisely because the binomial model is unable to fit overdispersed binomial data that the application of the beta-binomial is necessary. So before one adopts the beta-binomial for the analysis of certain datasets, one must first examine whether the data are overdispersed to the extent that the beta-binomial model would be a better fit than the simple binomial model. There are several ways to examine overdispersion. We know that where γ = 1/(1 + α + β). If we are able to estimate γ, we can test whether γ is zero. If it is close to zero, then there is no significant overdispersion, and the binomial model will adequately describe the data. This test, however, has been found to be less sensitive in detecting departure from the binomial model because boundary problems arise as we test whether a positive-valued parameter is greater than 0 (recall that α and β are positive parameters, and consequently so are θ and γ) [5].
As one would expect, a likelihood ratio test can also be used to test for overdispersion, but the same boundary problem applies [18,19]. The null hypothesis is that the underlying distribution is binomial while the alternative hypothesis is that the distribution is beta-binomial. The log-likelihood for the binomial model (interpreted to be pooling the data from all studies without weighting) is The likelihood ratio test is where L BB is the log-likelihood value for the beta-binomial model (9) and L B is log-likelihood value for the binomial model (15).
Although a solution for the boundary problem has been offered [20], there is no consensus on the optimal solution [21]. To avoid the boundary problem, we can use the alternative -Tarone's Z statistic [22] -to test for overdispersion. This has been shown to be more sensitive than the parameter test (e.g. test for γ being zero) and the loglikelihood ratio test [5]: where This statistic Z has an asymptotic standard normal distribution under the null hypothesis of a binomial distribution. In short, we recommend caution in using the likelihood ratio test. It is better to combine it with Tarone's Z statistics. The Z statistics can also be used as a goodness-of-fit test. It has been shown to be superior to other goodness-of-fit measures [21]. We will be calculating Tarone's Z in our application example.

The Bayesian Approach
In the preceding sections we describe the beta-binomial model within the frequentist framework of statistics. Interestingly, in the Bayesian statistics field, the beta-binomial model is commonly described in Bayesian statistics textbooks as an example [23,24]. Since Bayesian statistical methods are now increasingly used in clinical and public health research, we hereby briefly describe the derivation of the beta-binomial model in the Bayesian framework. Some have noted that the Bayesian approach can provide more accurate estimates for small samples [25,26].
Recall that the binomial distribution (in equation 1) is the following: Let the conjugate prior π(p|α, β) be a beta distribution (i.e., if p in equation 1 follows the beta distribution) where Γ is the gamma function. The beta priors are selected because they are very flexible on (0, 1) and can represent a wide range of prior beliefs. These are similar to the reasons for selecting the beta distribution in the frequentist framework. In addition, by starting with the beta distribution as the conjugate prior, we ensure that the posterior distribution is always a beta distribution, and thus mathematically tractable for estimating the parameters.

M = 1/θ), so that
In short, we again have a two-stage model: In the Bayesian terminology, the beta prior distribution, when updated with binomial data, gives a beta posterior distribution. The Bayesian estimator can then be chosen as the mean, median, or the mode of this marginal posterior. In many situations, as long as the sample sizes are reasonably large (n = 50 or more), our previous methods of moment estimation and maximum likelihood are still preferred in the Bayesian framework for the estimations of mean and variance. There are other detailed mathematical equations involved in Bayesian estimation of the betabinomial model for specific cases. Interested readers could consult Lee and Sabavala [25] as well as Lee and Lio [26].

Results
We will illustrate the application of the beta-binomial method using an analysis that examined the adverse effects of oral anti-fungal agents. Oral anti-fungal agents, including terbinafine, itraconazole, and fluconazole, have become the treatment of choice for onychomycosis and dermatophytosis not responding to topical therapy. In order to study the safety profiles of these agents, we reviewed data from randomized and non-randomized controlled trials, case series, and cohort studies that enrolled patients having superficial dermatophytosis (tinea pedis, tinea mannus, tinea copora, and tinea cruris) or onychomycosis, aged 18 or above, receiving oral antifungal therapy for two or more weeks. One outcome of interest was the cumulative incidence of patients who withdrew from the study because of adverse reactions [7]. Data for 41 treatment arms of terbinafine from 37 studies (Table 1 and Appendix) are used as an example.
Event rates from different studies varied from 0 % to 13.89%. We apply the beta-binomial model with the maximum likelihood method to estimate the pooled event rates using SAS and SAS macro BETABIN. From all the eligible studies, we combine the data and obtain the summary estimate of risks and its 95% confidence intervals (CI).
The ML estimates for parameters μ and θ are = 0.0344 and = 0.0278. The estimate of the covariance matrix for and is In Table 2, we present different estimations of a pooled proportion (event rates) using the binomial model and the beta-binomial model. Using the binomial model, we compute a binomial probability and variance as if all the data were from a single study with a sample size of over 3,000. The pooled estimate is 3.70%, 8% higher than the beta-binomial estimate of 3.44%. The standard error from the collapsed data is 0.34%, misleadingly smaller than that of the beta-binomial estimation of 0.59%.
The important issue naturally is the test of overdispersion since that is the basis for preferring the beta-binomial  We also conduct a likelihood-ratio test between the betabinomial and the binomial, and again the test shows that there is significant overdispersion (p < 0.001). Finally, we calculate Tarone's Z statistic, and the result is consistent with other tests. It shows that the beta-binomial has better goodness-of-fit than the binomial (p < 0.001). The fit that the beta-binomial model gives for our example is also graphically presented in Figure 2.
As we have shown above, under the beta-binomial model the summary event rate is 3.44% with an estimated standard error of 0.59%. The θ is estimated to be 2.78% (Table  2), which gives an α estimate of 1.24 and a β estimate of 34.72. Once these parameters are estimated, we can use the estimated beta-binomial model to examine the probability of observing, for example, 105 or more adverse events in a new study of 1,000 subjects. Using equation 3, that probability is 5% under our estimated beta-binomial model.

Discussion
Along with the development of drugs, vaccines, and medical products for unmet medical needs, more robust analytic methods are needed to quantify the risks associated with the use of these agents, so that regulators and clinicians can rigorously assess the risk-benefit profiles of medical products. While randomized controlled trials have been established as the gold standard for efficacy evaluation, comprehensive safety assessment requires a collection of different methods. As any single trial is rarely large enough to estimate precisely the probability of serious adverse events, large observational datasets or aggregations of clinical trial results are necessary. A recent high profile example [27] illustrated the need to combine results from multiple studies to unearth safety signals that may not be apparent in individual studies. Developing on prior work by Chuang-Stein [2], we provide a more comprehensive background of the beta-binomial model, a model that could have wider application in clinical and public health research. In order to show new developments in the beta-binomial field over the past decade, we explain and demonstrate that the beta-binomial method can be used for the combination of heterogeneous studies to estimate event rates.
Estimating the correct summary event rate based on heterogeneous binomial data is so far the main reason for Beta distribution for the binomial proportions based on example Figure 2 Beta distribution for the binomial proportions based on example. adopting the beta-binomial distribution. Once this is accomplished, one might wish to examine whether specific attributes of the studies will have any meaningful impact. The beta-binomial model can incorporate these attributes into a regression model as covariates. For example, the main purpose of the study might be to evaluate the proportion of adverse events from all clinical trials involving drug A. Different studies might have different proportions of female subjects, and one may link the covariate, the proportion of female subjects, to the α parameter. In addition, different studies might include or exclude certain comorbid conditions. The comorbidity, defined as a binary variable, could also be included as a covariate. One can then evaluate the likelihood of the comorbidity increasing a specific side effect. As current meta-regression methods are mainly applied to comparative measures like relative risks, the advantage of the betabinomial model is that it can assess the correlation between study attributes and absolute risks of events.
Traditional meta-analysis can also combine event rates from heterogeneous sources by using the DerSimonian and Laird method [28]. We applied this method to the same dataset and placed the summary rate in Table 2, with an estimate of 3.90% with a standard error of 0.51%. This is in good agreement with our estimation using the betabinomial model. In medical product safety assessment, however, being able to derive a clear probability distribution offers advantages that traditional meta-analysis cannot, because the distributions allow the computation of absolute risks or probabilities involved in decision analysis. In the Bayesian framework, the beta-binomial model also enables better incorporation of prior knowledge and its associated uncertainty. In other words, even though traditional meta-analysis can also combine event rates, the adoption of the beta-binomial model can serve multiple purposes.

Conclusion
In the process of pooling event rates from multiple studies, one must consider the existence of overdispersion and the adequacy of the binomial model. In the example that we have presented, we estimated the pooled proportion of adverse events using the beta-binomial model. While we mainly discussed the application in safety assessment, the same method can be applied to assessment of efficacy of treatment response [29].