Skip to main content

Dose-response meta-analysis of differences in means



Meta-analytical methods are frequently used to combine dose-response findings expressed in terms of relative risks. However, no methodology has been established when results are summarized in terms of differences in means of quantitative outcomes.


We proposed a two-stage approach. A flexible dose-response model is estimated within each study (first stage) taking into account the covariance of the data points (mean differences, standardized mean differences). Parameters describing the study-specific curves are then combined using a multivariate random-effects model (second stage) to address heterogeneity across studies.


The method is fairly general and can accommodate a variety of parametric functions. Compared to traditional non-linear models (e.g. E max, logistic), spline models do not assume any pre-specified dose-response curve. Spline models allow inclusion of studies with a small number of dose levels, and almost any shape, even non monotonic ones, can be estimated using only two parameters. We illustrated the method using dose-response data arising from five clinical trials on an antipsychotic drug, aripiprazole, and improvement in symptoms in shizoaffective patients. Using the Positive and Negative Syndrome Scale (PANSS), pooled results indicated a non-linear association with the maximum change in mean PANSS score equal to 10.40 (95 % confidence interval 7.48, 13.30) observed for 19.32 mg/day of aripiprazole. No substantial change in PANSS score was observed above this value. An estimated dose of 10.43 mg/day was found to produce 80 % of the maximum predicted response.


The described approach should be adopted to combine correlated differences in means of quantitative outcomes arising from multiple studies. Sensitivity analysis can be a useful tool to assess the robustness of the overall dose-response curve to different modelling strategies. A user-friendly R package has been developed to facilitate applications by practitioners.

Peer Review reports


The identification and characterization of dose-response relationships is an essential part of the analysis in many scientific disciplines such as toxicology, pharmacology, and epidemiology. This is particularly important in the development and testing of new compounds (e.g. a new drug, pharmaceutical treatment) where trials at different stages aim to evaluate the efficacy of increasing levels of dosage (Phase II-III trials) or to derive a dose-response curve for selection of optimal doses (Phase IV trials) [1, 2].

Randomized clinical trials often investigate a continuous outcome variable, such as the efficacy or safety of a drug, reporting the change from baseline of a medical score, or the final value of a clinical measurement. The dose-response results are typically summarized by dose-specific means and standard deviations [3]. Measures of effect are expressed in terms of mean or standardized mean differences using a dose level, usually the placebo group, as referent [1]. Over the last few years methodological research focused on developing and improving methods for performing dose-response analysis in a single study [4, 5]. A conclusive result is hardly obtained by a single investigation and there is often the need to synthesize information collected from multiple studies. In such a case meta-analytic methods can be used to define an overall relation or to investigate heterogeneity across study findings.

A method for pooling aggregated dose-response data where the outcome is a log relative risk was originally presented by Greenland and Longnecker in 1992 [6]. Since then, several papers have refined and covered specific aspects of the methodology such as model specification [7, 8], modeling strategies [9, 10], and software implementation [11, 12]. Other methodological articles extended the approach for continuous outcome but in the case where individual patient data are available, mainly in the context of time-series environmental studies [1315].

Only a few alternatives have been proposed to pool aggregated dose-response data where the findings are summarized by differences in means. Davis and Chen [16] in 2004 described a methodology for summarizing dose-response curves of first and second generation antipsychotics in schizoaffective patients. The authors reconstructed drug-specific dose-response curves and conducted a meta-analysis to compare the effectiveness of medium vs high dosages. A common alternative to analyze the drug effect consists of fitting a random-intercept E max model, where the random component accounts for heterogeneity in placebo effect across trials [17]. Heterogeneity, however, may be related to other study characteristics rather than differences in placebo response such as implementation, participants, intervention, and outcome definition. Thomas et al. [18] adopted hierarchical Bayesian models to summarize and describe, independently, the distribution of study-specific model parameters derived from an E max model.

The mentioned strategies assumed pre-specified models that do not allow for non-monotonic curves which may occur in practice [19], as in case of dose-response data of antipsychotics. In addition, fitting study-specific sigmoidal curves such as the E max model requires that the single studies have assessed at least three dose levels in order to estimate model parameters. Discarding studies not providing enough data points represents a loss of information and may introduce bias.

The aim of this paper is to formalize and propose a general and flexible methodology to pool dose-response relations from aggregated data where the changes in the distribution of the quantitative outcome are expressed in terms of differences in means. We first present the data necessary for a dose-response meta-analysis and derive formulas for obtaining effect sizes and their variance/covariance structure. We describe flexible dose-response models with particular emphasis on regression splines. The method is then applied to dose-response data from clinical trials on use of aripiprazole and symptoms improvement in schizoaffective patients.


Dose-response data

The notation and data required for a dose-response meta-analysis for a generic study are displayed in Table 1. We consider I studies indexed by i=1,…,I reporting the results of a common treatment at different dose levels x ij ,j=1,…,J i , where x 0i =0 indicates the control or placebo group in the i-th study. The study-specific results typically consist of dose-specific means of an outcome variable, Y ij , that measures the efficacy of the j-th dose in the i-th study [3]. Additional information about the number of patients allocated in each treatment, n ij , and the sample standard deviations of Y ij , s d ij , is generally reported or obtained from the study-specific results.

Table 1 Notation for aggregated data in the i-th study used in dose-response meta-analysis of differences in meas

Effect sizes and their variance/covariance

A common way to reduce heterogeneity in placebo response is to compute the effect size (or treatment effect) as difference between dose-specific means and placebo mean. In case all studies measure the outcome on a common and interpretable scale, the difference can be based on the absolute scale

$$ d_{ij} = \bar{Y}_{ij} - \bar{Y}_{i0}, \quad j = 1,\dots, J_{i}, \; i = 1, \dots, I $$

Assuming common study-specific population standard deviations, the variance of d ij is defined as

$$ \text{Var} \left(d_{ij} \right) = \frac{n_{ij}+ n_{i0}}{n_{ij} n_{i0}} s_{p_{i}}^{2}, \quad j = 1,\dots, J_{i}, \; i = 1, \dots, I $$

where \(s_{p_{i}}^{2} = \sum _{j = 0}^{J_{i}} \left (n_{ij} - 1 \right) sd_{ij}^{2} / \sum _{j = 0}^{J_{i}} \left (n_{ij} - 1 \right)\) is the square of the pooled standard deviation for the i-th study. Since the study-specific mean differences d ij use the same referent values, \(\bar {Y}_{i0}\), they cannot be regarded as independent. The covariance term is defined as

$$ \text{Cov} \left(d_{ij}, d_{ij^{'}} \right) = \text{Var} \left(\bar{Y}_{j0} \right) = \frac{s_{i0}^{2}}{n_{i0}}, \quad j \ne j^{'}, \; i = 1,\dots, I $$

In case the outcome is measured on different scales the effect sizes can be based on standardized mean differences

$$ d_{ij}^{*} = \frac{\bar{Y}_{ij} - \bar{Y}_{i0}}{s_{p_{i}}}, \quad j = 1,\dots, J_{i}, \; i = 1, \dots, I $$


$$\begin{array}{*{20}l} \text{Var} \left(d_{ij}^{*} \right) &= \frac{1}{n_{ij}} + \frac{1}{n_{i0}} + \frac{{d_{ij}^{*}}^{2}}{2 \sum_{j=0}^{J_{i}} n_{ij}}, \quad j = 1,\dots, J_{i},\\ i &= 1, \dots, I \\ \text{Cov} \left(d_{ij}^{*}, d_{ij^{'}}^{*} \right) &= \frac{1}{n_{i0}} + \frac{d_{ij}^{*}d_{ij^{'}}^{*}}{2 \sum_{j=0}^{J_{i}} n_{ij}}, \quad j \ne j^{'}, \; i = 1,\dots, I \end{array} $$

Dose-response analysis

The chosen effect sizes and the corresponding (co)variances are used to estimate the study-specific dose-response curves. The dose-response curves characterize the relative efficacy of the dose under investigation using the placebo effect as referent (i.e. the relative efficacy for the placebo is zero by definition). The dose-response models are expressed through the parametric model f, which specifies how the effect size varies according to the dose values. The functional relationship f is parametrized in terms of θ i , the p×1 vector of dose-response coefficients. We consider the case of mean differences, d ij , but the same principles apply for standardized mean differences, \(d^{*}_{ij}\). The study-specific curves can be written as

$$ \boldsymbol{d}_{i} = f\left(\boldsymbol{x}_{i}, \boldsymbol{\theta}_{i} \right) + \boldsymbol{\varepsilon}_{i}, \;\;\; \boldsymbol{\varepsilon}_{i} \sim N \left(\boldsymbol{0}, \boldsymbol{\hat \Sigma}_{i} \right), \quad i = 1, \dots, I $$

\(\boldsymbol {\hat \Sigma }_{i} \) is the covariance matrix of the residual error term, with Var(d ij ) along the diagonal and \(\text {Cov} \left (d_{ij}, d_{ij^{'}} \right)\) off-diagonal.

Several alternatives are available to model the dose-response curve (i.e. for the choice of f). Table 2 shows the definition of 4 types of models ordered according to the number of parameters, ranging from 1 for a linear model up to 3 for a logistic model. See Bretz et al. for a comprehensive description [2].

Table 2 Frequently used dose-response models

The most common choice in dose findings [2] is the use of the E max model which is expressed in terms of three parameters: the maximum effect (θ 1i ), the dose to produce half of the maximum effect (θ 2i ) and the steepness of the curve (θ 3i ) [20]. As other non-linear models, the E max model assumes a specific shape that does not allow for non-monotonic curve and its estimation requires at least three non reference dose levels. Quadratic models are defined by only p=2 coefficients but may poorly fit at extreme dose values [9]. Other non-linear models such as logistic and sigmoidal models, are commonly defined by p≥3 coefficients so that study-specific aggregated data may not be sufficient to estimate the parameters.

We propose the use of regression splines to flexibly model the dose of interest. Splines represent a family of smooth functions that can describe a wide range of curves (i.e. U-shaped, J-shaped, S-shaped, threshold) [21]. The curves consist of piecewise polynomials over consecutive intervals defined by k knots. Their use may facilitate curve fitting since many non-linear curves can be examined by estimating only a small number of coefficients. For instance, a restricted cubic spline model with three knots k=(k 1,k 2,k 3) is defined only in terms of p=2 coefficients [22]

$$ \mathrm{E} \left[ \boldsymbol{d}_{i} | \boldsymbol{x}_{i} \right] = \theta_{1i} \boldsymbol{x}_{1i} + \theta_{2i} \boldsymbol{x}_{2i} $$

with two transformations [23] defined as

$$\begin{array}{*{20}l} x_{1} &= x \\ x_{2} &= \frac{\left(x - k_{1} \right)_{+}^{3} - \frac{k_{3} - k_{1}}{k_{3} - k_{2}} \left(x - k_{2} \right)_{+}^{3} + \frac{k_{2} - k_{1}}{k_{3} - k_{2}} \left(x - k_{3} \right)_{+}^{3}}{ (k_{3} - k_{1})^{2}} \end{array} $$

where the ‘+’ notation, with u +=u if u≥0 and u +=0 otherwise, has been used.

An alternative flexible approach to model the dose-response association is represented by fractional polynomials. In particular, a dose-response model based on fractional polynomial of order two can be written as in Eq. 6 with the two transformations defined as

$$\begin{array}{*{20}l} x_{1} &= x^{p_{1}} \text{ and } x_{2} = x^{p_{2}} \text{ if \(p_{1} \neq p_{2}\)} \\ x_{1} &= x^{p_{1}} \text{ and } x_{2} = {x}^{p_{1}} \log({x}) \text{ if \(p_{1} = p_{2}\) } \end{array} $$

for each combination of p 1 and p 2 in the predefined set of values {−2,−1,−0.5,0,0.5,1,2,3}; for p=0, x p becomes log(x). The best fitting fractional polynomial is typically chosen based on the Akaike’s Information Criterion [24].

Once the functional relation f has been selected, generalized least square estimation can be performed to efficiently estimates the dose-response coefficients \(\boldsymbol {\hat \theta }_{j}\) and the corresponding (co)variance matrix \(\boldsymbol {\hat V}_{j}\), by minimizing

$$ \left(\boldsymbol{d}_{i} - f\left(\boldsymbol{x}_{i}, \boldsymbol{\theta}_{i} \right) \right)^{T} \boldsymbol{\hat \Sigma}_{i}^{-1} \left(\boldsymbol{d}_{i} - f\left(\boldsymbol{x}_{i}, \boldsymbol{\theta}_{i} \right) \right) $$

that generally requires numerical optimization algorithms. If f is a linear combination of the parameters θ i , as in the case of regression splines and fractional polynomials, the close solution can be written as

$$\begin{array}{*{20}l} \boldsymbol{\hat \theta}_{i} &= \left(\boldsymbol{X}_{i}^{T} \boldsymbol{\hat \Sigma}_{i}^{-1} \boldsymbol{X}_{i} \right)^{-1} \boldsymbol{X}_{i}^{T} \boldsymbol{\hat \Sigma}_{i}^{-1} \boldsymbol{d}_{i} \\ \boldsymbol{\hat V}_{i} &= \text{Var} \left(\boldsymbol{\hat \theta}_{i} \right) = \left(\boldsymbol{X}_{i}^{T} \boldsymbol{\hat \Sigma}_{i}^{-1} \boldsymbol{X}_{i} \right)^{-1} \end{array} $$

where X i indicates the J i ×p design matrix in the i-th study.


The estimated study-specific dose-response coefficients \(\boldsymbol {\hat \theta }_{i}\) and the accompanying (co)variance matrices \(\boldsymbol {\hat V}_{i}\) are combined by means of multivariate meta-analysis

$$ \boldsymbol{\hat \theta}_{i} \sim N \left(\boldsymbol{\theta}, \boldsymbol{\hat V}_{i} + \boldsymbol{\Psi} \right) $$

A fixed-effects model assumes no statistical heterogeneity among study results, i.e. differences in the dose-response coefficients are only related to sampling error. The assumption of homogeneity may not hold in practice, unless it is known that the studies are performed in a similar way and are sampled from the same population [25]. The Cochran’s Q test [26] is typically used to test statistical heterogeneity across studies (H 0:Ψ=0) [27]. Selected studies, however, will typically differ with respect to study design and implementation, selection of participants, and type of analyses. A certain degree of heterogeneity is expected and should be taken into account in the analysis. A random-effects model allows the dose-response coefficients, θ i , to vary across studies. Statistical heterogeneity is captured by the between-studies variance Ψ while θ represents the mean of the distribution of dose-response coefficients and an estimate, \(\boldsymbol {\hat \theta }\), can be obtained using (restricted) maximum likelihood estimation [15].

As a final result, the pooled dose-response curve can be presented in either a graphical or tabular form by predicting the mean differences of the outcome for a set of x dose values

$$ \mathrm{E} [ \boldsymbol{\hat{d}} | \boldsymbol{x} ] = f\left(\boldsymbol{X}, \boldsymbol{\hat \theta} \right) $$

with an approximate (1−α/2)% confidence interval (CI), that in case f is a linear combination of θ can be expressed as

$$ \mathrm{E} [ \boldsymbol{\hat{d}} | \boldsymbol{x} ] \mp z_{\frac{\alpha}{2}} \sqrt{\text{diag} \left(f\left(\boldsymbol{X}, \boldsymbol{\hat \theta} \right)^{T} \text{Cov}\left(\boldsymbol{\hat \theta} \right) f\left(\boldsymbol{X}, \boldsymbol{\hat \theta} \right) \right)} $$

where z α/2 is the α/2-th quantile of a standard normal distribution.

Dose findings

Once the pooled dose-response curve has been estimated, it may be of interest to determine a set of target doses, i.e. doses associated with prespecified outcome effects. In development of new compounds it is often important to select an optimal dose which is almost as effective as the maximum effective dose but has less undesired side effects, which often occur at high dosages. Suppose one wants to determine which is the lowest dose (ED γ ) to produce an almost complete effect, e.g. γ % of the observed maximum predicted response.

The ED γ can be determined as

$$ \mathrm{\widehat{ED}}_{\gamma} = \underset{x \in \left(0, x_{\text{max}} \right]}{\operatorname{argmax}} \left\{ \frac{\mathrm{E} [ {\hat{d}} | {x} ] }{\mathrm{E} [ {\hat{d}} | {{x_{\text{max}}}} ]} \geq \gamma \right\} $$

where x max is the dose corresponding to the maximum predicted outcome.

An important step when presenting results from dose findings analysis is to accompany the previous estimates with a measure of precision, typically confidence intervals. Pinheiro et al. [3] proposed the use a parametric bootstrap approach based on the asymptotic normal distribution of \(\boldsymbol {\hat \theta }\), the pooled estimate of the dose-response coefficients. The approach consists in re-sampling the dose-response coefficients θ from its approximate normal distribution and derive the distribution of \(\mathrm {\widehat {ED}}_{\gamma }\) based on the samples. Approximated confidence intervals for \(\mathrm {\widehat {ED}}_{\gamma }\) can be constructed using percentiles of the sampling distribution.


To illustrate the methodology we examined the dose-response relation between aripiprazole, a second-generation antipsychotic, and symptoms improvement in schizoaffective patients. We updated the search strategy presented in a previous review by Davis and Chen [16] by searching the Medline, International Pharmaceutical Abstracts, CINAHL, and the Cochrane Database of Systematic Reviews. To reduce the exclusion of unpublished papers, additional sources including Food and Drug administration website, data from Cochrane reviews, poster presentations and conference abstracts were also searched. All random-assignment, double-blind, controlled clinical trials of schizoaffective patients providing dose-response results for at least two non-zero dosages of aripiprazole were eligible.

Five studies [2832] met the inclusion criteria and were included in the analysis. All the studies reported mean changes from baseline as main outcome variable, using the Positive and Negative Syndrome Scale (PANSS). The PANSS scale is an ordinal score derived from 30 items ranging from 1 to 7. Computations of ratios such as percentage changes are not directly applicable and may lead to erroneous results [33, 34]. To address this issue, the theoretical minimum (i.e. 30) needs to be subtracted from the original score [35]. Information about the means, the number of patients assigned to each treatment, and the standard deviation was available from the published data. Because all the studies measured the outcome variable on the same scale, we computed PANSS mean differences as effect sizes. Data are reported in Table 3.

Table 3 Aggregated dose-response data of five clinical trials investigating effectiveness of different dosages of aripiprazole in schizoaffective patients. The continuous outcome is measured on the Positive and Negative Syndrome Scale and summarized by mean values (mean(Y)) and standard deviations (sd(Y))

We used the trial by Cutler et al. [28] to illustrate the steps required for estimating the dose-response curve for a single study. For example, the difference in mean PANSS comparing the dose of 2 mg/day relative to 0 mg/day is d 11=8.23−5.3=2.93 mg/day. Its variance is \(\text {Var}(d_{11}) = (85+92)/(85\times 92)\times s_{p_{1}}^{2}\) = 7.59, where \(s_{p_{1}}^{2}=119,419/356=335.4\). The covariance of this difference in means PANSS is 18.312/85=3.94. The variance/covariance structure associated with the vector of differences in means for this trial d 1 can be presented in a matrix form

$$\begin{array}{@{}rcl@{}} \boldsymbol{\hat \Sigma_{1}} =\left[ \begin{array}{lll} 7.59 & & \\ 3.94 & 7.72 & \\ 3.94 & 3.94 & 7.52 \\ \end{array}\right] \end{array} $$

To estimate the dose-response curve we need first to specify the model f. We characterized the dose-response relation using a restricted cubic spline model with three knots located at the 10th, 50th, and 90th percentiles (0, 10, and 30 mg/day) of the overall dose distribution (p = 2). The restricted cubic spline dose-response model is defined as in Eq. 7. Efficient estimates of the dose-response coefficients and (co)variance matrix were obtained by generalized least square estimation

$$\begin{array}{*{20}l} \boldsymbol{\hat \theta}_{1} &= (1.215, -5.738)^{T} \\ \boldsymbol{\hat V}_{1} &= \text{Var} \left(\boldsymbol{\hat \theta}_{1} \right) =\left[ \begin{array}{ll} 0.49 & \\ -3.65 & 31.64 \\ \end{array}\right] \end{array} $$

We applied the same procedure to the other studies included in the meta-analysis in order to obtain the study-specific \(\boldsymbol {\hat \theta }_{i}\) and \(\boldsymbol {\hat {V}}_{i}\), i=1,…,5 (Table 4). The study-specific predicted curves are presented in Fig. 1. Under a random-effects model, restricted maximum likelihood estimates were

$$ \begin{aligned} \boldsymbol{\hat \theta} &= (0.937, -1.156)^{T} \\ \widehat{\text{Cov}} \left(\boldsymbol{\hat \theta} \right) &= \left[\begin{array}{cc} 0.03 & \\ -0.05 & 0.10 \\ \end{array}\right] \end{aligned} $$
Fig. 1

Study-specific mean differences in Positive and Negative Syndrome Scale score for increasing dosages of aripiprazole. The first author and year of publication of the subjects included in the original analyses are reported. Black squares indicate the mean differences and whiskers their 95 % confidence interval. Ariprazole dosage was modeled with restricted cubic splines. Solid lines represent the estimated dose-response curves, dashed lines the corresponding 95 % confidence intervals. The placebo group (dose = 0) served as the referent group

Table 4 Study-specific dose-response coefficients and corresponding covariances for different dose-response models considered in the analysis

A p-value < 0.001 for the multivariate Wald-type test H 0:θ=0 provided strong evidence against the null hypothesis of no relation between different doses of aripiprazole and mean change PANSS score. The Q test (Q=3.5, p-value = 0.899) did not detect substantial statistical heterogeneity across studies.

To communicate results of the pooled dose-response analysis, we can estimate the pooled mean differences in PANSS scores using 0 mg/day as referent as 0.937x 1−1.156x 2, together with the corresponding 95 % confidence interval for a generic dose x of interest as following

$$\left(0.937 x_{1} -1.156 x_{2} \right) \mp 1.96 \sqrt{0.03 {x_{1}^{2}} + 0.1 {x_{2}^{2}} - 0.1x_{1}x_{2}} $$

where x 1 and x 2 are defined as in Eq. 8. For instance, the model-based predicted mean changes in PANSS score compared to placebo were 4.52 (95 % CI: 2.96, 6.08) for 5 mg/day, 8.08 (95 % CI: 5.43, 10.73) for 10 mg/day, 9.95 (95 % CI: 6.97, 12.94) for 15 mg/day, 10.38 (95 % CI: 7.49, 13.27) for 20 mg/day, 9.84 (95 % CI: 6.86, 12.83) for 25 mg/day, and 8.83 (95 % CI: 5.11, 12.54) for 30 mg/day. The pooled predicted dose-response curve together with the confidence intervals and the model mean differences is provided in Fig. 2.

Fig. 2

Pooled dose-response association between aripiprazole and mean change in Positive and Negative Syndrome Scale score (solid line). Aripiprazole dosage was modeled with restricted cubic splines in a random-effects model. Dash lines represent the 95 % confidence intervals for the spline model. The placebo group (dose = 0) served as the referent group. Circles indicate observed mean differences in individual studies; size of bubbles is proportional to precision (inverse of variance) of the mean differences. Right axis represents percentage of the maximum predicted effect

The results indicated a statistically significant positive association between increasing doses of aripiprazole and the mean change in PANSS score with the maximum value of 10.39 (95 % CI: 7.48, 13.30) observed at x max = 19.32 mg/day. The model suggested a slight decrease in the predicted mean PANSS score for dosages greater than 20 mg/day. The estimated dose to produce 50 % and 80 % of the predicted maximum effect were \(\mathrm {\widehat {ED}}_{50} =\) 5.82 mg/day (95 % CI: 5.10, 8.58) and \(\mathrm {\widehat {ED}}_{80} =\) 10.43 mg/day (95 % CI: 9.02, 16.73).

Sensitivity analysis

A sensitivity analysis is often required to evaluate the robustness of the pooled dose-response curve. In the spline model, for example, the location of the knots may affect the shape of the dose-response curve. Therefore we considered alternative knots locations including different combinations of the 10th, 25th, 50th, 75th and 90th percentiles of the overall dose distribution (0, 0.5, 10, 18.75, and 30 mg/day). A graphical comparison is presented in the left panel of Fig. 3. The alternative curves roughly described the same dose-response shape with no substantial variation, all indicating an increase in the mean change PANSS score up to 20 mg/day of aripiprazole. We can assess whether there is an increasing trend above 20 mg/day by simply re-defining x 2 equal to (x−20)+ in Eq. 8; this approach is known as piecewise linear model. The rate of change in the PANSS mean differences was negative and not statistically significant (θ 1+θ 2=-0.284, p = 0.18) after 20 mg/day of aripiprazole.

Fig. 3

Graphical sensitivity analysis for the pooled dose-response curves between aripiprazole and mean change in Positive and Negative Syndrome Scale score. The placebo group (dose = 0) served as the referent group. Right axis represents percentage of the maximum predicted effect. Left panel: different location of the three knots in a restricted cubic spline model. Right panel: different models, restricted cubic splines (solid line), fractional polynomials (dashed line), quadratic polynomial (dotted line), and E max model (dot-dashed line). Circles indicate observed mean differences in individual studies; size of bubbles is proportional to precision (inverse of variance) of the mean differences

To evaluate the sensitivity of the dose-response curve to the choice of the parametric model f adopted, instead, we considered three alternatives: fractional polynomials; quadratic; and E max. Since two studies only had two non-referent doses, the study-specific (sigmoidal) E max models as described in Table 2 cannot be estimated. A common solution is to fix the steepness of the curve θ 3 to be 1, also referred to as hyperbolic E max [20].

The “best” fractional polynomials (p 1=0.5, p 2=1) provided overall a similar dose-response curve when compared to the spline model, with slightly higher value for the maximum predicted response (right panel of Fig. 3). The hyperbolic E max had substantially higher predicted mean differences for low values of the dose. The non-linear model assumes a specific hyperbolic dose-response curve that did not seem to fit the data and may be dependent from the choice of fixing θ 3 to be 1. The dose-response curve described by the quadratic model fall in between the spline and the hyperbolic E max curves.


In this paper we proposed a statistical method to combine differences in means of quantitative outcomes. The method consists of dose-response models estimated within each study (first stage) and an overall curve obtained by pooling study-specific dose-response coefficients (second stage). The covariance among study-specific mean differences is taken into account in the first stage analysis using generalized least square estimators, while statistical heterogeneity across studies is allowed by multivariate random-effects model in the second stage.

One major strength of the proposed method is that it is fairly general and can accommodate different modeling strategies, including non-linear ones described by Pinheiro et al. [3]. Non-linear models, however, are defined by at least three or four parameters, and hence require an equal number of dose levels for each single study included in the analysis. Given that some studies may have investigated a lower number of dose levels, exclusion of these studies may result in substantial loss of information. In addition, many non-linear models assume a specific behaviour (e.g. monotonicity) requiring a strong a priori information about the dose-response curve. The choice of the parametric model is critical, since it highly influences the final results [3]. Indeed, the selection of the dose-response model should be informed by subject-matter knowledge as well as understanding of the research questions at hand. We presented the use of regression splines as a flexible tool for modeling any quantitative exposure. The major advantage is that a variety of curves, even non monotonic ones, can be estimated using only two parameters. It is considered to be closed to non-parametric regression, since no major assumptions about the shape of the curve are needed [9]. A possible alternative is the use of fractional polynomials. In comparing the two strategies, we did not find important differences between the two strategies and concluded that both are useful tools to characterize a (non-linear) dose-response curve. Nonetheless a sensitivity analysis is generally required to evaluate the robustness of the combined results.

A possible limitation of the proposed methodology is that it requires information about dose-specific means and standard deviations. Studies providing other summary measures, such as dose-specific medians, would not be included the analysis. The dose-response analysis presented in Eq. 6 is based on the asymptotic normal distribution of the conditional mean effect size. Extension of the introduced methodology to percentiles is not straightforward and may represent an interesting topic of future research.

An additional limitation of aggregated dose-response data is that supplementary information for approximating the covariance terms may not be available. Articles may report directly mean or standardized mean differences and standard errors for non-referent dose groups. Whenever the standard deviation for the outcome variable in the control group (\({s_{i0}^{2}}\)) cannot be obtained, it may be approximated using the pooled standard deviation based on the non-referent dose levels (\(s_{p_{ij}}^{2}\)). Alternatively a specific value may be imputed and a sensitivity analysis can be performed to evaluate how the results of the meta-analysis vary for different values of \({s_{0j}^{2}}\). Further limitations relate to the general application of meta-analysis based on aggregated data. These include restrictions in subgroup analyses, the impossibility of assessing the appropriateness of individual analyses, and to harmonize variable definitions and analyses for reducing the extent of heterogeneity, as well as specific biases such as aggregation (or ecological) bias in meta-regression models. Meta-analysis of individual patient data, however, are often difficult to undertake especially for the availability of individual data, so that usage of aggregated data may represent the only alternative [36]. Specific to aggregated dose-response data, different dose references and exposure range may complicate the analysis. The presented methodology assume that all the selected studies share a common dose-response model. Important departure from this assumption may limit and/or impact the pooling of individual dose-response coefficients. An alternative methods has been proposed based on a series of univariate meta-analyses of effect sizes for a pre-specified grid of dose-levels [37]. Further work is needed to analyze this possibility and potential advantages. Depending on the extent of heterogeneity of the dose-response curves, however, it may not be opportune to pool study-specific results, and meta-regression or stratified analyses should be performed [38].

In our application, we considered the effectiveness of increasing dosages of aripiprazole in shizoaffective patients. We described the steps needed to obtain the overall dose-response curve and to present it in a graphical form. We observed a non-linear association with the maximum efficacy corresponding to aripiprazole 19.32 mg/day. An estimated dose of 10.46 mg/day, however, may be sufficient to obtain 80 % of the maximum effect, which may be relevant for avoiding possible undesired side effects. Sensitivity analysis showed similar results as compared to fractional polynomials. The E max model presented higher drug efficacy for low dosages. Compared to the previous models, the E max model did not seem to fit properly the data at low dosages.


We described an approach to combine differences in means of a quantitative outcome contrasting different dose levels relative to a placebo in randomized trials. The general framework of the proposed methodology can include a variety of flexible models. Sensitivity analysis can be a useful tool to assess the stability of the overall dose-response curve to different modelling strategies. Although the method was presented for the analysis of randomized trials, it may be extended to observational studies where mean differences are further adjusted for potential confounders. Future work is needed to evaluate the properties of the statistical model and validity of the underlying assumptions. A user friendly procedure is implemented in the dosresmeta R package [39] with worked examples available on GitHub.


PANSS, positive and negative syndrome scale


  1. 1

    Bretz F, Pinheiro JC, Branson M. Combining multiple comparisons and modeling techniques in dose-response studies. Biometrics. 2005; 61(3):738–48. doi:10.1111/j.1541-0420.2005.00344.x. Accessed 24 Feb 2015

    CAS  Article  PubMed  Google Scholar 

  2. 2

    Bretz F, Hsu J, Pinheiro J, Liu Y. Dose finding - a challenge in statistics. Biometrical Journal. Biometrische Zeitschrift. 2008; 50(4):480–504. doi:10.1002/bimj.200810438.

    Article  PubMed  Google Scholar 

  3. 3

    Pinheiro JC, Bretz F, Branson M. Analysis of dose–response studies. modeling approaches. In: Dose Finding in Drug Development. New York City: Springer: 2006. p. 146–71.

    Google Scholar 

  4. 4

    Ting N. Dose finding in drug development. New York City: Springer; 2006.

    Google Scholar 

  5. 5

    Chevret S. Statistical methods for dose-finding experiments. New York City: Wiley; 2006.

    Google Scholar 

  6. 6

    Greenland S, Longnecker MP. Methods for trend estimation from summarized dose-response data, with applications to meta-analysis. Am J Epidemiol. 1992; 135(11):1301–1309.

    CAS  PubMed  Google Scholar 

  7. 7

    Berlin JA, Longnecker MP, Greenland S. Meta-analysis of epidemiologic dose-response data. Epidemiology. 1993; 4(3):218–28.

    CAS  Article  PubMed  Google Scholar 

  8. 8

    Dumouchel W. Meta-analysis for dose–response models. Stat Med. 1995; 14(5-7):679–85.

    CAS  Article  PubMed  Google Scholar 

  9. 9

    Bagnardi V, Zambon A, Quatto P, Corrao G. Flexible meta-regression functions for modeling aggregate dose-response data, with an application to alcohol and mortality. Am J Epidemiol. 2004; 159(11):1077–1086. doi:10.1093/aje/kwh142. Accessed 25 Nov 2013

    Article  PubMed  Google Scholar 

  10. 10

    Liu Q, Cook NR, Bergström A, Hsieh CC. A two-stage hierarchical regression model for meta-analysis of epidemiologic nonlinear dose–response data. Comput Stat Data Anal. 2009; 53(12):4157–167.

    Article  Google Scholar 

  11. 11

    Orsini N, Bellocco R, Greenland S, et al. Generalized least squares for trend estimation of summarized dose-response data. Stata J. 2006; 6(1):40.

    Google Scholar 

  12. 12

    Orsini N, Li R, Wolk A, Khudyakov P, Spiegelman D. Meta-analysis for linear and nonlinear dose-response relations: examples, an evaluation of approximations, and software. Am J Epidemiol. 2012; 175(1):66–73.

    Article  PubMed  Google Scholar 

  13. 13

    Dominici F, Samet JM, Zeger SL. Combining evidence on air pollution and daily mortality from the 20 largest us cities: a hierarchical modelling strategy. J Royal Stat Soc Ser A (Stat Soc). 2000; 163(3):263–302.

    Article  Google Scholar 

  14. 14

    Simmonds MC, Higginsa JP, Stewartb LA, Tierneyb JF, Clarke MJ, Thompson SG. Meta-analysis of individual patient data from randomized trials: a review of methods used in practice. Clin Trials. 2005; 2(3):209–17.

    Article  PubMed  Google Scholar 

  15. 15

    Gasparrini A, Armstrong B, Kenward MG. Multivariate meta-analysis for non-linear and other multi-parameter associations. Stat Med. 2012; 31(29):3821–839. doi:10.1002/sim.5471. Accessed 11 Dec 2013

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  16. 16

    Davis JM, Chen N. Dose response and dose equivalence of antipsychotics. J Clin Psychopharmacol. 2004; 24(2):192–208.

    CAS  Article  PubMed  Google Scholar 

  17. 17

    Holford NH, Sheiner LB. Understanding the dose-effect relationship: clinical application of pharmacokinetic-pharmacodynamic models. Clin Pharmacokinet. 1981; 6(6):429–53.

    CAS  Article  PubMed  Google Scholar 

  18. 18

    Thomas N, Sweeney K, Somayaji V. Meta-analysis of clinical dose-response in a large drug development portfolio. Stat Biopharm Res. 2014; 6(4):302–17. doi:10.1080/19466315.2014.924876. Accessed 11 Feb 2015

    Article  Google Scholar 

  19. 19

    Geddes J, Freemantle N, Harrison P, Bebbington P. Atypical antipsychotics in the treatment of schizophrenia: systematic overview and meta-regression analysis. BMJ : British Medical Journal. 2000; 321(7273):1371–1376. Accessed 24 Mar 2015.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20

    Macdougall J. Analysis of dose–response studies. emax model. In: Dose Finding in Drug Development. New York City: Springer: 2006. p. 127–45.

    Google Scholar 

  21. 21

    Durrleman S, Simon R. Flexible regression models with cubic splines. Stat Med. 1989; 8(5):551–61. doi:10.1002/sim.4780080504. Accessed 30 Mar 2015

    CAS  Article  PubMed  Google Scholar 

  22. 22

    Desquilbet L, Mariotti F. Dose-response analyses using restricted cubic spline functions in public health research. Stat Med. 2010; 29(9):1037–1057. doi:10.1002/sim.3841. Accessed 25 Nov 2013

    PubMed  Google Scholar 

  23. 23

    Harrell F.E. Jr, Lee KL, Pollock BG. Regression models in clinical studies: determining relationships between predictors and response. J Natl Cancer Inst. 1988; 80(15):1198–1202.

    Article  PubMed  Google Scholar 

  24. 24

    Royston P. A strategy for modelling the effect of a continuous covariate in medicine and epidemiology. Stat Med. 2000; 19(14):1831–1847.

    CAS  Article  PubMed  Google Scholar 

  25. 25

    Jackson D, White IR, Riley RD. Quantifying the impact of between-study heterogeneity in multivariate meta-analyses. Stat Med. 2012; 31(29):3805–820.

    Article  PubMed  PubMed Central  Google Scholar 

  26. 26

    Cochran WG. The combination of estimates from different experiments. Biometrics. 1954; 10(1):101–29. doi:10.2307/3001666. Accessed 09 Jan 2015

    Article  Google Scholar 

  27. 27

    Ritz J, Demidenko E, Spiegelman D. Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling projects. J Stat Plann Infer. 2008; 138(7):1919–1933.

    Article  Google Scholar 

  28. 28

    Cutler AJ, Marcus RN, Hardy SA, O’Donnell A, Carson WH, McQuade RD. The efficacy and safety of lower soses of aripiprazole for the treatment of patients with acute exacerbation of schizophrenia. CNS Spectrums. 2006; 11(09):691–702. doi:10.1017/S1092852900014784. Accessed 06 Feb 2015

    Article  PubMed  Google Scholar 

  29. 29

    McEvoy JP, Daniel DG, Carson WH, McQuade RD, Marcus RN. A randomized, double-blind, placebo-controlled, study of the efficacy and safety of aripiprazole 10, 15 or 20 mg/day for the treatment of patients with acute exacerbations of schizophrenia. J Psychiatr Res. 2007; 41(11):895–905. doi:10.1016/j.jpsychires.2007.05.002.

    Article  PubMed  Google Scholar 

  30. 30

    Kane JM, Carson WH, Saha AR, McQuade RD, Ingenito GG, Zimbroff DL, Ali MW. Efficacy and safety of aripiprazole and haloperidol versus placebo in patients with schizophrenia and schizoaffective disorder. J Clin Psychiatry. 2002; 63(9):763–71.

    CAS  Article  PubMed  Google Scholar 

  31. 31

    Potkin SG, Saha AR, Kujawa MJ, Carson WH, Ali M, Stock E, Stringfellow J, Ingenito G, Marder SR. Aripiprazole, an antipsychotic with a novel mechanism of action, and risperidone vs placebo in patients with schizophrenia and schizoaffective disorder. Arch Gen Psychiatr. 2003; 60(7):681–90. doi:10.1001/archpsyc.60.7.681.

    CAS  Article  PubMed  Google Scholar 

  32. 32

    Turner EH, Knoepflmacher D, Shapley L. Publication bias in antipsychotic trials: an analysis of efficacy comparing the published literature to the US Food and Drug Administration Database. PLoS Med. 2012;9(3). doi:10.1371/journal.pmed.1001189. Accessed 30 Mar 2015.

  33. 33

    Obermeier M, Schennach-Wolff R, Meyer S, Möller HJ, Riedel M, Krause D, Seemüller F. Is the panss used correctly? a systematic review. BMC psychiatry. 2011; 11(1):1.

    Article  Google Scholar 

  34. 34

    Leucht S, Kissling W, Davis JM. The PANSS should be rescaled. Schizophrenia bulletin. 2010; 36:461–462.

  35. 35

    Leucht S, Davis JM, Engel RR, Kane JM, Wagenpfeil S. Defining response in antipsychotic drug trials: recommendations for the use of scale-derived cutoffs. Neuropsychopharmacology. 2007; 32(9):1903–1910.

    CAS  Article  PubMed  Google Scholar 

  36. 36

    Lyman GH, Kuderer NM. The strengths and limitations of meta-analyses based on aggregate data. BMC Med Res Methodol. 2005; 5(1):14.

    Article  PubMed  PubMed Central  Google Scholar 

  37. 37

    Sauerbrei W, Royston P. A new strategy for meta-analysis of continuous covariates in observational studies. Stat Med. 2011; 30(28):3341–360.

    Article  PubMed  Google Scholar 

  38. 38

    Greenland S. Invited commentary: a critical look at some popular meta-analytic methods. Am J Epidemiol. 1994; 140(3):290–6.

    CAS  PubMed  Google Scholar 

  39. 39

    Crippa A, Orsini N. Multivariate dose-response meta-analysis: the dosresmeta r package. J Stat Softw. 2016. In Press.

Download references


We are grateful to Dr. Stefan Leucht and John M. Davis for providing the data and raising the methodological question under study.


This work was supported by Karolinska Institutet’s funding for doctoral students (KID-funding) (AC) and by a Young Scholar Award from the Karolinska Institutet’s Strategic Program in Epidemiology (SfoEpi) (NO).

Availability of data and materials

The data on the effectiveness of aripiprazole are publicly available and also contained in dosresmeta R package on github [39] (

Authors’ contributions

AC developed the methods and prepared a draft. NO provided critical reviews, corrections and revisions. Both authors read and approved the final version of the manuscript.

Authors’ information

AC is a PhD student in Epidemiology and Biostatistics. Dr. NO is Associate Professor of Medical Statistics.

Competing interests

The authors declare that they have no competing interest.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Author information



Corresponding author

Correspondence to Alessio Crippa.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Crippa, A., Orsini, N. Dose-response meta-analysis of differences in means. BMC Med Res Methodol 16, 91 (2016).

Download citation


  • Meta-analysis
  • Dose-response
  • Mean differences
  • Random-effects