Skip to main content
  • Research article
  • Open access
  • Published:

Efficient two-step multivariate random effects meta-analysis of individual participant data for longitudinal clinical trials using mixed effects models

Abstract

Background

Mixed effects models have been widely applied in clinical trials that involve longitudinal repeated measurements, which possibly contain missing outcome data. In meta-analysis of individual participant data (IPD) based on these longitudinal studies, joint synthesis of the regression coefficient parameters can improve efficiency, especially for explorations of effect modifiers that are useful to predict the response or lack of response to particular treatments.

Methods

In this article, we provide a valid and efficient two-step method for IPD meta-analyses using the mixed effects models that adequately addresses the between-studies heterogeneity using random effects models. The two-step method overcomes the practical difficulties of computations and modellings of the heterogeneity in the one-step method, and enables valid inference without loss of efficiency. We also show the two-step method can effectively circumvent the modellings of the between-studies heterogeneity of the variance-covariance parameters and provide valid and efficient estimators for the regression coefficient parameters, which are the primary objects of interests in the longitudinal studies. In addition, these methods can be easily implemented using standard statistical packages, and enable synthesis of IPD from different sources (e.g., from different platforms of clinical trial data sharing systems).

Results

To assess the proposed method, we conducted simulation studies and also applied the method to an IPD meta-analysis of clinical trials for new generation antidepressants. Through the numerical studies, the validity and efficiency of the proposed method were demonstrated.

Conclusions

The two-step approach is an effective method for IPD meta-analyses of longitudinal clinical trials using mixed effects models. It can also effectively circumvent the modellings of the between-studies heterogeneity of the variance-covariance parameters, and enable efficient inferences for the regression coefficient parameters.

Peer Review reports

Background

In clinical trials for drug developments, the outcome variables often involve longitudinal repeated measurements. In these trials, the primary statistical analyses usually compare the responses from experimental treatments with a control treatment at the end of a follow-up period. However, during the follow-up period, dropouts or missing data often arise, and may seriously influence the validity and precision of the statistical inference. Conventionally, simple imputation methods such as LOCF (last observation carried forward) had been adopted to handle these problems, but these ad-hoc approaches have been known to provide biased results and possibly produce erroneous conclusions [1, 2]. Recently, regulatory guidelines concerning missing data problems in clinical trials have been issued [1, 3, 4], and these invalid practices are rapidly substituted by more sophisticated methods. The mixed effects model [5, 6], especially known as the MMRM (mixed effects model for repeated measures) [7], is one of these methods and has been widely adopted for primary analyses of recent clinical trials in drug developments, which allows for valid statistical inference based on the direct likelihood approach under the missing at random (MAR) assumptions [5, 6].

The mixed effects model are increasingly used in the context of evidence synthesis as well. Recent advances of clinical trials data sharing systems (e.g., ClinicalStudyDataRequest.com, the YODA project) have enabled evidence synthesis of large individual participant level datasets and explorations of effect modifiers that are useful to predict the response or lack of response to particular treatments, because the powers of treatment-by-covariate interaction tests increasingly gain using rich statistical information from multiple clinical trials [8, 9]. However, there are additional complexities in the individual participant data (IPD) meta-analysis to synthesize longitudinal trial datasets based on the mixed effects models, due to the longitudinal nature of the datasets. The most important issue is about addressing correlations among different time points in the repeated measurements and the between-studies heterogeneity. In the IPD meta-analyses of ordinary univariate regression models such as analysis of covariance (ANCOVA) models, the synthesizing analyses can be simply implemented by multivariate meta-analysis [10, 11], but for the mixed effects models, the between-studies heterogeneity should be adequately considered not only for the fixed effects coefficients, but also for the variance-covariance parameters, simultaneously. In addition, to use the statistical information among multiple trials maximally, we should consider the correlations among the estimators of model parameters to gain efficiency. In the IPD meta-analyses using mixed effects models, Ishak et al. [12] considered to aggregate only summarized data of each time point, but their methods do not use the correlation information. Also, Jones et al. [13] proposed to implement IPD meta-analyses of longitudinal data using mixed effects models, but their methods only assumed fixed effects models for the regression coefficient parameters, and the heterogeneity among the studies was not considered. In existing IPD meta-analyses, these problems have also not been addressed and fixed effects models that assume no heterogeneity have been usually used.

In this article, we provide a valid and efficient method for IPD meta-analyses using the mixed effects models that adequately address the between-studies heterogeneity using random effects models. We especially show the ordinary one-step methods have practical difficulties on computations and modellings of the heterogeneity, and we provide a two-step approach as an alternative effective procedure to overcome these problems. We will show that the two-step method can effectively circumvent the modellings of the between-studies heterogeneity of the variance-covariance parameters and provides valid and efficient estimators for the regression coefficient parameters, which are the primary objects of interests. In addition, these methods can be easily implemented using standard statistical software, and have advantages to synthesize IPD from different sources of clinical trial datasets (e.g., from different platforms of clinical trial data sharing systems). We also demonstrate the effectiveness of these methods via simulation studies and a real data application to an IPD meta-analysis of clinical trials for new generation antidepressants [14].

Methods

Mixed effects models for longitudinal data

We consider a meta-analysis of N longitudinal clinical trials with ni individual participants within the ith trial (i = 1, … , N). Let Yijt be continuous repeated measurement outcome of tth time point of participant j in trial i, (j = 1, … , ni; t = 1, … , T), possibly to be missing at some points. Note that the time points planned to be measured the outcome variables are uncommon across the N trials, we can formally regard these unmeasured time point data as ‘missing’ without loss of generality.

At first, we discuss linear mixed effect models [5, 6] for the analysis of individual trials separately,

$$ {\boldsymbol{Y}}_{ij}={\boldsymbol{X}}_{ij}{\boldsymbol{\beta}}_i+{\boldsymbol{Z}}_{ij}{\boldsymbol{b}}_{ij}+{\boldsymbol{\varepsilon}}_{ij} $$
(1)

where Yij = (Yij1,  … , YijT)T is a T-vector of repeated measurements for participant j in trial i, Xij = (xij1,  … , xijT)T is a T × p matrix of fixed effect explanatory covariates, βi is a p-vector of fixed regression coefficients, Zij = (zij1,  … , zijT)T is a T × q matrix of random-effect covariates of participant j in trial i (usually a subset of Xij), and bij is a q-vector of random effects, assumed to follow a multivariate normal distribution bij~MVN(0, Di), where Di is the q × q covariance matrix of the random effects distribution. εij is a T-vector of random error, assumed to follow a multivariate normal distribution εij~MVN(0, Σi), where Σi is the T × T covariance matrix of the random error distribution. The random effects bij and the error εij are assumed to be independent, and thus the outcome variables among different participants are also independent. Yij marginally follows MVN(Xijβi, Vij), where \( {\boldsymbol{V}}_{ij}={\boldsymbol{Z}}_{ij}\boldsymbol{D}{\boldsymbol{Z}}_{ij}^T+{\boldsymbol{\Sigma}}_i \). The model parameters can be estimated by maximum likelihood (ML) or restricted maximum likelihood (REML) methods [6]; we suppose to use the REML estimator here owing to its theoretical property [6, 15]. If missing outcome data are involved, these likelihood-based inference methods provide valid statistical inference results under MAR assumptions [5, 6], and thus the linear mixed models have been widely adopted in longitudinal data analyses. In addition, recently, owing to the developments of regulatory guidelines concerning missing data treatments in clinical trials [3, 4], the mixed effects models have been increasingly popular. Especially, an alternative parameterized model named as MMRM [7] has been adopted for primary analyses of many clinical trials, which directly models the marginal distribution of Yij’s and parameterize the covariance matrix Vij not specifying D and Σi. The primary objects of interests are usually the fixed effects parameters, and we have rather little interests in the variance-covariance parameters in these clinical trials [16, 17]. Thus, through the re-parameterization to the marginal covariance matrix Vij, possible model misspecifications concerning the random effects structures might be effectively circumvented in the statistical inference. The procedures of statistical analyses and their theoretical properties are identical between the two parameterized models, so we conduct the following discussions using the former model (1) in this article.

IPD meta-analyses based on the mixed effects models

Based on the mixed effects model (1), IPD meta-analysis can be implemented in a straightforward manner. Jones et al. [13] proposed a fixed effects model with an assumption the regression coefficients βi’s are common across the N trials using both of one-step and two-step approaches. However, the between-studies heterogeneity is an essential issue to be addressed in modern evidence synthesis researches [9], so the random effects models are preferred in general [9, 18]. While, there are several crucial problems to be addressed in applying them to the mixed effects models for IPD meta-analysis:

1. When we consider the heterogeneity across studies, we should adequately address the heterogeneities for both of the fixed effects and variance-covariance parameters in the synthesizing analyses. A straightforward approach would be to model these parameters as random effects, the joint random effects model for bij and the components of Vij. This strategy would also require complicated computations that cannot be implemented in standard statistical packages, and we should completely specify the statistical model without any misspecifications in order to assure the consistency of parameter estimation for fixed effects. Besides, we have little interests in the heterogeneities of variance-covariance parameters, i.e., the components of Vij in the longitudinal clinical trials [16, 17].

2. In most of meta-analyses of longitudinal trials, the time-points to be planned to collect outcome measurements are not common across studies. In the mixed effects model analyses, the time variables are often treated as factor variables (especially for the MMRM analyses [19]), and then all of the outcome variables at several time points are possibly missing (unmeasured) in some studies. In these cases, when we conduct a one-step random meta-analysis using standard statistical packages (e.g., PROC MIXED in SAS, lme4 [20] for R), the statistical model can be partially not identifiable, i.e., we cannot obtain the model estimates.

Given these problems, there are practical difficulties in implementing ordinary one-step IPD meta-analyses. Note that for the first problem, Jones et al. [13] proposed another approach to modelling the variance-covariance parameters as different parameters across studies in their fixed effects model framework, but these approaches are not implementable using standard packages. In addition, this problem has not been usually considered in many existing IPD meta-analyses, possibly due to its practical difficulties, and common variance-covariance structures have usually been assumed. Besides, the second issue can be resolved by making an original program that maximises the (restricted) likelihood function that assumes different regression functions across the studies (and possibly different random effects structures), but it requires usually large efforts in making the complicated original programs in case-by-case practices.

Two-step marginal multivariate random effects analyses

To circumvent the complicated modelling problems and computational difficulties, we provide an alternative two-step approach for IPD meta-analyses based on the mixed effects models. Our proposition is to divide the model parameters into two components, fixed effects parameters and variance-covariance parameters, and to conduct synthesis analyses after obtaining study-specific aggregated statistics separately. Because we have little interests in the random effects in these clinical trials and the mixed effects models are typically designed to assess fixed effects [16, 17], here we specifically focus on the synthesis analyses for the fixed effects parameters. As noted later, the same method can be adapted to the synthesis analyses of the variance-covariance parameters. The computations are easily implementable, but we can show the resultant inference is valid and efficient, and that its theoretical properties do not depend on any heterogeneous structures of within-studies variance-covariance parameters.

We describe the procedure as follows.

  • Step-1. Analyse the IPD from N longitudinal trials separately, using the mixed effects models (1), and obtain the REML estimates of the model parameters (\( {\widehat{\boldsymbol{\beta}}}_i,{\widehat{\boldsymbol{D}}}_i,{\widehat{\boldsymbol{\Sigma}}}_i \)), i = 1, 2, … , N. In addition, obtain the variance-covariance matrix estimates of \( {\widehat{\boldsymbol{\beta}}}_i \), \( {\widehat{\boldsymbol{S}}}_i\left({\widehat{\boldsymbol{\beta}}}_i\right) \) for i = 1, 2, … , N.

  • Step-2. Conduct synthesizing analyses for the regression coefficient parameters \( \Big({\widehat{\boldsymbol{\beta}}}_i \), \( {\widehat{\boldsymbol{S}}}_i\left({\widehat{\boldsymbol{\beta}}}_i\right)\Big) \) that are the primary objects of interests in these meta-analyses, dropping statistical information about the within-studies covariance parameters (\( {\widehat{\boldsymbol{D}}}_i,{\widehat{\boldsymbol{\Sigma}}}_i \)),

$$ {\widehat{\boldsymbol{\beta}}}_i\sim MVN\left({\boldsymbol{\beta}}_i,{\widehat{\boldsymbol{S}}}_i\left({\widehat{\boldsymbol{\beta}}}_i\right)\right) $$
(2)
$$ {\boldsymbol{\beta}}_i\sim MVN\left(\boldsymbol{\mu}, \boldsymbol{\Omega} \right) $$

Inference concerning the model parameters (μ, Ω) can be conducted using the log likelihood or the restricted log likelihood functions [21, 22],

$$ \ell \left(\boldsymbol{\mu}, \boldsymbol{\Omega} \right)=-\frac{1}{2}\sum \limits_{i=1}^N\left\{\log \left|\boldsymbol{\Omega} +{\widehat{\boldsymbol{S}}}_i\left({\widehat{\boldsymbol{\beta}}}_i\right)\right|+{\left({\widehat{\boldsymbol{\beta}}}_i-\boldsymbol{\mu} \right)}^T{\boldsymbol{W}}_i\left({\widehat{\boldsymbol{\beta}}}_i-\boldsymbol{\mu} \right)+p\log 2\pi \right\}, $$
$$ {\ell}_{RL}\left(\boldsymbol{\mu}, \boldsymbol{\Omega} \right)=\ell \left(\boldsymbol{\mu}, \boldsymbol{\Omega} \right)-\frac{1}{2}\log \left|\sum \limits_{i=1}^N{\boldsymbol{W}}_i\right|+\frac{1}{2}p\log 2\pi $$

where \( {\boldsymbol{W}}_i={\left(\boldsymbol{\Omega} +{\widehat{\boldsymbol{S}}}_i\left({\widehat{\boldsymbol{\beta}}}_i\right)\right)}^{-1} \).

The computation of Step-1 is straightforwardly conducted using standard packages for linear mixed models, e.g., PROC MIXED in SAS, lme4 [20] for R. Also, Step-2 can be also simply implemented using existing packages for multivariate meta-analyses, e.g., mvmeta for Stata [23, 24] and R [25].

The resultant estimators of (μ, Ω) have the following properties.

Remark A

There are no efficiency losses regardless of dropping the within-studies variance-covariance information of (\( {\widehat{\boldsymbol{D}}}_i,{\widehat{\boldsymbol{\Sigma}}}_i \)) in the multivariate meta-analysis models (2) for inferences of (μ, Ω), i.e., the ML and REML estimators of (μ, Ω) on the multivariate meta-analysis models (2) are the most efficient estimators, and their efficiency bound is equivalent to that of the one-step estimators for the completely specified correct model.

This property is assured because the ML and REML estimators of βi and (Di, Σi) are orthogonal [6] in the sense of Cox and Reid [26], i.e., the non-diagonal submatrices of the information matrix of mixed model (1) are zero-matrices. Therefore, the statistical information of (\( {\widehat{\boldsymbol{D}}}_i,{\widehat{\boldsymbol{\Sigma}}}_i\Big) \) does not contribute to synthesizing βi’s when we conduct one-step IPD meta-analyses under the completely specified correct model. Thus, fully efficient estimator of (μ, Ω) can be obtained via the two-step method above, even if the statistical information of (Di, Σi) are dropped. This property is confirmed in the simulation studies in Section 3.

Remark B

When the heterogeneity for the variance-covariance parameters of (Di, Σi) is modelled by a random effects model G(D, Σ), we can obtain the consistent and fully efficient estimator of (μ, Ω) using the two-step method under any distribution form of G(D, Σ).

Note this property is important that it assures the validity and optimality of the two-step method without modelling the random effects model G(D, Σ) explicitly, even if there exists substantial heterogeneity for the variance-covariance parameters. This property is assured because the ML and REML estimators of βi and (Di, Σi) are orthogonal as the same arguments with Remark A. Besides, if we adopt the one-step approach, we should completely specify the joint random effects distributions of βi and (Di, Σi) correctly to assure the consistency. These parametric assumptions are also not substantial objects of interests in the evaluations of treatment effects. Through the two-step approach, we can effectively circumvent the unnecessary additional complicated assumptions as well as the computational efforts in practices.

In addition, in IPD meta-analyses of longitudinal clinical trials, the time-points of outcome measurements are possibly uncommon across studies. As mentioned regarding the second issue in the previous section, when the time variables are often treated as factor variables (e.g., for the typical MMRM analyses [19]), the one-step approach has practical and computational difficulties.

Remark C

For synthesis of longitudinal clinical trials in which time-points of outcome measurements are possibly uncommon, we can apply the two-step approach formally via dropping the corresponding time-points variables in Step-1. Then, in Step-2, the corresponding components of \( {\widehat{\boldsymbol{\beta}}}_i \) should be treated as missing outcomes variables.

The missing problems are common not only in two-step IPD meta-analysis, but in various multivariate meta-analysis, and have been extensively discussed in previous papers [21, 22]. Accordingly, in many software packages for multivariate meta-analysis [23,24,25], the standard modules can treat these partially missing outcomes adequately. Usually, these packages conduct available data analyses based on direct likelihood methods [23, 24].

In addition, in some cases, we can obtain the IPD for multiple clinical trials from different platforms of clinical trial databases. Recently, the clinical trial data sharing has been increasingly advanced [27, 28], and researchers can access the IPD of clinical trials for scientific research purposes, e.g., via ClinicalStudyDataRequest.com. However, in many platforms, the users cannot download the datasets and can only analyse on the remote accesses on the platform, in which case the IPD can only be analysed separately. In such cases, the one-step IPD meta-analyses cannot be conducted, but the two-step approaches can still be applied as valid and efficient procedure. In analysing longitudinal trial datasets, the two-step marginal synthesis method would be an effective strategy for evaluating treatment effects and interactions of the candidates of effect modifiers.

Results

Simulation studies

We conducted simulation studies to evaluate the validity and efficiency of the proposed method under practical IPD meta-analysis scenarios. In this context, validity corresponds to the accuracy of point estimation, standard error (SE) evaluation, and the coverage rate of the confidence interval (CI). In addition, the efficiency corresponds to the precision of estimation and the power of statistical test.

Simulation 1

First, we conducted simulations to evaluate the efficiency of two-step marginal multivariate meta-analysis approach. In particular, we would evaluate the equivalence of the two-step marginal method and the fully efficient one-step IPD meta-analysis. Simulation data were generated as three time-points longitudinal data, following the mixed effects model involving two-way interactions,

$$ {Y}_{ij t}={b}_{ij}+{\beta}_{i1}+{\beta}_{i2} trea{t}_{ij}+{\beta}_{i3} tim{e}_{2, ij}+{\beta}_{i4}{time}_{3, ij}+{\beta}_{i5}{z}_{ij}+{\beta}_{i6} trea{t}_{ij}\times tim{e}_{2, ij}+{\beta}_{i7} trea{t}_{ij}\times tim{e}_{3, ij}+{\beta}_{i8} tim{e}_{2, ij}\times {z}_{ij}+{\beta}_{i9} tim{e}_{3, ij}\times {z}_{ij}+{\beta}_{i10}{treat}_{ij}\times {z}_{ij}+{\varepsilon}_{ij t} $$
(3)

where bij is a random intercept for the jth participant of ith trial following N(0,4.52) and εijt is the random error for the tth visit of jth participant of ith trial following N(0,3.22), treatij is a binary variable indicating treatment group. time2, ijt and time3, ijt are dummy variables indicating time-points of 2nd and 3rd visits, respectively (reference is the 1st visit). The parameter settings were mimicked to the Furukawa et al. [14]‘s example in Section 4. zij is a covariate variable that is a candidate to be an effect modifier, which is a measurement that is associated with response or lack of response to a particular therapy. The two-way interaction terms represent the potential usefulness of zij as an effect modifier. Following the mixed effects model (3), we generated N longitudinal clinical trial datasets, for N = 5, 10, 15, and 20. In addition, the number of participants of each trial was generated from a discrete uniform distribution on [50, 500].

In these first simulation studies, the primary purpose is to assess the validity and efficiency of the two-step approach provided in Section 2. Also, we would evaluate the equivalence of the two-step marginal method and the fully efficient one-step IPD meta-analysis. At first, to circumvent the discussions to be complicated, we assumed there was no heterogeneity for βi and (Di, Σi) among the N trials, which is a special case of the fixed effects models considered by Jones et al. [13]. In other words, all βi (i = 1, … , N) are equal to the grand mean μ. The parameter settings of μ were also mimicked to the Furukawa et al. [14], μ1 =  − 0.04, μ2 =  − 0.14, μ3 = 1.56, μ4 = 0.83, μ5 =  − 0.46, μ6 = 0.57, μ7 = 0.14, μ8 = 0.15,  μ9 = 0.05, μ10 =  − 0.06. Under these settings, we conducted 10,000 simulations, and we analyzed the simulated datasets using the one-step method by the correctly specified mixed effects model and the two-step multivariate meta-analysis approach in Section 2.

The results are presented in Table 1. We presented empirical means and SE of the regression coefficient estimates, means of the SE estimates (\( \widehat{\mathrm{SE}} \)), empirical coverage rates of the 95% CIs, and empirical powers for the tests of individual regression coefficients. The contexts of the results are too rich, so we present the ones for selected coefficients, μ2, μ4, μ5, μ7, μ9, μ10. The results for the remaining parameters were similar. These results consistently showed that the fully efficient one-step approach was equivalent to the marginal two-step approach under all of the scenarios. Both the mean and SE of 10,000 estimates were mostly the same, and the SE estimate could accurately estimate the actual SE of the estimates. In addition, coverage rates of 95% CIs were accurate and the power of the two approaches were equivalent. As a whole, the two-step approach was shown to provide accurate results without loss of efficiency, regardless of dropping the variance-covariance information. The efficiency was also retained for the interaction tests.

Table 1 Results for the simulation 1: fully efficient one-step approach and the two-step marginal multivariate meta-analysis approach were compared

Simulation 2

Second, we conducted other simulations to evaluate the validity of the two-step random effects multivariate meta-analysis approaches under substantial heterogeneity of βi as well as (Di, Σi), among N longitudinal trials. We considered the mixed effects models (3) for individual trials, and additionally considered heterogeneity of βi and (Di, Σi). We considered a compound symmetry structure for Ω, which assumes a common variance among all of the components of βi, βik~N(μk, τ2), k = 1, … , 10, where τ2 = 0.10, 0.20. We assumed no correlations among βiks, and also assumed heterogeneity among the random effects distributions of bij~N(0, ζ2), where ζ is generated from a continuous uniform distribution on [2.5, 9.0]. We considered two settings: (a) outcome variables are measured on all three time-points, and (b) outcome variables are measured on two time-points for 40% of trials and on three time-points for the remaining 60% trials.

Under these conditions, the two-step approach can be easily implemented using standard statistical packages. Again, note that the two-step method does not require modelling of the heterogeneity structures of (Di, Σi), and the validity and efficiency of the inference are assured under any heterogeneous structures of (Di, Σi). Besides, to implement valid inference using one-step approach, we should adopt the correct statistical model involving the random effects distributions for both of βi and (Di, Σi). One ad-hoc approach uses a Bayesian hierarchical modelling involving random effects models, but it requires complicated computations using special software (e.g., OpenBUGS [29]).

The simulation results are presented in Tables 2 and 3, which correspond to the settings (a) and (b) respectively. We conducted 10,000 simulations and provided the same outputs as Table 1. We considered 6 scenarios, setting N = 5, 10, or 15 and τ2 = 0.10 or 0.20. The results also showed unbiasedness of the two-step estimates under all of the scenarios. In addition, the validities of SE estimation and CIs (inversely, the sizes of the corresponding statistical tests) could be confirmed. Again, note that we only took parametric assumptions for the random effects distribution of βi and no restrictive assumptions concerning the heterogeneity of (Di, Σi). The validity of the two-step method was demonstrated under the flexible and weak assumptions. The efficiency is not compared with those of the one-step methods here, but the equivalence is theoretically assured (Remark B).

Table 2 Results for the simulation 2(a): outcome variables were measured on all three time-points
Table 3 Results for the simulation 2(b): outcome variables were measured on two time-points for 40% trials and on three time-points for 60% trials

Applications to IPD meta-analysis for new generation antidepressants

To illustrate our method, we analyzed the dataset from an IPD meta-analysis for new generation antidepressants by Furukawa et al. [14] The authors conducted IPD meta-analysis from four placebo-controlled, double-blinded randomized clinical trials (1482 participants) for patients undergoing acute phase treatment for major depression. The outcome variable was the change score on the 17-item Hamilton Rating Scale for Depression (HRSD) or the Montgomery-Asberg Depression Rating Scale (MADRS). In their analyses, the latter was converted into the former using the conversion algorithm based on the item response theory [30]. In particular, the authors investigated whether the baseline depression severity modifies the efficacy of the antidepressants using IPD meta-analysis. The outcome measurements were repeatedly measured at several time-points in the four trials, but the time-points at which the outcome variables were planned to be measured were not common across the four trials.

We considered the following mixed effects model on 1st, 2nd, 4th and 6th weeks involving two-way interaction terms based on the model (1),

$$ {Y}_{ij t}={b}_{ij}+{\beta}_{i1}+{\beta}_{i2} trea{t}_{ij}+{\beta}_{i3}{week}_{1, ij}+{\beta}_{i4}{week}_{2, ij}+{\beta}_{i5}{week}_{4, ij}+{\beta}_{i6}{z}_{ij}+{\beta}_{i7} trea{t}_{ij}\times {week}_{1, ij}+{\beta}_{i8} trea{t}_{ij}\times {week}_{2, ij}+{\beta}_{i9} trea{t}_{ij}\times {week}_{4, ij}+{\beta}_{i10}{week}_{1, ij}\times {z}_{ij}+{\beta}_{i11}{week}_{2, ij}\times {z}_{ij}+{\beta}_{i12}{week}_{4, ij}\times {z}_{ij}+{\beta}_{i13}{treat}_{ij}\times {z}_{ij}+{\varepsilon}_{ij t} $$

where week1, ij, week2, ij and week4, ij are dummy variables for the time points (reference: 6th week). We first attempted a one-step random effects meta-analysis that modelled all of the regression coefficients as random effects among the 4 trials, but the REML estimate was not computable using standard statistical packages. There might also be between-studies heterogeneity for the variance-covariance parameters, which was not modelled in this analysis. Thus, we provided the results of a one-step fixed effects meta-analysis only consider correlations of repeated measured outcomes among different time points within individual participants. The REML estimates, 95% CI, and P-values are presented in Table 4.

Table 4 Results of the IPD meta-analysis of 4 clinical trials for new generation antidepressants

Next, we conducted IPD meta-analyses using the two-step approach based on the REML estimates of the fixed effects coefficients of the 4 trials. In the multivariate random effects meta-analysis models, we considered several assumptions for between-studies variance-covariance structures Ω, and the fitting of the candidate models was evaluated using the Bayesian information criterion (BIC) [31]. The best fitting model was the compound symmetry model (BIC = 125.73), whereas for the fixed effects model, BIC = 316.89.

In Table 3, we also provided the results of two-step multivariate meta-analysis of the fixed effects model and the random effects model using the compound symmetry model. In the random effects model, the between-studies SD estimate was 0.047 and the between-studies correlation coefficient was 0.064. The results for the fixed effects model analyses using one-step and two-step approaches were consistent, which was expected as described in Remark A. However, note that even if there was between-studies heterogeneity for the variance-covariance parameters, the two-step approach provides valid inference results. Besides, comparing the fixed and random effects two-step analyses, the coefficient estimates were somewhat different. In addition, the SE estimates of the fixed effects coefficients were larger for the random effects model, indicating that uncertainty might be larger when using this model. Considering the simulation results, even when there is substantial heterogeneity in βis, the two-step random effects approach provides valid inference results. The interaction tests of βi13 were not significant by significance level of 5% for any of the three models, consistent with the original analyses of Furukawa et al. [14], but these tests should be the most powerful under the assumption that the statistical models were correct.

Discussion

IPD meta-analysis is becoming increasingly popular in recent medical studies, especially as an effective approach for exploring effect modifiers for precision medicine [8]. Given recent trends in clinical trial data sharing [27, 28], the practical values of these methodologies are bound to increase in future studies. The efficiencies of the one-step and two-step approaches have also been discussed for IPD meta-analyses based on the conventional ANCOVA, logistic and Cox regression models [32]. Besides, as noted above, IPD meta-analyses using the mixed effects models involve more complicated problems, which should address the heterogeneity of variance-covariance structures of random effects distributions. In this paper, we presented a flexible two-step method that can effectively circumvent these difficulties, and demonstrated its validity and efficiency.

In the simulation studies, the coverage rates of 95% CIs for the random effects model retained nearly at the nominal level, but previous studies showed that SE are possibly underestimated under small N settings in random effects meta-analyses, and the standard CIs can have serious undercoverage properties [33,34,35,36]. In the simulation studies, the proper coverage properties are possibly caused by the adoption of a compound symmetry structure for the heterogeneity covariance matrix Ω. In multivariate meta-analyses, when the heterogeneity variances are assumed to be common across different outcomes (this assumption is often adopted in network meta-analysis [37, 38]), the statistical information across the different outcomes can be shared for the common parameter estimation, so the accuracy is generally gained. When we adopted different variance assumptions for Ω, the CIs had undercoverage property under small N setting (data not shown). Note that several improved methods for addressing the undercoverage property have been developed, e.g., the Kenward-Roger method [39] and the Bartlett-type corrections [40].

In this paper, we discussed the linear mixed effects models for continuous Gaussian outcome variables, but the two-step approach can be similarly adapted to the generalized linear mixed effects models for non-Gaussian outcome variables [41] and the frailty models for survival outcome variables [42]. For these outcome variables, it should be noted that there are possibly correlations between the estimators of fixed effects coefficients and variance-covariance parameters. Thus, the two-step analyses may cause losses of efficiency on the inferences. However, the two-step approach does not require parametric assumptions for heterogeneities for the within-studies variance-covariance parameters, and validity of the inference is generally assured. In addition, this method can be easily implemented using standard statistical packages. Therefore, the two-step marginal approach would also be an effective solution for these settings. Further research is warranted to evaluate the expansion of the two-step model to these settings.

Conclusions

The two-step approach is an effective method for synthesizing the regression coefficient parameters efficiently for IPD meta-analyses of longitudinal clinical trials. Using this method, modelling of heterogeneities of the within-studies variance-covariance parameters can be effectively circumvented without additional parametric assumptions, which are not of primary interests in the analyses of these clinical trials [16, 17, 19]. If we adopt the one-step approach to modelling the heterogeneity, complicated modelling is required and the validity is possibly violated if the adopted model is misspecified. Based on the relaxed assumptions of the two-step approach, we can implement efficient inferences for the treatment effects and interactions. In addition, the two-step approach has advantages for implementations because it can be easily performed using standard statistical packages [23,24,25]. It can also be effectively applied to IPD meta-analyses using clinical trials data sharing systems in which individual trials can only be analyzed separately.

Abbreviations

ANCOVA:

analysis of covariance

BIC:

Bayesian information criterion

CI:

confidence interval

HRSD:

Hamilton Rating Scale for Depression

IPD:

individual participant data

MADRS:

Montgomery-Asberg Depression Rating Scale

MAR:

missing at random

ML:

maximum likelihood

MMRM:

mixed effects model for repeated measures

REML:

restricted maximum likelihood

SE:

standard error

References

  1. Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, Frangakis C, Hogan JW, Molenberghs G, Murphy SA, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med. 2012;367(3):1355–60.

    Article  CAS  Google Scholar 

  2. O'Neill RT, Temple R. The prevention and treatment of missing data in clinical trials: an FDA perspective on the importance of dealing with it. Clin Pharm Ther. 2012;91(3):550–4.

    Article  CAS  Google Scholar 

  3. National Research Council. The prevention and treatment of missing data in clinical trials. Washington, DC: National Academies Press; 2010.

    Google Scholar 

  4. European Medicines Agency. Guideline on missing data in confirmatory clinical trials. London: European Medicine Agency; 2010.

    Google Scholar 

  5. Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38(4):963–74.

    Article  CAS  Google Scholar 

  6. Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York: Springer; 2000.

    Google Scholar 

  7. Mallinckrodt CH, Clark WS, David SR. Accounting for dropout bias using mixed-effects models. J Biopharm Stat. 2001;11(1–2):9–21.

    Article  CAS  Google Scholar 

  8. Riley RD, Lambert PC, Abo-Zaid G. Meta-analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221.

    Article  Google Scholar 

  9. Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions. Chichester: Wiley-Blackwell; 2008.

    Book  Google Scholar 

  10. Riley RD, Kauser I, Bland M, Thijs L, Staessen JA, Wang J, Gueyffier F, Deeks JJ. Meta-analysis of randomised trials with a continuous outcome according to baseline imbalance and availability of individual participant data. Stat Med. 2013;32(16):2747–66.

    Article  Google Scholar 

  11. Riley RD, Price MJ, Jackson D, Wardle M, Gueyffier F, Wang J, Staessen JA, White IR. Multivariate meta-analysis using individual participant data. Res Synth Methods. 2015;6(2):157–74.

    Article  CAS  Google Scholar 

  12. Ishak KJ, Platt RW, Joseph L, Hanley JA, Caro JJ. Meta-analysis of longitudinal studies. Clin Trials. 2007;4(5):525–39.

    Article  Google Scholar 

  13. Jones AP, Riley RD, Williamson PR, Whitehead A. Meta-analysis of individual patient data versus aggregate data from longitudinal clinical trials. Clin Trials. 2009;6(1):16–27.

    Article  Google Scholar 

  14. Furukawa TA, Maruo K, Noma H, Tanaka S, Imai H, Shinohara K, Ikeda K, Yamawaki S, Levine SZ, Goldberg Y, et al. Initial severity of major depression and efficacy of antidepressants: individual-participant data meta-analysis. Acta Psychiatr Scand. 2018;137(6):450–8.

    Article  CAS  Google Scholar 

  15. McCulloch CE, Searle SR, Generalized NJM. Linear, and mixed models. 2nd ed. New York: Wiley; 2008.

    Google Scholar 

  16. Mallinckrodt CH, Kaiser CJ, Watkin JG, Molenberghs G, Carroll RJ. The effect of correlation structure on treatment contrasts estimated from incomplete clinical trial data with likelihood-based repeated measures compared with last observation carried forward ANOVA. Clin Trials. 2004;1(6):477–89.

    Article  Google Scholar 

  17. Mallinckrodt CH, Lane PW, Schnell D, Peng Y, Mancuso JP. Recommendations for the primary analysis of continuous endpoints in longitudinal clinical trials. Drug Info J. 2008;42(4):303–19.

    Article  Google Scholar 

  18. Higgins JPT, Thompson SG, Spiegelhalter DJ. A re-evaluation of random-effects meta-analysis. J R Stat Soc A. 2009;172(1):137–59.

    Article  Google Scholar 

  19. Gosho M, Hirakawa A, Noma H, Maruo K, Sato Y. Comparison of bias-corrected covariance estimators for MMRM analysis in longitudinal data with dropouts. Stat Methods Med Res. 2017;26(5):2389–406.

    Article  Google Scholar 

  20. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Soft. 2015;67(1).

  21. Jackson D, Riley R, White IR. Multivariate meta-analysis: potential and promise. Stat Med. 2011;30(20):2481–98.

    Article  Google Scholar 

  22. Mavridis D, Salanti G. A practical introduction to multivariate meta-analysis. Stat Methods Med Res. 2013;22(2):133–58.

    Article  Google Scholar 

  23. White IR. Multivariate Radom-effects meta-analysis. Stata J. 2009;9:40–56.

    Article  Google Scholar 

  24. White IR. Multivariate random-effects meta-regression: updates to mvmeta. Stata J. 2011;11:255–70.

    Article  Google Scholar 

  25. Gasparrini A, Armstrong B, Kenward MG. Multivariate meta-analysis for non-linear and other multi-parameter associations. Stat Med. 2012;31(29):3821–39.

    Article  CAS  Google Scholar 

  26. Cox DR, Reid N. Parameter orthogonality and approximate conditional inference. J R Stat Soc B. 1986;49(1):1–39.

    Google Scholar 

  27. Taichman DB, Sahni P, Pinborg A, Peiperl L, Laine C, James A, Hong ST, Haileamlak A, Gollogly L, Godlee F, et al. Data sharing statements for clinical trials: a requirement of the International Committee of Medical Journal. Ann Intern Med. 2017;167(1):63–5.

    Article  Google Scholar 

  28. Taichman DB, Sahni P, Pinborg A, Peiperl L, Laine C, James A, Hong ST, Haileamlak A, Gollogly L, Godlee F, et al. Data sharing statements for clinical trials. BMJ. 2017;357:j2372.

    Article  Google Scholar 

  29. Lunn D, Spiegelhalter D, Thomas A, Best N. The BUGS project: Evolution, critique and future directions. Stat Med. 2009;28(25):3049–67.

    Article  Google Scholar 

  30. Carmody TJ, Rush AJ, Bernstein I, Warden D, Brannan S, Burnham D, Woo A, Trivedi MH. The Montgomery Asberg and the Hamilton ratings of depression: a comparison of measures. Eur Neuropsychopharmacol. 2006;16(8):601–11.

    Article  CAS  Google Scholar 

  31. Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):461–4.

    Article  Google Scholar 

  32. Burke DL, Ensor J, Riley RD. Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ. Stat Med. 2017;36(5):855–75.

    Article  Google Scholar 

  33. Brockwell SE, Gordon IR. A comparison of statistical methods for meta-analysis. Stat Med. 2001;20(6):825–40.

    Article  CAS  Google Scholar 

  34. Jackson D, Riley RD. A refined method for multivariate meta-analysis and meta-regression. Stat Med. 2014;33(4):541–54.

    Article  Google Scholar 

  35. Noma H. Confidence intervals for a random-effects meta-analysis based on Bartlett-type corrections. Stat Med. 2011;30(28):3304–12.

    Article  Google Scholar 

  36. Bender R, Friede T, Koch A, Kuss O, Schlattmann P, Schwarzer G, Skipka G. Methods for evidence synthesis in the case of very few studies. Res Synth Methods. 2018. https://doi.org/10.1002/jrsm.1297.

  37. Salanti G, Higgins JP, Ades AE, Ioannidis JP. Evaluation of networks of randomized trials. Stat Methods Me Res. 2008;17(3):279–301.

    Article  Google Scholar 

  38. White IR. Network meta-analysis. Stata J. 2015;15:951–85.

    Article  Google Scholar 

  39. Morris TP, Fisher DJ, Kenward MG, Carpenter JR. Meta-analysis of Gaussian individual patient data: two-stage or not two-stage? Stat Med. 2018;37(9):1419–38.

    Article  Google Scholar 

  40. Noma H, Nagashima K, Maruo K, Gosho M, Furukawa TA. Bartlett-type corrections and bootstrap adjustments of likelihood-based inference methods for network meta-analysis. Stat Med. 2018;37(7):1178–90.

    Article  Google Scholar 

  41. Fitzmaurice GM, Laird NM, Ware J. Applied longitudinal analysis. Hoboken, NJ: John Wiley & Sons; 2011.

    Book  Google Scholar 

  42. Duchateau L, Janssen P. The frailty model. New York: Springer; 2008.

    Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This study was supported by Grant-in-Aid from the Japan Agency for Medical Research and Development (Grant number: JP17dk0307072) and Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (Grant numbers: JP15K15954, JP17K19808) in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Availability of data and materials

The original datasets were available upon agreements between this research group and the sponsor companies, so are not publicly available.

Author information

Authors and Affiliations

Authors

Contributions

HN, KM, MG contributed to the developments of methods. HN, KM, TAF contributed to the acquisition of data. HN, KM, MG, SZL, YG, SL, TAF contributed to the analysis and interpretation of data. HN drafted the manuscript. All authors reviewed, read and approved the final version to be submitted.

Corresponding author

Correspondence to Hisashi Noma.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

HN has received lecture fees from Boehringer Ingelheim and Kyowa Hakko Kirin, and research support from Kyowa Hakko Kirin and GSK. He is an associate editor of this journal; MG received lecture and/or consultant fees from Daiichi Sankyo, Ferring Pharmaceuticals, Novartis, and Tiho, received travel fees from Takeda, and received manuscript fees from Kowa; SZL received honoraria or research support or consultancy fees or travel support from Shire Pharmaceuticals, F. Hoffmann-La Roche and Eli Lilly; SL has received honoraria for consulting from LB Pharma, Lundbeck, Otsuka, TEVA, Geodon Richter, Recordati, LTS Lohmann, Boehringer Ingelheim, and for lectures from Janssen, Lilly, Lundbeck, Otsuka, SanofiAventis and Servier; TAF has received lecture fees from Janssen, Meiji, Mitsubishi-Tanabe, MSD and Pfizer. He has received research support from Mitsubishi-Tanabe.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Noma, H., Maruo, K., Gosho, M. et al. Efficient two-step multivariate random effects meta-analysis of individual participant data for longitudinal clinical trials using mixed effects models. BMC Med Res Methodol 19, 33 (2019). https://doi.org/10.1186/s12874-019-0676-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12874-019-0676-1

Keywords