Research Article  Open  Open Peer Review  Published:
A double SIMEX approach for bivariate randomeffects metaanalysis of diagnostic accuracy studies
BMC Medical Research Methodologyvolume 17, Article number: 6 (2017)
Abstract
Background
Bivariate randomeffects models represent a widely accepted and recommended approach for metaanalysis of test accuracy studies. Standard likelihood methods routinely used for inference are prone to several drawbacks. Small sample size can give rise to unreliable inferential conclusions and convergence issues make the approach unappealing. This paper suggests a different methodology to address such difficulties.
Methods
A SIMEX methodology is proposed. The method is a simulationbased technique originally developed as a correction strategy within the measurement error literature. It suits the metaanalysis framework as the diagnostic accuracy measures provided by each study are prone to measurement error. SIMEX can be straightforwardly adapted to cover different measurement error structures and to deal with covariates. The effortless implementation with standard software is an interesting feature of the method.
Results
Extensive simulation studies highlight the improvement provided by SIMEX over likelihood approach in terms of empirical coverage probabilities of confidence intervals under different scenarios, independently of the sample size and the values of the correlation between sensitivity and specificity. A remarkable amelioration is obtained in case of deviations from the normality assumption for the randomeffects distribution. From a computational point of view, the application of SIMEX is shown to be neither involved nor subject to the convergence issues affecting likelihoodbased alternatives. Application of the method to a diagnostic review of the performance of transesophageal echocardiography for assessing ascending aorta atherosclerosis enables overcoming limitations of the likelihood procedure.
Conclusions
The SIMEX methodology represents an interesting alternative to likelihoodbased procedures for inference in metaanalysis of diagnostic accuracy studies. The approach can provide more accurate inferential conclusions, while avoiding convergence failure and numerical instabilities. The application of the method in the R programming language is possible through the code which is made available and illustrated using the real data example.
Background
Metaanalysis of diagnostic studies is a widely accepted approach for the assessment of the accuracy of a diagnostic test in distinguishing between diseased and nondiseased patients. A diagnostic study is commonly evaluated in terms of sensitivity, i.e., the conditional probability of testing positive in diseased subjects, and specificity, i.e., the conditional probability of testing negative in nondiseased subjects. Alternatively, the information about a diagnostic test is available as a twobytwo table of agreement between the test results and the reference standard test results [1].
The interest in metaanalysis of diagnostic accuracy studies has increased over recent years. Preliminary approaches based on separate univariate metaanalyses for sensitivity and specificity of diagnostic tests, although still diffuse in medical investigations, have been successfully improved by more sophisticated solutions accounting for the correlation between the diagnostic test measures [2–4]. The literature, initially based on least squares regressions [5, 6], now spans hierarchical models [4, 7–9], bivariate copula distributions [10–12], bivariate mixture models [13, 14], nonparametric solutions [15]. In this paper we focus on the bivariate randomeffects model [7, 8], as it is currently a wellestablished and recommended method for metaanalysis of diagnostic accuracy studies. The bivariate randomeffects approach has a hierarchical structure accounting for the withinstudy sampling variability and for the betweenstudy variability arising from differences derived, for example, from patients’ characteristics. Moreover, it considers the presence of measurement error affecting the sample estimation of sensitivity and specificity. These characteristics represent a substantial step ahead with respect to the original approach of Littenberg and Moses [5, 6] to construct a summary receiver operating characteristic (SROC) curve based on the regression of the difference between sensitivity and specificity on their sum, a solution criticised has a source of unreliable inferential conclusions [8]. Likelihood inference has to deal with issues of considerable interest [16–19]: small sample size is known to affect the accuracy of the inferential results; nonconvergence of the optimisation algorithms can occur, with nonpositive definite variance/covariance matrix or unreliable parameter estimates typically on the boundary of the parameter space; computational issues, such as numerical integration, may represent further complications to deal with.
This paper investigates the applicability of SIMEX (simulation extrapolation) as an alternative way for metaanalysis of diagnostic accuracy studies. SIMEX is a simulationbased technique developed within the measurement error literature [20, 21] that found a wide applicability in many areas of research, given the simplicity of the idea underlying the approach and the straightforward implementation with standard software. The performance of SIMEX for inference on the bivariate randomeffects model components as well as on the diagnostic accuracy measures is compared to the likelihood approach through an extensive simulation study covering different scenarios, with varying sample size and betweenstudy correlation. Attention is paid to the robustness of the competing methods against model misspecification, in particular deviations from the typical assumption of joint normal distribution for the random effects [9], as well as to nonconvergence problems and numerical instabilities. In addition, SIMEX is applied to the metaanalysis of transesophageal echocardiography recently used in literature [15] to highlight the limitations of the likelihoodbased inference.
Methods
Bivariate randomeffects formulation for metaanalysis
Consider a metaanalysis of n diagnostic accuracy studies, each of them providing information as a twobytwo table reporting the number of true positives, true negatives, false positives and false negatives, denoted by n _{11i },n _{00i },n _{10i } and n _{01i }, respectively. Let n _{1i } be the number of total positives and n _{0i } the number of total negatives. Consider the sensitivity (SE _{ i }) and the specificity (SP _{ i }) as diagnostic accuracy measures of study i,i=1,…,n. Keeping with much of the literature, the accuracy can be expressed using the logit transformation, η _{ i }=logit(SE _{ i }) and ξ _{ i }=logit(1−SP _{ i }). Given the twobytwo table information, the estimates of SE _{ i } and SP _{ i } in study i are n _{11i }/n _{1i } and n _{00i }/n _{0i }, respectively. Hereafter, the estimates of η _{ i } and ξ _{ i } will be denoted by $\hat {\eta }_{i}$ and $\hat {\xi }_{i}$, respectively.
Models
In this paper, we will focus on the bivariate randomeffects model for metaanalysis of diagnostic accuracy studies, following Reitsma et al. [7] and Arends et al. [8], among others. The model has a hierarchical structure, including a withinstudy level and a betweenstudy level accounting for the correlation between sensitivity and specificity [2–4]. The betweenstudy model considers the joint distribution of the random effects η _{ i } and ξ _{ i },
where $\overline {\eta }$ and $\overline {\xi }$ are the means over the studies, $\sigma ^{2}_{\eta }$ and $\sigma ^{2}_{\xi }$ denote the betweenstudy variances and ρ is the correlation coefficient. As sensitivity SE _{ i } and specificity SP _{ i } tend to be negatively correlated, then η _{ i } and ξ _{ i } tend to be positively correlated, so that ρ>0.
The withinstudy variability is accounted for at the second stage to describe the relationship between $(\hat {\eta }_{i}, \hat {\xi }_{i})^{\top }$ and (η _{ i },ξ _{ i })^{⊤}. The literature distinguishes between the approximate and the exact withinstudy model specification. The approximate model considers $(\hat {\eta }_{i},\hat {\xi }_{i})^{\top }$ following a bivariate normal specification,
where the withinstudy variance/covariance matrix is diagonal with nonzero entries estimated in each study. We refer to (1) and (2) as the NormalNormal approach [16]. Since, marginally,
the likelihood function for $ {\theta }=(\overline {\eta }, \overline {\xi }, \sigma ^{2}_{\eta }, \sigma ^{2}_{\xi }, \rho)^{\top }$ has a closedform expression and a straightforward implementation using standard softwares. The model has an interesting interpretation in terms of the model suggested by Rutter and Gatsonis [22] within a Bayesian framework, with a different parameterization [23]. From a practical point of view, the implementation of the bivariate NormalNormal model is, however, more convenient [8].
The exact withinstudy model specification considers the observed true positives and false positives as realisations of binomial variables,
We refer to (1) and (3) as the BinomialNormal approach [16]. The resulting model is a generalised linear model, with no closedform expression for the associated likelihood function. More computational effort is required with respect to the approximate model, as numerical integration is needed. Convergence problems represent a further drawback of the approach, with the risk of nonpositive definite variance/covariance matrix and unreliable estimates of the parameters of the variance/covariance matrix truncated on the boundary of the parameter space [9, 16, 19]. Both the practical issues are more severe as the number of studies decreases. The NormalNormal approach is prone to some criticism as well, despite its feasible application. Inferential conclusions can be biased as a consequence of small sample size or values of sensitivity and specificity close to 1 [4, 24]. When the sample size is large, instead, there are no substantial differences between the two approaches.
Parameter estimation is typically performed via maximum likelihood or restricted maximum likelihood [8]. The estimates of sensitivity and specificity are obtained by backtransforming the estimates of $\overline {\eta }$ and $\overline {\xi }$, with standard errors derived using the delta method. Alternative measures of test accuracy are the positive likelihood ratio LR+=SE/(1−SP), the negative likelihood ratio LR−=(1−SE)/SP and the diagnostic odds ratio dOR={SE/(1−SE)}×{SP/(1−SP)}. A description of the diagnostic test can be also provided by the SROC curve, through i) the characterisation of the bivariate normal model via an appropriate line and ii) the transformation of the line to the SROC space. See Arends et al. [8] for alternative specifications for the SROC curves. Discussion about the interpretation of the resulting SROC curve can be found in Hamza et al. [4] and references therein.
The measurement error problem
The hierarchical model defined for metaanalysis of diagnostic accuracy studies is an instance of the more general bivariate metaanalysis investigated by van Houwelingen et al. [25], among others. Control rate regression [26, 27], defined as the relationship between the treatment effect and the baseline risk in metaanalysis of clinical trials, perfectly fits the scenario we focus on in this paper. In control rate regression, attention is paid to the risk of inaccurate inferential conclusions due to the presence of measurement error [28, 29] affecting the treatment effect and the baseline risk measures. Different proposals have been suggested to face the measurement error problem [30–32]. Similarly, in metaanalysis of diagnostic accuracy studies the observed $\hat {\eta }_{i}$ and $\hat {\xi }_{i}$ are estimates of the true unknown η _{ i } and ξ _{ i } and thus they are prone to some kind of mismeasure. Not accounting for measurement error can result in misleading inference, the most frequent being a biased estimate of the slope of the regression line used to define the SROC curve, an effect known as attenuation. See, for example, the discussion in Arends et al. [8]. The likelihood approach based of the hierarchical model given by (1)–(2) or (1)–(3) properly accounts for measurement errors [8, 26]. The withinstudy model (2) or (3), in fact, defines a relationship between the observed errorprone $\hat {\eta }_{i}$ and $\hat {\xi }_{i}$ and the unobserved corresponding η _{ i } and ξ _{ i }, in this way including the uncertainty related to the measurement process.
Despite the above mentioned analogies, control rate regression and diagnostic accuracy studies differ with respect to the role played by the variables in the regression model. In control rate regression, the baseline risk information is the covariate for the regression model using the treatment effect as response variable. In diagnostic accuracy studies, the roles of η _{ i } and ξ _{ i } in terms of response variable and covariate are not undoubtedly defined, as specificity and sensitivity can act as response or covariate according to the regression model chosen to define a particular SROC curve. Only when a specific regression line used for drawing the SROC curve is defined [8], then the role of response or covariate is clearly stated.
Double SIMEX approach
SIMEX is a simulationbased technique for measurement error correction [20, 21, 33]. The method, originally developed to deal with classical additive errors affecting continuous variables, can be easily extended to all the scenarios where measurement error structures can be simulated. This requires the measurement error variance to be known or at least accurately approximated. SIMEX consists of a simulation step followed by an extrapolation step. In the first step, a resamplinglike strategy simulates B datasets of additional errors with increasing variance, each of them used to estimate the parameters of interest. In the second step, the relationship between the estimates and the amount of the added measurement error is determined and used to extrapolate the corrected estimate back to the no measurement error case. The simulation step typically requires the generation of independent random variables, while the estimation can be carried out using standard simple procedures, as the least squares estimation or the method of moments. The extrapolation step is a straightforward procedure. The feasibility of SIMEX application with standard software is the most attractive feature explaining its wide diffusion in many areas of research. Although the approach typically considers one or more mismeasured covariates, SIMEX can easily handle situations with measurement errors on both the response and covariates, see Holcomb [34], who termed the resulting method double SIMEX. This case perfectly fits the bivariate metaanalysis problem we focus on in this paper, as $\hat {\eta }_{i}$ and $\hat {\xi }_{i}$ are both affected by measurement error. In this way, SIMEX for diagnostic test accuracy can be thought of as an extension of Guolo [32], who investigated the methodology in control rate regression.
Let $ {W}_{i}=(\hat {\eta }_{i}, \hat {\xi }_{i})^{\top }$ denote the vector of the mismeasured quantities in study i and let X _{ i }=(η _{ i },ξ _{ i })^{⊤} denote the corresponding errorfree vector. In the rest of the paper, we will focus on the approximate withinstudy model (2) to relate W _{ i } and X _{ i }. The implications of using the exact model (3) in SIMEX are illustrated in the Discussion section. The simulation step generates B datasets with additional errors starting from W _{ i },
for fixed λ≥0 and Γ _{ i } denoting the variance/covariance matrix of W _{ i }. The additional pseudoerrors U _{ b,i } are mutually independent and normally distributed, with zero mean and identity variance/covariance matrix. These properties are not guaranteed in finite sample as the values are computer generated. A possible solution is to simulate the errors from the GramSchmid process [34]: the resulting pseudoerrors have the required independence properties and they are normalized to guarantee zero mean and unit variance. Moreover, their use reduces the Monte Carlo variance of the SIMEX estimates, see Section 5.3.4.1 in Carroll et al. [28]. Vector W _{ b,i }(λ) is a remeasurement of W _{ i } with increasing error, such that W _{ b,i }(λ)=W _{ i } for λ=0. The remeasurement is such that E(W _{ b,i })=X _{ i } and Var(W _{ b,i })=(1+λ)Γ _{ i } and the mean squared error E[{W _{ b,i }(λ)−X _{ i }}^{2}X _{ i }] equals zero for λ=−1, the key property of the simulated data. For each simulated bth data set, let $ {\hat {\theta }}_{b}(\lambda)$ be the estimate of θ obtained using a standard approach, as if the measurement error was absent. The simulation step concludes with the average of the B estimates for fixed $\lambda, {\hat {\theta }}(\lambda)=B^{1}\sum ^{B}_{b=1} {\hat {\theta }}_{b}(\lambda)$. Usually λ assumes values on a grid Λ, for example Λ={0.0,0.5,1.0,1.5,2.0}. The value of B is commonly fixed up to 100 [20, 33]. In the extrapolation step, a relationship between $ {\hat {\theta }}(\lambda)$ and λ is established and extrapolated back to the case of no measurement error corresponding to λ=−1. The resulting estimate is the SIMEX estimate ${\hat {\theta }}_{SIMEX}$. The most diffuse extrapolation function is the quadratic function, given its numerical stability, see Section 5.3.2 in Carroll et al. [28].
Let $ {s}^{2}_{b}(\lambda)$ be the estimated variance/covariance matrix of ${\hat {\theta }}(\lambda)$ and let $ {s}^{2}(\lambda)=B^{1}\sum ^{B}_{b=1} {s}^{2}_{b}(\lambda)$. Given the sample variance/covariance matrix $ {s}^{2}_{\Delta }(\lambda)$ of $ {\hat {\theta }}_{b}(\lambda)$, the variance/covariance matrix of the SIMEX estimator is obtained by extrapolating back the relationship between $ {s}^{2}(\lambda)  {s}^{2}_{\Delta }(\lambda)$ and λ to the case λ=−1, see Stefanski and Cook [21] and Appendix B.4 in Carroll et al. [28].
Simulation studies
Several simulation studies have been conducted to investigate the performance of SIMEX and compare it to the NormalNormal and the BinomialNormal likelihood approaches. Data simulation follows a twostage procedure. In the first stage, values for η _{ i } and ξ _{ i } are generated according relationship (1) or substituting the normal distribution with a t distribution with four degrees of freedom, e.g. [9], or a skewnormal [35] distribution. In the last two cases, the robustness of the results is investigated with respect to departures from the common normality assumption for the random effects, which may sometimes not be appropriate [9]. The chosen skewnormal distribution is such that the mean and the variance correspond to those for the normal case, but the skewness parameter for (η _{ i },ξ _{ i })^{⊤} is increased from (0,0)^{⊤} (the normal case) to (−1.0,0.5)^{⊤} and to (−2,2)^{⊤}. In the second stage, the setup is inspired by the studies in Hamza et al. [16] and Diaz [18]. The withinstudy numbers of true positives and false positives are simulated using relationship (3). The numbers of diseased subjects n _{1i } and nondiseased subjects n _{0i } are generated from a uniform variable on [ 40,200]. The number of studies n varies in {10;25} in order to evaluate the methods in case of small to moderate sample sizes. Scenarios with decreasing accuracy are considered, namely, high accuracy $(\overline {\eta },\overline {\xi })^{\top }=(2.94, 2.20)^{\top }$, medium accuracy $(\overline {\eta },\overline {\xi })^{\top }=(1.39, 1.50)^{\top }$ and low accuracy $(\overline {\eta },\overline {\xi })^{\top }=(0.62, 0.85)^{\top }$. Accordingly, (SE _{ i },SP _{ i })^{⊤}=(0.95,0.90)^{⊤},(SE _{ i },SP _{ i })^{⊤}=(0.80,0.82)^{⊤} and (SE _{ i },SP _{ i })^{⊤}=(0.65,0.70)^{⊤},i=1,…,n. Increasing correlation between η _{ i } and ξ _{ i } is considered, ρ∈{0.2;0.6;0.8}. Betweenstudy variances $\sigma ^{2}_{\eta }$ and $\sigma ^{2}_{\xi }$ are fixed equal to 1.2 and 0.5, respectively. One thousand datasets are generated for each combination of sample size, correlation and values of $(\overline {\eta },\overline {\xi })^{\top }$.
The integrals in the BinomialNormal approach are approximated via a GaussHermite procedure with 100 quadrature points. Inference in the NormalNormal model is carried out using the restricted maximum likelihood, while inference in the BinomialNormal model uses the maximum likelihood estimation. Likelihood maximisation, based on the Nelder and Mead algorithm [36], employs the method of moments estimates as starting values. SIMEX considers B=100 remeasured data generated using the GramSchmid process, λ assuming values in Λ={0.0,0.5,1.0,1.5,2.0} and the quadratic extrapolation function. Parameter estimation within the simulation step is based on model (2). All the methods are implemented in the R programming language [37].
Methods are compared with respect to bias and estimate of standard error of the estimators of the parameters $\overline {\eta }, \overline {\xi }, \sigma ^{2}_{\eta }, \sigma ^{2}_{\xi }, \rho $ and in terms of the 95% confidence interval for the estimators of the measures of diagnostic accuracy given by the diagnostic odds ratio dOR, the positive likelihood ratio LR+ and the negative likelihood ratio LR−. The performance of the methods in terms of convergence problems is investigated as well. Successful convergence is intended as meeting the criterion convergence (e.g., difference between current and updated estimates less than 0.0001) and positive definite variance/covariance matrix. The results under nonconvergence are excluded when summarising the simulation results.
Results
Simulation results
Tables 1 and 2 report the simulation results for n=10 under the normal, the t and the skewnormal specification for (η _{ i },ξ _{ i })^{⊤}, by distinguishing the high accuracy scenario and the low accuracy scenario. Results for n=25 and results for the medium accuracy scenario are included in the Additional file 1. In all the tables, the nonconvergence rate is reported in bold when larger than 5%.
Results for the high accuracy scenario (Table 1) show that, for metaanalysis with small sample size and under the randomeffects normal specification, the BinomialNormal approach appears to be preferable in terms of bias of the estimators with respect to alternative solutions, although at the price of a sligthly larger standard error. Such a behaviour, however, deteriorates when moving to t and skewed distributions, with the bias tending to increase as the value of the correlation ρ becomes smaller. Under a t randomeffects distribution, the BinomialNormal approach and SIMEX show an increased bias of the estimators of the variance components $\sigma ^{2}_{\eta }$ and $\sigma ^{2}_{\xi }$, together with an increased standard error. The effects for the NormalNormal approach are less marked. When considering a skewnormal distribution for (η _{ i },ξ _{ i })^{⊤} with a high value of skewness, SIMEX appears to be the preferable solution in terms of bias, in particular when referring to the estimates of the variance components.
The impact of misspecification of the randomeffects distribution on inferential conclusions becomes more evident when computing the empirical coverage of confidence intervals at 95% nominal level for dOR (Fig. 1) and LR+ and LR− (Fig. 2). For dOR (Fig. 1), the NormalNormal approach and the BinomialNormal approach provide the less satisfactory results, with empirical coverage probabilities far from the target 95% level, in particular when ρ is small and under a skewnormal specification for the randomeffects distribution. See, for example, the extreme case ρ=0.2 for the highskewness scenario, with a coverage equal to 10% only for the NormalNormal approach. Under the normal and t scenarios, the performance of the methods is comparable to that of SIMEX, with a slight underestimation of the coverage level for the NormalNormal method. For skewed scenarios, the advantages of SIMEX over both the likelihoodbased solutions are more marked. The empirical coverages of the confidence intervals are substantially closer to the nominal level, especially for large ρ. Figure 2 substantiates the results for LR+ and LR−. Under a normal or t specification for (η _{ i },ξ _{ i })^{⊤}, methods are still comparable, although using the NormalNormal likelihood method produces coverages for LR− lower than alternatives. For skewed distributions, SIMEX exhibits a more satisfactory behaviour than likelihood solutions in most of the cases when considering LR+: for high values of ρ and a low skewness, the behaviour is comparable to the other methods. For LR−, instead, SIMEX outperforms alternatives, with empirical coverages of confidence intervals closer to the target level as ρ increases.
Substantial differences between the competing methods occur in terms of failure rate of the estimation process, see the last column of Table 1. Convergence problems affect the likelihood approach, under the BinomialNormal formulation in particular, in this way confirming previous findings in the literature [9, 16, 19]. The failure rate is notable when n=10, increases with ρ and deteriorates under a skewnormal randomeffects specification with high values of skewness, thus making the use of the likelihood solution questionable. For the high skewness case and ρ=0.8, for example, the BinomialNormal approach reaches 25.6% of failures. More extreme experiments with ρ=0.9, not reported here, substantiate the results, with a further growth of nonconvergence rate higher than 31%. Conversely from the likelihood approach, the application of SIMEX does not fail in any of the examined situations, irrespectively of the sample size n, the correlation ρ and the randomeffects distribution, with a nonconvergence rate constantly equal to zero.
When moving to the low accuracy scenario, results are similar to those observed for the high accuracy case. Consider, for example, results reported in Table 2, where bias and estimated standard error of the estimators are only slightly reduced with respect to Table 1. The most interesting result is the reduction of the failure rate with respect to the high accuracy scenario. With regards to the likelihood analysis, the most substantial reduction of nonconvergences is observed for the BinomialNormal formulation, whose failure rate does not exceed the 5% level under the normal or t randomeffects formulation and reaches 8.5% under the skewnormal distribution. Similarly to the high accuracy context, the failure rate tends to increase with ρ. SIMEX maintains a failure rate equal to zero.
Results for the medium accuracy scenario are reported in Additional file 1. Conclusions are coherent wth those from the low and high accuracy scenario. From a computational point of view, nonconvergence problems mainly affect the BinomialNormal approach, with failure rates reaching 22.3% under the skewnormal specification when the sample size is small.
Results for n=25 are reported in Additional file 1. Inferential conclusions about the bias of the estimators under all the accuracy scenarios remain globally similar to those for n=10, with the advantage of a reduced estimate of the standard error of the estimators. The most interesting result related to the increased sample size is the reduction of the failure rate, under all the examined situations. The NormalNormal approach is almost convergent in all the simulation settings. The BinomialNormal approach substantially reduces the number of failures, with just two cases exceeding the 5% threshold, corresponding to the skewnormal case with high skewness and ρ=0.8 in the high accuracy and medium accuracy scenarios. SIMEX maintains a failure rate equal to zero.
Data example
Van Zaane et al. [38] perform a metaanalysis of six diagnostic accuracy studies for the assessment of atherosclerosis in the ascending aorta in patients undergoing cardiac surgery through transesophageal echocardiography in place of the referencestandard method given by epiaortic ultrasound scanning. The available information is reported in Table 3. Data have been recently reanalysed in Zapf et al. [15] through a nonparametric approach.
The likelihood analysis with the NormalNormal specification results in estimated variances $\sigma ^{2}_{\eta }$ and $\sigma ^{2}_{\xi }$ almost zero. This affects the evaluation of the variance/covariance matrix for the whole parameter vector. Under the constraint of variances equal to zero, the estimation of the sensitivity and the specificity are concordant with the results obtained in van Zaane et al. [38], using the bivariate randomeffects model of Reitsma et al. [7]. The corresponding results are reported in Table 4, together with the associated 95% confidence interval. As shown in Zapf et al. [15], the likelihood analysis using the BinomialNormal specification does not converge and the likelihood estimation of the correlation ρ is on the boundary of the parameter space, equal to 1. These results coincide with those available from our implementation of the likelihood approach. Changing the optimization algorithm and the starting values does not solve the nonconvergence problem. The application of SIMEX, conversely, does not pose any convergence issue. The estimates of $\overline {\eta }$ and $\overline {\xi }$ resulting in −1.525 (standard error 0.292) and −4.266 (standard error 0.325), respectively, give rise to the estimates of sensitivity and specificity as reported in Table 4. SIMEXbased confidence intervals for sensitivity and specificity are narrower than those available from the NormalNormal approach. The results from the nonparametric approach of Zapf et al. [15], close to those from SIMEX, are reported for completeness.
Discussion
Results from the simulation studies indicate that SIMEX leads to satisfactory inferential results in a wide range of scenarios. When the normality assumption for the randomeffects distribution holds, the method is comparable to the likelihood solutions in terms of bias and estimated standard error of the estimators of the parameters of interest and slightly superior to the BinomialNormal formulation in terms of empirical coverages of confidence intervals for different diagnostic accuracy measures. When departures from the normality assumptions hold in terms of low or high skewness, then advantages of using SIMEX are much more evident. In particular, empirical coverages of confidence intervals for the diagnostic accuracy measures are closer to the 95% target level than alternatives. The likelihood approach under the BinomialNormal formulation shows a less satisfactory performance.
A substantial difference between SIMEX and the likelihood approach is in terms of failure rate of convergence. SIMEX has not convergence problems whichever the examined scenario. Conversely, likelihood solutions suffer for convergence difficulties, especially in case of skewed randomeffects distribution. The highest levels of failure rate are reached using the BinomialNormal formulation and they are much more frequent as the sample size is small and the value of the correlation ρ increases. Simulation results are in accordance with previous findings in the literature about convergence issues and numerical instabilities of the likelihood approach. Possible solutions evaluated in the simulation studies, including the change of the optimisation algorithm and the change of the starting values [16, 19], only slightly reduce the number of failures. When adopting the SIMEX strategy, several solutions are available in case of convergence failure, although we did not experience such a problem in our study. Possible solutions include the choice of a different estimation method within each bth replication of the simulation step or varying the number of simulated datasets B. An additional practical strategy is the visual inspection of the SIMEX components and the direct extrapolation of the points of interest. This strategy is suggested in Section B.4.1 of Carroll et al. [28] when the SIMEX estimated variance/covariance matrix is nonpositive definite. Although possible, this is an infrequent case and we did not encounter it in our simulations.
From a strictly practical point of view, the implementation of SIMEX, despite its simulationbased nature, is not involved neither timeconsuming and can proceed by taking advantage of simple estimation methods, such as the method of moments. The R [37] code for SIMEX implementation is made available in the Additional file 2 and illustrated in the Additional file 3.
Although the scenarios investigated in the paper do not consider the presence of additional level covariates in model (1), SIMEX can be extended to account for them. In this case, the number of remeasured datasets B is recommended to be increased in order to guarantee the results having an acceptable precision, see Section 5.3 in Carroll et al. [28].
In this paper, the model structure used for the simulation step in SIMEX is given by the approximate model (2) in place of the exact model (3). The choice implies that, when necessary, the correction that adds 0.5 to the twobytwo table cells equal to zero is applied. Additional empirical investigations with different correction values show that the 0.5 correction does not impact the results. Such a behaviour is related to the SIMEX procedure, as the correction can affect only the original data, while the remeasured data of the simulation step are not influenced. Simulating the discrete components of the twobytwo table in place of their continuous logit transformations $\hat {\eta }_{i}$ and $\hat {\xi }_{i}$ is theoretically possible. In this case, the measurement error problem affects the classification of the positive/negative results in the twobytwo table, thus being called misclassification problem, see Küchenhoff et al. [39]. However, moving from the new generated discrete data to the logit transformations would still be an obligatory step, as the data are necessary for inference in the main model (1). In this case, the 0.5 correction would still apply, not only on the original data but in every case the problem arises within the simulation step. How to circumvent these limitations when simulating from the exact model (3) represents a topic of future research.
Conclusions
This paper focused on bivariate randomeffects models for metaanalysis of diagnostic test accuracy. Attention is paid to the presence of errors affecting the measures of diagnostic accuracy. Standard likelihoodbased procedures are shown to be prone to several drawbacks, despite their wide diffusion. The inaccuracy of inferential conclusions for small sample size and in case of misspecification of the randomeffects distribution is accompanied by computational issues which seriously affect the applicability of the approach. The SIMEX methodology represents an interesting and promising alternative. Reliable inference properly accounting for the presence of measurement errors is obtained with neither computational effort not numerical instabilities. The satisfactory performance of SIMEX illustrated through extensive simulation experiments is not affected by study characteristics, such as sample size or measurement error correlation. Robustness to departures from normal randomeffects distributions is a substantial improvement over standard likelihood solutions. The availability of the R code for a userfriendly implementation of SIMEX is aimed at encouraging its use.
Abbreviations
 dOR:

Diagnostic odds ratio
 FN:

False negatives
 FP:

False positives
 LR:

Negative likelihood ratio
 LR+:

Positive likelihood ratio
 s.e.:

Standard error
 SE:

Sensitivity
 SIMEX:

Simulation extrapolation
 SP:

Specificity
 SROC:

Summary operating receiver characteristic
 TN:

True negatives
 TP:

True positives
References
 1
Honest H, Khan K. Reporting of measures of accuracy in systematic reviews of diagnostic literature. BMC Health Serv Res. 2002; 2:4:266–7.
 2
Riley RD, Abrams KR, Sutton AJ, Lambert PC, Thompson JR. Bivariate randomeffects metaanalysis and the estimation of betweenstudy correlation. BMC Med Res Methodol. 2007; 7:3:266–7.
 3
Riley RD, Thompson JR, Abrams KR. An alternative model for bivariate randomeffects metaanalysis when the withinstudy correlations are unknown. Biostatistics. 2008; 9:172–86.
 4
Hamza TH, van Houwelingen HC, Stijnen T. The binomial distribution of metaanalysis was preferred to model withinstudy variability. J Clin Epidemiol. 2008; 61:41–51.
 5
Littenberg B, Moses LE. Estimating diagnosticaccuracy from multiple conflicting reports: a new metaanalytic method. Med Dec Making. 1993; 13:313–21.
 6
Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary ROC curve: dataanalytic approaches and some additional considerations. Stat Med. 1993; 12:1293–316.
 7
Reitsma JB, Glas AS, Rutjes AWS, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol. 2005; 58:982–90.
 8
Arends LR, Hamza TH, van Houwelingen HC, HeijenbrokKal MH, Hunink MGM, Stijnen T. Bivariate random effects metaanalysis of ROC curves. Med Dec Making. 2008; 28:621–38.
 9
Chen Y, Liu Y, Ning J, Nie L, Zhu H, Chu H. A composite likelihood method for bivariate metaanalysis in diagnostic systematic reviews. Stat Methods Med Res. 2014. http://journals.sagepub.com/doi/full/10.1177/0962280214562146.
 10
Chu H, Nie L, Chen Y, Huang Y, Sun W. Bivariate random effects models for metaanalysis of comparative studies with binary outcomes: methods for the absolute risk difference and relative risk. Stat Methods Med Res. 2012; 21:621–33.
 11
Kuss O, Hoyer A, Solms A. Metaanalysis for diagnostic accuracy studies: a new statistical model using betabinomial distributions and bivariate copulas. Stat Med. 2014; 33:17–30.
 12
Nikoloulopoulos AK. A mixed effect model for bivariate metaanalysis of diagnostic test accuracy studies using a copula representation of the random effects distribution. Stat Med. 2015; 34:3842–65.
 13
Eusebi P, Reitsma JB, Vermunt JK. Latent class bivariate model for the metaanalysis of diagnostic test accuracy studies. BMC Med Res Methodol. 2014; 14:88.
 14
Schlattmann P, Verba M, Dewey M, Walther M. Mixture models in diagnostic metaanalyses – clustering summary receiver operating characteristic curves accounted for heterogeneity and correlation. J Clin Epidemiol. 2015; 68:61–72.
 15
Zapf A, Hoyer A, Kramer K, Kuss O. Nonparametric metaanalysis for diagnostic accuracy studies. Stat Med. 2015; 34:3831–41.
 16
Hamza TH, Reitsma JB, Stijnen T. Metaanalysis of diagnostic studies: A comparison of random intercept, normalnormal, and binomialnormal bivariate summary ROC approaches. Med Dec Making. 2008; 28:639–49.
 17
Paul M, Riebler A, Bachmann LM, Rue H, Held L. Bayesian bivariate metaanalysis of diagnostic test studies using integrated nested Laplace approximations. Stat Med. 2010; 29:1325–9.
 18
Diaz M. Performance measures of the bivariate random effects model for metaanalyses of diagnostic accuracy. Comput Stat Data Anal. 2015; 83:82–90.
 19
Takwoingi Y, Guo B, Riley RD, Deeks JJ. Performance of methods for metaanalysis of diagnostic test accuracy with few studies or sparse data. Stat Methods Med Res. 2015. http://journals.sagepub.com/doi/abs/10.1177/0962280215592269?url_ver=Z39.882003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%3dpubmed.
 20
Cook JR, Stefanski LA. Simulation extrapolation estimation in parametric measurement error models. J Am Stat Assoc. 1994; 89:1314–28.
 21
Stefanski LA, Cook JR. Simulationextrapolation: the measurement error jackknife. J Am Stat Assoc. 1995; 90:1247–56.
 22
Rutter CM, Gatsonis CA. A hierarchical regression approach to metaanalysis of diagnostic test accuracy evaluations. Stat Med. 2001; 20:2865–84.
 23
Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for metaanalysis of diagnostic accuracy studies. Biostatistics. 2007; 8:239–51.
 24
Chu H, Cole SR. Bivariate metaanalysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. J Clin Epidemiol. 2006; 59:1331–3.
 25
Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in metaanalysis: multivariate approach and metaregression. Stat Med. 2002; 21:589–624.
 26
McIntosh MW. The population risk as an explanatory variable in research synthesis of clinical trials. Stat Med. 1996; 15:1713–28.
 27
Schmid CH, Lau J, McIntosh MW, Cappelleri JC. An empirical study of the effect of the control rate as a predictor of treatment efficacy in metaanalysis of clinical trials. Stat Med. 1998; 17:1923–42.
 28
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu C. Measurement Error in Nonlinear Models: A Modern Perspective. Boca Raton: Chapman & Hall, CRC Press; 2006.
 29
Buonaccorsi JP. Measurement Error: Models, Methods and Applications. Boca Raton: Chapman & Hall, CRC Press; 2010.
 30
Arends LR, Hoes AW, Lubsen J, Grobbee DE, Stijnen T. Baseline risk as predictor of treatment benefit: three clinical metareanalyses. Stat Med. 2000; 19:3497–518.
 31
Ghidey W, Stijnen T, van Houwelingen HC. Modelling the effect of baseline risk in metaanalysis: A review from the perspective of errorsinvariables regression. Stat Methods Med Res. 2013; 22:307–23.
 32
Guolo A. The SIMEX approach to measurement error correction in metaanalysis with baseline risk as covariate. Stat Med. 2014; 33:2062–76.
 33
Carroll RJ, Küchenhoff H, Lombard F, Stefanski LA. Asymptotics for the SIMEX estimator in nonlinear measurement error models. J Am Stat Assoc. 1996; 91:242–50.
 34
Holcomb J. Regression with covariates and outcome calculated from a common set of variables measured with error: Estimation using the SIMEX method. Stat Med. 1999; 18:2847–62.
 35
Azzalini A. A class of distributions which includes the normal ones. Scand J Stat. 1985; 12:171–8.
 36
Nelder JA, Mead R. A simplex algorithm for function minimization. Scand J Stat. 1965; 7:308–13.
 37
R Core Team: R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2015. http://www.Rproject.org/.
 38
Van Zaane B, Zuithoff NPA, Reitsma JB, Bax L, Nierich AP, Moons KG. Metaanalysis of the diagnostic accuracy of transesophageal echocardiography for assessment of atherosclerosis in the ascending aorta in patients undergoing cardiac surgery. Acta Anaesthesiol Scand. 2008; 52:1179–87.
 39
Küchenhoff H, Mwalili SM, Lesaffre E. A general method for dealing with misclassification in regression: The misclassification SIMEX. Biometrics. 2006; 62:85–96.
Acknowledgments
Not applicable.
Funding
This work was supported by a grant from the University of Padova (Progetti di Ricerca di Ateneo 2015, CPDA153257).
Availability of data and materials
The real data used for the illustration of SIMEX and likelihood procedures are reported in Table 3. The R code to implement SIMEX is available in the Additional file 2 and the commands needed to apply the software for data analysis are illustrated in the Additional file 3.
Competing interests
The author declares that she has no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Author information
Additional files
Additional file 1
The file includes a portion of the simulation results comparing the performance of SIMEX and likelihoodbased approaches in the medium accuracy scenario and in the low and high accuracy scenarios for sample size n=25. (PDF 91 kb)
Additional file 2
R code for applying the SIMEX approach. (R 10.2 kb)
Additional file 3
The file illustrates how to apply SIMEX for data analysis using the R code available in Additional file 2. (PDF 64.4 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Bivariate metaanalysis
 Diagnostic test
 Likelihood inference
 Measurement error
 SIMEX