On the censored cost-effectiveness analysis using copula information

Background Information and theory beyond copula concepts are essential to understand the dependence relationship between several marginal covariates distributions. In a therapeutic trial data scheme, most of the time, censoring occurs. That could lead to a biased interpretation of the dependence relationship between marginal distributions. Furthermore, it could result in a biased inference of the joint probability distribution function. A particular case is the cost-effectiveness analysis (CEA), which has shown its utility in many medico-economic studies and where censoring often occurs. Methods This paper discusses a copula-based modeling of the joint density and an estimation method of the costs, and quality adjusted life years (QALY) in a cost-effectiveness analysis in case of censoring. This method is not based on any linearity assumption on the inferred variables, but on a punctual estimation obtained from the marginal distributions together with their dependence link. Results Our results show that the proposed methodology keeps only the bias resulting statistical inference and don’t have anymore a bias based on a unverified linearity assumption. An acupuncture study for chronic headache in primary care was used to show the applicability of the method and the obtained ICER keeps in the confidence interval of the standard regression methodology. Conclusion For the cost-effectiveness literature, such a technique without any linearity assumption is a progress since it does not need the specification of a global linear regression model. Hence, the estimation of the a marginal distributions for each therapeutic arm, the concordance measures between these populations and the right copulas families is now sufficient to process to the whole CEA.


Background
Due to the variety of treatments for a specific health problem and in conjunction with their increasing costs, cost-effectiveness studies of new therapies is challenging. These studies could achieve to a statistical analysis since that the common practice in laboratories is to collect individual patient cost data in randomized studies. Furthermore, it is now possible to compute the incremental net benefit from the use of a new therapy in comparison with the common-in-use therapy.
In the last decades, the cost-effectiveness analysis (CEA) of new treatments became an actual subject of work for *Correspondence: charles.fontaine@usherbrooke.ca UPRES EA2415-Institut Universitaire de Recherche Clinique, Université de Montpellier, 641, Av. du doyen G.-Giraud, Montpellier, France statisticians. It is used in two particular designs: decision modeling-based CEA and trial-based CEA. The major difference between both approaches is that in the case of trial-based CEA, data are gathered at the patients level in particular studies and it may lead to overlearning from the study, which may lead to mistakes in interpretation when the results are inferred for populations. In contrast, decision modeling-based CEA are based on easily generalizable data.
The cost-effectiveness analysis is used to measure the incremental cost-effectiveness ratio (ICER) and the incremental net benefit (INB). The ICER is defined as the ratio: where C 1 is the cost of the tested therapeutic, C 0 is the cost of the control group which is usually measured in term of a given monetary unit, T 1 is the effectiveness of the tested therapeutic which is usually measured in term of survival life years, T 0 is the effectiveness of the control group and, E(•) is the expectation function. Therefore, it is an indicator of the monetary cost of using a new therapy in terms of survival time. On the other hand, the INB is defined as the following difference: where λ is the willingness-to-pay for a unit of effectiveness.
In the literature, many articles propose ways to estimate these quantities. At first, Willan and Lin [1] proposed an approach which is based on the sample mean. It was applied directly to survival time years. In case of censored data, they proposed to estimate the survival function for each arm using the product-limit estimator [2] and then to estimate the survival time by integrating the survival functions until a time τ (i.e. μ j is the life expectancy until τ ), the maximal time of follow-up. Concerning the estimation of costs in case of censoring, many estimators may be used. Zhao et al. [3] have shown the equivalence among them.
The quality adjusted life years (QALY) concept was introduced in 1977 by Weinstein and Stason [4], and is still actually one of the most important notions in the cost-effectiveness theoretical framework. In the paper of Willan et al. [5], the concept of quality of life adjusted to the survival time is defined as follows. Let q ji be the quality adjusted survival for the period of interest for the patient i which follows the treatment j and let q jki be the observed quality of life for the patient i receiving treatment j, j = 0, 1 during the interval of time a k . In fact, this is a representation of the standard survival times scaled down by the quality of life experienced by patients. One note that the duration of interest of a study (0, τ ] is divided in K (arbitrary, relative to the data) sub-intervals [a k , a k+1 ) where k = 1, 2, . . . , K and where 0 = a 1 < a 2 < . . . < a K+1 = τ . Thus, one can determine the value of q jki in the following path: let a patient i be on treatment j with B ji quality of life measured at times t ji1 , t ji2 , . . . , t jim ji with respective scores Q ji1 , Q ji2 , . . . , Q jim ij . These scores are nothing more than the utility values. Therefore, q jki = a k+1 a k Q(t)dt is a weighted sum of times spent in the different quality of life states where: if 0 ≤ t < t ji1 ; and where X ji = min(T ji , η ji ), δ ji = 1 {T ji <η ji } where η represents the censoring random variable. Furthermore, let Y jki = 1(X ji ≥ a k and [ X ji ≥ a k+1 or δ ji = 1] ), which indicates if a patient i on the treatment arm j is alive at time a k and is not censored on [ a k , a k+1 ) and let then using a known expression of variance [5], one obtains the estimation of μ j , the expected value of effectiveness adjusted to QALY, witĥ More recently, Willan et al. [6] proposed to realize the whole cost-effectiveness analysis using linear regression methods. Let C i be the observed cost for patient i, then E(C ji ) = β T C i Z C ji , i = 1, 2, . . . , n j , Z C ji is a vector of covariates affecting costs and β C j is a vector of unknown regression parameters. Then, using the inverse probability of censoring weighting (IPCW) methods, they proposed a way to estimate the second component of β C j which is the mean difference in costs between randomization groups adjusted to other covariates,ˆ c , and its associate variance. The same methodology is done there to estimate the mean difference in mean survival between randomization groups adjusted for the other covariates, e . Thus, they proposed to estimate the ICER adjusted for the quality of life byˆ c /ˆ e and used this time again the Fieller's theorem to find a 100(1 − α) confidence interval. For the adjusted INB, they proposed to estimate it byb λ = λˆ e −ˆ c with varianceσ 2 λ = λ 2σ 2 e + σ 2 c − 2λσ c e . Thus, if the INB is positive, the therapy is cost-effective and if it is negative, there is no costeffectiveness. One remarks that it is possible to determine the cost-effectiveness under a significance level α using a statistic of test: ifb λ /σ λ is greater than the test level z 1−α , at a level α, the therapy is cost-effective compared to the standard.
From that linear regression approach, many variants exist either parametric or semi-parametric [7]. The main problem of the estimator for cost-effectiveness analysis based on the linear regression is that even if the inverseprobability censored weighting estimator is consistent, it is not efficient because if an individual is censored before or at time a K+1 , the patient does not contribute to the sum that constitutes this estimator and the practitioner may lose some statistical significant information. Furthermore, when there is no censoring, it is equivalent to the solve ordinary least squares equations; for which the linearity assumption may lead to serious bias in case of non-linearity. In the case of a semi-parametric estimator, the main problem is that the estimation of the conditional moments may be hard to find accurately or may lead to some bias in function of the used estimator. For these reasons, we introduce a new cost-effectiveness analysis methodology and a modeling of the joint density function between costs and QALY using parametric copulas. It is therefore based only on the dependence between covariates and the prior information on the variables distributions.

Copula function
To introduce the theory beyond the new proposed estimator for cost-effectiveness analysis, it is crucial to present the concept of modeling the dependence among two or more variables, namely the copula function concept. The idea of the copula started with Sklar [8] who formulated his famous theorem as follows: can be written as: where C is the cumulative distribution functionof the copula. In fact, it is a cumulative distribution function from [ 0, 1] d to [ 0, 1] with uniform margins over the unit interval [ 0, 1] such that: . Therefore, the copula density can be written as follows: Thus, for a bivariate model, the joint density function of stochastic variables X 1 and X 2 is: Note that parametric and nonparametric copulas estimation exists. Also, one has to remark that given two marginal distributions, the copula that joins them is unique if and only if these margins are continuous. In the literature, few papers deal with copula and costs data. At first, Embrechts et al. [9] present an original article about the correlation measurement with those data. However, the copula concepts presented there are more about generalizations of the copula theory than about a modeling methodology. Secondly, Frees and Valdez [10] use copulas for costs data in the insurance area. Their work explains how to fit copulas and measure the mortality risks. Thirdly, there is Hougaard [11] who uses copula with costs data in a multivariate survival analysis context. Also, there is Zhao and Zhou who work with copula models using medical costs, but in a stochastic process context, which is not adapted to QALY and costs data when the information is limited.
In this paper, we assume the copula to be parametric, which means that the copula can be written in a particular way as a function of the chosen family, with one parameter who summarizes the dependence between the variables, and the marginal distributions to be continuous. For more about copulas, see Nelsen [12].

Determination of QALY in terms of time and quality of life
If the survival time adjusted for the quality of life is already measured, one should directly estimate the parameters of its distribution. However, most of the time, practitioners only have two variables: time and quality of life. As shown in the previous section, the classical adjustment method for time on quality of life is given by is the adjustment of quality of life scores on the interval of time of interest. Since that the function H(t) = t 0 Q(v(y))dy is monotonically increasing, it is possible to rewrite the cumulative distribution function of T adj as a composition of functions such as: where H −1 () is the generalized inverse function, and the probability density function such that: where f adj is the density function of T adj and f is the density function of T. Therefore, for an individual i on treatment j, the practicioners have the following measures, E adj ji , which is such that: where η adj ji represents the censoring adjusted on quality of life and T adj ji the survival time having the same adjustment. Also, let C ji the cumulative cost for individual i on arm j. Thus, we get the following dependency evidences: 1. T adj ji and η ji are dependent, 2. T adj ji and η adj ji are independent, 3. C ji and η ji are dependent.
Proofs are provided in the "Appendix" section.

Estimation of the parameters of the distributions
Even if, to begin, the right distributions for costs and QALY are unknown, it is possible to infer their two main parameters: mean and variance. In fact, we will consider here each arm of the trial as a distinct random variable with distinct mean and variance but with the same probability distribution. Furthermore, we will assume that non-administrative censoring exists as the main consideration of the estimation. Let Z C ji be the dvector of covariates that affect costs for arm j, j = 0, 1, for the grouped population, and Z E ji be the one for QALY. Then, as proposed by Thompson and Nixon [13] and Stamey et al. [14], the costs mean function, on an arm j, is defined as: and, the QALY variable mean function given costs is defined as: As these models are, in fact, linear regression models with censoring on covariates, using the method of Lin [15], one can estimate the regression coefficients vector α C by the sum over the k periods of time of interest α C = K k=1α C k using an inverse probability weighting method (IPCW) such that, for an individual i belonging to arm j, ) and, as described in "Background" section , X ji is the minimum between time from randomization to death and time from randomization to censoring, and δ ji = 1(T ji ≤ η ji ). The same approach is used to find β, the vector of regression coefficients for QALY. Thus, from the inference on coefficients, it is possible to determine the adjusted mean on survival.
In terms of variance, we propose the use of the result of Buckley and James [16] (see also Miller and Halpern [17]), which is a generalization of the IPCW techniques.
Thus, the approximate variance of the cost distribution on a given arm is:

Determination of the parametric distributions
To model costs, three common distributions are frequently used: Gamma, Normal and Lognormal distributions. Their parametrization is easily done given the mean and the variance of the distribution. Let μ C be the mean and σ 2 C be the variance of costs for any clinical arm j. Then, the parametric distribution choice will be one of the following: where ν C and τ 2 C are mean and variance of the logcosts, i.e.
Furthermore, ρ C is the shape parameter of the Gamma distribution, which is such that ρ C = μ 2 C /σ 2 C . Thus, once each modeling is achieved, a selection of the better parametric distribution has to be done using the deviance criteria. The best fit to the data corresponds to the distribution that has the lower deviance, which is minus two times the log-likelihood.
In the case of QALY, the choice may be any symmetric or skew-symmetric distribution. We propose two options here, but the classical way here is to consider only a gaussian distribution following the work of Thompson and Nixon [13]. One notes that, even if it may look strange to use a real defined function for a real positive distribution, the distribution of T adj usually has a high mean with a low standard error such that the negative part of the fitted distribution is in fact negligible. Then, the proposed options are:

Inference on Kendall's tau
To further compute the copula parameter for each tested copula in the selection process, one has to get the global dependence parameter: Kendall's tau. The idea is that performing inference on Kendall's tau instead of directly on the copula parameter for each tested copula leads to only one estimation process instead of as many inferences as the quantity of tested models. Surely, the maximum likelihood estimator can be used in order to get the copula parameter. However, in this paper, we suggest the use of Kendall's tau for non-randomized data in reason of the easiness of the method. Furthermore, as shown by Genest et al. [18,19], the difference between both inference methods is slightly significant compared to the gain in computational efficiency. Note that in case of randomized data, the exchangeability phenomenon among individuals occurs and concordance measure may be biased. Hence, in this case, we suggest the use of a standard maximum likelihood estimation for the copula parameter.
The Kendall's tau inversion method to infer a copula parameter has been shown to be consistent under specific assumptions, which holds here [20]. In fact, for every parametric copula family, there exists a direct relationship between the copula parameter and the Kendall's tau. For example, with Clayton copula, one hasθ = 2τ/(1 − τ ) and for Gumbel copula,θ = 1/(1 − τ ). These relationships are clearly given in almost all the literature about parametric copulas [12]. Let consider the couple of random variables (C, T adj ) on a fixed therapeutic arm j, j = 0, 1. Furthermore, let consider C {1} , T and discordant elsewhere. The Kendall's tau, which is in fact a concordance measure, is defined in Kendall [21] by where E is the expectation, 1 is the indicator function, In a more general frame, one has the couples C {1} , T In absence of censoring, the estimation of τ is given by its sample version: where n is the sample size. In fact, it is simply the n(n − 1)/2 pairs of bivariate observations that could be constructed, multiplied by the subtraction of the number of discordant pairs to the number of concordant pairs. Under censoring, the approach of Oakes [22] propose to add an uncensoring indicator to that equation such that: where U {r} and U {s} represent the censoring variables under each independent replication. The problem with that estimator is the lack of consistency on a highdependent scheme. Therefore, one advocates the use of the renormalized Oakes' estimator [23] for which consistency has been shown. Therefore, the estimator is simply the ratio of the subtraction of the number of uncensored discordant pairs to the number of uncensored concordant pairs over the total of uncensored pairs.

Bayesian selection of the copula
Inference here may be done on any consistent parametric copula. To select the right copula, few alternatives are proposed in the literature. For complete data distributions, Genest and Rivest [18] proposed a procedure based on a probability integral transformation. Furthermore, based on the goodness-of-fit, Lakhal-Chaieb [24] proposed a procedure for censored data when the distributions are estimated by the survival functions. However, when available, a prior knowledge of the distribution of the margins in a copula is not a negligible information. It should be taken into account for the inference of the copula model when the latter is unknown, to minimize the risk of errors.
In their paper, Dos Santos Silva et al. [25] proposed a method of selection based on the information criterion. Let note F T adj (y) and F C (χ) the cumulative distribution functions for QALY and costs for a given randomization arm. Thus, one gets: where C stands for a parameter vector comprising parameters of the distribution of costs, T adj is the parameter vector for QALY distribution, θ is the dependence parameter which is functional of the Kendall's tau and = C ∪ T adj ∪ θ is the union of all these vectors of parameters. Also, f and F respectively stand for the density probability function and the cumulative probability function. Note that a similar writing could be done for a multivariate modeling. As shown in Genest et al. [26], one can find the copula parameter of any parametric copula while just having the Kendall's tau [21] measure, even in multivariate models [27].
Let x be a bivariate sample of size n of that density function. Also, let M k a copula model for k = 1..m where m is the quantity of models one wants to test. Therefore, the likelihood function is given by: where δ j indicates if an individual is censored or not. Then, using the deviance function which is D( k ) = −2ll(x| , M k ) where ll stands for the log-likelihood function, Dos Santos Silva et al. [25] proposed to use the deviance information criterion (DIC) which is: One remarks that such a selection process requests parametric copulas with a limited quantity of parameters. Hence, one suggests the use of archimedean and elliptic copulas families for these modelings. In fact, the range of dependance structure models provided by archimedean copulas is enough large to cover the dependance parametrisation in a cost-effectiveness analysis. For example, a Clayton copula may represent a study where a small QALY is directly linked to small costs, but a huge QALY is independent of costs. Also, one remarks that if costs and QALY are independents, then the independence copula is selected and one gets directly the values that constitute the ICER.

Incremental cost-effectiveness ratio
From the estimated joint densities f (y j , χ j ), j ∈ {0, 1}, one writes, for costs: where D C j and D T adj j are respectively the domain of definition for the random variables C and T adj for arm j. For QALY, one has the following: Thus, given the expected costs and survival time adjusted to quality of life, the adjusted ICER is estimated by and using Fieller's theorem (Fieller [28], Willan and O'Brien [29], Chaudhary and Stearns [30]), one gets the 100(1 − α)% confidence interval such that In this formula, z 1−α/2 represents the 100(1 − α/2) percentile of the standard normal distribution. Furthermore, σ 2 T adj represents the variance of the distribution for effectiveness where T adj is the difference between T adj j=1 and T adj j=0 . The same scheme arises forσ 2 C . Forσ C T adj , it is nothing more than the estimated covariance of the differences, between j = 0 and j = 1, in costs and in quality adjusted life years.
The reason to use Fieller's theorem is to avoid standard ways of using bootstrap. As shown by Siani and Moatti [31], Fieller's method is often as robust as bootstrap methods are (both parametric and non-parametric ways), even in problematic cases. However, for non-common situations, an approach based on the ICER graphical plan (as proposed by Bang and Zhao [32]) is recommended. Indeed, in such a case, a graphical approach minimize the use of biased or non-sufficient statistics in a parametric confidence interval model.

Incremental net benefit
The adjusted INB(λ) is estimated by where λ is the willingnessto-pay for a unit of effectiveness.

Subgroups analysis
It is possible to accomplish a cohort analysis using that procedure. The main idea here is to perform a costeffectiveness analysis while achieving a discrimination between two or more subgroups. The principle is that there exists a baseline variable Z k , k ∈ {1, 2, . . . , d}, even for costs than for QALY, which is in fact a categorical variable (dichotomous or multichotomous) and for which one should determine a marginal INB. Such subgroups have to be based on clinically important covariates. Since that these subgroups are in the therapeutic arms, it is not possible to assume that they are balanced. As shown in Nixon and Thompson [33], and Tsai and Peace [34], a naive use of these subgroups information without any adjustment may lead to serious bias.
Let Z C jki , Z T adj jki be some population attribute indicator covariates (e.g. sex, smoking status, etc.) for costs and QALY on an individual i belonging to the clinical arm j. For the sake of illustrating the concept here, let say that one tests the therapeutic effect on smokers. Therefore, there will be four subgroups: smokers in the treated group, non-smokers in the treated group, and the same for the control groups.
Let T adj j=1,k=1,i be the effectiveness for smokers individuals i on the treated arm, T adj j=1,k=0,i be the effectiveness for non-smokers individuals i on the treated arm; and the same for the control arm. One does a similar writing for costs. Let E(T adj j ) = E(T adj j,k=1 ) − E(T adj j,k=0 ) and the same for costs. Then, the interest in this discrimination is on the incremental net benefit marginalized to the smokers cohort, which is Since that subgroups are inside clinical arms, the main issue is to establish the expression of variance. Adjusting Fieller's method to the subgroups context, one has where variances are computed in the standard path. For the covariance term, σ T adj C , there are two possible scenarios. Firstly, when the assumption that subgroups in treated and control arms are randomized is possible, one has cov T adj j=1 − T adj j=0 , C j=1 − C j=0 = cov T adj j=1 , C j=1 + cov T adj j=0 , C j=0 which can be found easily using standard techniques. Secondly, when the randomization assumption between subgroup is not possible because the cohorts are unbalanced in the clinical arms, then the covariance is: Here, the terms cov(T adj j=1 , C j=1 ) and cov(T adj j=0 , C j=0 ) are computed in a standard way similarly to the randomized case. However, for the crossed-arms covariance terms, cov(T adj j=1 , C j=0 ) and cov(T adj j=0 , C j=1 ), the approach that we suggest is to estimate the cumulative joint distributions F(E j=1 , C j=0 ) by Cθ (FĒ(y j=1 ),FC(χ j=0 )) and F(T adj j=1 , C j=0 ) by Cθ (FT adj (y j=1 ),FC(χ j=0 )) and F (T adj j=1 , C j=0 ) by Cθ (FT adj (y j=1 ),FC(χ j=0 )) according to the methodology shown in this paper, and then using the covariance definition cov(C, , compute the desired covariance from the estimated joint density function and the estimated marginal density functions.
In the spirit of the test of Willan et al. [6], one can test the equality of the INB between cohorts, which can be rejected at a level α if

Results and discussion
This section gives an illustration of the performance of the copula which provides the best estimate of the true copula and the cumulative joint distribution of costs and QALY respectively, according to the method presented above. Let the exact copula be C ( ) θ (F T adj (y| T adj ), F C (χ| C )) and its estimate be C (i) θ (F T adj (y|ˆ T adj ),F C (χ|ˆ C )) where (i) is the copula model selected among all the tested models and ( ) the exact copula model. Furthermore,F is the chosen distribution for F. Then, the objective of these simulations is to show that the bias generated by the approximation of θ byθ, = C ∪ T adj byˆ , the selection of C (i) as the copula model andF C ,F T adj as the marginal parametric models, is relatively weak.
In order to evaluate the performance of the proposed method in non-trivial cases, we performed Monte-Carlo simulations on 27 different simulation schemes. The methodology was to simulate bivariate data (representing costs and QALY) from three specific copulas. For each copula, we simulated the three possible configurations for the marginal distributions (costs are either normally distribute or lognormally distributed or follow a gamma distribution, while QALY is normally distributed). Then, we applied three different levels of randomly censoring on the marginal distribution of QALY (15, 30 and 70%). Hence, there were nine possibles copulas configurations; for these nine, we also challenged our methodology without any censoring to get a point of comparison for the inference of the Kendall's tau part of the model. For all the simulations, we assumed that T adj follows a normal distribution. Alternatively, we could have set T adj following a Gamma distribution. However, in the present core, the goal relatively to margins was to check that the determination of the marginal distributions criterion was adequate, which may be tested on only one margin (to simplify the text). The censoring followed an exponential distribution such that λ s=15 = 0.041, λ s=30 = 0.090 and λ s=70 = 0.308 where s represents the censoring percentage simulated. For all data generating processes (DGP), the Kendall's tau was identical and represented an intermediate level of dependence between marginal distributions to be fair with the reality: τ K = 0.60. Then, we computed the right copula parameter, for each copula, based on this Kendall's tau. We also used a relatively standard mean and variance in cost-effectiveness analysis, following parameters used by Thompson and Nixon [13], such that μ C = 1500, σ C = 400; μ T adj = 4, σ T adj = 0.75, and we parametrized each marginal distribution to keep close to these values. For the choice of generating copulas, we selected the three mostknown ones with the biggest difference: Gaussian, Clayton and Gumbel copulas. The DGPs schemes is shown on Table 1.
The simulation of linearly dependent and uncensored covariates for costs leads to a bias in our advantage for the computation of the mean and the variance, compared to the estimation performed according to a standard clinical scheme. We decided therefore to challenge our method using the Kaplan-Meier mean estimate of the survival function and its associated variance (which is the appropriate approach in absence of covariates of interest) instead of the presented procedure based on the covariates. Then, we applied the following steps: selecting a parametric distribution for the margins, selecting a parametric copula using the information criterion and finally, looking for the copula parameter. We replicated this procedure 500 times for n = 1000 data, then we collected the provided information on the frequency of successful procedures for the inference on the margins, the estimated Kendall's tau and the choice of copula, respectively.

Inference on Kendall's tau
The proposed way to infer Kendall's tau under censoring gives results close to the real τ K measured on data just before applying the censoring. On Fig. 1 and Table 2, one can see the dispersion of the computed Kendall's tau. We remind the reader that the theoretical value of tau used for simulations is 0.6, and that we performed simulations without censoring to give an idea of the Kendall's tau measure on complete data. Thus, firstly, all simulations were performed without any censoring. Secondly, a 15% rate of censoring were applied on data. Thirdly, for all the simulations, we used a 30% rate of censoring and, for the last case, we applied a 70% rate of censoring. Therefore, the vector that contains all the inferred Kendall's tau for a (un)censoring level has a length of 4500. One observes that the estimated values ofτ K are relatively closes to the theoretical value of 0.6, but one remarks that the dispersion of values for someθ is not close to the expected value  The outputs of the 9 simulations with a censoring of 0% are joint together in the information on the first line and the same for simulations censored at 15% on the second line, 30% at the third line and 70% at the last line as is the dispersion onτ K , since τ K ranges in [ −1, 1] while, for example, for Clayton copula, θ ranges in [ −1, ∞)\ {0}.

Inference on the marginal distribution for the costs
To select the right marginal distribution for costs on each of the 500 simulations and for each DGP, we used the proposed criteria based on the deviance. Thus, on Fig. 2, one can see the performance of this criterion. One may remark that, even with a 70% rate of censoring, the chosen parametric distribution is almost always correctly estimated.

Inference on the copula models
There exists a bunch of parametric copula families, but for the purpose of these simulations, we limited our selection to the most well-known ones: Gaussian, Student, Clayton, Gumbel, Frank and Joe copulas. However, with a real dataset, the reader may test any consistent parametric copula using the proposed way. During simulations, for each iteration, we collected the information about the selected family using the information criterion. At Tables 3 and 4, one can see these results for each DGP. We indicated, in bold, the copula that was selected the most in the 500 iterations and claimed that it was the copula to be retained for the mentioned DGP. When the selection of the copula was only in-between those used to simulate the data (as on Table 3), it was obvious that the chosen copula family is the one used for the generation process. Otherwise, when intermediate copulas (for which dependence adequation is close to the one provoked by the generating copula) are introduced in the selection process, the results may differ as shown on Table 4. For the DGPs where the costs were simulated following a normal distribution (1,2,3,10,11,12,19,20 and 21), the choice of the copula was influenced by the dependence caused by the type of copula retained to generate the data. In facts, when the costs margin follows N (μ C = 1500, σ C = 400) and the QALY margin follows N (μ T adj = 4, σ T adj = 0.75), using τ K = 0.60, if the dependency distribution is mostly normal (i.e. comes from a Gaussian or a Gumbel copula), there will be a moderate tail on both tails of the distribution with an uniform cloud along the correlation path. For these reasons which are attributes of the Frank copula, it was the chosen copula for DGPs 1,2,3,19,20 and 21. However, when, from the generating Clayton copula, a strict dependence was imposed to the left-tail while a mostly total independence was imposed to the right-tail, only a Clayton copula appeared appropriate to modeling  The chosen copula is in bold font these data. That's the reason why it was the chosen copula for DGPs 10,11 and 12. In the situation where QALY follows N (μ E = 4, σ E = 0.75) while costs follows a skewed marginal distribution (gamma or lognormal), in any case, there was not a right-tailed distribution, but mostly a leftoriented data cloud with a fat left-tail. That is the reason why, even when the data were generated from a Clayton copula, the Student (t) copula was the chosen one. Therefore, the simulations showed that the Bayesian criterion of selection of the copula was in accordance with the theoretical properties of the parametric copulas.
According to the structure of the marginal distributions, the choice of the copula could only be done in a limited spectrum of families. Therefore, when one seeks to find the best structural copula family, it appears essential to include the most known copula families covering as many The chosen copula is in bold font dependence states as possible. Thus, in harmony with Table 4, a selected copula which is not the generating one is only performing a better adequacy to the dependence between margins structures than the original one. One remarks that an high censoring level does not really impact the issue of the copula estimation. Indeed, for all DGPs with 70% censoring level, there is only one case where the result changes: DGP 24. In fact, the Gumbel copula is left-skewed, as Student copula may be; which can explains the wrong estimation of the copula in this case.

Example: acupuncture for chronic headache in primary care data
We used the acupuncture for chronic headache in primary care data from Vickers et al. [35][36][37] containing The data collection focuses on the measure of effectiveness in terms of QALY gained and the cumulative cost associated in UK pounds (£). Patients themselves reported unit costs associated with non-prescription drugs and private healthcare visits. The cost of the study intervention was estimated from the standard cost for a NHS professional multiplied by the contact time with the patient. Thus, patients in the treated arm had a mean time of 4.2 hours with study acupuncturist. No imputation for missing data was done if the three questionnaires were not complete and consequently for which QALY cannot be measured. Therefore, in the acupuncture arm, there was 136 participants and in the control arm, 119. The modeling process of both joint distributions function for QALY and costs in the two clinical arms is presented on Table 5. We remark that in both cases, dependence is weak since that Kendall's tau ranges between -0.10 and -0.15. Here, for the distribution of T adj , we compared two possibles choices: a Gamma distribution and a gaussian one. The normal distribution has the smallest deviance. For the choice of distributions for costs, in the two arms, the lognormal distribution was considered as the one with the smallest deviance. For the copula family selection using deviance information criteria, we compared, for each arm, Gaussian, Clayton, Student, Frank, Joe and Gumbel copulas. In the treated arm, Student copula was the considered one while in the control arm, it was the Gaussian copula. Therefore, the joint distribution function of the acupuncture arm was estimated by:F while, for the control arm, the estimation is: Using the approach given from the copula densities, the ICER is estimated such thatÎCER = 10082.68£/ unit of effectiveness, with a confidence interval which is:   The vertical intercept shows the negative value of the variability of costs and its confidence interval while the horizontal intercept shows the estimated ICER. Since the number of covariates was limited, it has not been possible to determine the existence of subgroups. Whether available, the search for subgroups is possible using our approach.

Impact of an artificially created censoring on the acupuncture example
To challenge our methodology via a censored dataset, we decided to create an artificial censoring on QALY variable (since that costs are assumed to always be observed). This censoring variable followed an exponential distribution where λ = 0.30. Hence, the QALY variable has been censored on approximately 30% of data. Using the same methodology than for the acupuncture original dataset, we obtained the information shown on Table 6. One notes that we did not report the costs information on this table since that it does not change from the information on Table 5 (costs is not a censored variable).Using these information, one obtains an estimated ICER such thatÎCER = 10879.89£/ unit of effectiveness, with a confidence interval which is: Fig. 4 Schema of the procedure to perform cost-effectiveness analysis using copulas as shown in this paper Hence, with a certain level of censoring, the estimation of ICER loses accuracy, and the confidence interval gets larger. However, the estimated ICER in case of censoring stay relatively close to the estimated ICER with original data, and stays in his confidence interval.

Synthesis on the acupuncture example
Firstly, let take a look at the conclusions of Wonderling et al. [36], who were the first to work on these original data. Without totally detailing their methodology, they used a linear regression adjusted on covariates of interest (age, sex, diagnosis, severity of headache at baseline, number of years of headache disorder, baseline SF-36 results and geographical site) to evaluate the mean difference for costs and effectiveness (in terms of QALY). Hence, from a linear model, they have got anÎCER equals to 9180 £with a mean health gain for acupuncture treatment of 0.021 QALY.
Using the copula-based methodology presented in this paper, we get, with these original data, a mean health gain for acupuncture treatment of 0.026 QALY and when we apply an exponentially distributed censoring around 30% on QALY variable, we get a mean health gain for acupuncture treatment of 0.021 QALY. The major difference between both approaches is on the estimated ICER value. However, we remark that the value of 9180 £keeps in the confidence interval of the copula-based ICER.

Conclusion
One motivation for this work was generated by the limitations of the at standard regression models applied to the cost-effectiveness analysis where the dependence structure between costs and utility along with time was not taken into account. We provided a simple step-bystep procedure to find INB and ICER and their confidence intervals, even in case of censoring.
On Fig. 4, one sees the schematized method from the observational data produced by both clinical arms to the complete cost-effectiveness analysis. In a parallel way, one accomplishes steps 1 and 2 which are the measure of the dependence between QALY and cumulative costs in each arm via Kendall's tau and the determination of the marginal distributions of both random variables. Then, at step 3, one generates copulas from information gained in steps 1 and 2 and, using the information criterion, one selects the closest copula to the right joint distribution function at step 4. Finally, at step 5, one determines the INB and the ICER using joint cdf. In case of subgroups cohorts analysis, one reiterate the procedure from step 1 to 5 to get two supplementary copulas and then, the joint cdf of costs and QALY for the crossed-arms covariance terms.
The methodology presented here can easily be implemented on computational software. Indeed, on R, using the packages copula [38] and CDVine [39], one can easily apply the whole process to a dataset, either censored or not. On SAS, the use of a PROC COPULA will be enough to fit a certain copula on a whole dataset.