### Statistical methods

#### The proportional odds model

The statistical approach is based on the proportional odds model [12]. Consider the *K* = 3 centers. Each patient had a response into one of the *m* = 4 categories, (*C*
_{1}, .., *C*
_{
m
}), with *n*
_{
i
}patients in center *i*. Let be the total number of patients overall.

Denote *Z* the response and *π*
_{
ijc
}, the probability that the *jth* subject in study *i* had a response in category *c*; *π*
_{
ijc
}= Pr(*Z*
_{
ij
}= *c*). Let also *Q*
_{
ijc
}= Pr(*Z*
_{
ij
}≤ *c*) be the cumulative probability of a response in category *c* or better, so that *Q*
_{
ijc
}= *π*
_{
ij
}
_{1} + ... + *π*
_{
ijc
}and *Q*
_{
ijm
}= 1; *c* = 1, ..., *m*-1. Hence *c* = *m* corresponding to the worst outcome (here an ETF) is the baseline level. The proportional odds model is defined by

where *α*
_{
c
}represents the *cth* intercept and *η*
_{
ij
}= *γ*
_{1} *x*
_{1}
_{
ij
}+ ...+ *γ*
_{
p
}
*x*
_{
pij
}, is a linear combination of explanatory variables. The model assumes 1) an increasing order for the *α*
_{
c
}: *α*
_{1} < *α*
_{2} < ... < *α*
_{
m-1}; 2) *α*
_{0} = -∞ and *α*
_{
m
}= ∞.

The assumption of proportional odds signifies that the ratio of the odds of the event *Z*
_{
ij
}≤ *c* for any pair of sets of explanatory variables is independent of the category *c*. The cumulative logit model (1) results from the existence of a latent random variable with a logistic distribution, the (*α*
_{
c
}, c = 0, ..., m) corresponding to the cut-off thresholds for categorizing the latent variable into *m* classes. Alternatives to the proportional odds model can be developed based on different assumptions regarding the latent variable, like a normal- or an extreme value distribution associated with the probit or complementary log- log links, respectively.

##### Fixed effect model

The basic model extends the proportional odds model (PO) proposed in [7] to a situation with L treatments (L > 2). For the sake of simplicity, the presentation will be restricted to the situation of L = 3 treatments corresponding to the real data set but the extension to more than 3 treatments is straightforward. Assuming that one particular treatment (here, AQ) was selected as the reference treatment, the treatment covariate is modeled using L-1 = 2 dummy variables *T*
_{1} and *T*
_{2}, where *T*
_{
l
}(l = 1, L-1) yields 1 if subject received the treatment l, and 0 otherwise (here, l = 1 corresponds to SP, l = 2 corresponds to AQ + SP)

where {*j* = 1, ..., *n*
_{
i
},} corresponds to the individuals within each arm, {*i* = 1, .., *K*}, corresponds to the centers and {*c* = 1, ..., *m*-1} to the categories. The parameter *γ*
_{1} represents the log-odds ratio of efficacy associated to treatment 1 *versus* the reference treatment, and *γ*
_{2} the log-odds ratio of efficacy associated to treatment 2 *versus* the reference treatment.

The center effects are modeled by *γ*
_{0i
}where *γ*
_{01} = *0* for identifiability constraint.

##### Testing the proportional odds assumptions

It is worth noting that model (2) assumes that, within each center and for each outcome category *c*, (*c* = 1, .., *m*-1), there is a common log-odds ratio, *γ*
_{1} for treatment *l* *versus* the reference treatment. Similarly, model (2) assumes also that, within each treatment group, and for each outcome category *c* there is a common log-odds ratio, *γ*
_{0i
}- *γ*
_{0i
}, between two centers *i* and *i'*, independent of the outcome category.

Additional models were introduced to test the PO assumption. Model (3) allows for testing a global PO between treatments, without a center effect:

However, each treatment can have a different effect according to the outcome category.

Assuming a global PO for *T*
_{1}, the model (4) allows for testing PO between *T*
_{2} and the reference treatment

where *δ*
_{
hc
}= 1 if *h* = *c* and 0, otherwise.

Similarly, the model (5) assumes a global PO for *T*
_{2}, and allows for testing PO between T1 and the reference treatment (here AQ)

##### Mixed effects model

Discarding a center effect, and still assuming proportional odds between treatments, 2 random effects were added to the treatment variables.

where: *β*
_{1i
}= *γ*
_{1} + *δ*
_{1i
}and *β*
_{2i
}= *γ*
_{2} + *δ*
_{2i
}. Both *δ*
_{1i
}and *δ*
_{2i
}are normally distributed random effects with expected value 0 and variance , respectively.

By grouping separately the fixed and the random-effects, the model can be rewritten as follows:

where , where *G'*, the transposed matrix of *G*, needs to be estimated from the data.

#### Estimation methods

The parameters of the fixed effects models were estimated using three different methods, *i.e.* maximum likelihood (ML), iterative generalized least squares (IGLS) and generalized estimating equations (GEE) [7]. In the IGLS and GEE methods, the ordinal response was considered as a set of correlated binary responses. Thus the proportional odds model is viewed as a member of the multivariate generalized linear models [13].

All these methods were implemented under the free Statistical Software R http://cran.r-project.org/. The various codes for the ML, GEE and IGLS methods are in the additional file 1. Models (3), (4) and (5) were implemented for the test of the proportional odds assumption between centers and treatments. The parameters of the random effects model were estimated using a Bayesian approach for hierarchical models. In the Bayesian paradigm, prior distributions for all unknown parameters *θ* need to be specified. The intercept *α*
_{
c
}were given non-informative *N*(0,10^{4}) priors. Parameters *β*
_{1i
}and *β*
_{2i
}were assumed to be from *N*(*γ*
_{1}, *σ*
_{1}
^{2}) and *N*(*γ*
_{2}, *σ*
_{2}
^{2}) respectively. We gave *γ*
_{1}, *γ*
_{2} non-informative *N*(0,10^{-6}). The difficulty in the Bayesian paradigm is the choice of the *a priori* distribution for the inverse of the variances *σ*
_{1}
^{2} and *σ*
_{2}
^{2}. Several forms of the *a priori* are available. We chose an *a priori* of the form *Gamma*(0.5*η*, 0.5*ηζ*
^{2}), where *η* and *ζ*
^{2} are either arbitrary chosen or calibrated from the data. Here we chose *η* = 2. The initial estimated values of *ζ*
^{2} (0,007, 0,001) were obtained from our data, by adjusting a generalized linear mixed model. So 1/*σ*
_{1}
^{2} and 1/*σ*
_{2}
^{2} were given *Gamma*(1,0.007) and *Gamma*(1,0.001), respectively. All the Bayesian analyses were performed using 3 chains of 50000 samples, out of which the first 25000 were removed to allow for convergence. Thus, each posterior statistic was based on 75000 samples. Finally the 95% credibility interval was obtained for each statistic.

#### Simulations

To assess the performance of the fixed effect methods, we carried out analyses on simulated data based on model (6). The aim was to compare the methods when the treatment covariate was taken into consideration and random effects were introduced to account for center variability. We generated a data set of 3 centers, each containing 100 patients, comparing 3 treatments A, B, C, with equal size. We supposed that the treatment effects were the same in the 3 centers, and were set to be *γ*
_{1} = -1 for B comparing to A and, *γ*
_{2} = 1 for C comparing to A, respectively. Initial values for *α*
_{
c
}were *α*
_{01} = -2, *α*
_{02} = -1, *α*
_{03} = 0. The initial value of the matrix *G*' was a diagonal matrix with diagonal elements *σ*
_{1} = 0.01 and *σ*
_{1} = 0.02, corresponding to the between center deviation.

The choice of the treatment initial values was guided by preliminary results from real data. The choice for the cut-point values was derived from [14].

Cumulative probabilities *Q*
_{
ijc
}were then obtained using model (6). For a subject *j* in study *i*, with vector of cumulative probabilities (0, *Q*
_{
ij1}, *Q*
_{
ij2}, *Q*
_{
ij3}, 1), the simulated outcome was generated using the quartile range category, to which an uniformly (0,1) distributed random value belonged. To check that the conclusions were not sensitive to the choice of the starting values, we repeated the simulations, for each method, with different initial values. An example of the simulated data set can be found in additional file 2.

For each method, bias and confidence levels were estimated through *N* 1000 = replications. For each estimation method, the treatment effect was assessed using the calculated coverage probability and the type I error.

In the Bayesian analysis, model (6) was implemented using the Bayesian Software WinBUGS. A program is provided in the additional file 3, showing how model (6) can be written in BUGS language.