 Research article
 Open Access
 Published:
Correction of the significance level when attempting multiple transformations of an explanatory variable in generalized linear models
BMC Medical Research Methodology volume 13, Article number: 75 (2013)
Abstract
Background
In statistical modeling, finding the most favorable coding for an exploratory quantitative variable involves many tests. This process involves multiple testing problems and requires the correction of the significance level.
Methods
For each coding, a test on the nullity of the coefficient associated with the new coded variable is computed. The selected coding corresponds to that associated with the largest statistical test (or equivalently the smallest p _{ v a l u e }). In the context of the Generalized Linear Model, Liquet and Commenges (Stat Probability Lett,71:33–38,2005) proposed an asymptotic correction of the significance level. This procedure, based on the score test, has been developed for dichotomous and BoxCox transformations. In this paper, we suggest the use of resampling methods to estimate the significance level for categorical transformations with more than two levels and, by definition those that involve more than one parameter in the model. The categorical transformation is a more flexible way to explore the unknown shape of the effect between an explanatory and a dependent variable.
Results
The simulations we ran in this study showed good performances of the proposed methods. These methods were illustrated using the data from a study of the relationship between cholesterol and dementia.
Conclusion
The algorithms were implemented using R, and the associated CPMCGLM R package is available on the CRAN.
Background
In applied studies, the relationship between an explanatory and a dependent variable is routinely measured using a statistical model. For instance, in epidemiology it is quite common that a study focuses on one particular risk factor. The scientific problem is to analyze whether this risk factor has an influence on the risk of occurrence of a disease, a biological trait, or another outcome. To answer to this question, a regression model is often used in which the risk factor will be represented by a continuous X, allowing adjustment on p−1 known risk factors of the studied trait. However, the form of the effect (or the doseeffect relationship) is not known in advance, and as such, the continuous variable X is often transformed, typically into categorical variables, by grouping values into two or more categories. An example of this is seen in an The American Journal of Epidemiology (October 2009, volume 170, number 8), where four of six papers with continuous exposure used categorization, and only two kept the variable as continuous [1].
Binary coding is often used in epidemiology, either to make interpretation easier, or because a threshold effect is suspected. In a regression model with multiple explanatory variables, the interpretation of the regression coefficient for a binary variable may be easier to understand than a change in one unit of the continuous variable. Dichotomous transformations of a variable X are defined as:
Other transformations are also used, in particular BoxCox transformations which have been defined as:
but the choice of the transformation is often subjective. The arbitrariness of the choice of cutpoints may lead to the idea of trying more than one set of values. Hence to analyze data, the statistician may have to use several transformations, and for each the statistician applies a test for “ β=0” (where β is the coefficient representing the effect of the risk factor of interest). The most favorable transformation is then chosen. The cutpoint giving the minimum p _{ v a l u e } is often termed “optimal” [2, 3]. When testing several codings of a variable, there is a problem with the multiplicity of tests performed, leading to an incorrect p _{ v a l u e } and possible overestimation of effects [4]. Generally, researchers fail to consider this problem and do not correct the significance level in relation to the number of tests performed [3], which can lead to an increase in the TypeI error [5]. The p _{ v a l u e } should thus be corrected to take into account the multiplicity of tests.
In many cases, it is now widely recognized that categorization of a continuous variable could introduce major problems to an analysis and interpretation of the associated model [1, 3]. It is important to note that the aim of this paper is not to defend this practice, but to improve a practice commonly used by epidemiologists in terms of multiple testing. Furthermore, despite known loss of power following dichotomization in the univariate case, Westfall [6] shown that dichotomizing continuous data can greatly improve the power when multiple comparisons are performed.
Many methods of correction exist, the most simple and well known being the Bonferroni rule. Several authors have improved this method to make it more powerful, however most do not take into account the correlation between the tests [7–11]. If the tests are independent, or moderately dependent, then they provide an upper bound which may be satisfactory. Efron [12] proposed a correction that account for the correlation between two consecutive tests if there is a natural order between the tests, with high correlation between adjacent tests. Liquet and Commenges [13, 14] and Hashemi and Commenges [15] proposed a more exact correction, accounting for the whole correlation matrix, for score tests obtained in logistic regression, generalized linear model and proportional hazards models.
Here, we proprose extending these studies to a categorical transformation (with m>2 categories) of the continuous variable by involving more than one parameter in the model; m−1 dummy variables are introduced in the model. The categorical transformation is a more flexible way to explore the unknown shape of the effect. In this context, we propose a method and an R program based on resampling approaches to determine the significance level for a series of several transformations (including dichotomous, BoxCox and categorical transformations) of an explanatory variable in a Generalized Linear Model. The problem of correcting the estimation of the effect will not be examined here.
First, we revisit the example proposed by Liquet and Commenges [14] on the relationship between cholesterol and dementia [16] to provide a framework for our discussion. In section ‘Methods: Statistical context’, we present the statistical contexts relating to multiple testing; the model, the maximum test and the minimum p _{ v a l u e } procedure and finally the score tests are exposed. Section ‘Methods: Significance level correction’ presents the different methods of correction of the TypeI error. A simulation study for the different strategies of coding, and application of the model to the initial example are presented in the section ‘Results’. Concluding remarks are given in the two last sections.
Example: revisiting the PAQUID cohort example
We revisited the example presented in the article of Liquet and Commenges [13] for a coding of a binary variable in a logistic regression. This example is based on the work of Bonarek et al. [16], who studied the relationship between serum cholesterol levels and dementia. The data came from a nested casecontrol study of 334 elderly French subjects aged 73 and over who participated in the PAQUID cohort (37 subjects with dementia and 297 controls). The variables age, sex, level of education and wine consumption were considered as adjustment variables. The analysis focused on the influence of HDLcholesterol(highdensity lipoprotein) on the risk of dementia. Bonarek et al. [16] first considered HDLcholesterol as a continuous variable; then, to ease clinical interpretation, they chose to transform the HDLcholesterol into a categorical variable with four classes. Finally, as there was no significant difference between the first three quartiles, HDLcholesterol was split into two categories with a cutpoint at the last quartile. The best p _{ v a l u e }, 0.007, was obtained in the latter analysis and was selected for interpretation. However, this p _{ v a l u e } did not take into account the numerous transformations performed to determine the best representation of the variable of interest. Legitimate questions arising from this include the following: What is the real association between dementia and HDLcholesterol, with a correction of the TypeI error? Is it really significant? Liquet and Commenges [14] proposed correcting the p _{ v a l u e } associated with multiple transformation including dichotomous and BoxCox transformation, however, their method cannot be used with categorical transformation.
Methods
Statistical context
Model
Let us consider a Generalized Linear Model with p explanatory variables [17], where Y _{ i } (1≤i≤n) are independently distributed with probability density function in the exponential family defined as follows:
with \mathbb{E}\left[{Y}_{i}\right]={\mu}_{i}={b}^{\prime}\left({\theta}_{i}\right),\mathbb{V}\mathit{\text{ar}}\left[{Y}_{i}\right]={b}^{\u2033}\left({\theta}_{i}\right)a\left(\varphi \right) and where a(·), b(·), and c(·) are known and differentiable functions. b(·) is three times differentiable, and its first derivative b ^{′}(·) is invertible. Parameters (θ _{ i },ϕ) belong to \Omega \subset {\mathbb{R}}^{2}, where θ _{ i } is the canonical parameter and ϕ is the dispersion parameter.
In this context, we wished to test the association between the outcome Y _{ i } and explanatory variable of interest X _{ i }, adjusted on a vector of explanatory variables Z _{ i }. The form of the effect of X _{ i } is unknown, so we may consider K transformations of this variable X _{ i } ( k )=g _{ k }(X _{ i }) with k=1,…,K.For example, if we transform the continuous variable in m _{ k } classes, m _{ k }−1 dummy variables are defined from the function g _{ k }(·): {X}_{i}\left(k\right)={g}_{k}\left({X}_{i}\right)=\left({X}_{i}^{1}\right(k),\dots ,{X}_{i}^{{m}_{k}1}(k\left)\right). Different numbers of level m _{ k } of the categorical transformation are possible.The model for a transformation k can then be obtained by modeling the canonical parameter θ _{ i } as:
where {Z}_{i}=(1,{Z}_{i}^{1},\dots ,{Z}_{i}^{p1}) and γ=(γ _{0},…,γ _{ p−1})^{T} is a p−1 vector of coefficients, and β _{ k } is the m _{ k }−1 vector of coefficients associated with a categorical transformation k of the variable X _{ i }. For dichotomous or BoxCox transformations β _{ k } reduce to a scalar ({\beta}_{k}\in \mathbb{R}).The hypothesis of the test for the transformation k is defined as follows:
where {\mathbf{0}}_{{m}_{k}1} is a null vector of dimension m _{ k }−1. Under the null hypothesis {\mathcal{\mathscr{H}}}_{0}\left(k\right) we have θ _{ i }(k)=γ Z _{ i }, which do not depend on k. Thus all the null hypotheses are the same, and denote it by {\mathcal{\mathscr{H}}}_{0}.
Maximum test and minimum Pvalue procedures
For each coding, k, of the variable X _{ i }, a test statistic T _{ k } is performed on the nullity of the vector β _{ k }. We then have a vector of test statistics T=(T _{1},…,T _{ K }) for the same null hypothesis (no effect of the risk factor of interest). In the context of dichotomous and BoxCox transformations, each test statisitic, T _{ k }, has asymptotically, a standard normal distribution. Thus rejecting the null hypothesis if one of the absolute values of the test T _{ k } is larger than a critical value c _{ α }, is equivalent to rejecting the null hypothesis if T _{ m a x }>c _{ α } where T _{ m a x }=m a x(T _{1},…,T _{ K }). To cope with the multiplicity problem, Liquet and Commenges [13, 14] proposed that the probablity of TypeI error for the statistic T _{ m a x } under the null hypothesis be computed as:
where T _{ m a x } is the realization of T _{ m a x }.An equivalent approach is to use a procedure based on the individual p _{ v a l u e } of each test T _{ k } noted P _{ k }=P(T _{ k }>t _{ k }) (where T _{ k } is the realization of T _{ k }). The minimum of the K realized p _{ v a l u e } corresponds to the test k which obtains the highest realization (in absolute values; k/ t _{ m a x }=t _{ k }). Then, we have:
where P _{ m i n }=m i n(P _{1},…,P _{ K }) and p _{ m i n } is the realization of p _{ m i n }. The interest of using a procedure based on the p _{ v a l u e } is the possibility of combining statistical tests which do not follow the same distribution. In the current context, we will combine dichotomous, BoxCox and categorical transformations with more than two levels.
Score test
We briefly present the score test used for all of the K transformations where the same null hypothesis is tested (i.e. {\mathcal{\mathscr{H}}}_{0}: “{\beta}_{k}={0}_{{m}_{k}1}” given by {\theta}_{i}\left(k\right)=\mathit{\gamma}{Z}_{i} (with different alternatives)). We present the main results obtained by Liquet and Commenges [14] for the Generalized Linear Model in the context of dichotomous and BoxCox transformations, and then consider the score test for categorical transformations.
Dichotomous and Boxcox Transformations
In the context of dichotomous and BoxCox transformations, the score test used for testing the effect of the transformed variable (β _{ k }=0 with {\beta}_{k}\in \mathbb{R}) follows asymptotically a standard normal distribution:
where \widehat{R} is the vector of residuals \widehat{{R}_{i}}={Y}_{i}\widehat{{\mu}_{i}} computed under the null hypothesis, V is a diagonal matrix such that {v}_{\mathit{\text{ii}}}=\widehat{\mathit{\text{Var}}}\left({Y}_{i}\right),H=\mathit{\text{VZ}}\left({Z}^{T}\mathit{\text{VZ}}\right){Z}^{T}, and Z the n×p matrix with rows Z _{ i }, i=1,…,n.
The correlation between the different tests has been defined by Liquet and Commenges [14]. Asymptotically, the joint distribution of T _{1},…,T _{ K } is a multivariate normal distribution with zero mean and a certain covariance matrix. Thus Liquet and Commenges [14] propose that the p _{ v a l u e } (associated with the test T _{ m a x }) defined in (3) using numerical integration [18] be calculated. They called their method the “exact method”.
Categorical transformations
In the context of a categorical transformation in m _{ k } classes, the score test testing {\mathcal{\mathscr{H}}}_{0}: “{\beta}_{k}={0}_{{m}_{k}1}” (with {\beta}_{k}\in {\mathbb{R}}^{{m}_{k}1}) follows asymptotically a χ ^{2} distribution with m _{ k }−1 degrees of freedom and is defined as:
where U _{ k } and I _{ k } are respectively the score function and the Fisher information matrix under the null hypothesis [19]. To compute the p _{ v a l u e } defined in (4), it is necessary to know the joint distribution of T=(T _{1},…,T _{ K }). Some studies have defined the distribution of the multivariate χ ^{2}[20, 21]. However, even though the correlation between the different tests could be easily estimated, it has not been possible, as far as we know, to obtain the joint distribution of T=(T _{1},…,T _{ K }). To overcome this problem, we propose approximating the p _{ v a l u e } (defined in (4) by the minimum p _{ v a l u e } procedure) using a resampling method (defined in the next section) which also accounts for the correlation between the test statistics.
Significance level correction
Bonferroni method
One of the most common corrections in multiple testing is the Bonferroni method. It has been described by several authors in various applications [7, 11, 22]. It allows an upper bound of the significance level of the minimum p _{ v a l u e } procedure to be computed as:
where K is the number of tests. This method is very simple and does not require any assumption about the correlation between the different tests. It can therefore be applied directly to the different possible codings of an explanatory variable. However, this only provides an upper bound of the p _{ v a l u e }, which may be very conservative if the correlation between tests are high and the number of transformation are large.
Resampling based methods
We propose the use of resampling based methods [23, 24] with the aim of building a reference distribution for the test statistics. These procedures have the advantage of taking into account the dependence of the test statistics for evaluating the correct significance level of the minimum p _{ v a l u e } procedure (or the maximum test procedure). The principle of resampling procedures is to define new samples from the probability measure defined under {\mathcal{\mathscr{H}}}_{0}: “{\beta}_{k}={0}_{{m}_{k}1}”.
Permutation test procedure
Permutation methods can be used to construct tests which control the TypeI error rate [25]. In our context, the algorithm of the permutation procedure is defined as follows:

1.
Apply the minimum p _{ v a l u e } procedure to the original data for the K transformations considered. We note p _{ m i n } the realization of the minimum of the p _{ v a l u e };

2.
As under {\mathcal{\mathscr{H}}}_{0}, the X _{ i } variable has no effect on the response variable, a new dataset is generated by permuting the X _{ i } variable in the initial dataset;

3.
Generate B new datasets {s}_{b}^{\ast},b=1,\dots ,B by repeating B times the step 2;

4.
For each new dataset, apply the minimum p _{ v a l u e } procedure for the transfomation under consideration. We note {p}_{\mathit{\text{min}}}^{\ast b} the smallest p _{ v a l u e } for each new dataset.

5.
The p _{ v a l u e } defined in (4) is then approximated by:
\hat{{p}_{\mathit{\text{value}}}}=\frac{1}{B}\sum _{b=1}^{B}{I}_{\left\{{p}_{\mathit{\text{min}}}^{\ast b}<{p}_{\mathit{\text{min}}}\right\}},
where I _{{·}} is an indicator function.
However, it is important to note that exchangeability need to be satisfied [25–30]. This condition is much more restrictive than it appears at first sight. In fact, Commenges [29] and Commenges and Liquet [25] showed that the permutation test approach for the score test is robust if the model has only one intercept under the null hypothesis, or if X _{ i } are independent of Z _{ i } for all i in the context of a linear model and the proportional hazards model. This issue applies in our context. Thus we investigated, the robustness of the permutation method when the exchangeability assumptions is violated.
Parametric bootstrap procedure
In 2000, Good [31] explained: “Permutations test hypotheses concerning distributions; bootstraps test hypotheses concerning parameters. As a result, the bootstrap implies less stringent assumptions”. Therefore, an alternative way may be to use resampling method based on bootstrap [32], which give us an asymptotic reference distribution. This procedure could be defined by the following algorithm:

1.
Apply the minimum p _{ v a l u e } procedure to the original data for the K transformations being considered. We note p _{ m i n } the realization of the minimum of the p _{ v a l u e };

2.
Fit the model under the null hypothesis, using the observed data, and obtain \widehat{\gamma}, the maximum likelihood estimate (MLE) of γ;

3.
Generate a new outcome {Y}_{i}^{\ast} for each subject from the probability measure defined under {\mathcal{\mathscr{H}}}_{0}. For example, for a logistic model (where a(ϕ)=1, b\left({\theta}_{i}\right)=\mathit{\text{log}}(1+{e}^{{\theta}_{i}}), and {\mu}_{i}=\mathbb{E}\left({Y}_{i}\right)={e}^{{\theta}_{i}}/(1+{e}^{{\theta}_{i}})), we generate {Y}_{i}^{\ast} according to:
P({Y}_{i}^{\ast}=1{Z}_{i})=\frac{{e}^{\widehat{\gamma}{Z}_{i}}}{1+{e}^{\widehat{\gamma}{Z}_{i}}}.
Repeat this for all the subjects to obtain a sample noted {s}^{\ast}=\{{Y}_{i}^{\ast},{Z}_{i},{X}_{i}\}

4.
Generate B new datasets {s}_{b}^{\ast},b=1,\dots ,B by repeating B times the step 3;

5.
Apply for each new dataset, the minimum p _{ v a l u e } procedure for the transformation considered. We note {p}_{\mathit{\text{min}}}^{\ast b} the smallest p _{ v a l u e } for each new dataset.

6.
Then, the p _{ v a l u e } defined in (4) is then approximated by:
\hat{{p}_{\mathit{\text{value}}}}=\frac{1}{B}\sum _{b=1}^{B}{I}_{\left\{{p}_{\mathit{\text{min}}}^{\ast b}<{p}_{\mathit{\text{min}}}\right\}}.
Results
Simulation study
The aim of this simulation study was to assess the performance of the two resampling methods to correct the significance level. Three different scenarios of transformations were investigated: dichotomous transformations, categorical transformations with three classes, and categorical transformations with different numbers of classes. To shorten the simulation study section we have not presented the results for the BoxCox transformations. For each simulation case, the control of the TypeI error and the power of the developed methods were evaluated. For all simulations, the data come from a logistic model (where a(ϕ)=1, b\left({\theta}_{i}\right)=\mathit{\text{log}}(1+{e}^{{\theta}_{i}}), and {\mu}_{i}=\mathbb{E}\left({Y}_{i}\right)={e}^{{\theta}_{i}}/(1+{e}^{{\theta}_{i}})) consisting of two explanatory variables: Z, an adjustment variable, and X, the variable of interest. We considered the following models:
where Z _{ i } and X _{ i } are independent and were generated according to a standard normal distribution and the vector X _{ i }(k) was a transformation of a continuous variable X _{ i }. The sample size was set to be 100. We used 1000 replications for each simulation and 1000 samples for the resampling methods.
Dichotomous transformations
We only considered dichotomous transformations to explore a shape effect of the variable of interest. To obtain the best transformation, several cutpoints c _{ k } may be tested. When epidemiological references are not available, a strategy based on the quantile of the continuous variable is most commonly applied. In this simulation we used the median for one dichotomous transformation. For two dichotomous transformations we used the first tercile as the first cutpoint, and the second tercile as the second cutpoint, and so on. This strategy is summarized in Table 1.
Firstly, we investigated the TypeI error rate. For a replication, the rejection criterion of the null hypothesis (β _{ k }=0) was a p _{ v a l u e } less than 0.05. Thus, for a simulation of 1 000 replications, the empirical TypeI error rate was the proportion of tests where the p _{ v a l u e } was less than 0.05. Figure 1(a) shows the evolution of the TypeI error rate for dichotomous transformations. The naive method, without correction of the multiple testing, increases the TypeI error rate with the number of codings tried. For ten codings this error rate reached 0.27. The error rate calculated by the Bonferroni method decreased with the number of cutpoints. This correction was therefore too conservative whereas the exact method and resampling methods gave a TypeI error rate close to the nominal 0.05 value.
When information on the shape of the effect of the explanatory variable was unknown we investigated the power of the methods applied above. We studied the power for a threshold effect model with a cutpoint value at the first tercile. Figure 1(b) gives the power as a function of the number of cutpoints tried. The power of the exact and resampling methods are quite similar to one another, and higher than the Bonferroni method. The difference between these methods and Bonferroni method increases with the number of cutpoints. We also observed that the power was highest at two cutpoints (two transformations). This result, was in fact, expected since we used the first and second terciles respectively as cutpoints for each dichotomous transformation. Power increased again when trying five and eight codings due to the fact that one of these codings corresponded to the first tercile. To conclude, the simulation study with dichotomous transformations showed that the resampling methods provide similar results for the TypeI error rate control and the power as those seen with the exact method.
Categorical transformations with same number of classes
We considered here only categorical transformations with three classes. In this situation, the choices of the two cutpoints (noted {c}_{k}^{1} and {c}_{k}^{2}) defining the categorical variables into three classes are also subjective. For this simulation study, our strategy was to attempt to find the most favorable transformation into three classes. This consisted of using the tercile of the variable for one transformation with two cutpoints (c11=q _{1/3} and c12=q _{2/3}); for two transformations we add to the previous choice a transformation with the first quartile and the third quartile for the two cutpoints (c21=q _{1/4} and c22=q _{3/4}). The global strategy until we obtain 10 transformations in three classes is presented in Table 2.
We investigated the TypeI error rate. Figure 2(a) shows the evolution of the TypeI error rate for categorical transformations in three classes. The results are similar to those we observed for dichotomous transformations. The Bonferroni correction was still too conservative, while resampling methods gave a TypeI error rate close to the nominal 0.05 value.
Next we considered the power of the different methods when the simulated model was specified with a categorical transformation of the continuous variable in three classes defined by cutpoints at the first and third quartile. The two resampling methods gave similar results with a higher power than the Bonferroni method (see Figure 2(b)). The power was highest for two transformations. This result was also expected because, with the strategy presented in Table 2, the transformation into three classes with cutpoints at the first and third quartile is used.
Various categorical transformations
In this last simulation, we presented a more realistic situation where different kinds of transformations were used to investigate the effect of the variable of interest. We proposed trying different categorical transformations and varying the number of classes. The most natural method is to use a dichotomous transformation at the median for one transformation. For two transformations, we added the previous coding and a categorical transformation in three classes based on the tercile. For three transformations, we added the two previous codings and a categorical transformation in four classes based on the quartile, and so on. The strategy proposed in this simulation is presented in Table 3.
The results for the TypeI error rate were similar to the previous simulation case (not shown here). We then studied the power of the different methods when the simulated model is specified with a categorical transformation of the continuous variable in five classes defined by cutpoints at the quintile. We can see in Figure 3, that, in this situation, the parametric bootstrap method seems slightly more powerful than the permutation method. The resampling methods were also more powerful than the Bonferroni method. Finally, as expected, we can see that the power was highest for four transformations, where one of the transformations used corresponded to a categorical transformation with quintiles as cutpoints.
Robustness of resampling methods
We investigated the robustness of the resampling methods when the exchangeability assumption is violated. The data came from the model defined in (5) with two dependent variables X _{ i }(k) and Z _{ i }. The dependency between X _{ i }(k) and Z _{ i } (formalized by the correlation ratio(η ^{2})) was specified by the following model:
where X _{ i }(k) is the binary coding of the X _{ i } variable with a cutpoint at the median. The coefficient β ^{∗} was computed according to η ^{2} and the variance of X _{ i }(k) variable.We tested three different binary codings with cutpoints at the first, the second and the third tercile. The strategy is used for various values of the correlation ratio (η ^{2}) from 0 to 0.6.
The robustness of the permutation method when the exchangeability assumption is violated was evaluated with respect to the results of the exact method. For different correlation ratios (η ^{2}) we evaluated the control of the TypeI error, the power, the Mean Square Error (MSE) of the estimated p _{ v a l u e } (p _{ v a l u e } from the exact method was used as a reference), and the rate of good decision (same decision as for the exact method). These results are presented in Figure 4 and show the good behavior of the permutation method since the TypeI error is controlled at the level 0.05, the power is the same for all the methods, the rate of good decision is always greater than 0.97, and the MSE is very low. Moreover, the distributions of the estimated p _{ v a l u e } are quite similar for different methods (not shown).
Example: revisiting the PAQUID cohort example
In order to find the real association between the two variables of interest in the example described at the end of Background section, we applied our newly developed approach which combined different kinds of transformations. Liquet and Commenges [14] have proposed seven dichotomous and five BoxCox transformations. However, their method did not allow for categorical transformations. We proposed to add, to the seven dichotomous and five BoxCox transformations for this application, four codings in three classes and four codings in four classes. The best transformation appeared to be the dichotomous transformation of HDLcholesterol with a cutpoint at the third quartile, as already found by Bonarek et al. [16]. The Bonferroni correction gave a p _{ v a l u e } equal to 0.140, thus not significant for an α level at 0.05. The p _{ v a l u e }, which is given by both resampling based methods is 0.038. To conclude, it is important to chose a powerful method of correction, because in this context the p _{ v a l u e } with no correction given by Bonarek et al. [16] was very optimistic (0.007), and the Bonferroni correction was very conservative, yielding an incorrect conclusion. The proposed approach based on the resampling procedure gave a result which was still significant and more realistic than the uncorrected p _{ v a l u e }.
Discussion
In this paper, we have considered the problem of correction of significance level for a series of several codings of an explanatory variable in a Generalized Linear Model with several adjusting variables. The methods developed, based on resampling methods, enable us to consider categorical transformations as more flexible in order to explore the unknown shape of the effect between an explanatory and a dependent variable. The simulation studies presented above show, firstly, that the resampling method provides similar results for the TypeI error rate control and the power as those found with the exact method proposed by Liquet and Commenges [14] for dichotomous and BoxCox transformations. Secondly, in the situation of categorical transformations, these simulations demonstrate the good performance of our proposed approaches. Finally we observed the robustness estimation of the p _{ v a l u e } by the resampling methods. These methods can be easily generalized to other models, such as the proportional hazards model, and to potentially extend the work of Hashemi and Commenges [15] in the same context.
Conclusion
To conclude, the methods developed, based on resampling, demonstrate good performances, and we have implemented different methods and different strategies of coding in an R package called CPMCGLM M (for Correction of the Pvalue after Multiple Coding in a Generalized Linear Model).
Appendix
The package CPMCGLM has been developed in R, an open source statistical software available at http://www.rproject.org. The methods presented in this paper are available in the main function CPMCGLM() for Probit, Logit, Linear, and Poisson models. Briefly, the user can specify the transformations tested: BoxCox, dichotomous or categorical transformations. Two options are possible for defining the cutpoints of the dichotomous and the categorical transformations: the user can either specify them, or the program will automatically use the strategy based on the quantile presented in the simulation study.
The main function provides the best codings according to the maximum test and minimum p _{ v a l u e } procedures. For this coding, the different methods of correction of the TypeI error rate presented in this paper are provided. We present an illustration of the CPMCGLM function on a simulated dataset:
References
Bennette C, Vickers A: Against Quantiles: categorization of continuous variables in epidemiologic research, and its discontents. BMC Med Res Methodol. 2012, 12: 2125. 10.1186/147122881221.
Altman D, Lausen B, Sauerbrei W, Schumacher M: Dangers of using optimal cutpoints in the evaluation of prognostic factors. J Natl Cancer Inst. 1994, 86 (11): 829835. 10.1093/jnci/86.11.829.
Royston P, Altman D, Sauerbrei W: Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006, 25: 127141. 10.1002/sim.2331.
Harrell FE, Lee KL, Mark DB: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996, 15 (4): 361387. 10.1002/(SICI)10970258(19960229)15:4<361::AIDSIM168>3.0.CO;24.
Miller RG: Simultaneous statistical inference. 2nd ed. 1981, New York  Heidelberg: Berlin: Springer Verlag. XVI 299, figs. DM 44.00
Westfall PH: Improving power by dichotomizing (even under normality). Stat Biopharm Res. 2011, 3 (2): 353362. 10.1198/sbr.2010.09055.
Simes R: An improved bonferroni procedure for multiple tests of significance. Biometrika. 1986, 73 (3): 751754. 10.1093/biomet/73.3.751.
Sidak Z: Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc. 1967, 62: 626633.
Holm S: A simple sequentially rejective multiple test procedure. Scand J Stat. 1979, 6: 6570.
Hommel G: A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika. 1988, 75: 383386. 10.1093/biomet/75.2.383.
Hochberg Y: A sharper Bonferroni procedure for multiple test procedure. Biometrika. 1988, 75: 800802. 10.1093/biomet/75.4.800.
Efron B: The length heuristic for simultaneous hypothesis tests. Biometrika. 1997, 84: 143157. 10.1093/biomet/84.1.143.
Liquet B, Commenges D: Correction of the Pvalue after multiple coding of an explanatory variable in logistic regression. Stat Med. 2001, 20: 28152826. 10.1002/sim.916.
Liquet B, Commenges D: Computation of the pvalue of the minimum of score tests in the generalized linear model, application to multiple coding. Stat Probability Lett. 2005, 71: 3338. 10.1016/j.spl.2004.10.019.
Hashemi R, Commenges D: Correction of the pvalue after multiple tests in a Cox proportional hazard model. Lifetime Data Anal. 2002, 8: 335348. 10.1023/A:1020514804325.
Bonarek M, BarbergerGateau P, Letenneur L, Deschamps V, Iron A, Dubroca B, Dartigues J: between cholesterol, apolipoprotein E polymorphism and dementia: a crosssectional analysis from the PAQUID study. Neuroepidemiology. 2000, 19: 14148. 10.1159/000026249.
McCullagh P, Nelder J: Genaralized Linear Models. 1989, New York: Chapman & Hall
Genz A: Numerical computation of multivariate normal probabilities. J Comput Graphical Stat. 1992, 1: 141149.
Cox D, Hinkley D: Theoretical Statistics. 1994, London: Chapman & Hall
Royen T: Expansions for the multivariate chiSquare distribution. J Multivariate Anal. 1991, 38: 213232. 10.1016/0047259X(91)90041Y.
Dagupsta N, Spurrier J: A class of multivariate χ2 distributions with applications to comparison with a control. Commun Stat Theory Methods. 1997, 26: 15591573. 10.1080/03610929708832000.
Worsley K: An improved Bonferroni inequality and applications. Biometrika. 1982, 69: 297302. 10.1093/biomet/69.2.297.
Westfall PH, Young S: Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics. 1992, New York: NY Wiley, xvii, 340 p
Yu K, Liang F, Ciampa J, Chatterjee N: Efficient pvalue evaluation for resamplingbased tests. Biostatistics. 2011, 12 (3): 582593. 10.1093/biostatistics/kxq078.
Commenges D, Liquet B: Asymptotic distribution of score statistics for spatial cluster detection with censored data. Biometrics. 2008, 64 (4): 12871289. 10.1111/j.15410420.2008.01132_1.x.
Romano J: On the behavior of randomization tests without a group invariance assumption. J Am Stat Assoc. 1990, 85 (411–412): 686
Xu H, Hsu J: Applying the generalized partitioning principle to control the generalized familywise error rate. Biom J. 2007, 49: 5267. 10.1002/bimj.200610307.
Kaizar E, Li Y, Hsu J: Permutation multiple tests of binary features do not uniformly control error rates. J Am Stat Assoc. 2011, 106 (495): 10671074. 10.1198/jasa.2011.tm10067.
Commenges D: Transformations which preserve exchangeability and application to permutation tests. J Nonparametric Stat. 2003, 15 (2): 171185. 10.1080/1048525031000089310.
Westfall PH, Troendle JF: Multiple testing with minimal assumptions. Biom J. 2008, 50 (5): 745755. 10.1002/bimj.200710456.
Good P: Permutation Tests: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. 2000, NewYork: SpringerVerlag
Efron B, Tibshirani R: An Introduction to the Bootstrap (Chapman & Hall/CRC Monographs on Statistics & Applied Probability). 1994, London: Chapman and Hall/CRC
Prepublication history
The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/13/75/prepub
Acknowledgements
We would like to thank Luc Letenneur (ISPED, University of Bordeaux) for making the data available and the Danone Research Clinical Study Platform.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
Both authors declare that they have no competing interests.
Authors’ contributions
BL and JR developed the methodology, the R code, performed the simulation and the analysis on the dataset as well as wrote the manuscript. Both authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Liquet, B., Riou, J. Correction of the significance level when attempting multiple transformations of an explanatory variable in generalized linear models. BMC Med Res Methodol 13, 75 (2013). https://doi.org/10.1186/147122881375
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/147122881375
Keywords
 Bonferroni procedure
 Generalized linear model
 Multiple coding
 Parametric bootstrap
 Permutation
 p _{ v a l u e }
 Resampling procedure