The BoxCox power transformation on nursing sensitive indicators: Does it matter if structural effects are omitted during the estimation of the transformation parameter?
 Qingjiang Hou^{1}Email author,
 Jonathan D Mahnken^{1},
 Byron J Gajewski^{1, 2} and
 Nancy Dunton^{3}
DOI: 10.1186/1471228811118
© Hou et al; licensee BioMed Central Ltd. 2011
Received: 18 April 2011
Accepted: 19 August 2011
Published: 19 August 2011
Abstract
Background
Many nursing and health related research studies have continuous outcome measures that are inherently nonnormal in distribution. The BoxCox transformation provides a powerful tool for developing a parsimonious model for data representation and interpretation when the distribution of the dependent variable, or outcome measure, of interest deviates from the normal distribution. The objectives of this study was to contrast the effect of obtaining the BoxCox power transformation parameter and subsequent analysis of variance with or without a priori knowledge of predictor variables under the classic linear or linear mixed model settings.
Methods
Simulation data from a 3 × 4 factorial treatments design, along with the Patient Falls and Patient Injury Falls from the National Database of Nursing Quality Indicators (NDNQI^{®}) for the 3^{rd} quarter of 2007 from a convenience sample of over one thousand US hospitals were analyzed. The effect of the nonlinear monotonic transformation was contrasted in two ways: a) estimating the transformation parameter along with factors with potential structural effects, and b) estimating the transformation parameter first and then conducting analysis of variance for the structural effect.
Results
Linear model ANOVA with Monte Carlo simulation and mixed models with correlated error terms with NDNQI examples showed no substantial differences on statistical tests for structural effects if the factors with structural effects were omitted during the estimation of the transformation parameter.
Conclusions
The BoxCox power transformation can still be an effective tool for validating statistical inferences with large observational, crosssectional, and hierarchical or repeated measure studies under the linear or the mixed model settings without prior knowledge of all the factors with potential structural effects.
Keywords
Data transformation NDNQI Nursing quality indicator ANOVA, Mixed modelBackground
Many health and nursing related studies focus on outcome measures that can be used to identify superior treatments and/or to reveal deficiencies in practices [1]. While substantial effort has been made on research design and data collection, researchers are more concerned with the validity of statistical conclusions should the reliability of the measurement be compromised [2] or the basic statistical assumptions be violated because nonnormal data distributions with these outcomes are common [3]. In the later case, data transformation is one of the powerful tools for developing parsimonious models for detecting structural effects or predictive factors and for better data representation and interpretation [4–6]. Ever since the pioneer works on the formal estimation of a suitable transformation [3], the nonlinear monotonic power transformation family in the form of ${y}^{\left(\lambda \right)}=\frac{{y}^{\lambda}1}{\lambda},\phantom{\rule{0.3em}{0ex}}if\phantom{\rule{2.77695pt}{0ex}}\left(\lambda \ne 0\right)$ and y^{(λ) }= log (y), if(λ = 0) has been the focus of extensive research and, as a result, has resulted in widespread applications in linear model analysis. With the advance in statistical research and computational technology, the BoxCox transformation has recently found its application in the linear mixed model settings [7–9], which, as hierarchical experiment design and longitudinal studies become more desirable, is an active field of research. Under the linear model framework, the parameter estimate for λ with the power transformation family, by definition, is obtained along with the structural effect such that the error term is normally distributed, ε ~ N (0, σ^{2}), with the model y^{(λ) }= Xθ + ε, where y^{(λ)}, X, and θ represents the transformed response, the design matrix of structural effects, and the vector of parameter estimates, respectively. This implies one should know a priori what the structure is before actually estimating the parameter for transformation (λ). In reality, factors with potential structural effects on the outcome can be large, unknown, and often are of primary interest for research, especially for large observational or crosssectional studies, such as the National Database of Nursing Quality Indicators (NDNQI^{®}). This study contrasted the effect of obtaining the BoxCox power transformation parameter and subsequent analysis with or without a priori knowledge of predictor variable under the classic ANOVA model with simulation, and then illustrated such effects by extending the BoxCox transformation into hierarchical analysis with the mixed model on two NDNQI nursing sensitive indicators.
Basic assumption for linear model methodology
where, Y_{ ijk }, X_{ ij }, β and ε_{ ijk }are all defined as in equation (1). This transformation may allow the response variable to achieve simplicity and additivity in mean structure for the expected value of (y^{ λ }) and make the variance more nearly constant among points in the factor space [14].
Substantial research has been conducted on the theoretical aspects of BoxCox modification [15], and a wide variety of applications used BoxCox transformation [16–18]. It is reported that maximum likelihoodbased variance components analysis applied to nonnormal data had inflated type I errors, which were controlled best by BoxCox transformation [19]. BoxCox transformation can be used to improve signal/noise ratio, map families of distributions and result in more efficient and robust results [20]. Analysis of the diagnostic accuracy using the receiver operating characteristic curve methodology required a BoxCox transformation within each cluster to map the test outcomes to a common family of distributions [21]. Recently, median regression after applying the BoxCox transformation was reported as notably more efficient and robust than the standard least absolute deviations estimator [22]. Due to its highly structured nature, however, the BoxCox power transformation model is controversial, as some theoretical and Monte Carlo studies indicated that the data based estimate of λ is unstable and that, much like the case of multivariate collinearity, λ and β are highly correlated [7–9, 16, 17]. Other studies, however, downplayed the cost from databased BoxCox transformation, arguing the cost should be moderate on the whole and seldom large [23]. It has been suggested that we need to understand better the joint effects of variable selection and data transformation [7, 8, 23]. Under the BoxCox transformation (2), one can put the data on the correct scale for an ANOVA model when the predictor variables (X) are identified and included during the transformation process. Unfortunately, for many nonrandomized studies it is not clear what predictor variables should be included when the dependent variable deviates significantly from the normal distribution.
Under the linear mixed model setting, the error term of ε_{ ijk }in model (3) is no longer independent and identically distributed (iid) normal, but rather correlated because sampling and experiment units may be hierarchical or each sampling unit may be repeatedly measured.
NDNQI database overview
In 1998, NDNQI^{®} was established by the American Nurses Association (ANA) to monitor nursingsensitive indicators that measure nursing quality and patient safety across all 50 states in the US [24]. Over the last decade, NDNQI has seen its participating hospitals grow from 35 in 1998 up to 1,450 by the end of 2009 [25]. With nursing data collected at the unit level within member institutions, NDNQI provides hospitals unitlevel performance reports with 8quarter trend data, along with national comparison data grouped by hospital staffed bed size, teaching status, Magnet status, various other hospital characteristics, and unit type [25].
Both Patient Falls and Patient Injury Falls have a common denominator of Total Number of Patient Days. Conceptually, a patient day is 24 hours, beginning with the hour of admission. The operational definition of patient days is the total number of inpatients present at the midnight census plus the total number of hours of short stay patients divided by 24. Short stay patients are patients on a unit for less than 24 hours either for observation or same day surgery.
Both Patient Falls and Patient Injury Falls are critical nursing quality indicators that may be associated with nursing workforce characteristics, as well as with unit type and some hospital characteristics such as teaching status and Magnet status. Other unknown factors might also affect the rates of Patient Falls and Patient Injury Falls in NDNQI hospitals across a wide spectrum of settings over the entire United States. Further, if such factors do exist, it would be of great interest to examine what administrative or nursing process adjustments a hospital might take to reduce these rates and thus improve the overall quality of service.
Methods
The BoxCox power transformation requires all predictor variables to be included in the model for estimating transformation parameter in order to put a skewed response onto the correct scale for the classic ANOVA model [27]. In this paper, a Monte Carlo simulation with a 3 × 4 factorial treatment design was used to contrast the properties of powertransformed response variables with and without the presence of the 3 × 4 factorial structural effects when the transformation parameter was estimated. The residual and the treatment main effects with the simulation were examined with twoway ANOVA model. NDNQI Patient Falls and Patient Injury Falls, collected on unit level, are correlated within hospitals and rightskewed in distribution. Statistical analysis without data transformation may violate the underlying assumption because of nonnormal error distributions, potentially also compounded with a correlated covariance structure. For illustration purpose, we first ignored the within hospital intra class correlation (ICC) and then extended the BoxCox power transformation into the linear mixed model framework [26] and analyzed NDNQI Patient Falls and Patient Injury Falls with mixed models assuming compound symmetric covariance structure [28] to contrast the effect of BoxCox transformations when predictor variable (Hospital Teaching and Magnet Status) were included in the transformation model with when they were ignored. Note, in NDNQI quarterly reports, ICC for all indicators were actually properly adjusted [29].
Patient Falls and Patient Injury Falls data from 6726 nursing units in 926 hospitals for the 3^{rd} quarter in 2007 were extracted from the NDNQI database maintained by NDNQI project at The Kansas University School of Nursing. The number of nursing units per hospital ranged from 1 to 36 with a median of 6 ± 5 (interquartile range). Along with the two indicators, hospital teaching status (Academic Medical Center; Other Teaching; NonTeaching) and Magnet status (Magnet vs. NonMagnet) were chosen from a variety of stratification variables for illustrative purposes. BoxCox transformation on Patient Falls and Patient Injury Falls were then applied both with and without inclusion of these predictors in the model with which the power transformation parameters were estimated.
Monte Carlo Simulations
where ${\Upsilon}_{ijk}^{\left(\lambda \right)}$ represented the transformed response from the k^{th} block with the i^{th} treatment for factor A and j^{th} treatment for factor B; μ was the overall mean; α_{ i }was the i^{th} treatment effect for factor A; β_{ j }was the j^{th} treatment effect for factor B, γ_{ ij }represented the factor A, B interaction, and ε_{ ijk }~ N (0, σ^{2}) represents error terms that followed the normal distribution. The transformed response vector ${\Upsilon}_{ijk}^{\left(\lambda \right)}$ in (4) was generated as the sum of the two factor main effects plus their interaction with α_{1} = 3.6; α_{2} = 4.5; α_{3} = 5.4; β_{1} = 2.0; β_{2} = 2.4; β_{3} = 2.8; β_{4} = 3.2; and γ_{ ij }= α_{ i }× β_{ j }for i = 1, 2, 3 and j = 1, 2, 3, 4; respectively. The random error ε_{ ijk }was generated as N (0, 26). The nontransformed response vector was then obtained through the inverse of power transformation function (2)Y_{ ijk }= (${\Upsilon}_{ijk}^{\left(\lambda \right)}$ * λ +1)^{(1/λ)} with the power transformation parameter (λ) being fixed at 0.4. Parameter α_{ i }, β_{ j }and ε_{ ijk }in model (4) were set such that the main effect and their interaction were all important. To check for large sample properties we let the replication for each combination of factors vary from 4 to 24 by 2, corresponding to the sample size ranges from 48 to 288 by 24. Two estimated power transformation parameters were obtained for each simulated data set: the first with the 3 × 4 factorial effect included as predictor variables in the transformation model (λ_{ 1 }), representing the BoxCox transformation by definition; and the other just a power transformation of the response variable (λ_{ 0 }), representing an approximation one might see in practice. Both power transformed response variables ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{1}\right)}$ and ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{0}\right)}$ were then used as the dependent variables for separate ANOVAs with 3 × 4 factorial treatment effects. The F statistics and Pvalues for the two factor main effects along with their interaction effects from the ANOVA tables were compared under the different power transformations. Residuals after the main effects and their interaction for both models were examined for normality with the ShapiroWilk statistic. The power transformation parameter was obtained following the maximum likelihood method [3]. A total of 1000 simulated data sets were generated for each set of replicate ranging from 4 to 24 for a completely randomized block design with 3 × 4 factorial treatments. SAS, version 9.2 was used for data generation and statistical analyses [30].
 1.
Will the goal of simplicity in structure and homogeneity in error for transformation be still achievable if predictor variables are omitted from the power transformation model?
 2.
What are the consequences of conducting the analysis of variance on the transformed response variable without including the predictor variables in estimating the transformation parameter (λ)?
Application to NDNQI Indicators
Suppose one is interested in investigating Patient Falls or Patient Injury Falls as a function of hospital teaching and/or Magnet status, then X_{ ij }in (1) has 6 columns with the first being a column of 1's, the 2^{nd} and 3^{rd} representing the teaching status, the 4^{th} an indicator for Magnet status, and the 5^{th} and 6^{th} for the Teaching by Magnet status interaction. After exploratory data analysis using the ANOVA model with hospital teaching and Magnet status as having structural effects, Patient Falls and Patient Injury Falls were analyzed with the mixed model under a) without transformation, b) power transformed without teaching and Magnet effects during the parameter estimation for (λ_{0}), and c) power transformed with teaching and Magnet effects during the parameter estimation (λ_{1}). The power transformation parameter, λ_{0}, was obtained through a grid search by maximizing the log likelihood of the residual for the transformed response variable after removing the overall means. As Gurka et al. [7] proposed, we obtained λ_{1}through maximizing the residual maximum likelihood (REML) with the existing computational procedures (SAS PROC Mixed). Specifically, for each indicator, a scaled BoxCox transformation [3] for a wide range of the power parameter value, λ_{ i }(i = 1 to 8 by 0.01) was first applied. Then, each transformed response was analyzed with the compound symmetry covariance structure to model the correlation among units within hospital. The λ_{ i }that corresponds to the maximum REML was selected as λ_{0}.
Results
Statistics for power transformation parameter and statistical test for structural effects based on Monte Carlo simulations
Simulations with non negative estimate (λ)  Sample size  Transformation parameter and its empirical estimate ± Standard deviation  Test for residual normality (proportion with P > 0.05)  

N  n  λ  Mean (λ_{ 1 }) ± Std  Mean (λ_{ 0 }) ± Std  Y ^{ λ }  ${({Y}^{\lambda})}^{({\lambda}_{0)})}$  ${\left({Y}^{\lambda}\right)}^{\left({\lambda}_{1}\right)}$ 
973  36  0.4  0.397 ± 0.160  0.296 ± 0.123  0.698  0.962  0.968 
996  54  0.4  0.386 ± 0.125  0.289 ± 0.107  0.522  0.950  0.971 
999  72  0.4  0.393 ± 0.104  0.295 ± 0.010  0.387  0.963  0.978 
1000  90  0.4  0.395 ± 0.089  0.295 ± 0.087  0.251  0.941  0.977 
1000  108  0.4  0.393 ± 0.085  0.295 ± 0.083  0.179  0.929  0.975 
1000  126  0.4  0.393 ± 0.075  0.296 ± 0.074  0.120  0.920  0.976 
1000  144  0.4  0.395 ± 0.068  0.299 ± 0.067  0.082  0.912  0.979 
1000  162  0.4  0.395 ± 0.067  0.398 ± 0.068  0.056  0.891  0.972 
1000  180  0.4  0.395 ± 0.063  0.030 ± 0.063  0.037  0.889  0.968 
1000  198  0.4  0.396 ± 0.059  0.301 ± 0.059  0.018  0.880  0.971 
1000  216  0.4  0.396 ± 0.057  0.302 ± 0.057  0.010  0.867  0.977 
Statistics for tests of structural effect with different transformation models based on Monte Carlo simulations
Simulations with non negative estimate (λ)  Sample size  Fvalue for interaction effects ± STD with different model for power transformation  Ftest for interaction effects (Proportion with P > 0.05)  

N  n  MeanF_{ λ }± Std  MeanF_{ λ0 }± Std  MeanF_{ λ1 }± Std  Y ^{ λ }  ${({Y}^{\lambda})}^{({\lambda}_{0)})}$  ${\left({Y}^{\lambda}\right)}^{\left({\lambda}_{1}\right)}$ 
973  36  2.072 ± 1.372  1.083 ± 0.699  1.328 ± 0.922  0.687  0.885  0.938 
996  54  2.460 ± 1.407  1.117 ± 0.673  1.338 ± 0.868  0.523  0.874  0.937 
999  72  2.866 ± 1.475  1.126 ± 0.660  1.378 ± 0.836  0.375  0.854  0.935 
1000  90  3.264 ± 1.656  1.178 ± 0.649  1.471 ± 0.865  0.343  0.823  0.927 
1000  108  3.796 ± 1.686  1.219 ± 0.700  1.537 ± 0.908  0.158  0.802  0.903 
1000  126  4.276 ± 1.900  1.292 ± 0.696  1.656 ± 0.951  0.100  0.748  0.899 
1000  144  4.707 ± 1.984  1.336 ± 0.738  1.721 ± 0.993  0.066  0.722  0.866 
1000  162  5.211 ± 2.156  1.411 ± 0.778  1.841 ± 1.073  0.043  0.697  0.852 
1000  180  5.653 ± 2.080  1.444 ± 0.783  1.903 ± 1.053  0.016  0.663  0.834 
1000  198  6.117 ± 2.282  1.497 ± 0.776  2.001 ± 1.114  0.009  0.624  0.798 
1000  216  6.616 ± 2.324  1.570 ± 0.797  2.107 ± 1.114  0.003  0.589  0.788 
Repeated measure analysis with the linear mixed model for Patient Falls and Patient Injury Falls for 2007 NDNQI 3^{rd} quarter
Source of Variation distribution  Degree of Freedom  FValue  Prob > F  Residual goodnessoffit test for normal 

Indicator: Total Falls (transformation with additive effect)  
Teaching Status  2  2.36  0.0945  
Magnet Status  1  3.83  0.0505  
Teaching × Magnet  2  5.15  0.0058  0.071 
Indicator: Total Falls (transformation without additive effect)  
Teaching Status  2  2.73  0.065  
Magnet Status  1  5.91  0.0151  
Teaching × Magnet  2  7.14  0.0008  0.023 
Indicator: Total Falls (no transformation)  
Teaching Status  2  2.14  0.1178  
Magnet Status  1  5.90  0.0151  
Teaching × Magnet  2  6.94  0.001  0.097 
Indicator: Total Injury Falls (transformation with additive effect)  
Teaching Status  2  1.83  0.1603  
Magnet Status  1  9.37  0.0022  
Teaching × Magnet  2  4.14  0.016  0.117 
Indicator: Total Injury Falls (transformation without additive effect)  
Teaching Status  2  1.37  0.2536  
Magnet Status  1  9.78  0.0018  
Teaching × Magnet  2  4.45  0.0118  0.029 
Indicator: Total Injury Falls (no transformation)  
Teaching Status  2  6.55  0.0014  
Magnet Status  1  3.63  0.0569  
Teaching × Magnet  2  3.29  0.0373  0.146 
Discussion
The BoxCox power transformation provides an effective tool to justify the use of the linear model when the response variable is not normally distributed. It was originally defined as highly structured and required all predictor variables to be included in the power transformation model [3]. There is always a cost resulting from selection of the transformation expressed as an inflated variance [7, 16]. However, predictor variables may not always be clearly defined in practice. This is especially true for exploratory data analysis, observational studies, or classification and regression tree (CART) analysis aimed at finding potential relationships when the distribution of the response variable deviates significantly from normality. In such cases, applying the BoxCox power transformation to the response variable alone and then searching for potential predictor variables was demonstrated to be effective in terms of achieving constant error and simplicity of main effects in the simulations and examples we examined. In our simulated data, the statistical tests for main effects were slightly more conservative for ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{0}\right)}$ as compared to ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{1}\right)}$, while the residuals after removing the structural treatment effects were unlikely to deviate from normality in either case (Table 1). On the other hand, interaction effects with ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{0}\right)}$ were generally less likely to be detected compared to ${\left({\Upsilon}_{ijk}^{\left(\lambda \right)}\right)}^{\left({\lambda}_{1}\right)}$ as the response variable.
The real case examples with Patient Falls and Patient Injury Falls from the NDNQI database showed BoxCox power transformations both with and without structural effects for teaching and Magnet status included in the models for estimating the transformation parameters were equally effective in normalizing the residual distributions (Figures 5a, b, 7a, b). Table 3 shows the test statistics from hierarchical analysis allowing for correlation between error terms for structural effect by stratification variables (Teaching, Magnet, and their interactions).
With over 1800 hospitals (one in every thee general hospitals in the U.S.) contributing nursing indicator data to the NDNQI database today, it is as critical to provide users with valid national comparative data in nursingsensitive quality indicators. As hospitals are striving to improve the quality of their nursing service, they can turn to the NDNQI quarterly reports to identify potential problems. While most of the nursing quality indicators are skewed in distribution, the structural effects of hospital characteristics are not always clear. In such cases, the classic BoxCox power transformation can be applied to the nursing quality indicators, for a specific category of unit (such as pediatric or post surgical) with linear model analysis, or all units within hospital under the mixed model framework, prior to identifying the structural effects from a potentially large pool of variables.
Both the simulation study and real case analysis with NDNQI quarterly report data demonstrated that the consequence of omitting a structural effect from the BoxCox power transformation was limited. This is important given the fact that for many large healthrelated observational studies the number of potential structural effects may be quite large. As of 2008, NDNQI had over 20 potential structural effects for 34 nursing indicators. Participating hospitals benefit from meaningful, valid comparative information based on a number of demographic, social, administrative, and service related factors. Estimating BoxCox power transformation parameters on indicators without including the unknown, or sometimes unmeasured, structural effects can still provide participating hospitals with statistically valid comparisons.
A few limitations need to be noted. First, the BoxCox transformation works better only if the measure of interest relatively smoothly spread out. In other words, the method may fail if the data cluster on a few values. Secondly, it is necessary to conduct a grid search of the transformation in order to find the optimum parameter that maximizes the residual likelihood both under the linear and the mixed model settings. Otherwise, the subsequent analysis may differ depending on whether or not the structural effects were included in the estimating process for the transformation parameters. Our results suggested a fine grid search for the transformation parameter should be used regardless the inclusion of factors with potential structural effects and regardless of whether the analysis uses the linear or mixed model settings, because the agreement on test for the structural effects occurs only if both transformations are optimized. Lastly, potential interactions between parameter estimates for transformation and for linear and/or random effects remains unclear, and, interpretation for the transformed data analysis, as always, remains a challenge that warrants further research.
Conclusions
The validity of linear mixed modeling via maximum likelihood relies on the underlying assumption that the random effects and residuals of the dependent variable are normally distributed. Many health and nursing related outcome measures deviate from this assumption. While at the same time, factors with potential structural effects are of major interest and yet to be identified. Therefore, the BoxCox power transformation provides a powerful tool for developing parsimonious models (i.e. applying linear mixed modeling) for data representation and interpretation. By extending the power transformation into linear mixed model setting with NDNQI examples, we found limited difference from subsequent test of structural effects regardless of whether such structure is included or omitted during the parameter estimation for transformation. This allows analysts to transform variables earlier in the model building, making the process of applying BoxCox transformation much easier in practice.
Future work would be to employ some sort of a latent class analysis [30] on the NDNQI data and look for structural relationships within each class.
Abbreviations
 NDNQI:

National Database of Nursing Quality Indicators
 ANOVA:

Analysis of Variance
 ICC:

Intra Class Correlation
 iid:

independent and identically distributed
 ANA:

American Nurses Association.
Declarations
Acknowledgements
This research was conducted under contract from the American Nurses Association (ANA). Dr. Nancy Dunton is the principal investigator.
Authors’ Affiliations
References
 Bonneterre V, Liaudy S, Chatellier G, Lang T, de Gaudemaris R: Reliability, validity, and health issues arising from questionnaires used to measure psychosocial and organizational work factors (POWFs) among hospital nurses: A critical review. Journal of Nursing Measurement. 2008, 16 (3): 207230. 10.1891/10613749.16.3.207.View ArticlePubMed
 Strickland OL: Impact of Unreliability of Measurements on Statistical Conclusion Validity. Journal of Nursing Measurement. 2005, 13 (2): 8385. 10.1891/jnum.2005.13.2.83.View ArticlePubMed
 Box GEP, Cox DR: An analysis of transformations. Journal of Royal Statistical Society. 1964, B 26: 211252.
 Ferketich S, Verran J: An overview of data transformation. Research in Nursing & Health. 1994, 17 (5): 393396. 10.1002/nur.4770170510.View Article
 Leydesdorff L, Bensman S: Classification and powerlaws: the logarithmic transformation. Journal of the American Society for Information Science & Technology. 2006, 57 (11): 14701486. 10.1002/asi.20467.View Article
 Jaeger T: Categorical data analysis: away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory & Language [serial online]. 2008, 59 (4): 434446.View Article
 Gurka MJ, Edward LJ, Muller KE, Kupper LL: Extending the BoxCox transformation to the linear mixed model. J R Statist Soc A. 2006, 169 (Part 2): 273288.View Article
 Lee JC, Lin TI, Lee KJ, Hus YL: Bayesian analysis of BoxCox transformed linear mixed models with ARMA(p,q) dependence. J Statist Plan Infer. 2005, 133: 435451. 10.1016/j.jspi.2004.03.015.View Article
 Spitzer JJ: A Monte Carlo investigation of the BoxCox transformation in small samples. Journal of the American Statistical Association. 1978, 73: 488495. 10.2307/2286587.
 Searle SR: Linear Models. 1971, John Wiley & Sons, Inc
 Draper NR, Smith H: Applied Regression Analysis. 1998, Wiley Series in Probability and StatisticsView Article
 Johnson RA, Wichern DW: Applied Multivariate Statistical Analysis. 1998, PrinticeHall, Inc
 Turkey JW: The comparative anatomy of transformations. Annals of Mathematical Statistics. 1957, 28: 602632. 10.1214/aoms/1177706875.View Article
 Lindsey JK: The roles of transformation to normality. Biometrics. 1975, 31: 247249. 10.2307/2529728.View Article
 Sakia RM: The BoxCox transformation technique: a review. The Statistician. 1992, 41: 167178.View Article
 Carroll RJ, Ruppert D: On prediction and the power transformation family. Biometrika. 1981, 68 (3): 609615. 10.1093/biomet/68.3.609.View Article
 Bickel PJ, Doksum KA: An analysis of transformation revisited. Journal of the American Statistical Association. 1981, 76: 296311. 10.2307/2287831.View Article
 Oberg A, Davidian M: Estimating data transformations in nonlinear mixed effects models. Biometrics. 2000, 56: 6572. 10.1111/j.0006341X.2000.00065.x.View ArticlePubMed
 Etzel CJ, Shete S, Beasley TM, Fernandez JR, Alliosn DB, Amos CI: Effect of BoxCox transformation on power of HasemanElson and maximumlikelihood variance components tests to detect quantitative trait loci. Human Heredity. 2003, 55 (23): 108116. 10.1159/000072315.View ArticlePubMed
 Helene HT, Zwinderman AH: Comparing transformation methods for DNA microarray data. BMC Bioinformatics. 2004, 5: 7710.1186/14712105577.View Article
 O'Malley AJ, Zou KH: Bayesian multivariate hierarchical transformation models for ROC analysis. Statistics in Medicine. 2005, 25 (3): 459479.View Article
 Fitzmaurice GM, Lipsitz SR, Parzen M: Approximate median regression via the BoxCox transformation. The American Statistician. 2007, 61 (3): 223238.View Article
 Carroll RJ, Ruppert D: The analysis of transformed data: comment. Journal of the American Statistical Association. 1984, 79: 312313. 10.2307/2288266.
 Dunton N, Gajewski BJ, Kluas S, Pierson B: The relationship of nursing workforce characteristics to patient outcomes. Online Journal of Nursing Issues. 2007
 Lake TL, Shang J, Klaus S, Dunton N: Patient falls: association with hospital Magnet status and nursing unit staffing. Research in Nursing & Health. 2010, 33: 413425. 10.1002/nur.20399.View Article
 [http://www.nursingquality.org/FAQPage.aspx#1]
 Draper NR, Cox DR: On distributions and their transformation to normality. Journal of Royal Statistical Society. 1969, B 31: 472476.
 Ramon CL, Milliken GA, Stroup WW, Wolfinger RD: SAS^{®} System for Mixed Models. 1996, Cary, NC: SAS Institute Inc
 Gajewski BJ, Mahnken JD, Dunton N: Improving quality indicator report cards through Bayesian modeling. BMC Medical Research Methodology. 2008, 8: 7710.1186/14712288877.PubMed CentralView ArticlePubMed
 Di CZ, BandeenRoche K: Multilevel latent class models with Dirichlet mixing distribution. Biometrics. 2010, 67 (1): 8696.View Article
 The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/11/118/prepub
Prepublication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.