 Research article
 Open Access
 Published:
Robust estimation of the effect of an exposure on the change in a continuous outcome
BMC Medical Research Methodology volume 20, Article number: 145 (2020)
Abstract
Background
The change in two measurements of a continuous outcome can be modelled directly with a linear regression model, or indirectly with a random effects model (REM) of the individual measurements. These methods are susceptible to model misspecifications, which are commonly addressed by applying monotonic transformations (e.g., BoxCox transformation) to the outcomes. However, transforming the outcomes complicates the data analysis, especially when variable selection is involved. We propose a robust alternative through a novel application of the conditional probit (cprobit) model.
Methods
The cprobit model analyzes the ordered outcomes within each subject, making the estimate invariant to monotonic transformation on the outcome. By scaling the estimate from the cprobit model, we obtain the exposure effect on the change in the observed or BoxCox transformed outcome, pending the adequacy of the normality assumption on the raw or transformed scale.
Results
Using simulated data, we demonstrated a similar good performance of the cprobit model and REM with and without transformation, except for some bias from both methods when the BoxCox transformation was applied to scenarios with small sample size and strong effects. Only the cprobit model was robust to skewed subjectspecific intercept terms when a BoxCox transformation was used. Using two real datasets from the breast cancer and inpatient glycemic variability studies which utilize electronic medical records, we illustrated the application of our proposed robust approach as a seamless threestep workflow that facilitates the use of BoxCox transformation to address nonnormality with a common underlying model.
Conclusions
The cprobit model provides a seamless and robust inference on the change in continuous outcomes, and its threestep workflow is implemented in an R package for easy accessibility.
Background
In studies with repeated measurements of a continuous outcome, it may be of interest to quantify the change in the outcome over time. For example, the effect of a treatment can be assessed by comparing the outcome measured for each subject before and after receiving the treatment in a prepost study [1, 2]. In experimental studies under wellcontrolled settings, the simple paired ttest may be sufficient for assessing the change in a continuous outcome at two time points in each subject. To control for confounding when assessing these changes in observational studies, we could generalize the paired ttest to a linear regression model. However, the validity of the inference from the regression model relies on the adequacy of model assumptions, i.e., normality or linearity.
Transformation is commonly used when residual diagnostics reveal that the model assumptions are inadequate. The commonly considered BoxCox power family [3] is defined for positive values, and so cannot be applied to the change in continuous measurements with negative differences. In these situations, the shifted power family [3] can be considered, but the likelihood function of the transformation parameter may behave poorly, resulting in the standard asymptotic properties of the maximum likelihood estimator being invalid [4, 5]. Although the YeoJohnson method of transformation [6] is welldefined for both positive and negative values, it produces estimates that are difficult to interpret, due to the different powers for the transformation of positive and negative values [7]. Given these difficulties, it may be preferable to apply the BoxCox transformation on the individual measurements.
An alternative approach is to analyze the repeated measurements by using the random effects model (REM) with a random intercept for each subject [8], and address the inadequacy of the normality assumption using the BoxCox transformation [9]. However, different transformations selected for the same outcome variable in different studies may result in conflicting findings [10] and could complicate the selection of variables in the final model: the independent variable(s) selected at each step of the modelbuilding procedure may differ from those that would have been selected from modelling the untransformed data, due to the additional (re)estimation of the transformation parameter at each step [11]. Hence, it is appealing to devise a robust analytical approach that allows inference on the effect of the predictor that is indifferent to any monotonic transformation on the outcome.
To address the issues faced with transformation in studies of independent samples, Liu and team [10] and Tan and team [12] proposed to analyze the ordering of outcomes where the resulting exposure effect is invariant to monotonic transformations. In particular, Tan and team introduced a stratified analysis of continuous outcomes for confounder adjustment, where the strata consisted of subjects with the same or similar confounding profile. This approach can be adapted in the context of two repeated outcome measurements by considering each subject as a stratum: assuming normally distributed error terms, the two ordered outcome measurements can be analyzed using the conditional probit (cprobit) model, allowing robust inference on the presence and direction of an exposure effect.
In this paper, we propose a robust approach to analyze two repeated measurements of a continuous outcome, by first applying the cprobit model to detect an exposure effect and subsequently quantifying the exposure effect on the observed outcome. We compare the performance of our proposed approach to the REM using simulated data, and provide a seamless threestep workflow for assessing exposure effect that facilitates the use of BoxCox transformation to address nonnormality with a common underlying model. We illustrate the application of our proposed approach with two real datasets which utilize electronic medical records (EMRs): a study of the association between the changes in white blood cell count and neutrophil percentage in female breast cancer patients after initiating chemotherapy; a study of glycemic variability over a threeday period among hospitalized patients, to determine the association between the baseline values and the change in the subsequent two days.
Methods
Underlying random effects model
We consider two repeated measurements of a continuous exposure (x_{ij}) and a continuous outcome (y_{ij}) from the ith subject, i = 1, …, n and j = 1, 2, and assume the observed outcome is generated from the following REM:
where ε_{ij}~N(0, σ^{2}) denotes the independent measurement error within each subject, and the random intercept \( {\alpha}_i\sim N\left({\mu}_{\alpha },{\sigma}_{\alpha}^2\right) \) represents the subjectspecific timeinvariant effects that is assumed to be independent from ε_{ij} and x_{ij} [8]. Departure from the normality assumptions could be assessed using the simple residuals [8], and if the normality assumption is inadequate, the BoxCox transformation can be applied to the outcome to achieve a normal distribution for the ‘total’ error term [9].
Difference model
The parametric assumption on the intercept terms makes the REM susceptible to model misspecification. An alternative approach to make inference on β is to apply the linear regression model to the change in the outcome within each subject from equation (1):
where ∆y_{i.} = y_{i2} − y_{i1}, ∆x_{i.} = x_{i2} − x_{i1}, and \( \Delta {\varepsilon}_{i.}={\varepsilon}_{i2}{\varepsilon}_{i1}\sim N\left(0,{\sigma}_{\Delta }^2=2{\sigma}^2\right) \) are scalar quantities. This differencing approach eliminates the subjectspecific intercept and thereby avoids the assumptions of normality on it [8].
Conditional probit model for continuous outcomes
Similar to nonparametric statistical tests, the cprobit model uses the ordering of the outcomes within each subject to perform hypothesis testing on β. It is derived from the scaled difference from equation (2) where the new error term has unit variance [13]:
where β_{c} = β/σ_{∆}. Hence the probability of observing y_{i2} > y_{i1} (or equivalently ∆y_{i.}/σ_{∆} > 0) for the ith subject is:
where I_{i} = I{y_{i2} > y_{i1}} and Φ(·) is the cumulative density function of a standard normal distribution. The estimate for β_{c} (\( {\hat{\beta}}_c \)) can be used to assess the presence and direction of an association between the exposure and the outcome.
Estimation of linear effect and residuals for conditional probit model
As the parameter of interest is the linear exposure effect on the continuous outcome (β) and β_{c} = β/σ_{∆} from equation (3), we adapt the approach proposed by Tan and team [12] to estimate β given \( {\hat{\beta}}_c \) by rewriting equation (2) as:
and we estimate σ_{∆} by maximizing the estimated likelihood [14] based on equation (5), where β_{c} is replaced by \( {\hat{\beta}}_c \). With estimates available for both \( {\hat{\beta}}_c \) and \( {\hat{\sigma}}_{\Delta } \), the residuals in equation (5) correspond to \( {\hat{\Delta \varepsilon}}_{i.}=\Delta {y}_{i.}{\hat{\sigma}}_{\Delta }{\hat{\beta}}_c\Delta {x}_{i.} \) and are subsequently used to assess the adequacy of the model assumptions. If the normality assumption is adequate, β can be estimated using the plugin estimator: \( \hat{\beta}={\hat{\sigma}}_{\Delta }{\hat{\beta}}_c \), with standard error: \( SE\left(\hat{\beta}\right)={\hat{\sigma}}_{\Delta } SE\left({\hat{\beta}}_c\right) \).
Addressing model inadequacy with BoxCox transformation
Modelling the transformed outcome
When the normality assumption on ∆ε_{i.} is inadequate, we consider a BoxCox transformation (indexed with a parameter λ) of the observed outcome y_{ij}:
We assume \( {y}_{ij}^{\left(\lambda \right)} \) can be described by the following REM:
where \( {\varepsilon}_{\lambda ij}\sim N\left(0,{\sigma}_{\lambda}^2\right) \) and \( {\alpha}_{\lambda i}\sim N\left({\mu}_{\lambda \alpha},{\sigma}_{\lambda \alpha}^2\right) \), and the subscript λ indicates the dependency of the parameters on λ. When λ = 1, equation (6) is simply the linear model on the untransformed outcome y_{ij} (i.e., equation (1) where α_{λi} = α_{i} − 1, β_{λ} = β and ε_{λij} = ε_{ij}). Similar to the analysis on the untransformed outcome, we eliminate the subjectspecific intercept by working on the difference in the transformed outcomes within each subject:
where \( \Delta {y}_{i.}^{\left(\lambda \right)}={y}_{i2}^{\left(\lambda \right)}{y}_{i1}^{\left(\lambda \right)} \), ∆x_{i.} = x_{i2} − x_{i1}, and \( \Delta {\varepsilon}_{\lambda i.}={\varepsilon}_{\lambda i2}{\varepsilon}_{\lambda i1}\sim N\left(0,{\sigma}_{\lambda \Delta }^2=2{\sigma}_{\lambda}^2\right) \) are scalar quantities.
Conditional probit model for transformed outcome
The cprobit model for the transformed outcome can be derived from the scaled difference of the transformed outcome in equation (7):
where β_{cλ} = β_{λ}/σ_{λ∆} and ∆ε_{λi.}/σ_{λ∆}~N(0, 1). Since the BoxCox transformation does not change the ordering of the outcome, equation (4) can be rewritten as:
where the estimate of β_{cλ} is again \( {\hat{\beta}}_c \) and Φ(·) is the cumulative density function of a standard normal distribution. To estimate β_{λ} given \( {\hat{\beta}}_c \) requires the estimation of λ and σ_{λ∆}. Following from equation (5), we use the relationship β_{λ} = σ_{λ∆}β_{cλ} to rewrite equation (7) as:
We follow the common practice [3, 5] to estimate λ and σ_{λ∆} by maximizing the profile likelihood. If residual diagnostics using \( {\hat{\Delta \varepsilon}}_{\lambda i.}=\Delta {y}_{i.}^{\left(\hat{\lambda}\right)}{\hat{\sigma}}_{\lambda \Delta }{\hat{\beta}}_c\Delta {x}_{i.} \) support the normality assumption, we estimate β_{λ} using the plugin estimator: \( {\hat{\beta}}_{\lambda }={\hat{\sigma}}_{\lambda \Delta }{\hat{\beta}}_c \), with standard error: \( SE\left({\hat{\beta}}_{\lambda}\right)={\hat{\sigma}}_{\lambda \Delta } SE\left({\hat{\beta}}_c\right) \).
Threestep workflow
Our proposed approach seamlessly integrates the use of BoxCox transformation to alleviate nonnormality, which we summarize as a threestep workflow (see Fig. 1). Assuming the (transformed) outcome is generated from a REM, our approach estimates the exposure effect by working on the differences in the (transformed) outcomes from each subject. In Step 1 of the workflow, the presence and direction of this effect is assessed by applying the cprobit model to model the indicator I_{i} in equation (4) of the positive difference in the observed outcomes within each subject (see Supplementary Figure S1 in Additional file 1 for a detailed summary of the threestep workflow). The estimated effect from the cprobit model, \( {\hat{\beta}}_c \), is unchanged when a BoxCox transformation is applied to the outcome (see equation (9)), and it is scaled in the subsequent steps to quantify the linear exposure effect on the (transformed) outcome.
In Step 2, we first consider a simple scenario where the observed outcome is generated from a REM (see equation (1)). Although our approach works on the differences in the observed outcomes within each subject (see equation (2)), \( {\hat{\beta}}_c \) provides an estimate for the scaled exposure effect on the observed outcome, where the scaling factor is the reciprocal of the standard deviation of the error distribution of the differences (see equation (3)). Thus, if the residual diagnostics suggest the error terms ∆ε_{i.} in equation (2) are normal, the estimated exposure effect on the observed outcome is: \( \hat{\beta}={\hat{\sigma}}_{\Delta }{\hat{\beta}}_c \).
When the normality assumption on the error term is inadequate for the observed outcome in Scenario 1, scaling \( {\hat{\beta}}_c \) from Step 1 may not be appropriate. We address this issue with the BoxCox transformed outcome where a REM is assumed for the transformed outcome (see equation (6)), and our approach estimates the exposure effect using the difference in the transformed outcomes (see equation (7)). The transformation parameter (λ) is estimated and the \( {\hat{\beta}}_c \) from Step 1 is now the estimate of the scaled effect on the transformed outcome (see equation (8)). The estimated effect on the transformed outcome is: \( {\hat{\beta}}_{\lambda }={\hat{\sigma}}_{\lambda \Delta }{\hat{\beta}}_c \). When the BoxCox transformation in Step 3 is insufficient to alleviate nonnormality, it may be useful to consider extensions of the functional relationship between the exposure and the outcome, e.g., by including interaction terms or splines.
Simulation study
Simulation study 1
This study assessed and compared the performance of the REM and the cprobit model when assessing the association between a predictor at baseline and the change in the outcome without transformation (see Additional file 2 for detailed simulation setup). To investigate the performance of the two methods under different conditions, we simulated data with different effect sizes of the predictor (β = 0 and −0.06), sample sizes (n = 300 and 1200) and random intercept distributions (normal and skewed). We assessed the performance of the two methods from 2000 simulation cycles using the type I error, power, bias, empirical standard error (empirical SE), average of the modelbased standard error (mean SE) and the coverage of the 95% confidence intervals (CIs) extracted for the estimated linear effect (\( \hat{\beta} \)).
Simulation study 2
This study assessed and compared the performance of the two methods when the BoxCox transformation on the outcome provides a linear model with normal errors. To generate such an observed continuous outcome, we used the outcome generated in Simulation study 1, and applied the inverse BoxCox transformation with λ = 1, 1/3, 0 to obtain the observed continuous outcome prior to the (supposedly unknown) transformation. Thus the created data required no transformation, a cubic root transformation and a log transformation respectively to satisfy the normality assumption. We assessed the performance of the estimated effect on the transformed outcome (\( {\hat{\beta}}_{\lambda } \)) and the estimated transformation parameter (\( \hat{\lambda} \)) from the two methods, and assessed the adequacy of the normality assumption after the BoxCox transformation with the Lilliefors test [15] on the residuals.
Data illustrations
We used two real datasets which utilize EMRs to illustrate our proposed workflow and compared it with: (i) the conventional linear regression model when the normality assumption is adequate without BoxCox transformation, and (ii) the REM with or without transformation as appropriate.
Neutrophil study
We investigated the association between the change in white blood cell (WBC) counts and the change in the neutrophil percentage before and during chemotherapy among breast cancer patients, where a low neutrophil level is a risk indicator for developing severe side effects of chemotherapy [16]. This study used retrospective data from breast cancer patients diagnosed between 2005 and 2014, who underwent chemotherapy at the National University of Singapore (NUH). The two periods considered in this study were the 60 days before (period 1) and after (period 2) the start date of chemotherapy. We focused our investigation on 384 patients who did not require medical intervention to increase neutrophil counts during these periods [17], with diagnostic information retrieved from the NUH breast cancer registry and laboratory test results from the NUH EMRs. For each patient, we extracted the minimum WBC count and minimum neutrophil percentage for each period, and subsequently dichotomized the minimum WBC count in each period using the sample median in period 1 (6.94 × 10^{9}/L).
We assumed the following REM:
where y_{ij} represents the minimum neutrophil percentage for the ith subject in the jth period, x_{ij} indicates whether the minimum WBC count for the ith subject in the jth period is higher than the cutoff defined earlier, t_{ij} is the indicator for period 2, Age_{i} represents age at diagnosis, Stage_{i} indicates stage 3 or above and Ethnicity_{i} indicates Chinese, for i = 1, …, 384, and j = 1, 2. As the linear regression and the cprobit models are applied to the difference in outcomes, these timeinvariant covariates are implicitly controlled for.
Blood glucose study
High glycemic variability is a known risk factor for diabetes complications [18, 19]. We examined whether the baseline glycemic variability was associated with the change in subsequent daily glycemic variability by using the pointofcare capillary blood glucose (BG) readings collected from the first BG monitoring episode among hospitalized noncritical care adult patients in NUH, using data retrieved from the NUH EMRs. We defined a BG monitoring episode as a contiguous sequence of BG readings where consecutive readings were no more than two days apart. For each patient, the daily glycemic variability was quantified by the standard deviation (SD) of the BG readings. The baseline variability is represented by the SD on the first day of the monitoring episode, and the change in the SD between the second and third day of the same episode (referred to as the first and second followup) is the outcome of primary interest. We included patients warded in the medical wards between September and December in 2012 if their first BG monitoring episode was at least three days in duration and at least three BG readings were collected per day during the first three days of the episode. The final dataset included 1200 patients.
We assumed the following REM:
where y_{ij} represents the SD of BG for the ith subject in the jth followup, t_{ij} is an indicator for the second followup, SD_{0i} represents the baseline SD of BG, Age_{i} represents age at baseline and Female_{i} indicates female for i = 1, …, 1200, and j = 1, 2. By taking the difference between y_{i1} and y_{i2} in equation (12), we obtain the following linear model:
Equation (13) suggests that β_{3} can be interpreted as the linear effect of the baseline variability of BG on the change in variability of BG between the first and second followup of the monitoring episode, and the timeinvariant covariates are implicitly controlled for when the linear regression and the cprobit models are applied to the difference in outcomes.
Implementation
The analyses were performed using R version 3.2.3 [20]. The lmer function in the lme4 package [21] was used to implement the REM, and the powerTransform function in the car package [22] was used to estimate λ when the BoxCox transformation was used. The cprobit model in Step 1 of our workflow was implemented using the glm function by specifying the binomial family for I_{i} with probit link. To obtain \( \hat{\lambda} \) in Step 3, the profile likelihood was maximized using the optimize function. Since it is common practice to restrict λ to take values from a closed interval [7, 23], we follow Hawkins and Weisberg [23] by considering the interval [−3, 3]. We implemented the proposed threestep workflow as an R package named cprobit [24] (see Additional file 3 for a reproducible example). Residual diagnostics was performed in Step 2 and 3 of our workflow using the lillie.test function in the nortest package [25].
Results
Simulation study
Simulation study 1: without BoxCox transformation
Both models provided unbiased estimates (\( \hat{\beta} \)) with comparable mean SE and empirical SE, type I error close to 5% and coverage close to 95% for the various settings considered (see the panels labelled “None” in Fig. 2 and Supplementary Table S1 in Additional file 1). The \( \hat{\beta} \) from the cprobit model had larger SE and lower power than that from the REM.
Simulation study 2: with BoxCox transformation
We first summarize the performance of both methods when the intercepts were normally distributed. Simulation results of the estimate (\( {\hat{\beta}}_{\lambda } \)) are summarized in Supplementary Table S2 in Additional file 1 and illustrated in the panels labelled λ = 0, 1/3 and 1 in Fig. 2. For zero effect with both sample sizes, both the cprobit model and the REM with BoxCox transformation produced unbiased \( {\hat{\beta}}_{\lambda } \) with comparable mean SE and empirical SE, coverage close to 95% and therefore type I error close to 5%. For nonzero effect (β_{λ} = − 0.06), the estimates from both methods were unbiased when sample size was large (n = 1200) but became somewhat biased with small sample size (n = 300), especially for larger λ values. Although the bias was slightly larger from the cprobit model than the REM, it was generally within 10% of the true value of β_{λ}. The coverage was slightly lower than 95% for nonzero effect when λ = 0 in both sample sizes and methods due to some underestimation of the standard error, which was more pronounced for larger λ. As observed in Simulation study 1, the REM had higher power than the cprobit model. The estimated transformation parameter (\( \hat{\lambda} \)) from both methods had good inferential properties except for the slightly conservative type I error and coverage from the cprobit model (see Fig. 3 and Supplementary Table S3 in Additional file 1). The rate of rejecting the normality assumption after the BoxCox transformation was close to the expected level of 5% for both methods (see Supplementary Table S4 in Additional file 1).
The performance of the cprobit model with BoxCox transformation was not affected by the skewed distribution of the intercept terms, but the REM with BoxCox transformation was adversely affected: it failed to address nonnormality in many simulation cycles especially when n = 1200 (see the high rejection rate in Supplementary Table S4 in Additional file 1), resulting in biased estimate and low coverage when β_{λ} = − 0.06, and conservative type I error when n = 300 and lower power than for normal intercepts.
Real data analyses
Neutrophil study
The residual qqplot and the Lilliefors test suggested the adequacy of the normality assumption for the linear regression model applied to the changes in the outcome (see Supplementary Figure S2(A) in Additional file 1), and hence we analyzed the observed outcome without transformation. When the WBC status changed from low in period 1 to high in period 2, the linear regression model estimated a significant increase in the minimum neutrophil percentage (i.e., expected increase is 10.20, 95% CI: 8.44, 11.95; see Table 1). Residual diagnostics for the REM and the cprobit model also supported the reporting of estimated effect for the untransformed data (see Supplementary Figure S2(B) and S2(C) in Additional file 1). Estimates of linear effect from both methods were consistent with the linear regression model: 9.51 (95% CI: 8.15, 10.88) from the REM, and 11.29 (95% CI: 8.66, 13.92) from the cprobit model.
Blood glucose study
Both the residual qqplot and the Lilliefors test suggested the inadequacy of the normality assumption when applying the linear regression model to the changes in the untransformed outcome (see Supplementary Figure S3 in Additional file 1), and a similar conclusion was drawn from residual diagnostics of the REM and the cprobit model (see Supplementary Figure S4(A) and S4(B) in Additional file 1). Therefore, the BoxCox transformation was used in the REM and the cprobit model in the subsequent analysis to address nonnormality. Both models identified a need to transform (\( \hat{\lambda}=0.33 \) for the REM and \( \hat{\lambda}=0.34 \) for the cprobit, see Table 2), and the residual qqplots and the Lilliefors test suggested the adequacy of the normality assumption after transformation (see Supplementary Figure S4(C) and S4(D) in Additional file 1). Residual diagnostics suggested that the REM had a better fit than the cprobit model for analyzing the transformed data, although both models generated similar estimates for the linear effect of the baseline measurement on the subsequent change in the transformed outcome: − 0.054 (95% CI: − 0.083, − 0.025) from the REM and − 0.042 (95% CI: − 0.079, − 0.005) from the cprobit model. A simulated dataset based on this study is provided in the cprobit package. The R commands to analyze this simulated dataset using the threestep workflow and their output are documented in Additional file 3.
Discussion
In this paper, we have proposed a robust alternative to simple linear regression or REM for the analysis of change in two repeated measurements of a continuous outcome. Our method involves a novel application of the cprobit model that incorporates the BoxCox transformation. By modeling the change, the cprobit model eliminates the subjectspecific intercept after taking the difference, and is therefore less susceptible to model misspecification than the REM. Simulation studies highlighted the advantage of the cprobit over the REM when the BoxCox transformation was required. Based on the framework of our proposed approach, we described a threestep workflow, and applied it to two real datasets that utilize EMRs to illustrate the different situations that can arise in practical data analysis.
Findings from our simulation study demonstrated a good performance for both the REM and the cprobit model when no transformation was required, although the estimates from the cprobit model had higher variability (and hence lower power) than REM. With the BoxCox transformation and normally distributed intercept terms, both methods provided good estimates of the linear exposure effect on the transformed outcome (\( {\hat{\beta}}_{\lambda } \)) for zero effect, and the power of both methods was not affected by the need to estimate the transformation parameter (λ) for both effect sizes. However, for nonzero effects, an underestimation of the SE of \( {\hat{\beta}}_{\lambda } \) from both methods resulted in reduced coverage, and for small sample size (n = 300) a bias, that was generally smaller for REM, despite the estimated transformation parameter (\( \hat{\lambda} \)) being unbiased. The bias and underestimated variability of \( {\hat{\beta}}_{\lambda } \) are attributable to the uncertainty in \( \hat{\lambda} \) that is not accounted for in estimating β_{λ} [9, 26]. Consistent with the literature [27] the REM without transformation was robust against a skewed distribution of the intercept terms. However, the REM with BoxCox transformation may fail to overcome the nonnormality in the ‘total’ error term [9] when the intercept term was skewed, resulting in a biased \( {\hat{\beta}}_{\lambda } \) with low type I error and poor coverage. Since the cprobit model eliminates the intercept terms from the likelihood, it is robust when used with the BoxCox transformation because it avoids making any distributional assumption on the intercepts.
We summarized the application of our method as a threestep workflow, where Step 1 models the ordered outcomes within each subject to estimate a measure of association between an exposure and a outcome (\( {\hat{\beta}}_c \)) that is invariant to a BoxCox transformation on the outcome. Leveraging on this invariance property in the subsequent two steps, we scale \( {\hat{\beta}}_c \) to provide an appropriate estimate for the exposure effect on the change in the observed (or BoxCox transformed) outcomes pending the adequacy of the normality assumption. Since \( {\hat{\beta}}_c \) is the scaled exposure effect estimate of the linear model that satisfies the normality assumption, different scaling factors are used for the observed and transformed outcomes. The two real datasets that utilize EMRs illustrated the stepbystep application of the method, where the data was first analyzed without transformation in Step 2 after completing Step 1. When the normality assumption is adequate (e.g., the neutrophil study), the estimated exposure effect from Step 2 is reported, otherwise the BoxCox transformation is used to address the nonnormality in Step 3 (e.g., the blood glucose study). Since the cprobit model makes inference by modelling the change within each subject, it estimates the same exposure effect (but with higher variability) as the linear regression model applied to the change in outcomes when no transformation is required. This is illustrated in the neutrophil study, where our proposed approach had similar estimate to the linear regression model but had a wider 95% CI.
In common with the linear regression model for the change in outcomes, our proposed approach has advantages over the REM in handling timeinvariant confounding effects. Although the subjectspecific intercept in the REM may implicitly account for timeinvariant covariates, a correlation between the intercept and the exposure may result in biased estimates when some of the covariates are also confounders. By modeling the difference, our proposed approach can be viewed as a fixed effects approach for analyzing longitudinal data that differs from the REM by allowing for correlation between the subjectspecific intercept and exposure [8, 28], hence implicitly controlling for timeinvariant confounding effects too. Hence, our proposed approach estimates the withinsubject effect of a timevarying exposure, while the betweensubject effect can be estimated from a REM that explicitly models these two effects of the timevarying exposure [8]. By modeling the ordered outcomes within each subject with the cprobit model, our proposed approach is robust to misspecified model assumptions but have lower power and higher variability than REM. A similar observation is made by Liu and team [10] when they compared their approach that models ordered outcomes of independent samples with approaches that model the observed outcomes.
Since the outcome of the cprobit model is a binary variable indicating an increase in the outcome for each subject, it is susceptible to common issues faced when analyzing binary responses, e.g., biased estimate due to small sample size, rare events or (quasi)complete separation [29,30,31]. The bias resulting from these issues could be alleviated using the Firth’s method [30,31,32], which is recently implemented for the probit regression model in the R package brglm [33]. Another limitation shared by the cprobit model and the REM is the high level of uncertainty in estimating λ when the sample size is small (n = 300), which could result in biased estimates with underestimated SE. However, Box and Cox [3] argued that one can still obtain a reasonable estimate of the effect in these situations by identifying a transformation that overcomes nonnormality (e.g., using prior knowledge) and reporting the estimated effect on the transformed outcome, and this is also commonly practiced in real data analyses.
In this paper we considered a common transformation for the two repeated measurements, which may not be applicable for a study where, for example, there is a suspected profound batch effect between the two measurements suggesting the transformation applied to each measurement is different. Wu and Tian [34] proposed a nonparametric transformation that allows each measurement to have a different transformation function from the others. We have considered the use of the BoxCox transformation in Step 3, and recently, Hothorn and team [35] have proposed a flexible family of nonparametric transformation functions for studies of independent samples of continuous outcomes that is applicable to both positive and negative values. Future work may explore extending their approach for Step 3 of our workflow. Nevertheless, the inference based on \( {\hat{\beta}}_c \) from Step 1 of our proposed workflow is valid as long as there exists some (unknown) monotonic transformation where the normality assumption is adequate, hence providing a useful tool for biomarker discovery by making less restrictive assumptions for hypothesis testing [12, 36]. Moreover, our proposed approach alleviates the complications of model building procedures involving the use of transformation on the outcomes to correct model misspecification, which are prevalent in both the traditional BoxCox transformation [11, 37] and the recent nonparametric approaches [34, 35].
Although we have presented our method as an approach to analyze change in two repeated measurements of a continuous outcome, it also applies to stratified analysis of continuous outcomes for confounder adjustment, with each stratum consisting of a pair of subjects with the same confounding profile. In scenarios with more than two repeated measurements from each subject, or more than two subjects in each stratum, the rankordered probit model [38] that generalizes the cprobit model can be considered in the same vein as the stratified analysis of continuous outcomes [12]. Furthermore, our proposed workflow can be extended to other error distributions (e.g., the skewed error distribution considered by Tan and team [12]), where the conventional normality assumption is not expected in some reallife applications [39, 40].
Conclusions
In this paper, we present a novel application of the cprobit model that provides a robust method for the study of change in a continuous outcome. The method is invariant to any monotonic transformation on the outcome when testing for the presence and direction of the association between the exposure and the outcome, and generally has estimates with good inferential properties for the exposure effect on the (transformed) outcome. Hence, a statistical analysis plan that preempts the use of BoxCox transformation to alleviate nonnormality can be easily integrated into the data analysis steps with the method, resulting in a practical and seamless workflow for data analysts.
Availability of data and materials
Data is available upon reasonable request and approval from ethic boards, and can only be shared in the context of an agreed collaboration and subject to a datasharing agreement to ensure security of the personal data of the study participants. The code for implementing our approach is available as an R package (available from https://github.com/nyilin/cprobit and see Additional file 3 for a reproducible example).
Abbreviations
 BG:

Blood glucose
 CI:

Confidence interval
 cprobit:

Conditional probit
 EMR:

Electronic medical records
 NUH:

National university hospital
 REM:

Random effects model
 SD:

Standard deviation
 SE:

Standard error
 WBC:

White blood cell
References
 1.
Bell BA. Pretest–posttest design. In: encyclopedia of research design. Thousand Oaks California: SAGE Publications, Inc.; 2010.
 2.
Thiese M. Observational and interventional study design types; an overview. Biochem Med. 2014;24:199–210.
 3.
Box GEP, Cox DR. An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological). 1964;26:211–52.
 4.
Atkinson AC, Pericchi LR, Smith RL. Grouped likelihood for the shifted power transformation. J R Stat Soc Ser B. 1991;53:473–82.
 5.
Atkinson A. Riani M. Springer New York: Robust Diagnostic Regression Analysis; 2000.
 6.
Yeo I, Johnson RA. A new family of power transformations to improve normality or symmetry. Biometrika. 2000;87:954–9.
 7.
Weisberg S. Applied linear regression. NJ, USA: John Wiley & Sons, Inc.; 2005.
 8.
Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2011.
 9.
Gurka MJ, Edwards LJ, Muller KE, Kupper LL. Extending the BoxCox transformation to the linear mixed model. J R Stat Soc Ser A (Statistics Soc). 2006;169:273–88.
 10.
Liu Q, Shepherd BE, Li C, Harrell FE. Modeling continuous response variables using ordinal regression. Stat Med. 2017;36:4316–35.
 11.
Yeo I. Variable selection and transformation in linear regression models. Stat Probab Lett. 2005;72:219–26.
 12.
Tan CS, Støer NC, Chen Y, Andersson M, Ning Y, Wee HL, et al. A stratification approach using logitbased models for confounder adjustment in the study of continuous outcomes. Stat Methods Med Res. 2019;28:1105–25.
 13.
BenAkiva ME, Lerman SR. Discrete choice analysis: theory and application to travel demand. Cambridge: MIT Press; 1985.
 14.
Pawitan Y. In all likelihood: statistical Modelling and inference using likelihood. Oxford: OUP Oxford; 2001.
 15.
Lilliefors HW. On the KolmogorovSmirnov test for normality with mean and variance unknown. J Am Stat Assoc. 1967;62:399.
 16.
MacDonald V. Chemotherapy: managing side effects and safe handling. Can Vet J. 2009;50:665–8.
 17.
US Department of Health and Human Services. Common Terminology Criteria for Adverse Events (CTCAE) Version 5.0. 2017.
 18.
Rodbard D. Optimizing display, analysis, interpretation and utility of selfmonitoring of blood glucose (SMBG) data for Management of Patients with diabetes. J Diabetes Sci Technol. 2007;1:62–71.
 19.
Suh S, Kim JH. Glycemic variability: how do we measure it and why is it important? Diabetes Metab J. 2015;39:273–82.
 20.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2015.
 21.
Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixedeffects models using lme4. J Stat Softw. 2015;67:1–48.
 22.
Fox J, Weisberg S. An R companion to applied regression. Third edit. Thousand Oaks CA: Sage; 2019.
 23.
Hawkins DM, Weisberg S. Combining the boxcox power and generalised log transformations to accommodate nonpositive responses in linear and mixedeffects linear models. South African Stat J. 2017;51:317–28.
 24.
Ning Y, Tan CS. R Package cprobit: Conditional Probit Model for Analysing Continuous Outcomes; 2020. https://doi.org/10.5281/ZENODO.3819766.
 25.
Gross J, Ligges U. nortest: Tests for Normality. 2015. https://cran.rproject.org/package=nortest.
 26.
Bickel PJ, Doksum KA. An analysis of transformations revisited. J Am Stat Assoc. 1981;76:296–311.
 27.
Verbeke G, Lesaffre E. The effect of misspecifying the randomeffects distribution in linear mixed models for longitudinal data. Comput Stat Data Anal. 1997;23:541–56.
 28.
Allison P. Fixed Effects Regression Models. Thousand Oaks: SAGE Publications, Inc.; 2009.
 29.
King G, Zeng L. Logistic regression in rare events data. Polit Anal. 2001;9:137–63.
 30.
Alison P. Logistic regression for rare events  statistical horizons. 2012. https://statisticalhorizons.com/logisticregressionforrareevents. Accessed 6 Mar 2019.
 31.
Heinze G, Schemper M. A solution to the problem of separation in logistic regression. Stat Med. 2002;21:2409–19.
 32.
Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80:27–38.
 33.
Kosmidis I. brglm: Bias Reduction in BinaryResponse Generalized Linear Models. 2017. doi:R package version 0.6.1.
 34.
Wu CO, Tian X. Nonparametric estimation of conditional distributions and ranktracking probabilities with timevarying transformation models in longitudinal studies. J Am Stat Assoc. 2013;108:971–82.
 35.
Hothorn T, Möst L, Bühlmann P. Most likely transformations. Scand J Stat. 2018;45:110–34.
 36.
Ning Y, Tan CS, Maraki A, Ho PJ, Hodgins S, Comasco E, et al. Handling ties in continuous outcomes for confounder adjustment with rankordered logit and its application to ordinal outcomes. Stat Methods Med Res. 2020;29:437–54.
 37.
John JA, Draper NR. An alternative family of transformations. Appl Stat. 1980;29:190.
 38.
Hajivassiliou VA, Ruud PA. Chapter 40 classical estimation methods for LDV models using simulation. Handb Econom. 1994;4:2383–441.
 39.
Thomas M, Lemaitre M, Wilson ML, Viboud C, Yordanov Y, Wackernagel H, et al. Applications of extreme value theory in public health. PLoS One. 2016;11:e0159312.
 40.
Katz RW, Parlange MB, Naveau P. Statistics of extremes in hydrology. Adv Water Resour. 2002;25:1287–304.
Acknowledgements
Not applicable.
Funding
This work was supported by Clinician Scientist AwardSenior Investigator Category, National Medical Research Council, Grant Number: NMRC/CSASI/0015–2017; NUS SSHSPH Research Programme – Breast Cancer Prevention Programme, Grant Number: R608000124733; NUS Asian Breast Cancer Research Fund, Grant Number: N176000023091; National Medical Research Council Centre Grant (CG) Programme, Grant Number: CGAug16M005; Centre for Health Services and Policy Research, the National University Health Systems Pte Ltd., Grant Number: SBRO14/NS01G; and Swedish Cancer Society (Cancerfonden), Grant Number: CAN 2015/493. EShyong Tai is funded under the clinician scientist award scheme from the National Medical Research Council of Singapore.
Author information
Affiliations
Contributions
CST conceived and supervised the project. MH cosupervised the project. YN derived the theoretical results, processed and analyzed the data, interpreted the findings and drafted the manuscript. NS, MR and CST participated in the derivation of theoretical results. PJH and CST participated in the data analysis. KYN, SCL and MH conceived, designed and collected data for the neutrophil study. SLK, EYHK and EST conceived, designed and collected data for the blood glucose study. NS, PJH, SLK, MH, MR and CST participated in the interpretation of the findings. All authors edited the manuscript and approved the final version.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The neutrophil study was approved by the National Health Group Domain Specific Review Board (Study reference number: 2014/01220), which gave permission to link the NUH Breast Cancer Registry with the laboratory test data in the NUH Electronic Medical Records. The approval for this study allows research on deidentified restrictedaccess data without consent from individual patients. The principal investigator of this study is A/Prof Mikael Hartman. The blood glucose study was approved by the National Health Group Domain Specific Review Board (Study reference number: 2014/00034), which gave permission to use the NUH Electronic Medical Records for research purposes. Waiver of consent was granted for the study. The principal investigator of this study is Dr. Shih Ling Kao.
Consent for publication
Not applicable.
Competing interests
All authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1.
Supplementary tables and figures for “Robust estimation of the effect of an exposure on the change in a continuous outcome”. Includes a detailed visual illustration of the threestep workflow, additional tables for simulation studies and additional figures for real data analysis.
Additional file 2.
Detailed simulation setup in “Robust estimation of the effect of an exposure on the change in a continuous outcome”. Provides a detailed description of the simulation setup.
Additional file 3.
Example Usage of Package cprobit. Description of a reproducible example using the cprobit package.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Ning, Y., Støer, N.C., Ho, P.J. et al. Robust estimation of the effect of an exposure on the change in a continuous outcome. BMC Med Res Methodol 20, 145 (2020). https://doi.org/10.1186/s12874020010276
Received:
Accepted:
Published:
Keywords
 BoxCox transformation
 Conditional probit model
 Normal errors
 Random effects model