Skip to main content

Using marginal standardisation to estimate relative risk without dichotomising continuous outcomes

Abstract

Background

Although criticisms regarding the dichotomisation of continuous variables are well known, applying logit model to dichotomised outcomes is the convention because the odds ratios are easily obtained and they approximate the relative risks (RRs) for rare events.

Methods

To avoid dichotomisation when estimating RR, the marginal standardisation method that transforms estimates from logit or probit model to RR estimate is extended to include estimates from linear model in the transformation. We conducted a simulation study to compare the statistical properties of the estimates from: (i) marginal standardisation method between models for continuous (i.e., linear model) and dichotomised outcomes (i.e., logit or probit model), and (ii) marginal standardisation method and distributional approach (i.e., marginal mean method) applied to linear model. We also compared the diagnostic test for probit, logit and linear models. For the real dataset analysis, we applied these analytical approaches to assess the management of inpatient hyperglycaemia in a pilot intervention study.

Results

Although the RR estimates from the marginal standardisation method were generally unbiased for all models in the simulation study, the marginal standardisation method for linear model provided estimates with higher precision and power than logit or probit model, especially when the baseline risks were at the extremes. When comparing approaches that avoid dichotomisation, RR estimates from these approaches had comparable performance. Assessing the assumption of error distribution was less powerful for logit or probit model via link test when compared with diagnostic test for linear model. After accounting for multiple thresholds representing varying levels of severity in hyperglycaemia, marginal standardisation method for linear model provided stronger evidence of reduced hyperglycaemia risk after intervention in the real dataset analysis although the RR estimates were similar across various approaches.

Conclusions

When compared with approaches that do not avoid dichotomisation, the RR estimated from linear model is more precise and powerful, and the diagnostic test from linear model is more powerful in detecting mis-specified error distributional assumption than the diagnostic test from logit or probit model. Our work describes and assesses the methods available to analyse data involving studies of continuous outcomes with binary representations.

Peer Review reports

Background

Dichotomisation of continuous outcomes is common in epidemiology. For example, certain conditions of interest are defined by a continuous variable over or below some threshold, such as, hyperglycaemia is determined by either pre-meal blood glucose (BG) exceeding 7.78 mmol/L or random BG exceeding 10 mmol/L [1, 2]. The nature of the outcome determines the statistical approach taken to analyse the data. For example, linear model and logit model (or logistic regression model) are commonly performed on continuous and binary outcomes respectively. Hence, for a continuous outcome where its binary representation is also widely used, studies have reported findings from both linear and logit models for outcomes with dual representations [3,4,5].

However, estimates from dichotomised outcomes have large variances [6, 7] and low power [6,7,8,9,10]. Despite of these disadvantages, there are practical reasons for justifying dichotomisation, such as: (a) following practices used in previous research, (b) simplifying analyses or presentation of results, (c) addressing skewed variable, and (d) using clinically significant thresholds [9, 10]. To address the problems associated with dichotomising continuous outcomes, approaches that use the analytical results from continuous outcomes to infer the association between an exposure and the dichotomised outcome have been proposed.

The proposed approaches transform the estimates obtained from continuous outcomes into familiar measures of association for binary outcomes, such as, risk differences, odds ratios (ORs), and relative risks (RRs). The estimates are obtained from applying least squares [11], method of moments [12,13,14], maximum likelihood [15, 16], or Bayesian [17, 18] estimation method to continuous outcomes. The simplest transformation multiplies a scaling factor to the estimates from the linear model to obtain log-OR [11] but it assumes the errors have a logistic distribution. The Bayesian method allows the distribution of the error to be unspecified [18]. An alternative approach uses the dichotomised marginal means from the linear model for continuous outcomes [12, 13, 15] to obtain the measures of association for binary outcomes and the skew normal distribution [14] has been considered to address potential skewness in the continuous outcomes.

The marginal mean approach estimates probabilities of different exposure levels by assuming an individual having a confounding profiling that corresponds to the mean values of the confounders [19]. When one of the confounders is binary, an individual having a binary confounder that equals to its mean value does not exist in the real world setting. However, when the outcome is continuous, the marginal mean is equivalent to the overall mean of the population in the linear regression model because the model has an identity link. Given that the computation of probabilities for binary outcomes involve non-linear link function (e.g., logit link), marginal standardisation is commonly used to generate probabilities and RRs from the logit model for making inference on the overall study population [19]. Interestingly, when the study of continuous outcomes does not require adjustment for confounders with regression model, for example, in randomised controlled trials, the marginal mean method for estimating one-sample risk and RR from two-sample [12, 15] could be equivalent to the marginal standardisation method under certain assumptions (see Additional file 1 Section 1 for the details).

In this paper, we leverage on the marginal standardisation method for binary outcomes to estimate the RR of dichotomised outcomes using the linear regression model with adjustment for confounders. Extending the marginal standardisation method from binary to dichotomised outcomes becomes apparent when we realise that the logit (or probit) model assumes an underlying latent variable that corresponds to a linear model with standard logistic (or normal) error [20,21,22], and when the latent variable exceeds some threshold (i.e., dichotomised latent variable) it can be modelled using the logit (or probit) model. As both the logit and probit models are commonly used to model binary outcomes, we extend the marginal standardisation method to linear model to make inference on dichotomised outcomes in two scenarios: logistic and normal error distributions. We compare the marginal standardisation method between regression models for continuous (i.e., linear model) and dichotomised (i.e., logit or probit model) outcomes by comparing the estimates generated from these regression models. Among approaches that avoid dichotomisation, we compare RR estimates that used the marginal standardisation approach for linear model with those from the distributional approach proposed by Sauzet et al. [16] that estimates RR from the marginal mean obtained from the same linear model. We assess the statistical properties of the estimates and diagnostic tests using simulated data. We also apply the various approaches to evaluate the effect of an intervention that aims to improve inpatient management of hyperglycaemia in a pilot study where multiple thresholds were used to represent varying levels of severity in hyperglycaemia.

Methods

Statistical models

The commonly used linear model for a continuous outcome has the following structure:

$$ {Y}_i={\alpha}_0+{\alpha}_1{E}_i+\sum \limits_{k=1}^p{\alpha}_{k+1}{Z}_{ki}+{\varepsilon}_i $$
(1)

where Yi, Ei and Zki, k = 1, …, p, are the observed outcome, exposure and confounders for the i-th individual respectively; α0 is the intercept and α1, , αp + 1 are the slopes; and the error terms, εis, are assumed to be independent and identically distributed from a normal distribution with mean being 0 and standard deviation (SD) being λ. We call this the normal linear model from henceforth.

Suppose the dichotomised outcome is defined by the continuous outcome exceeding a known threshold, i.e., \( \overset{\sim }{Y_i^1}=\mathrm{I}\left({Y}_i>\tau \right) \), where τ denotes the threshold and I(∙) is the indicator function, the dichotomised outcome would correspond to a probit model where the linear predictor of the probit model (LP) is:

$$ {LP}_i={\beta}_0+{\beta}_1{E}_i+\sum \limits_{k=1}^p{\beta}_{k+1}{Z}_{ki} $$
(2)

where the link function is the inverse cumulative density function of a standard normal distribution (i.e., probit link) and \( \overset{\sim }{Y_i^1} \) has a Bernoulli distribution. The parameters from the normal linear model for the continuous outcome in equation 1 and the parameters from the probit model in equation 2 have the following relationship (see Additional file 1 Section 2 for the details):

  • $$ {\beta}_0=\frac{\alpha_0-\tau }{\lambda } $$
  • $$ {\beta}_j=\frac{\alpha_j}{\lambda}\cdot for\cdot j=1,\cdots, p+1. $$

When the dichotomised outcome is defined by the continuous outcome being below a known threshold, i.e., \( \overset{\sim }{Y_i^2}=\mathrm{I}\left({Y}_i<\tau \right) \), the relationship of the parameters between the normal linear and probit models is:

  • $$ {\beta}_0=-\left(\frac{\alpha_0-\tau }{\lambda}\right) $$
  • $$ {\beta}_j=-\frac{\alpha_j}{\lambda}\cdot for\cdot j=1,\cdots, p+1. $$

The parameters of the two dichotomised outcomes (i.e., \( \overset{\sim }{Y_i^1} \) and \( \overset{\sim }{Y_i^2} \)) differ only in the sign. Hence, for example, we can use the estimates of the normal linear model for random BG to obtain estimates of the probit model for hyperglycaemia that corresponds to random BG exceeding some threshold.

When we assumed the error terms in equation 1 are independently and identically distributed from a logistic distribution with location parameter being 0 and scale parameter being λ, we call this the logistic linear model from henceforth. The dichotomised outcomes (i.e., \( \overset{\sim }{Y_i^1} \) and \( \overset{\sim }{Y_i^2} \)) from the logistic linear model corresponds to the logit model, and its linear predictor is represented by equation 2 and its link function is the inverse cumulative density function of a standard logistic distribution (i.e., logit link). The relationship of the parameters between the logistic linear and logit models is the same as the scenario where the error terms are normal. In summary, the scaled parameters from the normal (or logistic) linear model would correspond to the parameters from the probit (or logit) model. In particular, the scaled slope parameter of the logistic linear model will also have a log-OR interpretation. However, this interpretation does not apply to the scenario where the errors are normally distributed. Therefore, we propose to use RR to quantify the measure of association for dichotomised outcome.

Estimation of RR

In epidemiology, effect measures for binary outcomes are usually quantified by risk difference or RR, as these measures are more intuitive and understandable [23]. In the presence of confounders, we propose to use marginal standardisation method that contrasts the risks between all individuals who are assumed to be exposed and unexposed [19]. Specifically, the risk corresponds to the marginal probability is:

$$ \Pr \left(\overset{\sim }{Y}=1|E=j\operatorname{}\right)=\sum \limits_{i=1}^n\Pr \left(\overset{\sim }{Y}=1|{E}_i\operatorname{}=j,{Z}_{1i}={z}_{1i},\dots, {Z}_{pi}={z}_{pi}\right){w}_i $$
(3)

where j takes values zero or one that corresponds to unexposed or exposed, and subscript i refers to the i-th individual in the study population. The marginal probability in equation 3 is a weighted average of probabilities over a target population, taking into consideration of confounders [19, 24]. Hence, using the marginal probabilities, the estimated RR is:

$$ \hat{\mathrm{RR}}=\frac{\hat{\Pr}\left(\overset{\sim }{Y}=1|E=1\right)}{\hat{\Pr}\left(\overset{\sim }{Y}=1|E=0\right)}=\frac{\sum_{i=1}^n\hat{\Pr}\left(\overset{\sim }{Y_i}=1|{E}_i=1,{Z}_{1i}={z}_{1i},\cdots, {Z}_{pi}={z}_{pi}\right){w}_i}{\sum_{i=1}^n\hat{\Pr}\left(\overset{\sim }{Y_i}=1|{E}_i=0,{Z}_{1i}={z}_{1i},\cdots, {Z}_{pi}={z}_{pi}\right){w}_i} $$
(4)

and \( {w}_i=\frac{1}{n} \) when the target population of the standardisation is the study population [19, 24]. The quantity, \( \hat{\Pr}\left(\overset{\sim }{Y_i}=1|{E}_i=j,{Z}_{1i}={z}_{1i},\cdots, {Z}_{pi}={z}_{pi}\right) \), can be estimated as follows :

$$ \Phi \left(\hat{\beta_0}+\hat{\beta_1}{E}_i+\sum \limits_{k=1}^p\hat{\beta_{k+1}}{Z}_{ki}\right) $$
(5)

where Φ(·) is the cumulative distribution function of standard normal (or logistic) distribution and \( \hat{\beta_{\mathrm{j}}} \), for j = 0, , p + 1, are estimates from the probit (or logit) model. We can replace the estimates from probit (or logit) model with those from the normal (or logistic) linear model (i.e., \( \hat{\alpha_0},\dots, \hat{\alpha_{\mathrm{p}+1}},\hat{\lambda} \)). Hence, when \( \overset{\sim }{Y_i^1}=\mathrm{I}\left({Y}_i>\tau \right) \), \( \hat{\beta_0}=\frac{\hat{\alpha_0}-\tau }{\hat{\lambda}} \) and \( \hat{\beta_{\mathrm{j}}}=\frac{\hat{\alpha_{\mathrm{j}}}}{\hat{\lambda}} \) for j = 1, , p + 1, and when \( \overset{\sim }{Y_i^2}=\mathrm{I}\left({Y}_i<\tau \right) \), \( \hat{\beta_0}=-\left(\frac{\hat{\alpha_0}-\tau }{\hat{\lambda}}\right) \) and \( \hat{\beta_{\mathrm{j}}}=-\left(\frac{\hat{\alpha_{\mathrm{j}}}}{\hat{\lambda}}\right) \) for j = 1, , p + 1. Applying the delta method [19, 24,25,26,27] to \( \hat{\mathrm{RR}} \) in equation 4, the estimated standard error (SE) of \( \hat{\mathrm{RR}} \) is:

$$ \mathrm{se}\left(\hat{\mathrm{RR}}\right)={\left(\frac{\partial \hat{\mathrm{RR}}}{\partial \hat{\theta}}\right)}^T\hat{\mathrm{RR}}\left(\frac{\partial \hat{\mathrm{RR}}}{\partial \hat{\theta}}\right) $$
(6)

where \( \hat{\theta}={\left(\hat{\alpha_0},\dots, \hat{\alpha_{\mathrm{p}+1}},\hat{\lambda}\right)}^T \) if \( \hat{\mathrm{RR}} \) is estimated from linear model, and \( \hat{\theta}={\left(\hat{\beta_0},\dots, \hat{\beta_{\mathrm{p}+1}}\right)}^T \) if \( \hat{\mathrm{RR}} \) is estimated from probit or logit model (see Additional file 1 Section 3 for the details). The variance estimation procedure can be easily implemented in R [28].

Simulation study

Given that the logit and probit models are two commonly used models to analyse binary outcomes, we simulated continuous outcomes data under two scenarios: logistic and normal error distributions respectively, and generated dichotomised outcomes from these data. To investigate the performance of estimates from models that are correctly specified for the continuous and dichotomised outcomes, we compared the performance of estimates from logit model applied to dichotomised outcomes with those from logistic linear model applied to continuous outcomes where the outcomes are from data with logistic error distribution. To compare estimates from probit and normal linear models when the models are correctly specified, these models are applied to data with normal error distribution. We also compared \( \hat{\mathrm{RR}} \)s from these correctly specified models where \( \hat{\mathrm{RR}} \)s were computed using the marginal standardisation method. To investigate performance of approaches that avoided dichotomisation, we compared \( \hat{\mathrm{RR}}\mathrm{s} \) from marginal standardisation method that used estimates from linear model with those from the distributional approach proposed by Sauzet et al. [16]. Given that the distributional approach estimates RR by evaluating the risks at the marginal means of the linear model, it could be seen as a marginal mean method. We only considered the scenario when the data had normal error distribution in this comparison because both methods have been developed for this scenario. We called the approach proposed by Sauzet et al. [16] as distdicho from henceforth because it corresponded to the name of the STATA module that the authors had implemented their approach. To compute the bias and coverage probability quantities for distdicho, we have used the true RR value based on its definition of RR (i.e., risk of an exposed individual to an unexposed individual where both individuals have confounding profiling corresponding to the mean values of the confounders).

To investigate the robustness of the models when they are applied to data with the wrong error distribution (i.e., model misspecification), we compared the performance of estimates from logit model applied to dichotomised outcomes with those from logistic linear model applied to continuous outcomes where the outcomes are from data with normal error distribution. Likewise, we compared estimates from probit and normal linear models that are applied to data with logistic error distribution. We also compared \( \hat{\mathrm{RR}}\mathrm{s} \) from these mis-specified models where \( \hat{\mathrm{RR}}\mathrm{s} \) were computed using the marginal standardisation method. To identify potential model misspecification in the data analysis in the real world setting, we could perform model diagnostics by testing the distribution of residuals against the assumed error distribution from the linear model and testing the appropriateness of link function from the logit or probit model with Pregibon link test [29]. To assess the distribution of residuals from the logistic linear model, the 2-sided Kolmogorov-Smirnov (KS) test is used. As KS test tends to be extremely conservative when distribution parameters are estimated from the sample, we used the Lilliefors corrected KS test [30], which is only available for normal linear model. To visualise the distribution of the residuals, we plotted the quantile-quantile plot (QQ-plot). Details of the simulation set-up are provided in the next paragraph.

We simulated a continuous outcome as a function of a binary exposure variable, E, and two binary confounders, Z1 and Z2. We first simulated Z1 and Z2 from Bernoulli trials with success probabilities being 0.4 and 0.6 respectively, and then simulated E from a Bernoulli trial with success probability \( \Pr \left(E=1|{Z}_1,{Z}_2\right)=0.34\times \exp \left\{\ln \left(\sqrt{2}\right){Z}_1+\ln \left(\sqrt{2}\right){Z}_2\right\}. \) We simulated the continuous outcome based on the following:

$$ {Y}_i={\alpha}_0+{\alpha}_1{E}_i+{\alpha}_2\left({Z}_{1i}-0.4\right)+{\alpha}_3\left({Z}_{2i}-0.6\right)+{\varepsilon}_{\mathrm{i}} $$
(7)

where εi had a normal (or logistic) distribution with sample size set to 1000. We set α0 = 0.4 and α2 = α3 = 0.5, and considered the following α1 values that reflected increasing negative exposure effect on the outcome: 0, −0.15 and − 0.3. We also considered the following λ values: 1, 2 and 0.5. We considered the dichotomised outcome corresponding to Yi s exceeding a pre-specified threshold, \( \overset{\sim }{Y_i}=\mathrm{I}\left({Y}_i>\tau \right) \). To vary the baseline risk, we used threshold values corresponding to the 7.5-th, 15-th, 30-th, 50-th, 70-th, 85-th, and 92.5-th percentiles of Yi = α0 + εi which represents the outcome of an individual with the mean outcome corresponding to the marginal mean where all subjects in the population are assumed to be unexposed. We conducted 1000 simulation iterations to assess the bias, SE, coverage probability (i.e., the probability that the constructed 95% confidence interval contains the true value of the parameter), type 1 error, and power of both \( \hat{\beta_1} \) and \( \hat{\mathrm{RR}} \) from normal linear, logistic linear, probit and logit models. We also assessed the performance of the KS test, Lilliefors corrected KS test and Pregibon link test for model misspecification with the type 1 error and power of these tests.

Empirical example

The real dataset consists of the BG readings from index inpatient admissions of patients with type 2 diabetes mellitus admitted within the pre- and post- 60 days of a pilot intervention program designed to improve inpatient management of hyperglycaemia in selected wards of a tertiary hospital in Singapore in 2013. Each inpatient admission has BG readings at multiple time-points. To assess whether the intervention program could reduce the proportion of admissions with at least one hyperglycaemia event, we performed the normal and logistic linear models on the log-transformed maximum BG levels within an admission, with adjustment for age, gender, ethnicity, and emergency admission status. We used the same plots and tests in the simulation study to assess the distributional assumption of the error term in the linear model. Four common thresholds in inpatient management of hyperglycaemia with increasing severity: 10, 14, 16, and 20 mmol/L [31,32,33], were used in the analyses. We also performed the probit and logit models on the dichotomised outcomes at each threshold and assessed the appropriateness of the links using Pregibon link test. We reported 2-sided P values for the association between intervention and hyperglycaemia risk.

We used R version 3.3.2 (Vienna, Austria) to perform the simulation and analyse the real dataset [28]. To perform the normal and logistic linear models, we used the R package, gamlss (Generalised Additive Models for Location Scale and Shape) [34]. To obtain \( \hat{\mathrm{RR}} \) from the distributional approach [16] (distdicho), we implemented their method in R.

Results

Simulation results

β 1 estimate (\( \hat{\beta_1} \)) from simulation

β1 is the slope parameter associated with exposure for probit and logit model, and it corresponds to the scaled slope parameter associated with exposure from the linear models. Regardless of the threshold values and exposure effect sizes, we found, in general, \( \hat{\beta_1} \)s from normal linear and probit models had biases close to 0 and coverage probabilities close to 95%, which suggests comparable performance between these two models in terms of unbiasedness and coverage probabilities, when outcomes had standard normal errors (see Fig. 1a). Similar findings were observed for logistic linear and logit models when outcomes had standard logistic errors (see Fig. 1b). Model misspecification from applying logistic (or normal) linear model to continuous outcomes with standard normal (or logistic) errors gave unbiased estimates with coverage probabilities close to 95% across different threshold values. However, the mis-specified binary models produced biased estimates especially when the effect size was large (β1 =  − 0.3).

Fig. 1
figure 1

Simulations results for bias, coverage probability and standard error for both \( \hat{\beta_1} \) and \( \hat{\mathrm{RR}} \). Panel (a) and (b) plot the simulation results for \( \hat{\beta_1} \) and panel (c) and (d) plot the simulation results for \( \hat{\mathrm{RR}} \), when errors are normally (or logistically) distributed with mean (or location) 0 and standard deviation (or scale) 1, and α1 = 0,−0.15, and −0.3. Dashed lines are 0, 0.95 and 1 for Bias, Coverage probability and Ratio: Mean/Empirical standard error (SE) respectively, which correspond to no bias, 95% coverage probability and mean and empirical SEs are the same. Normal and logistic linear models mean linear model with the error terms assumed to have normal and logistic distribution respectively

The SEs of \( \hat{\beta_1} \) from the normal and logistic linear models were comparable but consistently smaller when compared to probit and logit models where the differences were more pronounced when threshold values deviated from 50-th percentile threshold value, regardless of exposure effect sizes and error distributions. Empirical and mean SEs of \( \hat{\beta_1} \) for each model were comparable, regardless of exposure effect sizes, threshold values and error distributions. Although the type 1 errors of \( \hat{\beta_1} \) were close to 0.05 for all models, normal and logistic linear models had higher power than probit and logit models respectively with differences being more pronounced when threshold values were at 7.5-th and 92.5-th percentiles (see Fig. 2a and c). Similar findings were observed when λ = 0.5, 2 (see Additional file 1 Section 4, Additional file 1: Figures S1 to S4).

Fig. 2
figure 2

Simulation results for type 1 error and power when α1 =  − 0.15 and α1 =  − 0.3. Panel (a) and (c) plot the simulation results for \( \hat{\beta_1} \) and panel (b) and (d) plot the simulation results for \( \hat{\mathrm{RR}} \). Dashed lines are plotted at 0.05 and 1 for Type 1 error and Power, corresponding to 0.05 type 1 error and 100% power respectively. Normal and logistic linear model means linear model with the error terms assumed to have normal and logistic distribution respectively

Relative risk estimate (\( \hat{RR} \)) from simulation

Similar to \( \hat{\beta_1} \)s, in general, we found \( \hat{\mathrm{RR}} \)s from all four models unbiased with coverage probabilities close to 95% across exposure effect sizes and threshold values when outcomes had standard normal (or logistic) error distribution (see Fig. 1c and d). Except when threshold values were at 92.5-th percentile, the estimates from mis-specified linear models had somewhat pronounced biases, but within 10% of the true value. Similar to \( \hat{\beta_1} \)s, the SEs of \( \hat{\mathrm{RR}} \) from the normal and logistic linear models were smaller than the probit and logit models. However, the differences increased as threshold values increased from 7.5-th to 92.5-th percentile. \( \hat{\mathrm{RR}} \) had similar findings as \( \hat{\beta_1} \) for ratio between empirical and mean SEs, type 1 error and power. Similar findings for \( \hat{\mathrm{RR}} \) were observed when λ = 0.5, 2 (see Additional file 1 Section 4, Additional file 1: Figures S1 to S4). When comparing approaches that avoided dichotomisation, we found \( \hat{\mathrm{RR}} \) from the marginal standardisation method, in general, had comparable performance as the distributional approach (distdicho) in terms of bias, coverage, empirical SE over mean SE, type 1 error and power (see Additional file 1 Section 4, Additional file 1: Figures S5 to S7) where the bias and coverage quantities were based on the true RR value as defined by the marginal mean method. The mean SEs of distdicho were only slightly larger than the SE of the marginal standardisation method when the threshold values and exposure effect sizes were large.

Model diagnostics

Model diagnostics could alleviate the bias due to model misspecification in the error distribution as observed in the previous simulation findings. When the effect size and scale parameter were zero and one respectively, the QQ plot of residuals from linear models followed the 45-degree line when applied appropriately (see Fig. 3a and d). Pregibon link tests applied on binary models had type 1 errors close to 0.05 with power lower than 5%. However, KS test for logistic linear model and Lilliefors corrected KS test for normal linear model had type 1 error much lower than 0.05 with power close to 5% and type 1 error close to 0.05 with power close to 80% respectively (see Table 1). Hence, the diagnostic test for linear model is more powerful than the one for probit and logit models. Similar findings were observed across different effect sizes and scale values (results not presented).

Fig. 3
figure 3

Quantile-quantile plot of residuals from linear models applied to simulated datasets when α1 = 0. Panel (a) and (b) plot the residuals from linear models with normal errors. Panel (c) and (d) plot the residuals from linear models with logistic errors. Gray lines plot quantile-quantile lines for 1000 simulations and black line is the 45-degree line. Normal and logistic linear models mean linear model with the error terms assumed to have normal and logistic distribution respectively

Table 1 Simulation results for model diagnostics when α1 = 0

Empirical results

We obtained the index inpatient admissions of 317 patients with type 2 diabetes mellitus, where 175 and 142 admissions occurred before and after intervention respectively. The patients were generally elderly where the mean age before intervention is 70 (SD = 14) years and the mean age after intervention is 67 (SD = 16) years. Majority of the patients were females (before: 55% vs after: 61%), and the majority were Chinese (before: 60% vs after: 56%), followed by Malay (before: 21% vs after: 21%), Indian (before: 13% vs after: 19%), and Others (before: 6% vs after: 4%). The average of maximum BG within an admission is 15.9 (SD = 6) mmol/L after intervention and it was significantly lower when compared with the average of maximum BG before intervention (i.e., mean of maximum BG = 17.3; SD = 6 mmol/L) without adjustment for confounders. To assess whether the proportion of admissions with at least one hyperglycaemia event was reduced after the intervention and to avoid dichotomisation, we performed the normal and logistic linear models on the log-transformed maximum BG within an admission with adjustment for confounders. For an admission to have at least one hypoglycaemia event within an admission, its maximum BG has to exceed a threshold value, such as: 10 mmol/L (before: 91% vs after: 92%), 14 mmol/L (before: 67% vs after: 57%), 16 mmol/L (before: 51% vs after: 41%), and 20 mmol/L (before: 29% vs after: 20%). Figure 4a compares the adjusted \( \hat{\beta_1} \)s which were associated with intervention indicator in the linear predictor of the linear, probit and logit models. For all thresholds except 10 mmol/L, we found estimates from normal linear and probit models similar, and estimates from logistic linear and logit models similar. The 95% confidence intervals of \( \hat{\beta_1} \)s from linear models were narrower than probit and logit models, while the confidence intervals among probit and logit models were wider when threshold values corresponded to baseline risks further from 50-th percentile (i.e., 29 and 91%). The linear models indicated that \( \hat{\beta_1} \) was significantly different from zero, but the probit and logit models had the same conclusion only when the threshold value was 20 mmol/L.

Fig. 4
figure 4

Effect of intervention on risk of hyperglycaemia with respect to different degree of severity. Panel (a) and (b) plot \( \hat{\beta_1} \) s and \( \hat{\mathrm{RR}} \) s with their corresponding 95% confidence intervals (i.e., vertical lines) for linear, probit and logit models respectively. * indicates a P-value less than 0.05 without adjustment for multiple testing. + indicates a P-value less than 0.05 after adjustment for multiple testing

Although \( \hat{\mathrm{RRs}} \) were similar across all approaches within each threshold except at 10 mmol/L, the confidence intervals of \( \hat{\mathrm{RR}} \)s computed from normal and logistic linear models by using marginal standardisation method, and normal linear model by using marginal mean method (distdicho) were narrower than those computed from probit and logit models by using marginal standardisation method (see Fig. 4b). The confidence intervals of \( \hat{\mathrm{RR}} \)s from all approaches were increasing as the baseline risks decreased. Probit and logit models indicated that \( \hat{\mathrm{RRs}} \) were significantly different from one for all thresholds except 10 mmol/L, whereas marginal standardisation and mean methods applied to linear models also indicated significant difference from one when the threshold value was 10 mmol/L. To account for multiple testing because four thresholds were used to define hyperglycaemia, we adjusted the P values for \( \hat{\mathrm{RR}} \) using Bonferroni correction [35]. After correction, the marginal standardisation method suggested significant difference in risks for all four thresholds with the logistic linear model and 16 and 20 mmol/L with the normal linear model where the remaining two thresholds had borderline significance (i.e., adjusted P values between 0.05 and 0.1). All thresholds except 10 mmol/L had borderline significance for the marginal mean method applied to normal linear model (distdicho) with adjusted P values ranging from 0.088 to 0.108. Significant difference in risk was observed for only one threshold (i.e., 20 mmol/L) with no borderline significance for the remaining three thresholds when marginal standardisation method was applied to probit and logit models. Similarly, after multiple testing adjustment for \( \hat{\beta_1} \) from probit and logit models, the results were insignificant with no borderline significance for all thresholds.

The QQ plot of residuals from normal and logistic linear models suggested that the distributional assumptions were reasonable (see Fig. 5), with Lilliefors corrected KS test and KS test P values being 0.91 and 0.80 for normal and logistic linear models respectively. Similar findings were observed with binary models, where the Pregibon link test results showed insignificant results for all thresholds except 20 mmol/L for Probit only (P value = 0.04), which became insignificant after Bonferroni correction.

Fig. 5
figure 5

Quantile-quantile plot of residuals from linear models applied to real dataset. Black line is the 45-degree line. Panel (a) plots the residuals from linear model with normal errors. Panel (b) plots the residuals from linear model with logistic errors

Discussion

Dichotomisation of continuous outcome is a common and appealing analytical approach especially when the dichotomisation process is also practiced in the clinical setting. However, the use of dichotomisation has been greatly criticised for non-negligible loss of power and increased variability in the estimate [6,7,8,9,10]. In particular, the magnitude of loss in power is greater when the threshold value is distant from the mean or median [7, 9], which corresponds to the scenario where OR approximates RR. To avoid the drawbacks of dichotomisation and facilitate the interpretation of binary representations of continuous outcomes from linear models, we proposed to transform estimates from linear models to RR through marginal standardisation. We evaluated the performance of our proposed approach, and compared it with the dichotomized and distributional approaches using both simulated and real datasets.

When comparing marginal standardisation method that avoided and did not avoid dichotomisation, our simulation results suggested β1 and RR estimates from linear, probit and logit models were generally unbiased when applied appropriately, but probit and logit models had larger SEs and smaller power than those estimates from linear models. The improvement in precision and power of estimates from continuous outcomes were also reported in other studies [11, 12, 16, 36]. Although β1 estimates from mis-specified binary models had somewhat pronounced biases when effect size was large, past studies have found that small sample size or extremely common (or rare) binary outcomes could lead to biased estimates from logit model, which were negligible when compared to the magnitude of the SEs [37]. The OR approximates to the RR when baseline risk is low (i.e., threshold at 92.5-th percentile), however, the reduction in precision and power of OR estimates from logistic linear model to logit model was also more pronounced when baseline risk is low.

Often, model diagnostics are performed to assess model misspecification [38]. Our simulation results suggested that the Lilliefors corrected KS test and KS test had better power in assessing the distributional assumptions of the error terms for linear models when compared to Pregibon link tests for binary models. Our findings were consistent with a previous study that reported low power when differentiating probit and logit links in binary models [39]. These findings were expected as probit and logit links are known to be similar [40].

From the real data analyses, in general, we found the estimates from normal linear and probit models to be similar, and this phenomenon was also observed in estimates from logistic linear and logit models. However, estimates from probit and logit models had wider confidence intervals and fewer significant findings than those from linear models. These observations were consistent with findings from our simulation and the literature [11, 12, 15]. Although we had used the log-transformed outcome in the real dataset analyses, \( \hat{\mathrm{RR}}\mathrm{s} \) obtained from this transformation is equivalent to that obtained from the original outcome without transformation, which corresponds to a multiplicative regression model with log-normal [41] or log-logistic error distribution [42]. When comparing approaches that avoided dichotomisation within normal linear model in the real data analysis, we found the marginal standardisation and marginal mean methods had similar \( \hat{\mathrm{RR}}\mathrm{s} \). From the Jensen’s inequality [43], the estimated probability obtained via marginal mean method could differ from that obtained via marginal standardisation method because the link function is not linear. However, if the range of the linear predictor values is in a neighbourhood where a linear function can be used to approximate the link function, \( \hat{\mathrm{RR}}\mathrm{s} \) from the two methods can be similar. Although these two methods gave similar results in our real data analysis, the marginal standardisation method is more commonly used in epidemiology studies [19, 25,26,27, 44] when compared with the marginal means method, and can be viewed as a special case of G-computation method in the causal inference literature [19], and can be generalised to binomial models with other link functions [26].

When faced with multiple threshold values to dichotomise the continuous outcome, the conventional approach would apply probit or logit model to the dichotomised outcome at each threshold value resulting in a multiple testing problem when assessing whether β1 (or RR) equals to 0 (or 1). In the real dataset analyses, after applying Bonferroni correction to account for the multiple testing problem with binary models, only RR defined at threshold value corresponding to 20 mmol/L was significant while β1s for all threshold values were not significant. However, with linear models, we avoided the multiple testing problem for β1. We first assessed whether β1 returned a significant finding before proceeding to identify the threshold value where RR ≠ 1. Linear models in the real dataset analyses returned significant findings for β1 and the follow-up analyses for RR, in general, returned significant results at various threshold values after accounting for multiple testing. In particular, applying marginal standardisation method to normal linear model to obtain \( \hat{\mathrm{RR}}\mathrm{s} \) provided stronger evidence of reduction in hyperglycaemia risk after intervention although \( \hat{\mathrm{RR}}\mathrm{s} \) were similar across methods. For future studies involving multiple thresholds, one could mimic the ANOVA testing procedure by starting with a test for β1 from the linear model before proceeding to perform post-hoc tests for RR at each threshold value with marginal standardisation or mean method.

Our proposed method has some limitations. We did not consider dichotomisation based on two thresholds, e.g., I(τ1 < Yi < τ2) or I(Yi < τ1, Yi > τ2), but our proposed approach can be extended to this scenario and it is beyond the scope of this paper. Although we have presented an application to binary exposure, our approach can be applied to categorical or continuous exposure as well [11, 16, 36, 45]. The marginal standardisation and mean methods in this paper were used to estimate RR only, but both methods can also be used to estimate absolute risk reduction, RR reduction and number needed to treat pending on the research question [44, 46].

Conclusions

In conclusion, we have extended the marginal standardised method that estimated RR using estimates from the linear model when the continuous outcome has a binary representation corresponding to the outcome exceeding or being below some pre-specified threshold value. By avoiding the application of probit or logit model on the dichotomised outcome, we obtained an unbiased RR estimate that was more precise and powerful with marginal standardisation and mean methods, and a more powerful model diagnostic test that could potentially alleviate potential issues associated with model misspecification in the error distribution. We provided guidance for future analyses involving dichotomised outcomes to facilitate interpretation, including settings involving multiple thresholds for the same continuous outcome.

Availability of data and materials

The computer code supporting the conclusions of this article is available in: http://blog.nus.edu.sg/dasa/transoutcome/

Abbreviations

ANOVA:

Analysis of Variance

BG:

Blood Glucose

KS:

Kolmogorov-Smirnov

OR:

Odds Ratio

QQ-plot:

Quantile-Quantile plot

RR:

Relative Risk

SD:

Standard Deviation

SE:

Standard Error

References

  1. Christian MJ, Andersen B. Inpatient glycemic management. J South Calif Clin. 2014;8(1):41–4.

    Google Scholar 

  2. Umpierrez GE, Hellman R, Korytkowski MT, Kosiborod M, Maynard GA, Montori VM, Seley JJ, Van den Berghe G. Management of hyperglycemia in hospitalized patients in non-critical care setting: an Endocrine Society clinical practice guideline. J Clin Endocrinol Metab. 2012;97(1):16–38.

    Article  CAS  Google Scholar 

  3. Choi JWJ, Ford ES, Gao X, Choi HK. Sugar-sweetened soft drinks, diet soft drinks, and serum uric acid level: the third national health and nutrition examination survey. Arthritis Care Res. 2008;59(1):109–16.

    Article  CAS  Google Scholar 

  4. Choi HK, Curhan G. Coffee, tea, and caffeine consumption and serum uric acid level: the third national health and nutrition examination survey. Arthritis Care Res. 2007;57(5):816–21.

    Article  CAS  Google Scholar 

  5. Teng GG, Tan CS, Santosa A, Saag KG, Yuan JM, Koh WP. Serum urate levels and consumption of common beverages and alcohol among Chinese in Singapore. Arthritis Care Res. 2013;65(9):1432–40.

    Article  CAS  Google Scholar 

  6. Cohen J. The cost of dichotomization. Appl Psychol Meas. 1983;7(3):249–53.

    Article  Google Scholar 

  7. Fedorov V, Mannino F, Zhang R. Consequences of dichotomization. Pharm Stat. 2009;8(1):50–61.

    Article  Google Scholar 

  8. Dawson NV, Weiss R. Dichotomizing continuous variables in statistical analysis a practice to avoid. Med Decis Mak. 2012;32(2):225–6.

    Article  Google Scholar 

  9. Ragland DR. Dichotomizing continuous outcome variables: dependence of the magnitude of association and statistical power on the cutpoint. Epidemiology. 1992:434–40.

    Article  CAS  Google Scholar 

  10. MacCallum RC, Zhang S, Preacher KJ, Rucker DD. On the practice of dichotomization of quantitative variables. Psychol Methods. 2002;7(1):19.

    Article  Google Scholar 

  11. Moser BK, Coombs LP. Odds ratios for a continuous outcome variable without dichotomizing. Stat Med. 2004;23(12):1843–60.

    Article  Google Scholar 

  12. Peacock J, Sauzet O, Ewings S, Kerry S. Dichotomising continuous data while retaining statistical power using a distributional approach. Stat Med. 2012;31(26):3089–103.

    Article  CAS  Google Scholar 

  13. Sauzet O, Peacock J. Estimating dichotomised outcomes in two groups with unequal variances: a distributional approach. Stat Med. 2014;33(26):4547–59.

    Article  CAS  Google Scholar 

  14. Sauzet O, Ofuya M, Peacock JL. Dichotomisation using a distributional approach when the outcome is skewed. BMC Med Res Methodol. 2015;15(1):40.

    Article  Google Scholar 

  15. Suissa S. Binary methods for continuous outcomes: a parametric alternative. J Clin Epidemiol. 1991;44(3):241–8.

    Article  CAS  Google Scholar 

  16. Sauzet O, Breckenkamp J, Borde T, Brenne S, David M, Razum O, Peacock JL. A distributional approach to obtain adjusted comparisons of proportions of a population at risk. Emerg Themes Epidemiol. 2016;13(1):8.

    Article  Google Scholar 

  17. Jiang W, Mayo MS, Gajewski BJ. Dichotomizing continuous data which retains statistical precision using a bayesian distributional approach that reflects the true uncertainty. Stat Biopharm Res. 2016;8(1):77–87.

    Article  Google Scholar 

  18. Schörgendorfer A, Branscum AJ, Hanson TE. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data. Biometrics. 2013;69(2):508–19.

    Article  Google Scholar 

  19. Muller CJ, MacLehose RF. Estimating predicted probabilities from logistic regression: different methods correspond to different target populations. Int J Epidemiol. 2014;43(3):962–70.

    Article  Google Scholar 

  20. McFadden D. Econometric models for probabilistic choice among products. J Bus. 1980; 53(3):S13–29.

    Article  Google Scholar 

  21. Cramer JS. Logit models from economics and other fields. Cambridge: Cambridge University Press; 2003.

  22. Ben-Akiva ME, Lerman SR. Discrete choice analysis: theory and application to travel demand, vol. 9. Cambridge: MIT press; 1985.

  23. Lee J, Tan CS, Chia KS. A practical guide for multivariate analysis of dichotomous outcomes. Ann Acad Med Singap. 2009;38(8):714–9.

    PubMed  Google Scholar 

  24. Flanders WD, Rhodes PH. Large sample confidence intervals for regression standardized risks, risk ratios, and risk differences. J Chronic Dis. 1987;40(7):697–704.

    Article  CAS  Google Scholar 

  25. Localio AR, Margolis DJ, Berlin JA. Relative risks and confidence intervals were easily computed indirectly from multivariable logistic regression. J Clin Epidemiol. 2007;60(9):874–82.

    Article  Google Scholar 

  26. Greenland S. Model-based estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case-control studies. Am J Epidemiol. 2004;160(4):301–5.

    Article  Google Scholar 

  27. Knol MJ, Le Cessie S, Algra A, Vandenbroucke JP, Groenwold RH. Overestimation of risk ratios by odds ratios in trials and cohort studies: alternatives to logistic regression. Can Med Assoc J. 2012;184(8):895–9.

    Article  Google Scholar 

  28. R Core Team. R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2017.

    Google Scholar 

  29. Pregibon D. Goodness of link tests for generalized linear models. Appl Stat. 1980; 29(1):15–24.

    Article  Google Scholar 

  30. Lilliefors HW. On the Kolmogorov-Smirnov test for normality with mean and variance unknown. J Am Stat Assoc. 1967;62(318):399–402.

    Article  Google Scholar 

  31. Munoz M, Pronovost P, Dintzis J, Kemmerer T, Wang N-Y, Chang Y-T, Efird L, Berenholtz SM, Golden SH. Implementing and evaluating a multicomponent inpatient diabetes management program: putting research into practice. Jt Comm J Qual Patient Saf. 2012;38(5):195–AP194.

    Article  Google Scholar 

  32. Gschwend S, Ryan C, Atchison J, Arslanian S, Becker D. Effects of acute hyperglycemia on mental efficiency and counterregulatory hormones in adolescents with insulin-dependent diabetes mellitus. J Pediatr. 1995;126(2):178–84.

    Article  CAS  Google Scholar 

  33. Maynard G, Kulasa K, Ramos P, Childers D, Clay B, Sebasky M, Fink E, Field A, Renvall M, Juang PS. Impact of a hypoglycemia reduction bundle and a systems approach to inpatient glycemic management. Endocr Pract. 2014;21(4):355–67.

    Article  Google Scholar 

  34. Stasinopoulos DM, Rigby RA. Generalized additive models for location scale and shape (GAMLSS) in R. J Stat Softw. 2007;23(7):1–46.

    Article  Google Scholar 

  35. Miller RG. Simultaneous statistical inference. Springer-Verlag New York: Springer; 1981.

    Book  Google Scholar 

  36. Jiang Y, Scott A, Wild CJ. Case–control analysis with a continuous outcome variable. Stat Med. 2009;28(2):194–204.

    Article  Google Scholar 

  37. Cordeiro GM, McCullagh P. Bias correction in generalized linear models. J R Stat Soc Ser B Methodol. 1991;53(3):629–43.

    Google Scholar 

  38. Belsley DA, Kuh E, Welsch RE. Regression diagnostics: identifying influential data and sources of collinearity, vol. 571. New York: Wiley; 2005.

  39. Hosmer DW, Hjort NL. Goodness-of-fit processes for logistic regression: simulation results. Stat Med. 2002;21(18):2723–38.

    Article  Google Scholar 

  40. Aldrich JH, Nelson FD. Linear probability, logit, and probit models, vol. 45. Newbury Park: Sage; 1984.

    Book  Google Scholar 

  41. Firth D. Multiplicative errors: log-normal or gamma? J R Stat Soc Ser B Methodol. 1988;50(2):266–8.

    Google Scholar 

  42. Shoukri M, Mian I, Tracy D. Sampling properties of estimators of the log-logistic distribution with application to Canadian precipitation data. Can J Stat. 1988;16(3):223–36.

    Article  Google Scholar 

  43. Needham T. A visual explanation of Jensen’s inequality. Am Math Mon. 1993;100(8):768–71.

    Article  Google Scholar 

  44. Austin PC. Absolute risk reductions, relative risks, relative risk reductions, and numbers needed to treat can be obtained from a logistic regression model. J Clin Epidemiol. 2010;63(1):2–6.

    Article  Google Scholar 

  45. Sciandra M, Muggeo VM, Lovison G. Subject-specific odds ratios in binomial GLMMs with continuous response. Stat Methods Appl. 2008;17(3):309–20.

    Article  Google Scholar 

  46. Austin PC. Absolute risk reductions and numbers needed to treat can be obtained from adjusted survival models for time-to-event outcomes. J Clin Epidemiol. 2010;63(1):46–55.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

The work was supported by Centre for Health Services and Policy Research (grant number: SBRO14/NS01G) from the National University Health Systems Pte Ltd. The funding body played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

YC derived the theoretical results, processed the data, performed the simulation and data analyses, interpreted the findings, and drafted the manuscript. YN participated in the data processing and analyses, and edited the manuscript. SLK designed the study, participated in the data analyses and interpretation of the findings, edited the manuscript, and co-supervised the project. NS, FMR, KV, EYHK, and EST interpreted the findings and edited the manuscript. CST conceived the project, participated in the data analyses and interpretation of findings, edited the manuscript, and supervised the project. All authors reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Chuen Seng Tan.

Ethics declarations

Ethics approval and consent to participate

The investigations were carried out in accordance with the ethical codes and guidelines of the Nuremberg Code (1946), the Declaration of Helsinki (1964), the Belmont Report (1979) and the Singapore Guideline for Good Clinical Practice (1998), and were approved by the National Healthcare Group Domain Specific Review Board (Reference Number: NHG DSRB 2014/00034), including waiver of informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional file

Additional file 1:

Additional theoretical proofs and additional simulation results. The file consists of four sections. Section 1. Comparison of marginal standardisation approach to marginal means approach Section 2. Derivation details for the relationship between parameters from linear models to probit or logit models Section 3. Variance of estimates Section 4. Additional simulation results (DOCX 2100 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Ning, Y., Kao, S.L. et al. Using marginal standardisation to estimate relative risk without dichotomising continuous outcomes. BMC Med Res Methodol 19, 165 (2019). https://doi.org/10.1186/s12874-019-0778-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12874-019-0778-9

Keywords