Skip to main content
  • Research article
  • Open access
  • Published:

G-computation of average treatment effects on the treated and the untreated



Average treatment effects on the treated (ATT) and the untreated (ATU) are useful when there is interest in: the evaluation of the effects of treatments or interventions on those who received them, the presence of treatment heterogeneity, or the projection of potential outcomes in a target (sub-) population. In this paper we illustrate the steps for estimating ATT and ATU using g-computation implemented via Monte Carlo simulation.


To obtain marginal effect estimates for ATT and ATU we used a three-step approach: fitting a model for the outcome, generating potential outcome variables for ATT and ATU separately, and regressing each potential outcome variable on treatment intervention.


The estimates for ATT, ATU and average treatment effect (ATE) were of similar magnitude, with ATE being in between ATT and ATU as expected. In our illustrative example, the effect (risk difference [RD]) of a higher education on angina among the participants who indeed have at least a high school education (ATT) was −0.019 (95% CI: −0.040, −0.007) and that among those who have less than a high school education in India (ATU) was −0.012 (95% CI: −0.036, 0.010).


The g-computation algorithm is a powerful way of estimating standardized estimates like the ATT and ATU. Its use should be encouraged in modern epidemiologic teaching and practice.

Peer Review reports


In epidemiology, (bio)statistics and related fields, researchers are often interested in the average treatment effect in the total population (average treatment effect, ATE). This quantity provides the average difference in outcome between units assigned to the treatment and units assigned to the placebo (control) [1]. However, in economics and evaluation studies, it has been noted that the average treatment effect among units who actually receive the treatment or intervention (average treatment effects on the treated, ATT) may be the implicit quantity sought and the most relevant to policy makers [2]. For instance, consider a scenario where a government has implemented a smoking cessation campaign intervention to decrease the smoking prevalence in a city and now wishes to evaluate the impact of such intervention. Although the overarching goal of such evaluation may be to assess the impact of such intervention in reducing the prevalence of smoking in the general population (i.e. ATE), researchers and policymakers might be interested in explicitly evaluating the effect of the intervention on those who actually received the intervention (i.e. ATT) but not that on those among whom the intervention was never intended.

Alternatively, researchers may be interested in estimating the potential impact of an existing program in a new target (sub-) population. For instance, one might wish to project the effect of the smoking cessation intervention in a city that did not receive the intervention in order to gauge its potential impact when such intervention is actually implemented. This latter quantity is referred to as the average treatment effect on the untreated (ATU). Interestingly, the ATE can be seen as a weighted average of the ATT and the ATU. All three quantities will be equal when the covariate distribution is the same among the treated and the untreated (e.g. under perfect randomization with perfect compliance or when there is no unmeasured confounders) and there is no effect measure modification by the covariates.

Robins introduced the “g-methods” to estimate such quantities using observational data [3]. Among these, the marginal structural models (MSMs) were designed to estimate marginal quantities (i.e., not conditional on other covariates). The parameters of a MSM can be consistently estimated using two classes of estimators: the g-computation algorithm [4] and the inverse-probability of treatment weighting (IPTW) [5]. G-computation is often seen as a viable alternative to IPTW because g-computation produces more efficient (i.e. small standard errors) and more stable estimates in parametric settings and can better handle heterogeneity involving time-varying exposure and confounding [6]. To date, there are several didactic demonstrations for g-computation [7, 8] and applied examples for projecting the impact of hypothetical interventions aimed at reducing risk factors for coronary heart diseases [9] or diabetes in adult populations [10], or at reducing early childhood adiposity [11]. However, these studies focused on ATE and there are still no accessible demonstrations of g-computation [4] applied to ATT and ATU. This manuscript aims to present an easy-to-use g-computation technique using Monte Carlo simulation for consistently estimating ATT and ATU. We also present alternative ways to obtain ATT and ATU via ATE with sample restriction or g-computation technique without simulation.


Notation and g-computation steps

In the remaining, we will use capital letters to refer to random variables and lowercase letters to represent the specific realizations of the corresponding random variables. Let A denote the treatment, with a and a * as its index and reference values, Y the outcome, C a set of covariates sufficient for confounding control, and Y a the potential outcome that would have occurred had treatment A, perhaps contrary to fact, been set to a. Each subject in the population has a pair of potential outcomes, one being observed and the other being counterfactual. Y a is the observed outcome had the subject received the treatment A = a whereas Y a* is the counterfactual outcome. Conversely, for subjects who receive placebo (control), Y a* is the observed outcome while Y a is the counterfactual outcome. The ATE, defined as \( E\left({Y}_a-{Y}_{a^{*}}\right) \), is the average marginal treatment effect in the total population. The ATT, defined as \( E\left({Y}_a-{Y}_{a^{*}}\Big|A=a\right) \) and the ATU, defined as \( E\left({Y}_a-{Y}_{a^{*}}\Big|A={a}^{*}\right) \), measure the marginal treatment effect in the subpopulation that received the treatment and the subpopulation that did not, respectively. When the assumptions of consistency [12], conditional exchangeability given C [13], and positivity [14] are met, the target causal parameters ATE, ATT and ATU on the risk difference scale can be estimated using observational data and the following estimators:

$$ \begin{array}{c}\hfill ATE={\varSigma}_c\left[E\left(Y\Big|A=1,\boldsymbol{C}=\boldsymbol{c}\right)-E\left(Y\Big|A=0,\boldsymbol{C}=\boldsymbol{c}\right)\right]P\left(\boldsymbol{C}=\boldsymbol{c}\right),\hfill \\ {}\hfill ATT=E\left(Y\Big|A=a\right)-{\varSigma}_cE\left(Y\Big|A={a}^{*},\boldsymbol{C}=\boldsymbol{c}\right)P\left(\boldsymbol{C}=\boldsymbol{c}\Big|A=a\right),\kern0.24em \mathrm{and}\hfill \\ {}\hfill ATU={\varSigma}_cE\left(Y\Big|A=a,\boldsymbol{C}=\boldsymbol{c}\right)P\left(\boldsymbol{C}=\boldsymbol{c}\Big|A={a}^{*}\right)-E\left(Y\Big|A={a}^{*}\right).\hfill \end{array} $$

Steps to implement g-computation using Monte Carlo simulation are as follows:

Step 1: Fit a flexible model for Y on A and covariates C (i.e. with all possible and relevant interaction terms) and save the regression coefficients.

Step 2: Re-sample the original data with replacement K times (e.g. 200 or as many as computationally feasible). Create two copies of this pooled dataset and stack them. Assign a new treatment intervention variable A = a for every observation in the first copy and A = a * in the second copy. Then, generate potential outcomes for ATT and ATU separately using the regression coefficients obtained from step 1. For ATT, assign the potential outcome Y a for treated (i.e. A = a) individuals as their observed outcome Y in the “intervention A = a” dataset copy (by consistency), but impute their counterfactual outcome Y a* in the “intervention A = a *” copy (by conditional exchangeability). This latter counterfactual outcome is simulated under non-treatment, based on the outcome model and regression coefficients from step 1. For ATU, by consistency assumption, assign the potential outcome Y a*  = Y among the untreated (i.e. A = a *) individuals in the “intervention A = a *” dataset copy, and impute their counterfactual outcome Y a in the “intervention A = a” copy. This counterfactual outcome Y a is simulated under treatment, based on the outcome model and regression coefficients from step 1. Note that the g-computation of the ATT or ATU involves imputing or simulating only half of the potential outcomes under the counterfactual treatment since by consistency under factual treatment the potential outcome is observed.

Step 3: For ATT and ATU respectively, regress the corresponding potential outcome variable on the intervention variable A for the entire pooled simulated sample to obtain the point estimate. Repeat steps 1 to 3 on J (e.g. 500) bootstrapped samples taken at random with replacement from the original data. We obtain the standard errors (SEs) and 95% confidence intervals (CIs) based on the J resultant point estimates from the final regression in step 3. The standard deviation of these J point estimates is taken as the standard error and the corresponding 2.5th and 97.5th percentiles are taken as the confidence limits of the 95% CI. Nonparametric bootstrapping [15] can also be used to obtain bias-corrected and accelerated CIs.

One could also obtain ATT and ATU from average treatment effect (ATE) by simply restricting the analysis for ATE estimation to the treated (for ATT) or to the untreated (for ATU) (Additional file 1: Section 1). An alternative g-computation technique without simulation is included in the Additional file 1: Section2.


We applied the above simulation method to the India sample data from the cross-sectional World Health Survey (WHS) conducted by the WHO from 2002 to 2004 [16]. Samples were probabilistically selected with every individual being assigned to a known non-zero selection probability. All participants were interviewed face-to-face with the standardized WHS survey, which included questions regarding demographic, socioeconomic and behavioral factors. Details of dataset description and variable creation can be found elsewhere [17].

Table 1 displays the estimates for ATT, ATU and ATE on the risk difference and odds ratio scale respectively for binary education (treatment) and binary angina indicator (outcome), accounting for age and gender (covariates). We were interested in estimating the impact of a hypothetical intervention (aimed at ensuring that the target study participants have at least a high school education) on angina diagnosis. The intervention could be implemented (i) universally in the whole population of India (ATE), (ii) among individuals of a sub-population of India who actually completed high school or had higher educational attainment (ATT), or (iii) among individuals of a sub-population of India who had less than a high school education (ATU) when the survey was conducted. Detailed steps and the accompanying SAS codes for this illustrative example are included in the Additional file 1: Section 3 and Additional file 1: Section 5.

Table 1 Effect estimates obtained from g-computation using the illustrative example dataseta (N = 7706)


In the illustration, participants with at least a high school education were less likely to report having an angina diagnosis compared to those with less than a high school education, based on both risk difference (RD) and odds ratio (OR) measures (Table 1). The estimates for ATT, ATU and ATE were of similar magnitude, with ATE being in between ATT and ATU as expected. The ATT estimates were of slightly greater magnitude (RD: −0.019, 95% CI: −0.040, −0.007; OR: 0.773, 95% CI: 0.607, 0.944) than the ATU estimates (RD: −0.012, 95% CI: −0.036, 0.010; OR: 0.910, 95% CI: 0.678, 1.177), suggesting that the protective effect of a higher education on angina may be stronger among the participants who indeed have at least a high school education than among those who have less than a high school education in India.

Similar results obtained via g-computation without simulation are presented in the Additional file 1: Table S1.


In this article, we presented a 3-step approach to estimating ATT and ATU via Monte Carlo simulation. Since ATE risk difference is the weighted average of ATT and ATU, weighted by the relative sample size of those who are treated and untreated, ATT and ATU can also be estimated from ATE via sample restriction.

When generating the potential outcomes in step 2, the potential outcome will be the same as the observed outcome if the intervention assignment (e.g. treatment) is indeed what the subject originally received and the consistency assumption is satisfied. Accordingly, the counterfactual outcome for the same subjects will be imputed (simulated) based on the outcome from those who received the alternative to treatment (e.g. placebo) and are comparable (i.e., exchangeable) conditional on measured covariates or confounders, if the assumption of conditional exchangeability assumption is met. In step 2 of the alternative g-computation approach that does not require simulation, the predicted outcomes [i.e., E(Y|A = a, C = c)] are generated for both treated and untreated individuals. While the approach via simulation clearly demonstrates the importance of the two core assumptions—consistency and conditional exchangeability—to estimate causal parameters from observational data, the approach without simulation is less computationally intensive.

We also need the positivity assumption which requires that there exist participants who experienced all levels of the treatment (such as being treated or untreated) for every combination of the values of the observed confounders in the population under study [14]. This latter assumption needs to be supported by the data at hand. Steps for implementing g-computation for ATT and ATU allow us to better understand the importance of assumptions that are often listed but seldom discussed.

Besides the consistency, conditional exchangeability and positivity assumptions, other implicit assumptions such as the absence of other biases (selection bias and measurement error) and correct model specification need to be satisfied in order to estimate ATE, ATT and ATU consistently. G-computation relies heavily on outcome model specification as shown in the above steps, in which we used the regression coefficients we obtained from the outcome regression model in step 1 to predict potential outcomes. On the contrary, the IPTW method relies on correct exposure model specification assumptions. Therefore, these two g-methods can sometimes yield different results. Their strengths and limitations, and performance under violation of the positivity assumption have also been discussed in the literature [6, 18]. When possible, researchers could use both methods, or use doubly robust methods [1921] where consistent estimates for the target effects can be obtained as long as either the outcome or exposure model is correctly specified.


The g-computation algorithm is a powerful way of estimating standardized estimates like the ATT and ATU, beyond routine age- and sex-standardization and as an alternative to IPTW fitting of MSM [22]. It should be used in modern epidemiologic teaching and practice.



Average treatment effect


Average treatment effects on the treated


Average treatment effects on the untreated


Marginal structural models


  1. Imbens GW. Nonparametric estimation of average treatment effects under exogeneity: a review. Rev Econ Stat. 2004;86:4–29.

    Article  Google Scholar 

  2. Heckman JJ, Vytlacil E. Policy-Relevant Treatment Effects. Am Econ Rev. 2001;91:107–11.

    Article  Google Scholar 

  3. Robins JM. Marginal Structural Models versus Structural nested Models as Tools for Causal inference. In: Halloran ME, Berry D, editors. Stat. Model. Epidemiol. Environ. Clin. Trials. New York: Springer; 2000. p. 95–133.

    Chapter  Google Scholar 

  4. Robins J. A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Math Model. 1986;7:1393–512.

    Article  Google Scholar 

  5. Robins JM, Robins JM, Hernán MA, Hernán MA, Brumback B. Marginal structural models and causal inference in epidemiology. Epidemiology. 2000;11:550–60.

    Article  CAS  PubMed  Google Scholar 

  6. Daniel RM, Cousens SN, De Stavola BL, Kenward MG, Sterne JAC. Methods for dealing with time-dependent confounding. Stat Med. 2013;32:1584–618.

    Article  CAS  PubMed  Google Scholar 

  7. Snowden JM, Rose S, Mortimer KM. Implementation of G-computation on a simulated data set: demonstration of a causal inference technique. Am J Epidemiol. 2011;173:731–8.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Keil AP, Edwards JK, Richardson DB, Naimi AI, Cole SR. The parametric g-formula for time-to-event data: intuition and a worked example. Epidemiology. 2014;25:889–97.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Taubman SL, Robins JM, Mittleman MA, Hernán MA. Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. Int J Epidemiol. 2009;38:1599–611.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Danaei G, Pan A, Hu FB, Hernán MA. Hypothetical midlife interventions in women and risk of type 2 diabetes. Epidemiology. 2013;24:122–8.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Nianogo RA, Wang MC, Wang A, Nobari TZ, Crespi CM, Whaley SE, et al. Projecting the impact of hypothetical early life interventions on adiposity in children living in low-income households. Pediatr Obes. 2016. doi:10.1111/ijpo.12157.

    PubMed  Google Scholar 

  12. Cole SR, Frangakis CE. The consistency statement in causal inference: a definition or an assumption? Epidemiology. 2009;20:3–5.

    Article  PubMed  Google Scholar 

  13. Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006;60:578–86.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Westreich D, Cole SR. Invited commentary: positivity in practice. Am J Epidemiol. 2010;171:674–7. discussion 678–81.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Efron B, Tibshirani R. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy. Stat Sci. 1986;1:54–75.

    Article  Google Scholar 

  16. World Health Organization. WHO World Health Survey. World Health Organization;2016. Available: Accessed 1 June 2016.

  17. Wang A, Stronks K, Arah OA. Global educational disparities in the associations between body mass index and diabetes mellitus in 49 low-income and middle-income countries. J Epidemiol Community Health. 2014;68:705–11.

    Article  PubMed  Google Scholar 

  18. Moore KL, Neugebauer R, van der Laan MJ, Tager IB. Causal inference in epidemiological studies with strong confounding. Stat Med. 2012;31:1380–404.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Hernan MA, Robins JM. Standardization and the Parametric G-formula. Causal Inference. Chapman & Hall/CRC. 2015. Available from: Accessed 1 June 2016.

  20. Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61:962–72.

    Article  PubMed  Google Scholar 

  21. Vansteelandt S, Keiding N. Invited commentary: G-computation--lost in translation? Am J Epidemiol. 2011;173:739–42.

    Article  PubMed  Google Scholar 

  22. Sato T, Matsuyama Y. Marginal structural models as a tool for standardization. Epidemiology. 2003;14:680–6.

    Article  PubMed  Google Scholar 

Download references


The authors benefited from facilities and resources provided by the California Center for Population Research at UCLA (CCPR), which receives core support (R24-HD041022) from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).


AW was supported by the Dissertation Year Fellowship from the University of California, Los Angeles. RAN was supported by a Burroughs Wellcome Fellowship and the Dissertation Year Fellowship from UCLA. OAA was partly supported by grant R01-HD072296-01A1 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development.

Availability of data and materials

The World Health Survey (WHS) is a large cross-sectional study implemented by the WHO and conducted in 2002–2004 in 70 countries. The survey collected data on the health of adult populations and health systems using probabilistic sampling techniques. The study samples were nationally representative and included in total over 300,000 individuals aged 18 and older. WHS is publicly available at (registration required). The SAS code is included in the supplementary file and the dataset for illustrative example is available upon request.

Authors’ contributions

AW and RAN participated in the study conception, design, analysis and interpretation of results and drafting of the article. OAA supervised the study and participated in study conception, design and interpretation of results, and made critical revisions to the paper for important intellectual content. All authors read and approved the final manuscript.

Competing interest

The authors declare that they have no competing interests.

Consent for publication

Not applicable as this is a secondary data analysis project.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Aolin Wang.

Additional file

Additional file 1:

In the supplementary file, we presented two alternative ways for estimating ATT and ATU, and the detailed g-computation steps, and the corresponding SAS code for the illustration. (DOCX 81 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, A., Nianogo, R.A. & Arah, O.A. G-computation of average treatment effects on the treated and the untreated. BMC Med Res Methodol 17, 3 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: