A regression model for risk difference estimation in populationbased case–control studies clarifies gender differences in lung cancer risk of smokers and never smokers
 Stephanie A Kovalchik^{1}Email author,
 Sara De Matteis^{2},
 Maria Teresa Landi^{3},
 Neil E Caporaso^{3},
 Ravi Varadhan^{4},
 Dario Consonni^{2},
 Andrew W Bergen^{5},
 Hormuzd A Katki^{3} and
 Sholom Wacholder^{3}
DOI: 10.1186/1471228813143
© Kovalchik et al.; licensee BioMed Central Ltd. 2013
Received: 25 July 2013
Accepted: 7 November 2013
Published: 19 November 2013
Abstract
Background
Additive risk models are necessary for understanding the joint effects of exposures on individual and population disease risk. Yet technical challenges have limited the consideration of additive risk models in case–control studies.
Methods
Using a flexible risk regression model that allows additive and multiplicative components to estimate absolute risks and risk differences, we report a new analysis of data from the populationbased case–control Environment And Genetics in Lung cancer Etiology study, conducted in Northern Italy between 2002–2005. The analysis provides estimates of the genderspecific absolute risk (cumulative risk) for nonsmoking and smokingassociated lung cancer, adjusted for demographic, occupational, and smoking history variables.
Results
In the multiplevariable lexpit regression, the adjusted 3year absolute risk of lung cancer in never smokers was 4.6 per 100,000 persons higher in women than men. However, the absolute increase in 3year risk of lung cancer for every 10 additional packyears smoked was less for women than men, 13.6 versus 52.9 per 100,000 persons.
Conclusions
In a Northern Italian population, the absolute risk of lung cancer among never smokers is higher in women than men but among smokers is lower in women than men. Lexpit regression is a novel approach to additivemultiplicative risk modeling that can contribute to clearer interpretation of populationbased case–control studies.
Keywords
Additive risk Absolute risk Case–control study EAGLE Lung cancer Risk assessment Sex factors SmokingBackground
The multiplicative model quantifies the joint effects of exposures on the relative risk of disease and is the mainstay of case–control analysis [1]. The contribution of the multiplicative model to studies of disease etiology is undeniable. However, there are several epidemiological questions that are more easily addressed with an additive risk model, where exposure effects are modeled on the absolute risk (probability) scale. In particular, additive risk models can clarify the public health significance of exposure effects [2, 3] and the interpretation of statistical interactions [4–6]. Despite these advantages, the technical difficulties of properly constraining risk estimates to the 0–1 range and a lack of software for constrained additive risk regression have hindered the use of additive risk models in case–control studies [7–9].
We recently encountered the challenge of additive risk modeling with case–control data in an investigation of gender differences in smokingassociated lung cancer in the Environment and Genetics in Lung cancer Etiology (EAGLE) Study—a populationbased case–control study conducted in Northern Italy between 2002–2005 [10]. In a logistic regression analysis of never and ever smokers of the EAGLE Study, De Matteis and colleagues found evidence of an interaction between gender and packyears smoked that suggested a higher susceptibility to lung cancer in men [11, 12]. The authors sought to quantify the public health implications of the gender differences they found by estimating absolute risk differences of lung cancer in men and women, adjusted for other confounders. The risk difference estimates could theoretically be obtained with an additive risk model yet, unlike methods for multiplicative modeling, reliable methods for additive risk regression with case–control data were not available.
To address the challenge of absolute risk estimation in case–control studies, we present a novel regression approach to quantify risk difference associations with populationbased case–control data using linearexpit (lexpit) regression. Lexpit regression is an additivemultiplicative risk model for a dichotomous outcome that can incorporate additive and multiplicative effects of risk factors and properly constrains risk estimates to a feasible range. We previously showed that lexpit regression addresses the main technical challenges to additive risk analysis of binary data in cohort studies [13]. Building on this earlier work, we extend lexpit regression to populationbased case–control studies by incorporating sampling information into the estimation procedure. After describing the interpretation of lexpit regression and its methodology, we return to the question that motivated the development of these new methods and use the lexpit model to quantify confounderadjusted risk difference effects of gender for smoking and nonsmoking associated lung cancer in the EAGLE Study.
Methods
Study participants
Projected control population by gender, age, and regional sampling strata in the EAGLE Study
Milan  Monza  Brescia  Pavia  Varese  

N Population (Number of controls)  
Male  
3539  70,630 (3)  12,352 (1)  22,479 (1)  9,202 (2)  9,866 (1) 
4044  58,166 (9)  10,045 (4)  18,816 (2)  7,958 (1)  8,156 (3) 
4549  50,727 (25)  9,143 (3)  16,644 (3)  7,046 (5)  7,698 (2) 
5054  55,952 (56)  9,677 (3)  17,760 (18)  7,508 (3)  8,155 (7) 
5559  51,407 (145)  8,281 (8)  14,665 (33)  5,951 (8)  6,783 (25) 
6064  55,106 (209)  9,083 (17)  14,682 (47)  6,765 (15)  7,127 (20) 
6569  45,477 (251)  7,043 (26)  11,334 (42)  5,855 (30)  5,765 (41) 
7074  35,965 (242)  5,423 (22)  8,995 (30)  4,518 (20)  4,640 (27) 
7580  24,960 (149)  3,430 (10)  6,291 (18)  3,417 (8)  3,278 (22) 
Total  448,390 (1,089)  74,477 (94)  131,666 (194)  58,220 (92)  61,468 (148) 
Female  
3539  68,084 (5)  11,717 (1)  20,391 (1)  0 (0)  9,509 (2) 
4044  57,734 (2)  10,122 (1)  17,455 (1)  7,410 (4)  8,061 (1) 
4549  53,942 (13)  9,387 (4)  16,248 (7)  6,979 (3)  7,754 (6) 
5054  63,060 (27)  10,307 (1)  17,739 (3)  7,424 (7)  8,545 (2) 
5559  58,781 (61)  9,022 (2)  15,140 (7)  6,085 (5)  7,099 (6) 
6064  63,452 (43)  9,607 (5)  15,885 (8)  7,352 (4)  7,675 (3) 
6569  56,296 (73)  8,152 (5)  12,439 (9)  7,199 (7)  6,822 (4) 
7074  50,119 (63)  7,090 (3)  13,220 (7)  6,763 (4)  6,083 (4) 
7580  43,166 (62)  5,610 (1)  11,221 (10)  5,732 (3)  5,404 (9) 
Total  514,634 (349)  81,014 (23)  139,738 (53)  49,212 (37)  66,952 (37) 
Descriptive characteristics by gender for the EAGLE study
Lung cancer cases (n=1,943)  Controls (n=2,116)  

Male (n=1,537)  Female (n=406)  Male (n=1,617)  Female (n=499)  
Age  
3559  301 (20)  123 (30)  371 (23)  172 (34) 
6066  428 (28)  91 (23)  469 (29)  104 (21) 
6771  362 (24)  77 (19)  366 (23)  94 (19) 
72+  446 (28)  115 (28)  411 (25)  129 (26) 
Pvalue  <0.001  <0.001  
Education  
None  91 (6)  21 (5)  66 (4)  24 (5) 
Elementary  625 (40)  128 (32)  431 (27)  143 (29) 
Middle school  424 (28)  134 (33)  456 (28)  158 (31) 
High school or more  397 (26)  123 (30)  664 (41)  174 (35) 
Pvalue  0.005  0.0975  
Ever had list A/B job  
Yes  522 (34)  27 (7)  447 (28)  28 (6) 
No  1,015 (66)  379 (93)  1,170 (72)  471 (94) 
Pvalue  <0.001  <0.001  
ETS at the workplace  
Yes  1,180 (70)  215 (54)  1,127 (70)  270 (54) 
No  357 (30)  191 (46)  490 (30)  229 (46) 
Pvalue  <0.001  <0.001  
Smoking variables  
Ever smoked cigars, pipes, or cigarillos  
Yes  267 (17)  5 (1)  309 (19)  2 (0) 
No  1,270 (83)  401 (99)  1,308 (81)  497 (100) 
Pvalue  <0.001  <0.001  
Smoking status  
Never smoker  29 (2)  103 (25)  397 (25)  282 (57) 
Former  723 (47)  116 (29)  800 (49)  110 (22) 
Current  785 (51)  187 (46)  420 (26)  107 (21) 
Pvalue  <0.001  <0.001  
Packyears (Smokers only)  
<5  38 (3)  31 (10)  265 (22)  97 (45) 
519  82 (5)  56 (18)  199 (16)  50 (23) 
2039  421 (28)  124 (41)  413 (34)  49 (22) 
40+  967 (64)  92 (31)  343 (28)  21 (10) 
Pvalue  <0.001  <0.001  
Years since quitting (Quitters only)  
<10  337 (52)  60 (47)  158 (20)  28 (25) 
10+  386 (48)  56 (53)  642 (80)  82 (75) 
Pvalue  0.3557  0.2059  
Avg. percent inhaled (Smokers only)  
25  1 (0)  5 (2)  3 (0)  1 (0) 
50  62 (4)  17 (6)  36 (3)  11 (5) 
75  449 (30)  113 (37)  295 (24)  53 (24) 
100  995 (66)  168 (55)  886 (73)  152 (70) 
Pvalue  <0.001  0.3905 
Lexpitmodel
Description
a sum of additive β ' x and multiplicative exp it(γ _{0} + γ ' z) components, where exp it(u) = esp(u)/(1 + exp(u)) is the inverselogit (expit) function, which converts the logodds u to the risk scale. An incidence rate can also be derived by dividing the cumulative risk R(x,z) by the length of the risk period τ, ⋅ R(x, z)/τ, under the assumption of constant risk over the risk period. For case–control study designs, the risk period length τ is typically equal to the duration of case ascertainment. When the additive terms of lexpit are set to zero, the model reduces to a strictly multiplicative logistic model; when the multiplicative terms are set to zero, the model reduces to a strictly additive binomial linear model.
The “additive” and “multiplicative” descriptions of the lexpit model coefficients refer to the effects of the x and z variables on the baseline risk, or the cumulative risk of disease in unexposed individuals, denoted as R_{0}=expit(γ _{ 0 } ). According to (1), the risk in a person with x=x _{1} exposure is β ' x _{1} greater than a person with x=x _{1}1; thus, each β coefficient is a risk difference associated with a unit increase in the corresponding x factor, after adjusting for all remaining x and z factors.
In Equation (1), the risk in a person with z=z_{1} is a multiplicative factor of the baseline risk approximately equal to ≈ exp(γ ' z _{1}). The exponentiated value of each γ coefficient estimates the residual odds ratio associated with a unit increase in the corresponding z variable, after adjustment for the risk due to x exposures and the effects of remaining z. In logistic regression, coefficients are the adjusted logodds ratios of odds having the form R(x, z)/(1 − R(x, z)). In lexpit regression, the logodds ratios represented by the coefficients γ involve odds of the form (R(x, x) − β ' x)/(1 − R(x, z), where R(x, z) − β ' x is the risk that remains after subtracting the risk due to x exposures. Hence, we refer to the exponentiated coefficients γ of the lexpit model as “residual odds ratios”. These residual odds ratios are directly comparable to the odds ratio associations in a logistic regression model of z exposures fit to the subgroup of individuals without exposure to the x variables, i.e., with x = 0.
The baseline risk parameter γ _{ 0 } is included in the expit for mathematical convenience, as no constraints are required to ensure that R_{0}=expit(γ _{ 0 } ) is within the feasible 0–1 probability range.
Example 1: Interpretation of lexpit model coefficients
with univariate x_{1} for gender (1 = female, 0 = male) and univariate multiplicative term z_{1}= Years Smoked, a continuous variable. Under model (2), the 3year risk difference between a woman and a man with equal years smoked is R(1, z _{1}) − (0, z _{1}) = β Thus, β is the difference in lung cancer risk between women and men, adjusted for smoking.
Next we consider the independent effect of a 30year smoking duration. Under model (2), the residual logodds is lg it(R(3, 30)) = γ _{0} + 30γ for a man and log it(R(1, 30) − β) = γ _{0} + 30γ for a woman who have smoked for 30 years. For each, the difference in the residual logodds compared to a never smoker (the logodds ratio) is 30γ. Thus, γ represents increase in the odds of lung cancer associated with an additional year’s duration of smoking, adjusted for gender.
Estimation
Application of lexpit regression to populationbased case–control data can generate absolute risk and risk difference estimates when an unbiased representation of the underlying population is available. As with expansion estimators in survey estimation [15], weighing each observation by its inverse sampling fraction, roughly, yields an estimate of the number of individuals representative of the “study base” [16]. To accommodate stratified sampling, we suppose the study base consists of J strata. Let ij index the ith individual within the jth stratum. The data vector for this individual is {y _{ ij }, x _{ ij }, z _{ ij }, w _{ ij }}, where y _{ ij } indicates case status, x _{ ij } are additive risk factors, z _{ ij } are multiplicative risk factors, and w _{ ij } is the sampling weight. For a populationbased case–control study with complete case ascertainment and random sampling of controls within strata, the sample weights equal 1 for all cases and the ratio of the population size N _{ j } to the number of sampled controls n _{ j } (N _{ j } /n _{ j }) for controls in stratum j. The use of inverse probabilities as sampling weights allows our methodology to accommodate more complex case–control designs (frequency matching, individual matching, etc.).
The quantities X and Z refer to the complete set of risk factors in the study sample. The feasible region is constructed from the joint distribution of (X, Z), creating a separate constraint for each unique combination of observed x and z factors. The feasible region guarantees that the risk estimate for each observed exposure type is a population probability.
To impose the conditions of the feasible region, we have adapted a constrained optimization algorithm previously developed for cohort analyses [13]. Lexpit methods for case–control data are similar to regression methods for survey data. The use of sample weights makes the risk estimates of the lexpit model for case–control data; both require the same design considerations for accurate estimation of standard errors of estimates. We therefore use influencebased methods, a common approach for linearized variance estimation of survey statistics [18], to derive variances for the lexpit model’s risk estimates. In the Additional file 1 we summarize the optimization algorithm and the influence approach for obtaining variance estimates for the lexpit model parameters.
Example 2: Lexpit model estimation for case–control data
A risk difference estimate of $\widehat{\beta}=1.5/1,000$ represents 0.15% lower risk for women than men This example conceptualizes how the sampling probabilities of a case–control study, when available, can be utilized to obtain population risk estimates.
Choice of additive and multiplicative effects
The flexibility of lexpit regression in allowing estimation of the effect of an exposure as additive, multiplicative, or (in some cases) both, can create uncertainty about an exposure’s true mode of effect. Although the true mode of effect can never be known, we provide three practical strategies to explore the functional form of a given riskexposure relationship: a riskexposure scatter plot that gives a graphical depiction between crude risk and a continuous exposure, a testing method based on the comparison of effects in a lexpit model with both additive and multiplicative effects of an exposure, and a measure of goodnessoffit. Details of each approach are provided in the Additional file 1.
Results
Representation of variables included in regression analyses of the EAGLE study
Factor  Representation  Values 

Packyears^{a}  Continuous  
Female  Categorical  Male = 0 
Female = 1  
Age  Continuous  Years 
Education  Trend  None = 0 
Elementary = 1  
Middle school = 2  
High school or more = 3  
Smoked cigars, pipes, cigarillos  Categorical  Never Smoked = 0 
Smoked = 1  
ETS in the workplace  Categorical  No ETS = 0 
ETS = 1  
Highrisk occupation^{b}  Categorical  No occupation = 0 
Occupation = 1  
Average percent inhaled  Trend  Never smoker = 0 
<25% = 1  
2549% = 2  
5074% = 3  
75100% = 4  
Years since quitting  Continuous  Years 
Lexpit regression analysis of the EAGLE Study
Factor  3year risk difference (per 100,000)  95% confidence interval  Residual odds ratio  95% confidence interval 

Female  4.6  (−1.8, 11.0)  
Packyears (per 10 yrs)  52.9  (31.9, 73.8)  
Female x Packyears  −39.3  (−70.1, 8.6)  
Age – 60^{a}  1.12  (1.10, 1.13)  
Education – 1^{b}  0.69  (0.60, 0.80)  
Highrisk occupation^{c}  1.01  (0.72, 1.41)  
Occupational ETS  1.54  (0.72, 1.41)  
Cigars, pipes, cigarillos  1.15  (0.86, 1.53)  
Average percent inhaled  2.19  (1.99, 2.41)  
Years since quitting  0.94  (0.93, 0.95) 
As one assessment of the improvement of the fit of the model with the use of multiplicative effects we compared the weighted HosmerLemeshow goodnessoffit statistic (Additional file 1: Section S3) among the lexpit model, a strictly additive blm model, and a strictly multiplicative logistic model of the same variables. The chisquared statistic in the blm model was 20.8, the weighted logistic model 18.2, and 15.9 with the lexpit model, indicating an improvement in fit with the use of the additivemultiplicative form we used.
Discussion
We have presented lexpit regression methods to estimate adjusted absolute risk differences with populationbased case–control data. By shifting the focus from estimates of relative risk to absolute risk, lexpit regression gives epidemiologists a direct and reliable way to assess the public health significance of an exposure’s effect. Moreover, lexpit regression provides a flexible framework for handling potential confounders, as variables with additive or multiplicative effects can be accommodated. When there is uncertainty about a variable’s mode of effect, we outlined approaches to assess the reasonableness of each effect type. Our opensource R package blm allows the new methods to be implemented with the ease of standard logistic regression.
Lexpit regression is the absolute risk analog to additivemultiplicative models for hazard rates, such as the CoxAalen model [21], which have become increasingly popular in the survival literature [22]. Each class of models share the strength of greater flexibility in the study and representation of the joint effects of risk factors on the hazard rate, in the case of the CoxAalen model, and the absolute risk of disease, in the case of the lexpit model. The extension of additivemultiplicative models to absolute risk estimation from a variety of study designs is significant because of the importance of individualized risk assessment to public health. To our knowledge, the lexpit model is the first additivemultiplicative regression model of risk that appropriately ensures risk estimates are within the probability scale. Although alternative additivemultiplicative models of risk could be developed by considering other functions for the multiplicative component (e.g. exp), we have focused on the expit function because of its mathematical advantages. Because of the expit function, the lexpit model will require fewer constraints than alternative additivemultiplicative models to produce feasible estimates in the 0–1 probability range.
None of more than 20 published observational studies that have examined male–female differences in lung cancer etiology have quantified the independent effect of gender on the absolute risk of smoking and nonsmokingassociated lung cancer [23–26]. Using lexpit regression, we were able to address this important public health question. Our findings add to the De Matteis et al. logistic regression of the EAGLE case–control study [11] in two important ways. First, we confirmed that gender differences in the confounderadjusted effect of packyears are found on the additive risk scale. Secondly, we found suggestive evidence that women’s risk of lung cancer risk is higher than men’s in never smokers but is lower than men’s in smokers. Conventional unconditional logistic regression, which does not provide estimates of absolute risk, would not identify these findings, especially given that gender was used as a matching variable in selecting controls. Thus, our novel methods provide further insight about maleandfemale differences in lung cancer risk from previously analyzed data that has direct public health implications.
In their commentary on the De Matteis et al. study, Alberg and colleagues pointed to a need to further delineate the clinical significance of gender differences in lung cancer etiology [12]. Our reanalysis of the EAGLE Study clarifies the clinical relevance of gender effects for lung cancer risk in an Italian population by providing estimates of the excess lung cancer risk associated with gender. The small excess risk in women never smokers suggests that some genderrelated etiological factor(s) for nonsmokingrelated lung cancer remains to be identified. A public health implication for the gender differences we found among smokers concerns selection criteria for computed tomographic lung cancer screening. Current guidelines recommend screening for individuals between ages 55 and 75 years with a minimum of 30 packyears smoked [27]. However, in an Italian population, we estimate that the excess lung cancer risk for a male 30 packyear smoker is more than 1,100 per 100,000 greater than an otherwise similar female 30 packyear smoker. Thus, in keeping with the “equal management for equal risk” principle [28], genderbased risk criteria for lung cancer screening selection may be warranted in some populations.
The implications of the EAGLE lexpit analysis for computed tomographic screening guidelines exemplifies the importance of the choice of measure of association used in an etiological analysis for understanding the public health significance of a risk factor’s effect. Risk differences measure a risk factor’s effect in terms of the number of excess attributable cases in a welldefined population, an explicit measure of the public health significance of an effect, which can be compared across exposures and across diseases. Our study provides an important example of this comparative use of risk differences with respect to gender effects in smoking and nonsmokingassociated lung cancer. Some research has suggested a higher risk of lung cancer among women never smokers [29]. We further elucidated this difference through lexpit analysis by showing that the excess risk in women never smokers was approximately equal to the excess risk with 1 additional packyear smoked in men as compared to women. As the development of public health interventions and clinical recommendations become increasingly guided by individual risk assessment, there will be a growing need for methods like lexpit regression that can facilitate the estimation of absolute risk differences from observational data.
Lexpit regression resolves several limitations of alternative strategies for estimating risk differences from case–control studies. Using nonadditive models of risk, such as the logistic model, to estimate a marginal risk difference [30, 31] gives average in the study population, not equivalent to a risk difference effect estimated here. The application of the lexpit model to case–control data extends previously proposed methods for absolute risk methods requiring prospective cohorts or disease registries [32]. Further, lexpit regression advances current methods for assessing additive interactions in case–control studies. It is well known that multiplicative interactions sometimes disappear when modeled on the additive scale [4–6, 33] and vice versa, highlighting the dependence of statistical interactions on the choice of a model’s scale. The removal of interactions leads to more parsimonious models whose risk associations have a clearer interpretation. The flexible additivemultiplicative form of the lexpit can help epidemiologists reduce multiplicative and additive statistical interactions, making it easier to interpret risk effects. While departure from additivity can be detected on the relative risk scale using the relative excess risk due to interaction, this metric is limited because it can only detect the direction of departure from additivity but not the magnitude of the effect [34, 35].
While lexpit regression makes the important advance of allowing case–control studies to make inferences about absolute risk and risk differences of exposures, there are several challenges to its application to case–control data. First, the period of risk for the cumulative risk estimates of the lexpit model is determined by the period of case ascertainment, which may generally prohibit longterm risk estimates. As with other common probability models of case–control data, the lexpit model assumes the population risk of disease is fixed during the period cases and controls are sampled. The population validity of lexpit regression also requires accurate sampling weights, which may be difficult to obtain for studies using a socalled “secondary base” [36], as with hospital or registry controls, for the selection of controls. Further investigation of the availability and accuracy of sampling information in case–control studies is needed to clarify the practical limitations of using sampling data for absolute risk estimation.
Conclusions
Additive and multiplicative models concern “two quite different aspects of the association between risk factor and disease” [1], p. 58. Epidemiologists have been urged to consider both perspectives in risk association studies, especially in the assessment of effect modification [26], yet technical challenges have long made multiplicative models more convenient to use. In this paper, we have presented methods and software [27] to allow analyses of populationbased case–control studies to incorporate these complementary perspectives into a single model via lexpit regression. Further applications and extensions of additive risk modeling with case–control data will help to improve our understanding of the joint effects of exposures on disease risk.
Abbreviations
 EAGLE:

Environment and genetics in lung cancer etiology
 Lexpit:

Linearexpit.
Declarations
Acknowledgement
This work was supported by the Intramural Research Program of the National Institutes of Health, National Cancer Institute, Division of Cancer Epidemiology and Genetics. Dr. Varadhan is a Brookdale Leadership in Aging Fellow at the Johns Hopkins University School of Medicine.
Authors’ Affiliations
References
 Breslow NE, Day NE: Statistical Methods in Cancer Research, Vol. I. The Design and Analysis of Case–control Studies, IARC Scientific Publication No. 32. 1980, New York, NY: Oxford University PressGoogle Scholar
 Greenland S: Interpretation and choice of effect measures in epidemiologic analyses. Am J Epidemiol. 1987, 125 (5): 761768.PubMedGoogle Scholar
 Sackett DL, Deeks JJ, Altman DG: Down with odds ratios!. Evid Based Med. 1996, 1: 164166.Google Scholar
 Skrondal A: Interaction as departure from additivity in case–control studies: a cautionary note. Am J Epidemiol. 2003, 158: 251258. 10.1093/aje/kwg113.View ArticlePubMedGoogle Scholar
 Rothman KJ: Causes. Am J Epidemiol. 1976, 104: 587593.PubMedGoogle Scholar
 Knol MJ, VanderWeele TJ: Recommendations for presenting analyses of effect modification and interaction. Int J Epidemiol. 2012, 41: 514520. 10.1093/ije/dyr218.View ArticlePubMedPubMed CentralGoogle Scholar
 Wacholder S: The case–control study as data missing by design: estimating risk differences. Epidemiol. 1996, 7 (2): 144150. 10.1097/0000164819960300000007.View ArticleGoogle Scholar
 Wacholder S: Binomial regression in GLIM: estimating risk ratios and risk differences. Am J Epidemiol. 1986, 123: 174184.PubMedGoogle Scholar
 Spiegelman D, Hertzmark E: Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol. 2005, 162 (3): 199200. 10.1093/aje/kwi188.View ArticlePubMedGoogle Scholar
 Landi MT, Consonni D, Rotunno M, Bergen AW, Goldstein AM, Lubin JH, Goldin L, Alavanja M, Morgan G, Subar AF, Linnoila I, Previdi F, Corno M, Rubagotti M, Marinelli B, Albetti B, Colombi A, Tucker M, Wacholder S, Pesatori AC, Caporaso NE, Bertazzi PA: Environment and genetics in lung cancer etiology (EAGLE) study: an integrative populationbased case–control study of lung cancer. BMC Public Health. 2008, 8: 20310.1186/147124588203.View ArticlePubMedPubMed CentralGoogle Scholar
 De Matteis S, Consonni D, Pesatori AC, Bergen AW, Bertazzi PA, Caporaso NE, Lubin JH, Wacholder SW, Landi MT: Are women who smoke at higher risk for lung cancer than men who smoke?. Am J Epidemiol. 2013, 177 (7): 601612. 10.1093/aje/kws445.View ArticlePubMedPubMed CentralGoogle Scholar
 Alberg AJ, Wallace K, Silvestri GA, Brock MV: Invited commentary: the etiology of lung cancer in men compared with women. Am J Epidemiol. 2013, 177 (7): 613616. 10.1093/aje/kws444.View ArticlePubMedPubMed CentralGoogle Scholar
 Kovalchik SA, Varadhan R, Fetterman B, Poitras NE, Wacholder S, Katki HA: A general binomial regression model to estimate standardized risk differences from binary response data. Stat Med. 2013, 32: 808821. 10.1002/sim.5553.View ArticlePubMedGoogle Scholar
 Consonni D, De Matteis S, Lubin JH, Wacholder S, Tucker M, Pesatori AC, Caporaso NE, Bertazzi PA, Landi MT: Lung cancer and occupation in a populationbased case–control study. Am J Epidemiol. 2010, 171 (3): 323333. 10.1093/aje/kwp391.View ArticlePubMedPubMed CentralGoogle Scholar
 Horvitz DG, Thompson DJ: A generalization of sampling without replacement from a finite universe. J Am Stat Assoc. 1952, 47: 663685. 10.1080/01621459.1952.10483446.View ArticleGoogle Scholar
 Wacholder S, Silverman DT, McLaughlin JK, Mandel JS: Selection of controls in case–control studies: 2. Types of controls. Am J Epidemiol. 1992, 135 (9): 10291041.PubMedGoogle Scholar
 Benichou J, Wacholder S: A comparison of 3 approaches to estimate exposurespecific incidence rates from populationbased case–control data. Stat Med. 1994, 13: 651661. 10.1002/sim.4780130526.View ArticlePubMedGoogle Scholar
 Graubard BI, Fears TR: Standard errors for attributable risk for simple and complex sample designs. Biometrics. 2005, 61 (3): 847855. 10.1111/j.15410420.2005.00355.x.View ArticlePubMedGoogle Scholar
 R Development Core Team: R: A Language and Environment for Statistical Computing. 2012, Vienna, Austria: R Foundation for Statistical ComputingGoogle Scholar
 Kovalchik SA, Varadhan R: Fitting additive binomial regression models with the R package blm. J Stat Softw. 2013, 54 (1): 118.View ArticleGoogle Scholar
 Martinussen T, Scheike TH: A flexible additivemultiplicative hazard model. Biometrika. 2002, 89 (2): 283298. 10.1093/biomet/89.2.283.View ArticleGoogle Scholar
 Cortese G, Scheike TH, Martinussen T: Felxible survival regression modelling. Stat Methods Med Res. 2010, 19 (1): 528. 10.1177/0962280209105022.View ArticlePubMedGoogle Scholar
 Blot WJ, McLaughlin JK: Are women more susceptible to lung cancer?. J Natl Cancer Inst. 2004, 96 (11): 812813. 10.1093/jnci/djh180.View ArticlePubMedGoogle Scholar
 Khuder SA: Effect of cigarette smoking on major histological types of lung cancer: a metaanalysis. Lung Cancer. 2001, 31 (2–3): 139148.View ArticlePubMedGoogle Scholar
 Bain C, Feskanich D, Speizer FE, Thun M, Hertzmark E, Rosner BA, Colditz GA: Lung cancer rates in men and women with comparable histories of smoking. J Natl Cancer Inst. 2004, 96 (11): 826834. 10.1093/jnci/djh143.View ArticlePubMedGoogle Scholar
 Gandini S, Botteri E, Iodice S, Boniol M, Lowenfels AB, Maisonneuve P, Boyle P: Tobacco smoking and cancer: a metaanalysis. Int J Cancer. 2008, 122 (1): 155164. 10.1002/ijc.23033.View ArticlePubMedGoogle Scholar
 Boiselle PM: Computed tomography screening for lung cancer. JAMA. 2013, 309: 11631170. 10.1001/jama.2012.216988.View ArticlePubMedGoogle Scholar
 Katki HA, Schiffman M, Castle PE, Fetterman B, Poitras NE, Lorey T, Cheung LC, RaineBennett T, Gage JC, Kinney WK: Fiveyear risks of CIN 2+ and CIN 3+ among women with HPVpositive and HPVnegative LSIL pap results. J Low Genit Tract Dis. 2013, 17: S43S49.View ArticlePubMedPubMed CentralGoogle Scholar
 Wakelee H, Chang E, Gomez S, Keegan T, Feskanich D, Clarke C, Holmberg L, Yong L, Kolonel L, Gould M, et al: Lung cancer incidence in never smokers. J Clin Oncol. 2007, 25 (5): 472478. 10.1200/JCO.2006.07.2983.View ArticlePubMedPubMed CentralGoogle Scholar
 Greenland S, Holland P: Estimating standardized risk differences from odds ratios. Biometrics. 1991, 47 (1): 319322. 10.2307/2532517.View ArticlePubMedGoogle Scholar
 Greenland S: Modelbased estimation of relative risks and other epidemiologic measures in studies of common outcomes and in case–control studies. Am J Epidemiol. 2004, 160 (4): 301305. 10.1093/aje/kwh221.View ArticlePubMedGoogle Scholar
 Benichou J, Gail MH: Methods of inference for estimates of absolute risk derived from populationbased case–control studies. Biometrics. 1995, 51 (1): 182194. 10.2307/2533324.View ArticlePubMedGoogle Scholar
 Marschner IC, Gillett AC, O’Connell RL: Stratified additive Poisson models: computational methods and applications in clinical epidemiology. Comput Stat Data Anal. 2012, 56 (5): 11151130. 10.1016/j.csda.2011.08.002.View ArticleGoogle Scholar
 Knol MJ, van der Tweel I, Grobbee DE, Numans ME, Geerlings MI: Estimating interaction on an additive scale between continuous determinants in a logistic regression model. Int J Epidemiol. 2007, 36: 11111118. 10.1093/ije/dym157.View ArticlePubMedGoogle Scholar
 Richardson DB, Kaufman JS: Estimation of the relative excess risk due to interaction and associated confidence bounds. Am J Epidemiol. 2009, 169 (6): 756760. 10.1093/aje/kwn411.View ArticlePubMedPubMed CentralGoogle Scholar
 Wacholder S, McLaughlin JK, Silverman DT: Selection of controls in case–control studies: 1. Principles. Am J Epidemiol. 1992, 135 (9): 10191028.PubMedGoogle Scholar
 The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712288/13/143/prepub
Prepublication history
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.