- Research
- Open access
- Published:

# Case-only approach applied in environmental epidemiology: 2 examples of interaction effect using the US National Health and Nutrition Examination Survey (NHANES) datasets

*BMC Medical Research Methodology*
**volume 22**, Article number: 254 (2022)

## Abstract

### Introduction

By substituting the general ‘susceptibility factor’ concept for the conventional ‘gene’ concept in the case-only approach for gene-environment interaction, the case-only approach can also be used in environmental epidemiology. Under the independence between the susceptibility factor and environmental exposure, the case-only approach can provide a more precise estimate of an interaction effect.

### Methods

Two analysis examples of the case-only approach in environmental epidemiology are provided using the 2015–2016 and 2017–2018 US National Health and Nutritional Examination Survey (NHANES): (i) the negative interaction effect between blood chromium level and glycohemoglobin level on albuminuria and (ii) the positive interaction effect between blood cobalt level and old age on albuminuria. The second part of the methods (theoretical backgrounds) summarized the logic and equations provided in previous studies about the case-only approach.

### Results

(i) When a 1 μg/L difference of both blood chromium level (mcg/L) and a 1% difference in blood glycohemoglobin level coincide, the multiplicative interaction contrast ratio (ICR_{c/nc}) was 0.72 (95% CI 0.35–1.60), with no statistical significance. However, when only the cases were analyzed, the case-only ICR (ICR_{CO}) was 0.59 (95% CI 0.28–0.95), with a statistical significance (a negative interaction effect). (ii) When a 1 μg/L difference of both blood cobalt levels and a 1-year difference in age coincide, the multiplicative interaction contrast ratio (ICR_{c/nc}) was 1.13 (95% CI 0.99–1.37), with no statistical significance. However, when only the cases were analyzed, the case-only ICR (ICR_{CO}) was 1.21 (95% CI 1.06–1.51), with a statistical significance (a positive interaction effect).

### Discussion

The discussion suggested the theoretical background and previous literature about the possible protective interaction effect between blood chromium levels and blood glycohemoglobin levels on the incidence of albuminuria and the possible aggravating interaction effect between blood cobalt levels and increasing ages on the incidence of albuminuria. If the independence assumption between a susceptibility factor and environmental exposure in a study with cases and non-cases is kept, the case-only approach can provide a more precise interaction effect estimate than conventional approaches with both cases and non-cases.

## Introduction

The estimation of an interaction effect has often been conducted in cohort or case-control studies using information from both cases and controls [1,2,3,4]. However, a case-only approach can be a valid alternative and even may have advantages under certain circumstances over conventional approaches that use information from both cases and controls.

The case-only approach is used to calculate the interaction effect estimate. This unique approach is mainly used in gene-environmental and gene-gene interaction studies in genetic epidemiology [5,6,7,8]. However, if the ‘gene’ concept in the gene-environmental interaction could indicate a type of ‘susceptibility factor,’ the term ‘gene-environment interaction’ in genetic epidemiology can be replaced with the ‘susceptibility factor-environmental exposure interaction’ in environmental epidemiology.

The case-only approach can provide 2 benefits over a study with cases and non-cases or conventional cohort/case-control studies to estimate the interaction effect between a susceptibility factor and an environmental exposure [5, 7,8,9,10,11,12,13]. The first is that a more precise interaction effect estimate can be calculated. The second is that this approach can estimate the interaction effect when appropriate controls are unavailable. However, this case-only approach requires an important condition between the susceptibility factor and the environmental exposure studied: independence [5, 14]. If this independent assumption between a susceptibility factor and an environmental exposure is not fulfilled, the case-only interaction estimate might be biased severely from the interaction effect estimate acquired from a study with cases and non-cases.

This study will summarize all logic, definitions, and equations about the case-only approach through various study types, including case-only studies and a study with cases and non-cases, including case-control and cohort studies. In addition, this study will deal with important assumptions and the relationship among these assumptions, which are required for the reliable estimation of the interaction effect in the case-only approach. Possible corrective strategies for the violation of the independence assumption will also be dealt with. Finally, 2 analysis examples of the case-only approach will be illustrated using the US NHANES dataset. This study can clarify the logic and equations of the case-only approach and contribute to applying the case-only approach of genetic epidemiology to environmental epidemiology.

## Methods: application for real data – 2 examples

In this study, 2 analysis examples using the US National Health and Nutritional Examination Survey (NHANES) data will be provided (https://www.cdc.gov/nchs/nhanes/index.htm). The case-only approach applied in environmental epidemiology will be explained using this dataset.

### The preventive (negative) interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

The laboratory data of NHANES 2015–2016 and NHANES 2017–2018 datasets were used. The blood chromium levels (mcg/L) were used as the environmental exposure variable, and the glycohemoglobin levels (%) were used as the susceptibility factor variable. The albumin creatinine ratio (mg/g) was the outcome (disease) variable.

The chromium level of 1.4 mcg/L was set as the standpoint between normal and abnormal chromium levels. The albumin creatinine ratio of 300 mg/g was set as the standpoint between normal and albuminuria (micro and macro). Both micro-albuminuria and macro-albuminuria were categorized in the single ‘albuminuria’ category. Glycohemoglobin level was used as a continuous variable without conversion to a categorical variable. Because of possible confounding due to diabetes treatment (glucose-lowering medications), all respondents with the ‘yes’ answer to the question ‘take diabetic pills to lower blood sugar’ were excluded from the analysis.

### The aggravating (positive) interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

The laboratory data and demographics data of NHANES 2015–2016 and NHANES 2017–2018 datasets were used. The blood cobalt level (mcg/L) in laboratory data was used as the environmental exposure variable, and age in years in demographics data was used as the susceptibility factor variable. Albumin creatinine ratio (mg/g) in laboratory data was used as the outcome variable.

The cobalt level of 1.8 mcg/L was set as the standpoint between normal and abnormal cobalt levels. The albumin creatinine ratio of 300 mg/g was set as the standpoint between normal and albuminuria. Both micro-albuminuria and macro-albuminuria were categorized as a single ‘albuminuria’ category. Age in years was applied as a continuous variable without conversion to a categorical variable.

### Calculation of estimates

All abbreviations used in this article are provided in Table 1. First, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood chromium level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood cobalt level was calculated. Second, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the blood glycohemoglobin level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the fold-difference in the odds of albuminuria associated with a unit difference in the age in years was calculated. Third, the estimate with an appropriate confidence interval for the multiplicative ICR associated with the difference of one unit in both the blood chromium level and the blood glycohemoglobin level was calculated in the first example. In the second example, the estimate with an appropriate confidence interval for the multiplicative ICR associated with the difference of one unit in both the blood cobalt level and age in years was calculated. Fourth, the independence between the blood chromium level and blood glycohemoglobin level was assessed in the whole sample, including cases and non-cases in the first example. In the second example, the independence between the blood cobalt level and age in years was assessed in the whole sample, including cases and non-cases. Fifth, only if the independence mentioned in the fourth item was plausible the multiplicative ICR using only cases were calculated. If the independence mentioned in the fourth item was not plausible, the multiplicative ICR calculated based on only cases was adjusted based on theoretical equations (multiplied by the S-E OR_{c/nc}). After these steps, the authors concluded whether the estimate derived from only cases is more precise than the estimate obtained from both cases and non-cases.

### Statistical method and software

A logistic regression model was applied for the calculation of odds ratios. The R software version 4.0.3 was used. Package ‘dplyr’ and ‘data.table’ were used for the pre-processing of the datasets. The used R codes are provided in Supplementary material A.

## Methods: theoretical backgrounds

### Basic assumption: the joint and ICR on the multiplicative scale

Statistical interactions between the effects of susceptibility factors and those of environmental factors can be assessed as departures from multiplicativity of effects or as departures from additivity of effects. Table 2 indicates an example of a study with cases and non-cases. With the unexposed and no susceptibility (E-G-) group set as the reference group, we can calculate relative risk (RR) and odds ratio (OR) for all other 3 groups.

The joint RR for the susceptibility factor and environmental exposure (RR_{se}) can be compared with the RR for environmental exposure alone (RR_{e}) or with the RR for susceptibility factor alone (RR_{s}). The joint OR for the susceptibility factor and environmental exposure (OR_{se}) can be compared with the OR for environmental exposure alone (OR_{e}) or with the OR for susceptibility factor alone (OR_{s}). In the joint RR model with the additive scale, the ICR (ICR_{c/nc}) indicates the departures from the sum of individual RRs minus one (ICR_{c/nc} = RR_{se}-(RR_{s} + RR_{e}-1)). This equation is called ‘relative excess risk due to interaction (RERI)’ in epidemiologic literature [15]. In the joint OR model with the additive scale, the ICR (ICR_{c/nc}) indicates the departures from the sum of individual ORs minus one (ICR_{c/nc} = OR_{se}-(OR_{s} + OR_{e}-1)). In the joint RR model with the multiplicative scale, the ICR (ICR_{c/nc}) indicates the departures from the product of individual RRs (ICR_{c/nc} = RR_{se}/(RR_{s} × RR_{e})). In the joint OR model with the multiplicative scale, the ICR (ICR_{c/nc}) indicates the departures from the product of individual ORs (ICR_{c/nc} = OR_{se}/(OR_{s} × OR_{e})). In this article, we used only the joint RR or the joint OR model with the multiplicative scale to estimate the ICR_{c/nc}.

### The ICR in a case-only study and the ICR in a study with cases and non-cases

Table 2 illustrates the composition of a study with cases and non-cases. To generate case-only data from the above source population, we extracted only the ‘case’ column in Table 3.

The ICR in a case-only study will be as follows:

The ICR in a study with cases and non-cases will be as follows:

In Eq. (2), (ag/ce) is converted into ICR_{co} obtained in the case-only study. ICR_{c/nc} is the ICR calculated in a study with cases and non-cases. From Eq. (2), the requirement for the equality between the ICR acquired from a study with cases and non-cases and the ICR acquired from the case-only study is as follows:

Equation (3) means that the environmental exposure and the susceptibility factor must be independent in a study with cases and non-cases for the equality between the ICR acquired from a study with cases and non-cases and the ICR acquired from the case-only study. In Eqs. (2) and (3), we should note that the equality between the ICR from a study with case and non-cases and the ICR from the case-only study does not necessarily require a rare disease assumption (a low prevalence of the disease).

The above equations in this subsection can be understood from the context of a logistic model, with other covariates adjusted. The following equations indicate a conventional logistic regression model for a case-only study:

When E is a categorical or continuous variable for environmental exposure status, a case-only estimate for the interaction effect can be obtained using Eq. (5).

We can also assess the independence between an environmental factor and a susceptibility factor in a study with cases and non-cases from the context of a logistic model using the following equations:

According to the independence assumption provided in Eq. (3), the environmental exposure and the susceptibility factor must be independent in the population with cases and non-cases for the equality between the ICR obtained in the population with cases, and non-cases and the ICR obtained in the case-only study. From the context of a logistic model, this means that the confidence interval for Eq. (7) must include 1 and that the point estimate for Eq. (7) must be close to 1.

We can also calculate the ICR obtained in the population with cases and non-cases from the context of a logistic model, using the following equation:

### The ICR in a case-control study

We can define the susceptibility-environment ICR acquired from a case-control study in the model with the multiplicative scale as follows:

ICR_{cc}: the ICR calculated in a case-control study.

ICR_{cc} > 1: The joint OR is larger than the product of each individual OR.

ICR_{cc} < 1: The joint OR is smaller than the product of each individual OR.

ICR_{cc} = 1: The joint OR is the same as the product of each individual OR.

If the joint OR is larger than the product of each individual OR, the ICR_{cc} will be larger than 1. If the joint OR is smaller than the product of each individual OR, the ICR_{cc} will be smaller than 1. If the joint OR is the same as the product of each individual OR, the ICR_{cc} will be 1.

### The ICR in a case-only study and the ICR in a case-control study

For the generation of the case-control study data, a fraction (p) of controls in each group was selected from the population with cases and non-cases in Table 4.

The ICR in a case-control study can be calculated as follows:

In Eq. (11), the requirement for equality between ICR_{cc} and ICR_{co} is as follows:

Equation (12) means that for the equality between ICR_{cc} and ICR_{co}, the susceptibility factor and environmental exposure must be independent in the control population. A rare disease assumption is also not required for this equality.

We can also calculate the ICR in a case-control study from the context of a logistic model, using the following equation:

### The ICR in a study with cases and non-cases and the ICR in a case-control study

The equality between ICR_{cc} and ICR_{co} does not mean that these 2 estimates are not biased away from the ICR acquired from the population with cases and non-cases (ICR_{c/nc}). Based on Eqs. (2) and (11), we can get the following equation:

In Eq. (15), for the equality between ICR_{cc} and ICR_{c/nc}, the following equation or at least 1 of 2 conditions suggested below should be met:

Equation (16) means that for the equality between ICR_{cc} and ICR_{c/nc}, the susceptibility factor and the environmental exposure must be independent both in the population with cases and non-cases and in the controls. Alternatively, if the disease is rare, Eq. (16) will be satisfied. In this case, the rare disease assumption must be examined in the population with cases and non-cases.

### S-E independence in the population with cases and non-cases and S-E independence in the controls: one cannot replace the other

If we evaluate Eq. (16) in detail, we can find an important relationship. The S-E independence in the controls is a totally different concept from the S-E independence in the population with cases and non-cases: one cannot replace the other.

For the first equal sign, S-E OR_{control} = 1 is required according to Eq. (11).

For the second equal sign, S-E OR_{c/nc} = 1 is required according to Eq. (2).

If the disease is rare, \({\mathrm{ICR}}_{\mathrm{cc}}=\left({\mathrm{ICR}}_{\mathrm{co}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)\) according to Eq. (11), and \({\mathrm{ICR}}_{\mathrm{c}/\mathrm{nc}}=\left({\mathrm{ICR}}_{\mathrm{c}\mathrm{o}}\right)\left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)\) according to Eq. (2).

If a researcher uses whether or not S-E OR_{controls} equals 1, instead of whether or not S-E OR_{c/nc} equals 1, for the assessment of the validity of using ICR_{co} instead of using ICR_{c/nc}, this misuse can lead to either the rejection of the valid ICR_{co} or the acceptance of the invalid ICR_{co} mistakenly.

In Supplementary material B, an example from Gatto et al. [8] is provided for this problem. In the first example, S and E are independent in the population, including cases and non-cases (S-E OR_{c/nc} = 1). The interaction estimate in the population, including cases and non-cases (i.e., ICR_{c/nc}) is 2.5. The ICR_{co} is also 2.5. In this situation, the S-E OR_{control} of 0.7 does not provide a reliable estimation for S-E OR_{c/nc} of 1.0. In the second example, the S-E OR_{c/nc} is 2.0, showing a non-independent relationship. The ICR_{c/nc} is 1.0, but ICR_{co} is 2.0. In this situation, the S-E OR_{control} of 1.0 does not provide a reliable estimation for S-E OR_{c/nc} of 2.0.

### The rare disease assumption: for ICR_{cc} = ICR_{c/nc} and S-E OR_{control} = S-E OR_{c/nc}

The rare disease assumption provides 2 implications in this discussion of the case-only approach. The first implication is provided in Eq. (18). The second implication is the following:

\(\left(\frac{\left(\mathrm{c}+\mathrm{D}\right)\left(\mathrm{e}+\mathrm{F}\right)}{\left(\mathrm{a}+\mathrm{B}\right)\left(\mathrm{g}+\mathrm{H}\right)}\right)=\)S-E OR_{c/nc} from Eq. (3) \(\mathrm{and}\ \left(\frac{\mathrm{DF}}{\mathrm{BH}}\right)=\frac{\mathrm{df}}{\mathrm{bh}}=\) S-E OR_{control} from Eq. (12)

In this subsection, we will deal with the second implication. Equation (20) indicates the relationship between S-E OR_{control} and S-E OR_{c/nc} [8].

In Gatto et al. [8], the authors used Eq. (20) to conduct a sensitivity analysis (Supplementary material C). The article assessed the impact of the baseline risk of disease in the population (p(D|S-E-)) and the independent effect of S (RR_{S}) on the S-E OR_{control} when the S-E OR_{c/nc} is 1.0. In Supplementary material C, the baseline risk of disease ranges from 0.1 to 6%. As illustrated in Supplementary material C, the S-E OR_{control} is similar to the S-E OR_{c/nc} of 1.0 when either the baseline risk of disease (p(D|S-E-)) is under 1%, and the independent effect of S is relatively low (RR_{S} < 2.5). However, as the baseline risk of disease approaches 3%, the S-E OR_{control} begins to diverge from the S-E OR_{c/nc} of 1.0. This worsens when the independent effect of the susceptibility factor increases.

### Violation of independence: confounder and subpopulation dependence

The violation of independence between S and E occurs when an individual alters his or her environmental exposure according to his or her susceptibility factor. This violation is due to 2 factors mainly: (i) a confounder and (ii) subpopulation dependence.

Gatto et al. [8] provide 2 examples of confounders. In the first example of Supplementary material D, the family history functions as a confounder, and in the second example of Supplementary material D, the adverse reaction to alcohol functions as a mediator between the susceptibility factor and the environmental exposure. For these 2 examples, the positive multiplicative interaction (ICR_{CO} of > 1) will be biased towards the null (ICR_{CO} ≈ 1) because of the overall negative association between S and E due to C.

If these covariates can be adjusted, the independence between S and E can be restored.

However, a cautious approach is required because the adjustment of unrelated covariates with S-E dependence would cost some degrees of freedom and would reduce the precision of ICR_{CO} [8].

Another source of the violation of independence is a hidden dependence on a subpopulation. Wang et al. [9] provide a unique solution for this problem, providing the following Eq. (9):

CIR: Confounding Interaction Ratio. r_{SE}: the correlation coefficient between S and E. CV_{S}: variation in susceptibility factor prevalence odds. CV_{E}: variation in environmental exposure prevalence odds.

CIR_{U}: the upper bound of CIR, CIR_{L}: the lower bound of CIR, *υ*_{S}(*υ*_{S} ≥ 1): the ratio of the largest and the smallest susceptibility frequency odds across all strata. *υ*_{E}(*υ*_{E} ≥ 1): the ratio of the largest and the smallest exposure frequency odds across all strata.

In Eq. (23), CIR is the ratio of the crude ICR_{c/nc} without stratification over ICR_{c/nc} with stratification. According to the above equation, there would be no population stratification bias (CIR =1), (i) if the exposure prevalence odds and the susceptibility frequency odds are uncorrelated across all strata (r_{ES} = 0), (ii) no variation exists in the exposure prevalence odds (CV_{E} = 0), or (iii) no variation exists in the susceptibility frequency odds (CV_{S} = 0).

In Eq. (24), *υ*_{S}(*υ*_{S} ≥ 1) denotes the ratio of the largest over the smallest susceptibility frequency odds, and *υ*_{E}(*υ*_{E} ≥ 1) denotes the ratio of the largest over the smallest exposure prevalence odds across all the strata in the population. If there is either no variation in the susceptibility frequency odds (*υ*_{S} = 1) or in the exposure prevalence odds (*υ*_{E} = 1), there would be no bias (U = L = 1) according to Eq. (24). If we can calculate CIR for a population, we can calculate ICR_{c/nc} with stratification.

For the violation of S-E independence, researchers usually would try to evaluate a potential confounder based on their subject-matter knowledge. However, for subpopulation dependence, attention should be paid to the whole study population and the strata rather than finding a confounder. This important difference should be in the mind of researchers using a case-only approach.

### The efficiency gained from the case-only approach

Case-only approach can calculate a more precise interaction effect estimate (i.e., that with a narrower confidence interval) than a study design with case and non-cases, such as a cohort/case-control study approach can do [16].

In Eqs. (8) and (9), and Table 2, the asymptotic variance of \(\hat{\upbeta}\)_{3} in a population with cases and non-cases is as follows:

In Eqs. (13) and (14), and Table 4, the asymptotic variance of \({\overline{\overline{\upbeta}}}_3\) in a case-control study is as follows:

In Eqs. (4), Eq. (5), and Table 3, the asymptotic variance of \(\hat{\gamma}\)_{1} in a case-only study is as follows:

Comparing Eq. (27) with Eqs. (25) and (26), the case-only design can provide an estimate with a narrower confidence interval than either the case-control or the cohort design (study designs with cases and non-cases) can do. This efficiency gain comes from the independence assumption between susceptibility factor and environmental exposure (S-E OR_{c/nc} = 1).

### Methodological issues to be considered

Several issues must be considered when applying the case-only approach to estimating the interaction effect between a susceptibility factor and an environmental exposure. Firstly, the case selection process must follow a typical rule of case selection as in a case-control study. Secondly, researchers must verify independence between the susceptibility trait and the environmental exposure in the population with cases and non-cases to substitute the ICR_{CO} calculated in a case-only design for the ICR_{c/nc} calculated in a population with cases and non-cases (according to Eqs. (2) and (3)). If evidence of an association between susceptibility factor and environmental exposure exists, the calculated S-E OR_{c/nc} must be used to correct the ICR_{CO} by multiplying it as provided in Eq. (2). Thirdly, the independence assumption might seem reasonable for various susceptibility factors and environmental exposures. However, some susceptibility factors can modify the likelihood of environmental exposure. This hidden association must be discovered before a case-only approach is applied. Finally, the interaction effect estimate (ICR_{CO}) obtained from the case-only approach can only be interpreted as a departure from the multiplicative effect and not from the additive effect. However, according to previous epidemiologic literature, additive interaction more closely corresponds to mechanistic biologic interaction effects rather than merely statistical interaction effects [17, 18]. Even though this is true, researchers in the current academic societies often use the multiplicative scale to estimate interaction effects because of several practical reasons [18]. This limitation should be considered when the results of this study are applied.

### Summary

In summary, the case-only approach can be applied to environmental epidemiology successfully when a susceptibility factor and an environmental exposure are independent in a population with cases and non-cases. Through this approach, a more precise interaction effect estimate can be calculated.

## Results

### Basic information of datasets and descriptive analysis for each variable

By combining ‘Albumin & Creatinine – Urine,’ ‘Chromium & Cobalt,’ ‘Glycohemoglobin,’ and ‘Demographic Variables and Sample Weights’ data files, a dataset with 7286 subjects was created. For the first analysis example, the respondents with the ‘yes’ answer to the question ‘take diabetic pills to lower blood sugar’ were excluded (5890 subjects). After that, only 1396 subjects were included. For the second analysis example, all subjects (7286 subjects created) were included. The descriptive analysis results for the main variables are provided in Table 5.

### The negative interaction effect between blood chromium level and glycohemoglobin level on albuminuria (micro and macro)

As the first example, Table 6 provides the sequential processes of applying the case-only approach (which will be explained in the first discussion section) in estimating the interaction effect between blood chromium level and glycohemoglobin level on albuminuria. All these sequential processes follow the sequential processes provided in subsection 2.3: (i) Firstly, a 1 μg/L difference of blood chromium level resulted in the fold-difference in the odds of albuminuria 2.20 (95% CI 1.48–3.32) times. (ii) Secondly, a 1% difference in blood glycohemoglobin level resulted in the fold-difference in the odds of albuminuria 1.57 (95% CI 1.44–1.73) times. (iii) Thirdly, when a 1 μg/L difference in blood chromium level and a 1% difference in blood glycohemoglobin level coincide, the multiplicative interaction contrast ratio (ICR) is 0.72 (95% CI 0.35–1.60), with statistical insignificance. (iv) Fourthly, in the population with cases and non-cases, blood chromium levels and blood glycohemoglobin levels are independent of each other (S-E OR_{c/nc}: 0.76 (95% CI 0.47–1.06)). Therefore, the case-only ICR can be a good substitute for the ICR acquired from the population with cases and non-cases. (v) Finally, when only the cases are analyzed (case-only approach), the case-only ICR is 0.59 (95% CI 0.28–0.95), with a statistical significance (a negative interaction effect).

In this example, the environmental exposure (blood chromium level) and the susceptibility factor (blood glycohemoglobin level) are independent in the population with cases and non-cases. Therefore, the case-only ICR itself can be used as the ICR acquired from the population with cases and non-cases without a conversion. (This will be explained in the first discussion section in detail.) However, the ICR acquired from the population with cases, and non-cases was a statistically insignificant ICR because of a relatively wide confidence interval. This problem was solved by applying the case-only approach, producing a slightly decreased ICR with a statistical significance (a narrower confidence interval). A possible protective (negative) interaction effect between blood chromium levels and blood glycohemoglobin levels can be inferred from this example.

### The positive interaction effect between blood cobalt level and old age on albuminuria (micro and macro)

As the second example, Table 6 provides the sequential processes of applying the case-only approach in estimating the interaction effect between blood cobalt level and age in years on albuminuria. All these sequential processes follow the sequential processes provided in subsection 2.3: (i) Firstly, a 1 μg/L difference in blood cobalt level resulted in the fold-difference in the odds of albuminuria 1.09 (95% CI 0.98–1.20) times, without a statistical significance. (ii) Secondly, the 1-year difference in age resulted in the fold-difference in the odds of albuminuria by 1.05 (95% CI 1.04–1.05) times. (iii) Thirdly, when a 1 μg/L difference in blood cobalt level (mcg/L) and a 1-year difference in age coincide, the multiplicative ICR is 1.13 (95% CI 0.99–1.37), with statistical insignificance. (iv) Fourthly, in the population with cases and non-cases, blood cobalt level and age in years show a slight association, not completely independent (S-E OR_{c/nc}: 1.06 (95% CI 1.03–1.10)). Therefore, the case-only ICR must be multiplied by the S-E OR_{c/nc} to be ICR_{c/nc} according to Eq. (2). (v) Finally, when only the cases are analyzed (case-only approach), the case-only ICR is 1.14 (1.03–1.37), with a statistical significance (a positive interaction effect). (vi) By multiplying S-E OR_{c/nc} by the ICR_{CO} calculated, the ICR_{CO}-adjusted, 1.21 (95% CI 1.06–1.51), was produced.

In this example, the environmental exposure (blood cobalt level) and the susceptibility factor (age in years) are not independent in the population with cases and non-cases. Therefore, the case-only ICR must be multiplied by the S-E OR_{c/nc} to produce the ICR_{c/nc} according to Eq. (2). The ICR acquired from the population with cases, and non-cases showed a statistically equivocal ICR (1.13 (95% CI 0.99–1.37)). However, by applying the case-only approach, the ICR_{CO}-adjusted showed a slightly higher ICR with a statistical significance (1.21 (95% CI 1.06–1.51). Therefore, a possible aggravating (positive) interaction effect between blood cobalt levels and ages in years can be inferred from this example.

## Discussion

Many previous studies dealt with various aspects of the case-only approach, usually in the context of gene-environment interaction studies or gene-gene interaction studies [5, 7, 9, 11, 14]. Some studies compared the case-only ICR with the ICR from the case-control design, whereas others compared the case-only ICR with the ICR from the population with cases and non-cases. This study incorporated all previous literature and systematically organized the provided logic and equations. From this effort, various definitions and equations for the ICR in the case-only design can be established compared to the ICR in the population with cases and non-cases (cohort/case-control studies). This systematic organization of concepts from 3 study designs is the original contribution of this study.

Furthermore, this study extended the case-only approach, which had been used usually in gene-environment interaction or gene-gene interaction studies, to a more general concept of the interaction effect estimation between susceptibility factors and environmental exposures. If the independence assumption between a susceptibility factor and an environmental exposure is fulfilled, even though the ‘gene’ is replaced with the ‘susceptibility factor,’ the same equations can be applied. Therefore, the case-only approach can also be applied to environmental epidemiology.

### The preventive (negative) interaction effect between blood chromium levels and glycohemoglobin levels on albuminuria (micro and macro)

The adverse effect of chromium on kidney function was reported in some previous literature [19, 20]. Glycohemoglobin level ≥ 6.5% is a diagnostic criterion for diabetes mellitus and is naturally associated with diabetic nephropathy [21]. Albuminuria, including micro-albuminuria and macro-albuminuria, has been used both as a useful initial marker for kidney damage and a marker associated with an increased risk of progressive renal diseases [22, 23]. However, a possible protective interaction effect is being increasingly reported for the interaction effect between chromium exposure and diabetic chronic kidney disease, based on improved glucose tolerance and insulin sensitivity [24,25,26,27,28].

The result of this study illustrates well a protective interaction effect between blood chromium level (environmental exposure) and blood glycohemoglobin level (susceptibility factor) on the albuminuria status (outcome). This protective interaction effect of chromium on diabetic patients with nephropathy can be used for establishing a future effective treatment strategy for diabetic nephropathy. For example, a study reports a possible positive effect of prescribing a nano chromium metal-organic framework on diabetic chronic kidney disease patients [24].

### The aggravating (positive) interaction effect between blood cobalt levels and old ages on albuminuria (micro and macro)

The effect of blood cobalt levels on kidney function is not yet established, with only a few studies reporting possible adverse effects, mainly in experimental animals [29]. However, the effect of aging on decreasing kidney function is relatively well established [30, 31]. Furthermore, the fact that this aging kidney is susceptible to various toxic substances is well known through numerous studies [32,33,34,35]. From these pieces of evidence, we can infer that the aging kidney could be more susceptible to the possible toxic effect of cobalt, even if it is almost non-toxic to the young kidney.

The result of this study illustrates well this toxin-susceptible feature of the aging kidney (susceptibility factor) to cobalt exposure (environmental exposure). As a marker of kidney damage, the proportion of albuminuria was greater in the older subjects. The result of this study can be used to devise a protective environmental health strategy for aging people with an increased possibility of exposure to heavy metals, such as cobalt.

## Conclusion

This study summarized the previously reported logic and equations about the case-only approach systematically. In particular, the associated definitions and equations are collectively summarized from the cohort and case-control (study designs with cases and non-cases) to case-only studies. By substituting the ‘susceptibility factor’ concept from environmental epidemiology for the conventional ‘gene’ concept from genetic epidemiology, this study broadened the applicability of the case-only approach to broad environmental health topics. If the independence assumption between a susceptibility factor and an environmental exposure in the population with cases and non-cases is kept, this case-only approach can provide a more precise interaction effect estimate than that from study designs with cases and non-cases (cohort/case-control studies). Finally, 2 analysis examples of the case-only approach using the US NHANES datasets were explained. The protective interaction effect between blood chromium levels and blood glycohemoglobin levels and the aggravating interaction effect between blood cobalt levels and increasing ages on the incidence of albuminuria must be investigated meticulously in future studies. In summary, the case-only approach can be a useful approach not only in genetic epidemiology but also in environmental epidemiology.

## Availability of data and materials

All data used in this article are available on the National Health and Nutrition Examination Survey homepage (https://wwwn.cdc.gov/nchs/nhanes/).

## References

Clayton D, McKeigue PM. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet. 2001;358(9290):1356–60.

Hogan MD, Kupper LL, Most BM, Haseman JK. Alternatives to Rothman's approach for assessing synergism (or antagonism) in cohort studies. Am J Epidemiol. 1978;108(1):60–7.

Knol MJ, Egger M, Scott P, Geerlings MI, Vandenbroucke JP. When one depends on the other: reporting of interaction in case-control and cohort studies. Epidemiology. 2009;20:161–6.

Skrondal A. Interaction as departure from additivity in case-control studies: a cautionary note. Am J Epidemiol. 2003;158(3):251–8.

Dennis J, Hawken S, Krewski D, Birkett N, Gheorghe M, Frei J, et al. Bias in the case-only design applied to studies of gene-environment and gene-gene interaction: a systematic review and meta-analysis. Int J Epidemiol. 2011;40(5):1329–41.

VanderWeele TJ, Hernández-Díaz S, Hernán MA. Case-only gene-environment interaction studies: when does association imply mechanistic interaction? Genet Epidemiol. 2010;34(4):327–34.

Li D, Conti DV. Detecting gene-environment interactions using a combined case-only and case-control approach. Am J Epidemiol. 2009;169(4):497–504.

Gatto NM, Campbell UB, Rundle AG, Ahsan H. Further development of the case-only design for assessing gene-environment interaction: evaluation of and adjustment for bias. Int J Epidemiol. 2004;33(5):1014–24.

Wang L-Y, Lee W-C. Population stratification bias in the case-only study for gene-environment interactions. Am J Epidemiol. 2008;168(2):197–201.

Albert PS, Ratnasinghe D, Tangrea J, Wacholder S. Limitations of the case-only design for identifying gene-environment interactions. Am J Epidemiol. 2001;154(8):687–93.

Yang Q, Khoury MJ, Sun F, Flanders WD. Case-only design to measure gene-gene interaction. Epidemiology. 1999;10(2):167–70.

Schmidt S, Schaid DJ. Potential misinterpretation of the case-only study to assess gene-environment interaction. Am J Epidemiol. 1999;150(8):878–85.

Khoury MJ, Flanders WD. Nontraditional epidemiologic approaches in the analysis of gene environment interaction: case-control studies with no controls! Am J Epidemiol. 1996;144(3):207–13.

Dai JY, Liang CJ, LeBlanc M, Prentice RL, Janes H. Case-only approach to identifying markers predicting treatment effects on the relative risk scale. Biometrics. 2018;74(2):753–63.

Richardson DB, Kaufman JS. Estimation of the Relative Excess Risk Due to Interaction and Associated Confidence Bounds. Am J Epidemiol. 2009;169(6):756–60.

Piegorsch WW, Weinberg CR, Taylor JA. Non-hierarchical logistic models and case-only designs for assessing susceptibility in population-based case-control studies. Stat Med. 1994;13(2):153–62.

Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Third Edition. Philadelphia: Lippincott Williams & Wilkins; 2008.

VanderWeele TJ, Knol MJ. A tutorial on interaction. Epidemiol Methods. 2014;3(1):33–72.

Tsai T-L, Kuo C-C, Pan W-H, Chung Y-T, Chen C-Y, Wu T-N, et al. The decline in kidney function with chromium exposure is exacerbated with co-exposure to lead and cadmium. Kidney Int. 2017;92(3):710–20.

Wedeen RP, Qian LF. Chromium-induced kidney disease. Environ Health Perspect. 1991;92:71–4.

Association AD. 2. Classification and diagnosis of diabetes: standards of medical care in diabetes—2019. Diabetes Care. 2019;42(Supplement 1):S13–28.

Levey AS, Becker C, Inker LA. Glomerular filtration rate and albuminuria for detection and staging of acute and chronic kidney disease in adults: a systematic review. JAMA. 2015;313(8):837–46.

Heerspink HJL, Gansevoort RT. Albuminuria is an appropriate therapeutic target in patients with CKD: the pro view. Clin J Am Soc Nephrol. 2015;10(6):1079–88.

Fakharzadeh S, Kalanaky S, Argani H, Dadashzadeh S, Torbati PM, Nazaran MH, et al. Ameliorative effect of a nano chromium metal–organic framework on experimental diabetic chronic kidney disease. Drug Dev Res. 2021;82(3):393–403.

Huang H, Chen G, Dong Y, Zhu Y, Chen H. Chromium supplementation for adjuvant treatment of type 2 diabetes mellitus: Results from a pooled analysis. Mol Nutr Food Res. 2018;62(1):1700438.

Yin RV, Phung OJ. Effect of chromium supplementation on glycated hemoglobin and fasting plasma glucose in patients with diabetes mellitus. Nutr J. 2015;14(1):1–9.

Lewicki S, Zdanowski R, Krzyzowska M, Lewicka A, Debski B, Niemcewicz M, et al. The role of Chromium III in the organism and its possible use in diabetes and obesity treatment. Ann Agric Environ Med. 2014;21(2):331–5.

Sahin K, Onderci M, Tuzcu M, Ustundag B, Cikim G, Ozercan İH, et al. Effect of chromium on carbohydrate and lipid metabolism in a rat model of type 2 diabetes mellitus: the fat-fed, streptozotocin-treated rat. Metabolism. 2007;56(9):1233–40.

Naura AS, Sharma R. Toxic effects of hexaammine cobalt(III) chloride on liver and kidney in mice: Implication of oxidative stress. Drug Chem Toxicol. 2009;32(3):293–9.

Wetzels JFM, Kiemeney LALM, Swinkels DW, Willems HL, Heijer MD. Age- and gender-specific reference values of estimated GFR in Caucasians: The Nijmegen Biomedical Study. Kidney Int. 2007;72(5):632–7.

Coresh J, Astor BC, Greene T, Eknoyan G, Levey AS. Prevalence of chronic kidney disease and decreased kidney function in the adult US population: Third national health and nutrition examination survey. Am J Kidney Dis. 2003;41(1):1–12.

Wang X, Bonventre J, Parrish A. The Aging Kidney: Increased Susceptibility to Nephrotoxicity. Int J Mol Sci. 2014;15(9):15358–76.

Rosner MH. The pathogenesis of susceptibility to acute kidney injury in the elderly. Curr Aging Sci. 2009;2(2):158–64.

Schmitt R, Cantley LG. The impact of aging on kidney repair. Am J Physiol Ren Physiol. 2008;294(6):F1265–72.

Jerkić M, Vojvodić S, López-Novoa JM. The mechanism of increased renal susceptibility to toxic substances in the elderly. Int Urol Nephrol. 2001;32(4):539–47.

## Acknowledgments

The author appreciates the reviewers’ comments on this study. (Reviewer #1 and David M. Thompson). In particular, the comments from David M. Thompson were of great help in improving the quality and logic of the primary manuscript. This work was supported by INHA UNIVERSITY Research Grant.

## Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

## Author information

### Authors and Affiliations

### Contributions

Jinyoung Moon: Conceptualization, Methodology, Investigation, Resources, Data Curation, Software, Validation, Formal analysis, Writing – Original Draft, Visualization. Hwan-Cheol Kim: Writing –Review & Editing, Supervision, Project administration. The author(s) read and approved the final manuscript.

### Corresponding author

## Ethics declarations

### Ethics approval and consent to participate

This study used only the publicly available National Health and Nutrition Examination Survey (NHANES) datasets. These datasets can be accessed on the NHANES homepage (https://www.cdc.gov/nchs/nhanes/index.htm). For the datasets, the information about Ethics Review Board (ERB) approval can be found on https://www.cdc.gov/nchs/nhanes/irba98.htm.

The authors confirm that all experiments were performed in accordance with the Declaration of Helsinki.

### Consent for publication

Not applicable.

### Competing interests

The authors have no potential competing interests to disclose.

## Additional information

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary Information

**Additional file 1: Supplementary material A.**

The used R codes for the statistical analyses. **Supplementary material B.** The S-E independence in the controls cannot replace the S-E independence in the population with cases and non-cases [1]. **Supplementary material C.** How strong a rare disease assumption is required for the equality between S-E OR_{c/nc} and S-E OR_{control} [1]. **Supplementary material D.** Violation of independence: confounder [1].

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

## About this article

### Cite this article

Moon, J., Kim, HC. Case-only approach applied in environmental epidemiology: 2 examples of interaction effect using the US National Health and Nutrition Examination Survey (NHANES) datasets.
*BMC Med Res Methodol* **22**, 254 (2022). https://doi.org/10.1186/s12874-022-01706-6

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/s12874-022-01706-6