We have presented an approach to selecting adjustment variables which combines prior knowledge expressed in a DAG with results from analysis of the data. The approach is pragmatic in that it focuses only on the effect of interest (also emphasized by others
[5]); uses regression models and the change-in-estimate procedure familiar to epidemiologists; and can incorporate real-data problems such as measurement error and residual bias. It aims at producing a plausible, best working DAG or set of DAGs for a given research question, given the data at hand, and at communicating the assumptions underlying variable selection in the initial and final models using a standardized, graphical form
[3]. The approach also communicates the uncertainties in the assumptions in the final models by presenting all the DAGs identified by the researcher which are consistent with the observed change-in-estimate patterns. This aims to help other research teams to focus on the areas of uncertainty and corroborate or refute the DAGs, based on the analysis of different datasets in an iterative way.

The approach depends on recent theoretical work on c- (confounding-) equivalence
[16] and collapsibility of estimates over different DAG structures
[17]. Pearl and Paz
[16] have developed conditions for c-equivalence which apply to any subsets of the variables in a DAG. Our approach uses two of their results: that all sufficient adjustment sets are c-equivalent and that failure to find c-equivalence of putative sufficient adjustment sets rules out a DAG implying such c-equivalence
[3]. The approach also uses Pearl and Paz’s insights into bias amplification, in which they note that bias amplification will lead to changes in associations conditional on different variables even if the variables block the same path. In a recent, detailed review of collapsibility (i.e. equivalence) of different estimators over different DAGs
[17], Greenland and Pearl noted that regression coefficients may be used to check collapsibility over different covariable sets, an approach which we develop here for applied work.

To our knowledge, only one other article in the epidemiology literature to date has looked at adjustment variable selection by explicitly combining DAGs and a statistical selection procedure
[6]. This article addressed deletion of variables from an adjustment set defined from a prior DAG using the change-in-estimate procedure, but considered only odds ratios from simulations of case–control studies and explicitly excluded colliders. Our approach is therefore broader as it addresses whether the data support the initial DAG which defines the starting adjustment set, applies to any collapsible estimator, and covers the range of possible relationships between variables. Interestingly, this article found largest bias (using simulated data) when including covariables associated only with the outcome in the adjustment set and suggested that non-collapsibility of the odds ratio may have been involved
[6]. This reinforces our insistence on collapsible estimators.

The proposed approach has some potential advantages over other variable-selection methods. It can reduce the “black-box” nature of using the p-value or the change-in-estimate alone to select variables, as it lays out the rationale for adjustment-variable choice graphically. It will also frequently lead to a more parsimonious model than selection based on p-values since it chooses variables by relevance to the exposure-outcome association, rather than the association with the outcome alone. The approach also extends background-knowledge methods by checking starting assumptions against the data and requiring researchers to justify mismatches or adapt assumptions appropriately. The approach complements the recently proposed method of adjusting on all assumed parents of exposure and outcome
[21] as it can incorporate adjustment decisions when parent variables are measured with error and can achieve a more parsimonious model by excluding parent variables which do not lie on biasing pathways. Of course, sensitivity analyses to explore the impact of possible unmeasured confounding
[53] remain important.

An important point concerns the possibility of incidental cancellations and small effects. Finding a meaningful difference in the add-one pattern for a variable *when no difference is implied by the DAG* indicates the need to review the variable’s relationships. However, finding no meaningful difference in the add-one or minus-one patterns *when a difference is implied* is not, strictly speaking, inconsistent with the DAG. This is because of the possibilities of incidental cancellations across pathways and of changes which simply do not exceed the pre-defined meaningful threshold. For this reason, we suggest that the researcher maintain such arrows (thereby assuming “weak faithfulness” rather than faithfulness (see
[32] p.190), but label these arrows for other research teams to examine with different datasets.

A potential criticism of the approach is that it does not eliminate background knowledge from adjustment-variable selection. Indeed, the examples include instances of needing background knowledge to distinguish between DAGs giving the same add-one and minus-one patterns (e.g. confounding- vs. mediating-pathway examples, measurement-error vs. bias-amplification examples). It is well known that different DAGs can imply the same statistical relationships
[3, 7, 54], making an appeal to background knowledge unavoidable when using DAGs in applied work. We do not consider this a limitation, however, seeing background knowledge as valid information which should rarely be over-ruled by any single dataset but, rather, reviewed in light of the patterns in the data. This is particularly appropriate in clinical epidemiology, where we frequently know quite a lot about likely relationships between variables. In contrast, the approach is unlikely to be well adapted to datasets for which researchers have very little background knowledge, when alternative approaches such as DAG-discovery algorithms (below) may be used.

Another potential criticism is that the approach only addresses variable relationships relevant to the effect of interest, remaining agnostic about other regions of the DAG. This aims to focus on the research question at hand and to minimize the risk of “getting lost” in trying to explore all possible associations in the DAG, many of which do not directly impact on the selected exposure-outcome estimate. A researcher wishing to explore the full DAG could apply a DAG-discovery algorithm (e.g. the PC, GES, or FCI algorithms; see the TETRAD project’s website and
[7]). Such algorithmic approaches use statistical tests or scoring rules to identify edges between variables and can incorporate background knowledge such as the temporal ordering of variables or the forced inclusion or exclusion of arrows. However, they have proven controversial
[8] and have not yet crossed over into applied epidemiologic research. Nonetheless, recent applications of these algorithms in the biomedical literature for data with many variables and little background knowledge have been interesting
[55]. In the approach proposed in this article, a researcher could use these algorithms to explore additional prior starting DAGs. In our experience, however, there are challenges to using these algorithms currently, including handling datasets with mixed continuous and categorical variables and dealing with issues such as measurement error and bias amplification.

We wish to highlight several additional limitations of the proposed approach. Like the change-in-estimate procedure, the approach is *ad hoc* and informal as it depends on arbitrary thresholds and is not founded on well-defined statistical tests with appropriate theoretical properties. In addition, as discussed above, different DAG structures can give the same implied add-one and minus-one patterns and so more than one DAG will be consistent with the observed patterns. For this reason, the researcher should present all identified DAGs with implied patterns consistent with those observed; further, researchers should always remember that other DAGs (not identified) will also be consistent with the patterns.

Several extensions to the approach are possible, should it appeal to epidemiologists working on applied questions. These include how best to address sampling variability in the patterns, comparing the performance of different rules based on the proportion of bootstrap samples which fall outside the meaningful threshold. Another potential extension concerns precision in choosing the adjustment set. We note that a researcher may wish to adjust on additional variables to improve precision
[56] and may wish to delete variables from the final adjustment set based on precision of estimates, as concluded in
[6]. Researchers should of course bear in mind that, as with any *a posteriori* variable selection, estimates from a revised DAG will tend to be over-precise. Finally, it may be possible to extend the approach to include recent advances in DAG theory, including selection variables to encode differences between populations (and so uncertainty about arrows)
[57], signed DAGs which specify assumptions about the positive or negative direction of paths
[58], and interactions using sufficient causation DAGs
[59].