Assessing subgroup effects with binary data: can the use of different effect measures lead to different conclusions?

White, Ian R; Elbourne, Diana

doi:10.1186/1471-2288-5-15

Standard measures of differences between outcome rates are problematic for identifying subgroup effects

James Scanlan, James P. Scanlan, Attorney at Law

8 June 2011

White and Elbourne[1] address the way that interaction tests are affected by whether one compares relative changes in risk of an outcome, relative changes in risk of the opposite outcome, absolute changes in outcome rates, or odds ratios. They recommend a conservative approach to identifying interaction that involves examining the measure that is least likely to show a statistically significant subgroup effect.

The fact that, for example, when an intervention reduces one adverse outcome rate from 12.7% to 5.0% and another from 21.7% to 10.0%, whether an interaction test finds a statistically significant difference between the two changes may depend on what measure of change is employed suggests that something may be amiss with interaction tests generally.

More important, the authors overlook the way that, due to factors inherent in the shape of normal distributions of risks of experiencing an outcome, standard measures of changes in outcome rates tend to be affected by the overall prevalence of an outcome. As a result of such factors, there is reason to expect that an intervention that similarly affects groups with different base rates of experiencing an outcome will commonly cause a larger proportionate change in the outcome for the group with the lower base rate while causing a larger proportionate change in the opposite outcome for the other group. Thus, statistical significance issues aside, a determination of which group benefited more in relative terms will often turn on whether one examines changes in the adverse outcome or the favorable outcome. Changes in absolute differences and odds ratios tend also to be systematically affected by the overall prevalence of an outcome. While typical patterns of relative differences and odds ratios are more difficult to describe, the two measures commonly yield contrasting interpretations as to which group benefited more from an intervention.[2-6]

The hip trial data examined by White and Elbourne may have provided an unfortunate example. Among the strong suspicion group, ultrasound appeared to reduce the risk of surgery and the risk of any hip treatment. But among the moderate suspicion group, those treated with ultrasound had a slightly higher surgery rate and only a negligibly lower rate of any hip treatment than those not treated. So the above-described patterns as to the comparative size of relative changes in one outcome and relative changes in the opposite outcome are not present. But, given that the shapes of normal distributions provide a statistical basis for expecting such patterns absent countervailing forces, the patterns must be taken into account.

In order to identify a meaningful subgroup effect one must employ a measure that is unaffected by the overall prevalence of an outcome. The only apparent such measure is an estimate, derived from the outcome rates of subjects receiving an intervention and not receiving it, of the difference between the means of the hypothesized underlying risk distributions.[4,6] In the case of the figures cited above, such measures would show the two changes to be exactly the same. Each reflects a situation where the intervention shifted the means of the underlying distributions by half a standard deviation.

These considerations must be borne in mind both in appraising subgroup effects and in interpreting meta-analyses when the base rates differ from study to study or within studies.[7] The considerations must also be borne in mind in the crucial estimating of the absolute risk reduction for a particular subgroup in the absence of reliable information for that subgroup.

References:

1. White IA, Elbourne D. Assessing subgroup effects with binary data: can the use of different effects measures lead to different conclusions? BMC Medical Research Methodology 2005, 5;15: http://www.biomedcentral.com/1471-2288/5/15 (Accessed June 7, 2010).

2. Scanlan JP. Race and mortality. Society 2000;37(2):19-35: http://www.jpscanlan.com/images/Race_and_Mortality.pdf (Accessed June 7, 2011.)

3. Scanlan JP. Divining difference. Chance 1994;7(4):38-9,48: http://jpscanlan.com/images/Divining_Difference.pdf (Accessed June 7, 2011.)

4. Scanlan JP. Interpreting Differential Effects in Light of Fundamental Statistical Tendencies, presented at 2009 Joint Statistical Meetings of the American Statistical Association, International Biometric Society, Institute for Mathematical Statistics, and Canadian Statistical Society, Washington, DC, Aug. 1-6, 2009: http://www.jpscanlan.com/images/JSM_2009_ORAL.pdf; http://www.jpscanlan.com/images/Scanlan_JSM_2009.ppt (Accessed June 7, 2011.)

5. Scanlan’s Rule page of jpscanlan.com: http://jpscanlan.com/scanlansrule.html (Accessed June 7, 2011)

6. Subgroup Effects sub-page of Scanlan’s Rule page of jpscanlan.com: http://www.jpscanlan.com/scanlansrule/subgroupeffects.html (Accessed June 7, 2011.)

7. Meta-analysis sub-page of Scanlan’s Rule page of jpscanlan.com: http://jpscanlan.com/scanlansrule/metaanalysis.html (Accessed June 7, 2011.)

Competing interests

None.

Standard measures of differences between outcome rates are problematic for identifying subgroup effects

James Scanlan, James P. Scanlan, Attorney at Law

8 June 2011

White and Elbourne[1] address the way that interaction tests are affected by whether one compares relative changes in risk of an outcome, relative changes in risk of the opposite outcome, absolute changes in outcome rates, or odds ratios. They recommend a conservative approach to identifying interaction that involves examining the measure that is least likely to show a statistically significant subgroup effect.

The fact that, for example, when an intervention reduces one adverse outcome rate from 12.7% to 5.0% and another from 21.7% to 10.0%, whether an interaction test finds a statistically significant difference between the two changes may depend on what measure of change is employed suggests that something may be amiss with interaction tests generally.

More important, the authors overlook the way that, due to factors inherent in the shape of normal distributions of risks of experiencing an outcome, standard measures of changes in outcome rates tend to be affected by the overall prevalence of an outcome. As a result of such factors, there is reason to expect that an intervention that similarly affects groups with different base rates of experiencing an outcome will commonly cause a larger proportionate change in the outcome for the group with the lower base rate while causing a larger proportionate change in the opposite outcome for the other group. Thus, statistical significance issues aside, a determination of which group benefited more in relative terms will often turn on whether one examines changes in the adverse outcome or the favorable outcome. Changes in absolute differences and odds ratios tend also to be systematically affected by the overall prevalence of an outcome. While typical patterns of relative differences and odds ratios are more difficult to describe, the two measures commonly yield contrasting interpretations as to which group benefited more from an intervention.[2-6]

The hip trial data examined by White and Elbourne may have provided an unfortunate example. Among the strong suspicion group, ultrasound appeared to reduce the risk of surgery and the risk of any hip treatment. But among the moderate suspicion group, those treated with ultrasound had a slightly higher surgery rate and only a negligibly lower rate of any hip treatment than those not treated. So the above-described patterns as to the comparative size of relative changes in one outcome and relative changes in the opposite outcome are not present. But, given that the shapes of normal distributions provide a statistical basis for expecting such patterns absent countervailing forces, the patterns must be taken into account.

In order to identify a meaningful subgroup effect one must employ a measure that is unaffected by the overall prevalence of an outcome. The only apparent such measure is an estimate, derived from the outcome rates of subjects receiving an intervention and not receiving it, of the difference between the means of the hypothesized underlying risk distributions.[4,6] In the case of the figures cited above, such measures would show the two changes to be exactly the same. Each reflects a situation where the intervention shifted the means of the underlying distributions by half a standard deviation.

These considerations must be borne in mind both in appraising subgroup effects and in interpreting meta-analyses when the base rates differ from study to study or within studies.[7] The considerations must also be borne in mind in the crucial estimating of the absolute risk reduction for a particular subgroup in the absence of reliable information for that subgroup.

References:

1. White IA, Elbourne D. Assessing subgroup effects with binary data: can the use of different effects measures lead to different conclusions? BMC Medical Research Methodology 2005, 5;15: http://www.biomedcentral.com/1471-2288/5/15 (Accessed June 7, 2010).

2. Scanlan JP. Race and mortality. Society 2000;37(2):19-35: http://www.jpscanlan.com/images/Race_and_Mortality.pdf (Accessed June 7, 2011.)

3. Scanlan JP. Divining difference. Chance 1994;7(4):38-9,48: http://jpscanlan.com/images/Divining_Difference.pdf (Accessed June 7, 2011.)

4. Scanlan JP. Interpreting Differential Effects in Light of Fundamental Statistical Tendencies, presented at 2009 Joint Statistical Meetings of the American Statistical Association, International Biometric Society, Institute for Mathematical Statistics, and Canadian Statistical Society, Washington, DC, Aug. 1-6, 2009: http://www.jpscanlan.com/images/JSM_2009_ORAL.pdf; http://www.jpscanlan.com/images/Scanlan_JSM_2009.ppt (Accessed June 7, 2011.)

5. Scanlan’s Rule page of jpscanlan.com: http://jpscanlan.com/scanlansrule.html (Accessed June 7, 2011)

6. Subgroup Effects sub-page of Scanlan’s Rule page of jpscanlan.com: http://www.jpscanlan.com/scanlansrule/subgroupeffects.html (Accessed June 7, 2011.)

7. Meta-analysis sub-page of Scanlan’s Rule page of jpscanlan.com: http://jpscanlan.com/scanlansrule/metaanalysis.html (Accessed June 7, 2011.)

Competing interests

None.

Archived Comments for: Assessing subgroup effects with binary data: can the use of different effect measures lead to different conclusions?

Standard measures of differences between outcome rates are problematic for identifying subgroup effects

Competing interests

BMC Medical Research Methodology

Contact us