The role of causal reasoning in understanding Simpson's paradox, Lord's paradox, and the suppression effect: covariate selection in the analysis of observational studies

Abstract

Tu et al present an analysis of the equivalence of three paradoxes, namely, Simpson's, Lord's, and the suppression phenomena. They conclude that all three simply reiterate the occurrence of a change in the association of any two variables when a third variable is statistically controlled for. This is not surprising because reversal or change in magnitude is common in conditional analysis. At the heart of the phenomenon of change in magnitude, with or without reversal of effect estimate, is the question of which to use: the unadjusted (combined table) or adjusted (sub-table) estimate. Hence, Simpson's paradox and related phenomena are a problem of covariate selection and adjustment (when to adjust or not) in the causal analysis of non-experimental data. It cannot be overemphasized that although these paradoxes reveal the perils of using statistical criteria to guide causal analysis, they hold neither the explanations of the phenomenon they depict nor the pointers on how to avoid them. The explanations and solutions lie in causal reasoning which relies on background knowledge, not statistical criteria.

Commentary

Simpson's paradox, Lord's paradox, and the suppression effect are examples of the perils of the statistical interpretation of a real but complex world. By rearing their heads intermittently in the literature they remind us about the inadequacy of statistical criteria for causal analysis. Those who believe in letting the data speak for themselves are in for a disappointment.

Now, suppose there are no other unmeasured covariates given the DAGs in Figures 1 to 7. If Figure 1 is the true state of affairs, to estimate the total effect of BW on BP, the unadjusted analysis will suffice. If, however, Figure 2 or 3 applies, then the adjusted (that is conditional on CW) is needed to estimate the total effect of BW on BP. This is because conditioning on CW will block the back-door BW to BP: BW←CW→BP in Figure 2 or BW←U→CW→BP in Figure 3. The reader could by now have doubts about the correctness of Figure 2 where a later observation CW is a confounder of the effect of BW on BP since one could argue that, by occurring after BW, CW could not be seen as a common cause of both BW and BP. (See Hernán et al [8] for an accessible defence of the structural approach to confounding and selection bias using DAGs.) Nonetheless, while temporality seemingly excludes CW as a confounder in Figure 2, it does not exclude CW from ever being part of a confounding path as seen in Figure 3. Both BW and CW are more likely to be the result of a common cause (U), possibly genetic. Based on background knowledge and common sense, Figure 3 is more plausible than 2. Therefore, temporality cannot be used to judge whether a variable is a confounder, part of a sufficient subset of covariates needed to block a backdoor, or not [5].

Figure 4 presents a scenario where the unadjusted effect of BW on BP is the correct estimate since CW is a collider (that is, without conditioning, it already acts as a blocker) in the DAG depicting two unobserved common causes of BW and CW and of CW and BP. This scenario is closely related to that in Figure 5 where BW has no effect on, but shares an unobserved common cause (U3) with, BP. In all scenarios, our choice of which (unadjusted or adjusted) estimate to use is not based on the magnitude or direction of the estimate but on the governing causal relations. Put this way, Simpson's paradox becomes a problem of covariate adjustment (when to adjust or not) in the causal analysis of non-experimental or observational data. The paradox arises due to the causal interpretation of the observation that the proportion of a given level of BW is evidence for making an educated guess of the proportion of a given BP level in an observed sample if the status of the third related covariate CW is unknown [5]. What we really want to answer is "Does BW cause BP?", not "Does observing BW allow us to predict BP?".

As Pearl has noted [5], people think "causes", not proportions (the thing that drives the paradox in Simpson's paradox); "reversal" is possible in the calculus of proportions but impossible in the calculus of causes. Put in Pearl's causal language, the invariance of causal interpretation that is wrongly used to interpret evidence of reversal in proportions in Simpson's paradox is as follows:

$\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{high}\right\},\text{high}\text{CW}\right)<\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{low}\right\},\text{high}\text{CW}\right)$
(1)
$\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{high}\right\},\text{low}\text{CW}\right)<\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{low}\right\},\text{low}\text{CW}\right)$
(2)

where, according to our causal intuition, the combined or unadjusted analysis should be:

$\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{high}\right\}\right)<\mathrm{Pr}\left(\text{BP}=\text{high}|do\left\{\text{BW}=\text{low}\right\}\right)$
(3)

The inequalities in (1), (2) and (3) reflect the "sure thing principle" which when applied to Tu et al's paper would then go as follows: an action do{BW} which decreases the probability of the event BP in each CW subpopulation must also decrease the probability of BP in the whole population, provided that the action do{BW} does not change the distribution of the CW subpopulations. See Pearl [5] for a formal proof, although the sure thing principle follows naturally from the semantics of actions as modifiers of mechanisms, as embodied by the do(·) operator. What is numerically observed in Simpson's paradox, however, is

$\mathrm{Pr}\left(\text{BP}=\text{high}|\text{BW}=\text{high}\right)<\mathrm{Pr}\left(\text{BP}=\text{high}|\text{BW}=\text{low}\right)$
(4)

which goes against our causal intuition or inclination to think "causes". If the DAG represented in Figure3 – or, for the sake of argument, Figure2 – applies, then we must consult the conditional analysis represented by inequalities 1 and 2, not the observed unconditional analysis in inequality 4. In this context, inequality 4 can only be seen as an evidence of BP that BW provides in the absence of information on CW, not as a statement of the causal effect of BW on BP which is what inequality 3 captures [5]. That is, Simpson's paradox arises because once CW is unknown to us, and we observe, for instance, that the proportion of {BW=high} is higher than that of {BW=low}, we have evidence for predicting (as in inequality 4) that the observable proportion of {BP=high} would also be higher than that of {BP=low} in the non-experimental data, but we cannot take this observation to imply that {BW=high} causes {BP=high} which goes against our causal knowledge thatdoing{BW=low} causes {BP=high} as depicted in inequality 3. Hence, prediction does not imply aetiology. The former tends to deal with usually transitory proportions whereas the latter deals with invariant causal relations.

A further illustration of the futility of the continued statistical discussion of the paradoxes is captured in the discussion of the suppression effect: how an unrelated covariate (CW) "increases the overall model fit ...assessed byR2..." [1]. Tuet al should not be surprised that suppression is little known in epidemiology because epidemiologists do not and should not use the squared multiple-correlation-coefficientR2 as a measure of goodness-of-fit. As Tuet al algebraically admit,R2 is only an indication of the proportion of the variance in BP or outcome that is attributable to the variation in the fitted mean of BP [9]. It is known that the expected value ofR2 can increase as more and even unrelated variables are added to the model thus making it a useless criterion for guiding covariate selection [10].

Furthermore, Tuet al make a passing mention of direct effect versus indirect effect (as might be the case in the consideration of adjustment in Figure1). This is, of course, beyond the scope of their paper and, therefore, my commentary. I refer the curious reader to the important work on the complex issues involved in the estimation of direct effect [3,5,1114]. Suffice it to say that, in common situations where total effect estimation is possible, direct effect may be unidentifiable. For instance, although all effects of BW on BP can still be consistently estimated even in a scenario where there is an additional unobserved common cause (U) of CW and BP as in Figure6 (modified from Figure2), the direct effect of BW on BP cannot be identified without measuring U in Figure7 which is a similar modification of Figure1. Like Pearl [5] and Holland and Rubin [15], I take these paradoxes to be related to causal concepts which are, thus, best understood in the context of causal analysis.

In conclusion, it cannot be overemphasized that although Simpson's and related paradoxes reveal the perils of using statistical criteria to guide causal analysis, they hold neither the explanations of the phenomenon they purport to depict nor the pointers on how to avoid them. The explanations and solutions lie in causal reasoning which relies on background knowledge, not statistical criteria. It is high time we stopped treating misinterpreted signs and symptoms ('paradoxes'), and got on with the business of handling the disease ('causality'). We should rightly turn our attention to the perennial problem of covariate selection for causal analysis using non-experimental data.

References

1. Tu Y-K, Gunnell DJ, Gilthorpe MS: Simpson's paradox, Lord's paradox, and suppression effects are the same phenomenon – the reversal paradox. Emerg Themes Epidemiol. 2008, 5: 2. 10.1186/1742-7622-5-2

2. Oxford University: Oxford Dictionary, Thesaurus, and Wordpower Guide. Oxford: Oxford University Press; 2001.

3. Pearl J: Causal diagrams for empirical research. Biometrika. 1995, 82: 669-710. 10.1093/biomet/82.4.669. 10.1093/biomet/82.4.669

4. Greenland S, Pearl J, Robins JM: Causal diagrams for epidemiologic research. Epidemiol. 1999, 10 (1): 37-48. 10.1097/00001648-199901000-00008

5. Pearl J: Causality. Models, Reasoning and Inference. Cambridge: Cambridge University Press; 2000.

6. Pearl J: Causal inference in health sciences. Health Serv Outcomes Res Methodol. 2001, 2: 189-220. 10.1023/A:1020315127304. 10.1023/A:1020315127304

7. Robins JM: Data, design, and background knowledge in etiologic inference. Epidemiol. 2001, 12 (3): 313-320. 10.1097/00001648-200105000-00011

8. Hernan MA, Hernandez-Diaz S, Robins JM: A structural approach to selection bias. Epidemiol. 2004, 15 (5): 615-625. 10.1097/01.ede.0000135174.63482.43

9. Rothman KJ, Greenland S, Lash TL, : Modern Epidemiology. Philadelphia: Lippincott, 3; 2008.

10. Altman DG: Practical Statistics for Medical Research. Boca Raton, FL: Chapman & Hall, 1991.

11. Robins JM, Greenland S: Identifiability and exchangeability for direct and indirect effects. Epidemiol. 1992, 3 (2): 143-155. 10.1097/00001648-199203000-00013

12. Pearl J: Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann, 2001, 411-420.

13. Cole SR, Hernan MA: Fallibility in estimating direct effects. Int J Epidemiol. 2002, 31: 163-165. 10.1093/ije/31.1.163

14. Petersen ML, Sinisi SE, van der Laan MJ: Estimation of direct causal effects. Epidemiol. 2006, 17: 276-284. 10.1097/01.ede.0000208475.99429.2d. 10.1097/01.ede.0000208475.99429.2d

15. Holland PW, Rubin DB: On Lord's paradox. Principles of Modern Psychological Measurement. Edited by: Wainer H, Messick S. Hillsdale, NJ: Lawrence Erlbaum Associates, 1982, 3-25.

Acknowledgements

This work was supported by a Rubicon fellowship (grant number 825.06.026) awarded by the Board of the Council for Earth and Life Sciences (ALW), at the Netherlands Organisation for Scientific Research (NWO). The author thanks Timothy Hallett, and ETE's editorial board and associate editors for their insightful comments. This paper represents author's own opinions, but not those of ETE or other relevant affiliations.

Author information

Authors

Corresponding author

Correspondence to Onyebuchi A Arah.

Competing interests

OAA is an associate faculty editor of the journalEmerging Themes in Epidemiology (ETE).

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Rights and permissions

Reprints and Permissions

Arah, O.A. The role of causal reasoning in understanding Simpson's paradox, Lord's paradox, and the suppression effect: covariate selection in the analysis of observational studies. Emerg Themes Epidemiol 5, 5 (2008). https://doi.org/10.1186/1742-7622-5-5

• Accepted:

• Published:

• DOI: https://doi.org/10.1186/1742-7622-5-5

Keywords

• Blood Pressure
• Birth Weight
• Causal Analysis
• Causal Interpretation
• Current Weight