class: center middle main-title section-title-1 # Introduction to causal inference .class-info[ **Session 11** .light[MATH 80667A: Experimental Design and Statistical Methods <br> HEC Montréal ] ] --- name: outline class: title title-inv-1 # Outline -- .box-1.large.sp-after-half[Basics of causal inference] -- .box-2.large.sp-after-half[Directed acyclic graphs] -- .box-3.large.sp-after-half[Causal mediation] --- layout: false name: basicscausal class: center middle section-title section-title-1 # Causal inference --- layout: true class: title title-1 --- # Correlation is not causation <div class="figure" style="text-align: center"> <img src="img/11/xkcd552_correlation.png" alt="xkcd comic 552 by Randall Munroe, CC BY-NC 2.5 license. Alt text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'." width="55%" /> <p class="caption">xkcd comic 552 by Randall Munroe, CC BY-NC 2.5 license. Alt text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.</p> </div> --- # Spurious correlation <div class="figure" style="text-align: center"> <img src="img/11/5920_per-capita-consumption-of-margarine_correlates-with_the-divorce-rate-in-maine.png" alt="Spurious correlation by Tyler Vigen, licensed under CC BY 4.0" width="60%" /> <p class="caption">Spurious correlation by Tyler Vigen, licensed under CC BY 4.0</p> </div> --- # Correlation vs causation <div class="figure" style="text-align: center"> <img src="img/11/correlation_causation.jpg" alt="Illustration by Andrew Heiss, licensed under CC BY 4.0" width="60%" /> <p class="caption">Illustration by Andrew Heiss, licensed under CC BY 4.0</p> </div> --- # Potential outcomes For individual `\(i\)`, we postulate the existence of a potential outcomes - `\(Y_i(1)\)` (response for treatment `\(X=1\)`) and - `\(Y_i(0)\)` (response for control `\(X=0\)`). Both are possible, but only one will be realized. .box-1.medium[Observe outcome for a single treatment] - Result `\(Y(X)\)` of your test given that you either party `\((X=1)\)` or study `\((X=0)\)` the night before your exam. --- # Fundamental problem of causal inference With binary treatment `\(X_i\)`, I observe either `\(Y_i \mid \text{do}(X_i=1)\)` or `\(Y_i \mid \text{do}(X_i=0)\)`. <table class="table" style="margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;"> \(i\) </th> <th style="text-align:center;"> \(X_i\) </th> <th style="text-align:center;"> \(Y_i(0)\) </th> <th style="text-align:center;"> \(Y_i(1)\) </th> <th style="text-align:center;"> \(Y_i(1)-Y_i(0)\) </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> ? </td> </tr> <tr> <td style="text-align:center;"> 2 </td> <td style="text-align:center;"> 0 </td> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> ? </td> </tr> <tr> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> 6 </td> <td style="text-align:center;"> ? </td> </tr> <tr> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> 0 </td> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> ? </td> </tr> <tr> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> 0 </td> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> ? </td> </tr> <tr> <td style="text-align:center;"> 6 </td> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> ? </td> <td style="text-align:center;"> 7 </td> <td style="text-align:center;"> ? </td> </tr> </tbody> </table> --- # Causal assumptions? Since we can't estimate individual treatment, we consider the **average** treatment effect (average over population) `\(\mathsf{E}\{Y(1) - Y(0)\}\)`. The latter can be estimated as `\begin{align*} \textsf{ATE} = \underset{\substack{\text{expected response among}\\\text{treatment group}}}{\mathsf{E}(Y \mid X=1)} - \underset{\substack{\text{expected response among}\\\text{control group}}}{\mathsf{E}(Y \mid X=0)} \end{align*}` When is this a valid causal effect? --- # (Untestable) assumptions For the ATE to be equivalent to `\(\mathsf{E}\{Y(1) - Y(0)\}\)`, we need: 1. conditional *ignorability*, which states that potential outcomes are independent (denoted with the `\({\perp\mkern-10mu\perp}\)` symbol) of assignment to treatment given a set of explanatories `\(\boldsymbol{Z}\)`. In notation `\(\{Y(0), Y(1)\} {\perp\mkern-10mu\perp} X \mid \boldsymbol{Z}\)` 2. lack of interference: the outcome of any participant is unaffected by the treatment assignment of other participants. 3. consistency: given a treatment `\(X\)` taking level `\(j\)`, the observed value for the response `\(Y \mid X=j\)` is equal to the corresponding potential outcome `\(Y(j)\)`. --- layout: false name: dag class: center middle section-title section-title-2 # Directed acyclic graphs ## .color-light-1[Slides by Dr. Andrew Heiss, CC BY-NC 4.0 License.] --- layout: true class: title title-2 --- # Types of data .pull-left[ .box-inv-2.medium.sp-after-half[Experimental] .box-2.sp-after-half[You have control over which units get treatment] ] -- .pull-right[ .box-inv-2.medium.sp-after-half[Observational] .box-2.sp-after-half[You don't have control over which units get treatment] ] --- # Causal diagrams .box-inv-2.medium.sp-after-half[Directed acyclic graphs (DAGs)] .pull-left[ .box-2.SMALL[**Directed**: Each node has an arrow that points to another node] .box-2.SMALL[**Acyclic**: You can't cycle back to a node (and arrows only have one direction)] .box-2.SMALL[**Graph**: A set of nodes (variables) and vertices (arrows indicating interdependence)] ] .pull-right[ <img src="11-slides_files/figure-html/simple-dag-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # Causal diagrams .box-inv-2.medium.sp-after-half[Directed acyclic graphs (DAGs)] .pull-left[ .box-2.SMALL[Graphical model of the process that generates the data] .box-2.SMALL[Maps your logical model] ] .pull-right[ ![](11-slides_files/figure-html/simple-dag-1.png) ] --- # Three types of associations .pull-left-3[ .box-2.medium[Confounding] <img src="11-slides_files/figure-html/confounding-dag-1.png" width="100%" style="display: block; margin: auto;" /> .box-inv-2.small[Common cause] ] .pull-middle-3.center[ .box-2.medium[Causation] <img src="11-slides_files/figure-html/mediation-dag-1.png" width="100%" style="display: block; margin: auto;" /> .box-inv-2.small[Mediation] ] .pull-right-3[ .box-2.medium[Collision] <img src="11-slides_files/figure-html/collision-dag-1.png" width="100%" style="display: block; margin: auto;" /> .box-inv-2.small[Selection /<br>endogeneity] ] --- # Confounding .pull-left-wide[ <img src="11-slides_files/figure-html/confounding-dag-big-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ .box-inv-2.medium.sp-after-half[**X** causes **Y**] .box-inv-2.medium.sp-after-half[But **Z** causes both **X** and **Y**] .box-inv-2.medium.sp-after-half[**Z** * confounds* the **X** → **Y** association] ] --- # Confounder: effect of money on elections .box-inv-2.medium.sp-after-half[What are the paths<br>between **money** and **win margin**?] .pull-left[ <img src="11-slides_files/figure-html/money-elections-1.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ .box-2.sp-after-half[Money → Margin] .box-2.sp-after-half[Money ← Quality → Margin] .box-inv-2.sp-after-half[Quality is a *confounder*] ] --- # Experimental data Since we randomize assignment to treatment `\(X\)`, all arrows **incoming** in `\(X\)` are removed. With observational data, we need to explicitly model the relationship and strip out the effect of `\(X\)` on `\(Y\)`. --- # How to adjust with observational data .box-inv-2.medium.sp-after-half[Include covariate in regression] .center.float-left[ .box-inv-2.medium[Matching] .box-inv-2.medium[Stratifying] .box-inv-2.medium.sp-before-half[Inverse probability weighting] ] --- # Causation .pull-left-wide[ <img src="11-slides_files/figure-html/causation-dag-big-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ .box-inv-2.medium.sp-after-half[**X** causes **Y**] .box-inv-2.medium.sp-after-half[**X** causes<br>**Z** which causes **Y**] .box-2.medium.sp-after-half[**Z** is a mediator] ] --- # Colliders .pull-left-wide[ <img src="11-slides_files/figure-html/collider-dag-big-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right-narrow[ .box-inv-2.medium.sp-after-half[**X** causes **Z**] .box-inv-2.medium.sp-after-half[**Y** causes **Z**] .box-2.medium.sp-after-half[Should you control for **Z**?] ] --- layout: false .pull-left[ .box-2.medium[Colliders can create<br>fake causal effects] ] .pull-right[ .box-2.medium[Colliders can hide<br>real causal effects] ] <img src="11-slides_files/figure-html/bulls-scores-1.png" width="50%" style="display: block; margin: auto;" /> .center[ .box-inv-2[Height is unrelated to basketball skill… among NBA players] ] --- layout: true class: title title-2 --- # Colliders and selection bias <img src="11-slides_files/figure-html/nba-dag-1.png" width="65%" style="display: block; margin: auto;" /> --- # Conditioning on colliders - [Omnipresent in the literature](https://doi.org/10.1146/annurev-soc-071913-043455) - [Example: When and how does the number of children affect marital satisfaction? An international survey](https://doi.org/10.1371/journal.pone.0249516) - [Example: The Predictive Validity of the GRE Across Graduate Outcomes](https://doi.org/10.1080/00221546.2023.2187177) ??? A new collider bias teaching example. Sample selects on marriage (not divorced) so: satisfaction ––> [not divorced] <–– children (Richard McElreath, Apr 26, 2021 on Twitter) Example of confounder: https://doi.org/10.1177/109467051454314 --- # Three types of associations .pull-left-3[ .box-2.medium[Confounding] ![](11-slides_files/figure-html/confounding-dag-1.png) .box-inv-2.small.sp-after-half[Common cause] .box-inv-2.small[Causal forks **X** ← **Z** → **Y**] ] .pull-middle-3[ .box-2.medium[Causation] ![](11-slides_files/figure-html/mediation-dag-1.png) .box-inv-2.small.sp-after-half[Mediation] .box-inv-2.small[Causal chain **X** → **Z** → **Y**] ] .pull-right-3[ .box-2.medium[Collision] ![](11-slides_files/figure-html/collision-dag-1.png) .box-inv-2.small.sp-after-half[Selection /<br>endogeneity] .box-inv-2.small[inverted fork **X** → **Z** ← **Y**] ] --- # Life is inherently complex <img src="img/12/dagitty-model.png" width="50%" style="display: block; margin: auto;" /> .small[ Postulated DAG for the effect of smoking on fetal alcohol spectrum disorders (FASD) ] ??? Source: Andrew Heiss (?), likely from McQuire, C., Daniel, R., Hurt, L. et al. The causal web of foetal alcohol spectrum disorders: a review and causal diagram. Eur Child Adolesc Psychiatry 29, 575–594 (2020). https://doi.org/10.1007/s00787-018-1264-3 --- layout: false name: causal-mediation class: center middle section-title section-title-3 # Causal mediation --- layout: true class: title title-3 --- # Key references - Imai, Keele and Tingley (2010), [A General Approach to Causal Mediation Analysis](https://doi.org/10.1037/a0020761), *Psychological Methods*. - Pearl (2014), [Interpretation and Identification of Causal Mediation](http://dx.doi.org/10.1037/a0036434), *Psychological Methods*. - Baron and Kenny (1986), [The Moderator-Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations](https://doi.org/10.1037/0022-3514.51.6.1173), *Journal of Personality and Social Psychology* Limitations: - Bullock, Green, and Ha (2010), [Yes, but what’s the mechanism? (don’t expect an easy answer)](https://doi.org/10.1037/a0018933) - Uri Simonsohn (2022) [Mediation Analysis is Counterintuitively Invalid](http://datacolada.org/103) <!-- - Zhao, Lynch and Chen (2010), [Reconsidering Baron and Kenny: Myths and Truths about Mediation Analysis](https://doi.org/10.1086/651257), *Journal of Consumer Research* - [David Kenny's website](https://davidakenny.net/cm/mediate.htm) - Imai, Tingley and Yamamoto (2013), [Experimental designs for identifying causal mechanisms (with Discussion)](https://doi.org/10.1111/j.1467-985X.2012.01032.x), Journal of the Royal Statistical Society: Series A. --> --- # Sequential ignorability assumption - potential mediation given treatment `\(x\)` as `\(M_i(x)\)` and - potential outcome for treatment `\(x\)` and mediator `\(m\)` as `\(Y_i(x, m)\)`. - Given pre-treatment covariates `\(\boldsymbol{Z}\)`, potential outcomes for mediation and treatment are conditionally independent of treatment assignment. $$ Y_i(x', m), M_i(x) \perp\mkern-10mu\perp X_i \mid \boldsymbol{Z}_i = \boldsymbol{z}$$ - Given pre-treatment covariates and observed treatment, potential outcomes are independent of mediation. $$ Y_i(x', m) \perp\mkern-10mu\perp M_i(x) \mid X_i =x, \boldsymbol{Z}_i = \boldsymbol{z}$$ --- # Total effect **Total effect**: overall impact of `\(X\)` (both through `\(M\)` and directly) `$$\begin{align*}\mathsf{TE}(x, x^*) = \mathsf{E}[ Y \mid \text{do}(X=x)] - \mathsf{E}[ Y \mid \text{do}(X=x^*)]\end{align*}$$` .pull-left[ .box-inv-3[ **X** → **M** → **Y** <br>plus <br>**X** → **Y** ] ] .pull-right[ <img src="11-slides_files/figure-html/moderation-1.png" width="80%" style="display: block; margin: auto;" /> ] --- # Average controlled direct effect `$$\begin{align*}\textsf{CDE}(m, x, x^*) &= \mathsf{E}[Y \mid \text{do}(X=x, m=m)] - \mathsf{E}[Y \mid \text{do}(X=x^*, m=m) \\&= \mathsf{E}\{Y(x, m) - Y(x^*, m)\} \end{align*}$$` Expected population change in response when the experimental factor changes from `\(x\)` to `\(x^*\)` and the mediator is set to a fixed value `\(m\)`. Problem: this forces manipulation of the mediator, and only gives outcome for a fixed value `\(m\)`. --- # Direct and indirect effects The **natural direct effect** is the expected change in `\(Y\)` under treatment `\(x\)` if `\(M\)` is set to whatever value it would take under control `\(x^*\)`: `$$\textsf{NDE}(x, x^*) = \mathsf{E}[Y\{x, M(x^*)\} - Y\{x^*, M({x^*})\}].$$` The **natural indirect effect** is the expected change in `\(Y\)` if we set `\(X\)` to its control value and change the mediator value which it would attain under `\(x\)`: `$$\textsf{NIE}(x, x^*) = \mathsf{E}[Y\{x^*, M(x)\} - Y\{x^*, M(x^*)\}].$$` --- # Total effect Counterfactual conditioning reflects a physical intervention, not mere (probabilistic) conditioning. We define the **total effect** as `$$\mathsf{TE}(x, x^*) = \textsf{NDE}(x, x^*) - \textsf{NIE}(x^*, x).$$` ??? --- # Necessary and sufficiency of mediation From Pearl (2014): > The difference `\(\textsf{TE}-\textsf{NDE}\)` quantifies the extent to which the response of `\(Y\)` is owed to mediation, while `\(\textsf{NIE}\)` quantifies the extent to which it is explained by mediation. These two components of mediation, the necessary and the sufficient, coincide into one in models void of interactions (e.g., linear) but differ substantially under moderation - In linear systems, changing the order of arguments amounts to flipping signs - This definition works under temporal reversal and gives the correct answer (the regression-slope approach of the linear structural equation model does not).