+ - 0:00:00
Notes for current slide
Notes for next slide

Introduction to causal inference

Session 11

MATH 80667A: Experimental Design and Statistical Methods
HEC Montréal

1 / 26

Outline

2 / 26

Outline

Basics of causal inference

2 / 26

Outline

Basics of causal inferenceDirected acyclic graphs

2 / 26

Outline

Basics of causal inferenceDirected acyclic graphs Causal mediation

2 / 26

Causal inference

3 / 26

Correlation is not causation

xkcd comic 552 by Randall Munroe, CC BY-NC 2.5 license. Alt text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.

xkcd comic 552 by Randall Munroe, CC BY-NC 2.5 license. Alt text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.

4 / 26

Spurious correlation

Spurious correlation by Tyler Vigen, licensed under CC BY 4.0

Spurious correlation by Tyler Vigen, licensed under CC BY 4.0

5 / 26

Correlation vs causation

Illustration by Andrew Heiss, licensed under CC BY 4.0

Illustration by Andrew Heiss, licensed under CC BY 4.0

6 / 26

Potential outcomes

For individual i, we postulate the existence of a potential outcomes

  • Yi(1) (response for treatment X=1) and
  • Yi(0) (response for control X=0).

Both are possible, but only one will be realized.

Observe outcome for a single treatment

  • Result Y(X) of your test given that you either party (X=1) or study (X=0) the night before your exam.
7 / 26

Fundamental problem of causal inference

With binary treatment Xi, I observe either Yido(Xi=1) or Yido(Xi=0).

i Xi Yi(0) Yi(1) Yi(1)Yi(0)
1 1 ? 4 ?
2 0 3 ? ?
3 1 ? 6 ?
4 0 1 ? ?
5 0 5 ? ?
6 1 ? 7 ?
8 / 26

Causal assumptions?

Since we can't estimate individual treatment, we consider the average treatment effect (average over population) E{Y(1)Y(0)}.

The latter can be estimated as

ATE=E(YX=1)expected response amongtreatment groupE(YX=0)expected response amongcontrol group

When is this a valid causal effect?

9 / 26

(Untestable) assumptions

For the ATE to be equivalent to E{Y(1)Y(0)}, the following are sufficient:

  1. ignorability, which states that potential outcomes are independent of assignment to treatment
  2. lack of interference: the outcome of any participant is unaffected by the treatment assignment of other participants.
  3. consistency: given a treatment X taking level j, the observed value for the response YX=j is equal to the corresponding potential outcome Y(j).
10 / 26

Directed acyclic graphs

Slides by Dr. Andrew Heiss, CC BY-NC 4.0 License.

11 / 26

Types of data

Experimental

You have control over which units get treatment

12 / 26

Types of data

Experimental

You have control over which units get treatment

Observational

You don't have control over which units get treatment

12 / 26

Causal diagrams

Directed acyclic graphs (DAGs)

Directed: Each node has an arrow that points to another node

Acyclic: You can't cycle back to a node (and arrows only have one direction)

Graph: A set of nodes (variables) and vertices (arrows indicating interdependence)

13 / 26

Causal diagrams

Directed acyclic graphs (DAGs)

Graphical model of the process that generates the data

Maps your logical model

14 / 26

Three types of associations

Confounding

Common cause

Causation

Mediation

Collision

Selection /
endogeneity

15 / 26

Confounding

X causes Y

But Z causes both X and Y

Z confounds the XY association

16 / 26

Confounder: effect of money on elections

What are the paths
between money and win margin?

Money → Margin

Money ← Quality → Margin

Quality is a confounder

17 / 26

Experimental data

Since we randomize assignment to treatment X, all arrows incoming in X are removed.

With observational data, we need to explicitly model the relationship and strip out the effect of X on Y.

18 / 26

How to adjust with observational data

  • Include covariate in regression
  • Matching: pair observations that are more alike in each group, and compute difference between these
  • Stratification: estimate effects separately for subpopulation (e.g., young and old, if age is a confounder)
  • Inverse probability weighting: estimate probability of self-selection in treatment group, and reweight outcome.
19 / 26

Causation

X causes Y

X causes
Z which causes Y

Z is a mediator

20 / 26

Colliders

X causes Z

Y causes Z

Should you control for Z?

21 / 26

Colliders can create
fake causal effects

Colliders can hide
real causal effects

Height is unrelated to basketball skill… among NBA players

22 / 26

Colliders and selection bias

23 / 26

A new collider bias teaching example. Sample selects on marriage (not divorced) so: satisfaction ––> [not divorced] <–– children (Richard McElreath, Apr 26, 2021 on Twitter)

Example of confounder: https://doi.org/10.1177/109467051454314

Three types of associations

Confounding

Common cause Causal forks XZY

Causation Mediation Causal chain XZY

Collision Selection /
endogeneity
inverted fork XZY

25 / 26

Life is inherently complex

Postulated DAG for the effect of smoking on fetal alcohol spectrum disorders (FASD)

26 / 26

Source: Andrew Heiss (?), likely from

McQuire, C., Daniel, R., Hurt, L. et al. The causal web of foetal alcohol spectrum disorders: a review and causal diagram. Eur Child Adolesc Psychiatry 29, 575–594 (2020). https://doi.org/10.1007/s00787-018-1264-3

Outline

2 / 26
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow