+ - 0:00:00
Notes for current slide
Notes for next slide

Introduction to experimental design

Session 1

MATH 80667A: Experimental Design and Statistical Methods
HEC Montréal

1 / 30

Outline

2 / 30

Outline

Class details

2 / 30

Outline

Class details

Motivation

2 / 30

Outline

Class details

Motivation

Review

2 / 30

Outline

Class details

Motivation

Review

Key concepts in experimental designs

2 / 30

Class details

3 / 30

Course content

Content

  • Basics of experimental design
  • Statistical inference
  • Completely randomized designs
  • Analysis of variance
  • Blocked designs
  • Analysis of covariance
  • Intro to mixed models
  • Intro to causal inference
  • Linear mediation analysis
4 / 30

We will focus on simple designs as they lead to simple analysis. For more complicated schemes, consult an expert or find a collaborator.

A single introductory course in statistics does not make one an expert. The purpose is more numerical literacy then expert knowledge.

However, the assessment, activities and interdisciplinary skills are targeted for PhD students

We will spend a lot of time on ANOVA (one-way, two-way, multivariate, repeated measures, etc.)

Course content

Content

  • Basics of experimental design
  • Statistical inference
  • Completely randomized designs
  • Analysis of variance
  • Blocked designs
  • Analysis of covariance
  • Intro to mixed models
  • Intro to causal inference
  • Linear mediation analysis

Cross-disciplinary skills

  • Scientific workflow
  • Peer-review
  • Reporting
  • Statistical fallacies
  • Reproducibility
4 / 30

We will focus on simple designs as they lead to simple analysis. For more complicated schemes, consult an expert or find a collaborator.

A single introductory course in statistics does not make one an expert. The purpose is more numerical literacy then expert knowledge.

However, the assessment, activities and interdisciplinary skills are targeted for PhD students

We will spend a lot of time on ANOVA (one-way, two-way, multivariate, repeated measures, etc.)

The reproducibility crisis has changed the publishing landscape, so the requirements for publications are more stringent. I intend to cover these in details.

Statistical fallacies are common mistakes that discredit the validity of your work. Learning to recognize them in the wild is crucial (also as reviewer).

Prerequisites

5 / 30

Prerequisites

Math skills

Basic algebra

5 / 30

Prerequisites

Math skills

Basic algebra

Computer science

None

5 / 30

Prerequisites

Math skills

Basic algebra

Computer science

None

Statistics

At the level of OpenIntro Statistics (Chapter 1)

5 / 30

More specifically for statistics:

  • variable types (continuous, discrete, etc.)
  • basic graphs (histograms, scatterplots)
  • hypothesis testing
  • differences in mean (e.g., t-test)
  • simple linear regression

The OpenIntro Statistics book can be freely downloaded from https://www.openintro.org/book/os/

Motivation

6 / 30

History

Experiments on agricultural trials in Rothamsted ongoing since 1843

Description of the Rothamsted station
Broadbalk photo
7 / 30

https://www.rothamsted.ac.uk/environmental-change-network

http://www.era.rothamsted.ac.uk/images/metadata/rbk1/2012-AJ-12-10.jpg

R.A. Fisher worked 14 years at Rothamsted from 1919 and developed much of the theory underlying experimental design See this recollection by Yates on his contribution: https://doi.org/10.2307/2528399

Fisher was a eugenist and his views are largely decried nowadays.

Modern experiments: A/B testing

Screenshot of Twitter page following changes made to interface in 2021 (the follow button is black for higher contrast).
8 / 30

This illustrates changes to the design (Follow now high-contrast in black). This is an example of A/B testing common in web design.

Evidence-based policy


RAND health insurance study


Student Teacher Achievement Ratio (STAR)

9 / 30

RAND: In a large-scale, multiyear experiment, participants who paid for a share of their health care used fewer health services than a comparison group given free care. It concluded that cost sharing reduced "inappropriate or unnecessary" medical care (overutilization), but also reduced "appropriate or needed" medical care. https://www.rand.org/health-care/projects/hie.html

Tennessee's STAR project: smaller class sizes lead to better outcomes "Over 7,000 students in 79 schools were randomly assigned into one of 3 interventions: small class (13 to 17 students per teacher), regular class (22 to 25 students per teacher), and regular-with-aide class (22 to 25 students with a full-time teacher's aide). Classroom teachers were also randomly assigned to the classes they would teach. The interventions were initiated as the students entered school in kindergarten and continued through third grade." https://dss.princeton.edu/catalog/resource1589

Nobel memorial prize

Composite photo by Andrew Heiss: excerpt from the Washington Post following the announcement of the Nobel prize in economics for experimental approach to solving poverty, Abdul Latif Jameel Poverty Action Lab (J-PAL) video and a Twitter feed with a photo of Abhijit Banerjee and Esther Duflo upon the announcement of the Nobel memorial price.
10 / 30

Review

11 / 30

Population and sampling

Defining a target population

Sampling frame

Where to draw sample from

Sampling procedure

Randomness

12 / 30

Convenience samples and non-response bias

Illustration of flawed survey due to differential nonresponse.
13 / 30

Sampling scheme

Simple random sampling

Stratified sampling

Cluster sampling

Multi-stage sampling

14 / 30

Stratified sampling: select the same fraction (gender, ethnicity, etc.) Useful for oversampling rare categories Clustering sampling: villages, housing blocks, classrooms. Lower quality than stratified sampling, but cheaper

Combination of different sampling

Judging the quality of a sample

Summary statistics

Raw data

Pre-testing

15 / 30

Summary statistics: Reported to check representativeness of the sample relative to population.

Raw data: Used for reproducibility and to assess whether data is fraudulent.

Pre-testing: Check whether sampling allocation is sufficiently random.

U. Simonsohn, L. Nelson and J. Simmons. Evidence of Fraud in an Influential Field Experiment About Dishonesty, 2021, https://datacolada.org/98.

Experiments as gold-standard

Screenshot of a paper in BJOG by Hariton and Locascio

Randomised controlled trials (RCTs) are the reference standard for studying causal relationships between interventions and outcomes as randomisation eliminates much of the bias inherent with other study designs.

16 / 30

Paper available at https://doi.org/10.1111/1471-0528.15199

From the abstract: "Randomised controlled trials (RCTs) are the reference standard for studying causal relationships between interventions and outcomes as randomisation eliminates much of the bias inherent with other study designs."

Study type versus sampling

Random versus non-random assignment and sampling
17 / 30

Experimental versus observational

Figure from a New York Times on a RCT studying the effectiveness of wellness programs.
18 / 30

Figure from the Upshot https://www.nytimes.com/2018/08/06/upshot/employer-wellness-programs-randomized-trials.html

based on the paper

Damon Jones, David Molitor, Julian Reif, What do Workplace Wellness Programs do? Evidence from the Illinois Workplace Wellness Study, The Quarterly Journal of Economics, Volume 134, Issue 4, November 2019, Pages 1747–1791, https://doi.org/10.1093/qje/qjz023

Key concepts in experimental design

19 / 30

Technical vocabulary

Experimental unit

Observational unit

Factor

20 / 30

Impact of encouragement on teaching

From Davison (2008), Example 9.2

In an investigation on the teaching of arithmetic, 45 pupils were divided at random into five groups of nine. Groups A and B were taught in separate classes by the usual method. Groups C, D, and E were taught together for a number of days. On each day C were praised publicly for their work, D were publicly reproved and E were ignored. At the end of the period all pupils took a standard test.

21 / 30

Exercise

In pairs, identify

  • the experimental and observational units
  • the factor levels
  • the response variable
03:00
22 / 30

Comparing groups (factor levels)

  • Without any intervention, variability in output from one observation to the next.
  • Differences between groups are comparatively stable.
23 / 30

Choices in experimental designs

  • factor levels being compared
  • observations to be made (number of repetitions, etc.)
  • experimental units
24 / 30

Requirements for good experiments

  1. Absence of systematic error
  2. Precision
  3. Range of validity
  4. Simplicity of the design
25 / 30

Absence of systematic error

  • Achieved via randomization
  • Controlling the environment
26 / 30

If we allocate observations non-randomly (for example timing of a course vs instructor), we cannot necessarily distinguish between effects

Controlled environment: reduce as much variability (room temperature, experimental apparatus, instructions, etc.)

Precision

Cartoon
27 / 30

Precision

  • depends on the intrinsic variability
  • function of
    1. accuracy of experimental work
    2. number of experimental units / repetitions per unit
    3. design and methods of analysis
28 / 30

Range of validity

  • What is population?
  • Identify restrictions
  • Extrapolation
    • if proper random sampling scheme
    • range of validity
Cartoon
29 / 30

Simplicity of the design

  • Simple designs lead to simple statistical analyses
Cartoon
30 / 30

Important: understand reasons for difference

  • do not limit to experimental knowledge about the differences

Outline

2 / 30
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow