+ - 0:00:00
Notes for current slide
Notes for next slide

Complete factorial designs

Session 5

MATH 80667A: Experimental Design and Statistical Methods
HEC Montréal

1 / 38

Outline

Factorial designs and interactions

Tests for two-way ANOVA

2 / 38

Factorial designs and interactions

3 / 38

Complete factorial designs?

Factorial design
study with multiple factors (subgroups)

Complete
Gather observations for every subgroup

4 / 38

Motivating example

Response:
retention of information
two hours after reading a story

Population:
children aged four

experimental factor 1:
ending (happy or sad)

experimental factor 2:
complexity (easy, average or hard).

5 / 38

Setup of design

complexity happy sad
complicated
average
easy
6 / 38

These factors are crossed, meaning that you can have participants in each subconditions.

Efficiency of factorial design

Cast problem
as a series of one-way ANOVA
vs simultaneous estimation

Factorial designs requires
fewer overall observations

Can study interactions

7 / 38

To study each interaction (complexity, story book ending) we would need to make three group for each comparison in rows, and one in each column. So a total of 3 one-way ANOVA each with 2 groups and 2 one-way anova with 3 groups. The two-way ANOVA will lead to 6 groups instead of 12.

Interaction

Definition: when the effect of one factor
depends on the levels of another factor.

Effect together

sum of individual effects

8 / 38

Interaction or profile plot

Graphical display:
plot sample mean per category

with uncertainty measure
(1 std. error for mean
confidence interval, etc.)

9 / 38

Interaction plots and parallel lines

10 / 38

Interaction plots for 2 by 2 designs

11 / 38

Cell means for 2 by 2 designs

12 / 38

Line graph for example patterns for means for each of the possible kinds of general outcomes in a 2 by 2 design. Illustration adapted from Figure 10.2 of Crump, Navarro and Suzuki (2019) by Matthew Crump (CC BY-SA 4.0 license)

Example 1 : loans versus credit

Sharma, Tully, and Cryder (2021) Supplementary study 5 consists of a 2×2 between-subject ANOVA with factors

  • debt type (debttype), either "loan" or "credit"
  • purchase type, either discretionary or not (need)

No evidence of interaction

13 / 38

Example 2 - psychological distance

Maglio and Polman (2014) Study 1 uses a 4×2 between-subject ANOVA with factors

  • subway station, one of Spadina, St. George, Bloor-Yonge and Sherbourne
  • direction of travel, either east or west

Clear evidence of interaction (symmetry?)

14 / 38

Tests for two-way ANOVA

15 / 38

Analysis of variance = regression

An analysis of variance model is simply a linear regression with categorical covariate(s).

  • Typically, the parametrization is chosen so that parameters reflect differences to the global mean (sum-to-zero parametrization).
  • The full model includes interactions between all combinations of factors
    • one average for each subcategory
    • one-way ANOVA!
16 / 38

Formulation of the two-way ANOVA

Two factors: A (complexity) and B (ending) with na=3 and nb=2 levels, and their interaction.

Write the average response Yijr of the rth measurement in group (ai,bj) as E(Yijr)average responseb=μijsubgroup mean where Yijr are independent observations with a common std. deviation σ.

  • We estimate μij by the sample mean of the subgroup (i,j), say μ^ij.
  • The fitted values are y^ijr=μ^ij.
17 / 38

One average for each subgroup

B ending
A complexity
b1 (happy) b2 (sad) row mean
a1 (complicated) μ11 μ12 μ1.
a2 (average) μ21 μ22 μ2.
a3 (easy) μ31 μ32 μ3.
column mean μ.1 μ.2 μ
18 / 38

Row, column and overall average

  • Mean of Ai (average of row i): μi.=μi1++μinbnb

  • Mean of Bj (average of column j): μ.j=μ1j++μnajna

  • Overall average: μ=i=1naj=1nbμijnanb
  • Row, column and overall averages are equiweighted combinations of the cell means μij.
  • Estimates are obtained by replacing μij in formulas by subgroup sample mean.
19 / 38

Vocabulary of effects

  • simple effects: difference between levels of one in a fixed combination of others (change in difficulty for happy ending)
  • main effects: differences relative to average for each condition of a factor (happy vs sad ending)
  • interaction effects: when simple effects differ depending on levels of another factor
20 / 38

Main effects

Main effects are comparisons between row or column averages

Obtained by marginalization, i.e., averaging over the other dimension.

Main effects are not of interest if there is an interaction.

happy sad
column means μ.1 μ.2
complexity row means
complicated μ1.
average μ2.
easy μ3.
21 / 38

Simple effects

Simple effects are comparisons between cell averages within a given row or column

happy sad
means (easy) μ13 μ23
complexity mean (happy)
complicated μ11
average μ21
easy μ31
22 / 38

Contrasts

We collapse categories to obtain a one-way ANOVA with categories A (complexity) and B (ending).

Q: How would you write the weights for contrasts for testing the

  • main effect of A: complicated vs average, or complicated vs easy.
  • main effect of B: happy vs sad.
  • interaction A and B: difference between complicated and average, for happy versus sad?

The order of the categories is (a1,b1), (a1,b2), , (a3,b2).

23 / 38

Contrasts

Suppose the order of the coefficients is factor A (complexity, 3 levels, complicated/average/easy) and factor B (ending, 2 levels, happy/sad).

test μ11 μ12 μ21 μ22 μ31 μ32
main effect A (complicated vs average) 1 1 1 1 0 0
main effect A (complicated vs easy) 1 1 0 0 1 1
main effect B (happy vs sad) 1 1 1 1 1 1
interaction AB (comp. vs av, happy vs sad) 1 1 1 1 0 0
interaction AB (comp. vs easy, happy vs sad) 1 1 0 0 1 1
24 / 38

Global hypothesis tests

Main effect of factor A

H0: μ1.==μna. vs Ha: at least two marginal means of A are different

Main effect of factor B

H0: μ.1==μ.nb vs Ha: at least two marginal means of B are different

Interaction

H0: μij=μi+μj (sum of main effects) vs Ha: effect is not a combination of row/column effect.

25 / 38

Comparing nested models

Rather than present the specifics of ANOVA, we consider a general hypothesis testing framework which is more widely applicable.

We compare two competing models, Ma and M0.

  • the alternative or full model Ma under the alternative Ha with p parameters for the mean
  • the simpler null model M0 under the null H0, which imposes ν restrictions on the full model
26 / 38

The same holds for analysis of deviance for generalized linear models. The latter use likelihood ratio tests (which are equivalent to F-tests for linear models), with a χν2 null distribution.

Intuition behind F-test for ANOVA

The residual sum of squares measures how much variability is leftover, RSSa=i=1n(yiy^iMa)2 where y^i is the estimated mean under model Ma for the observation yi.

The more complex fits better (it is necessarily more flexible), but requires estimation of more parameters.

  • We wish to assess the improvement that would occur by chance, if the null model was correct.
27 / 38

Testing linear restrictions in linear models

If the alternative model has p parameters for the mean, and we impose ν linear restrictions under the null hypothesis to the model estimated based on n independent observations, the test statistic is

F=(RSS0RSSa)/νRSSa/(np)

  • The numerator is the difference in residuals sum of squares, denoted RSS, from models fitted under H0 and Ha, divided by degrees of freedom ν.
  • The denominator is an estimator of the variance, obtained under Ha (termed mean squared error of residuals)
  • The benchmark for tests in linear models is Fisher's F(ν,np).
28 / 38

Analysis of variance table

term degrees of freedom mean square F
A na1 MSA=SSA/(na1) MSA/MSres
B nb1 MSB=SSB/(nb1) MSB/MSres
AB (na1)(nb1) MSAB=SSAB/{(na1)(nb1)} MSAB/MSres
residuals nnanb MSresid=RSSa/(nnanb)
total n1

Read the table backward (starting with the interaction).

  • If there is a significant interaction, the main effects are not of interest and potentially misleading.
29 / 38

Intuition behind degrees of freedom

The model always includes an overall average μ. There are

  • na1 free row means since naμ=μ1.++μna.
  • nb1 free column means as nbμ=μ.1++μ.nb
  • nanb(na1)(nb1)1 interaction terms
B ending
A complexity
b1 (happy) b2 (sad) row mean
a1 (complicated) μ11 X μ1.
a2 (average) μ21 X μ2.
a3 (easy) X X X
column mean μ.1 X μ

Terms with X are fully determined by row/column/total averages

30 / 38

Example 1

The interaction plot suggested that the two-way interaction wasn't significant. The F test confirms this.

There is a significant main effect of both purchase and debttype.

term SS df F stat p-value
purchase 752.3 1 98.21 < .001
debttype 92.2 1 12.04 < .001
purchase:debttype 13.7 1 1.79 .182
Residuals 11467.4 1497
31 / 38

Example 2

There is a significant interaction between station and direction, so follow-up by looking at simple effects or contrasts.

The tests for the main effects are not of interest! Disregard other entries of the ANOVA table

term SS df F stat p-value
station 75.2 3 23.35 < .001
direction 0.4 1 0.38 .541
station:direction 52.4 3 16.28 < .001
Residuals 208.2 194
32 / 38

Main effects for Example 1

We consider differences between debt type labels.

Participants are more likely to consider the offer if it is branded as credit than loan. The difference is roughly 0.5 (on a Likert scale from 1 to 9).

## $emmeans
## debttype emmean SE df lower.CL upper.CL
## credit 5.12 0.101 1497 4.93 5.32
## loan 4.63 0.101 1497 4.43 4.83
##
## Results are averaged over the levels of: purchase
## Confidence level used: 0.95
##
## $contrasts
## contrast estimate SE df t.ratio p.value
## credit - loan 0.496 0.143 1497 3.469 0.0005
##
## Results are averaged over the levels of: purchase
33 / 38

Toronto subway station

Simplified depiction of the Toronto metro stations used in the experiment, based on work by Craftwerker on Wikipedia, distributed under CC-BY-SA 4.0.

Simplified depiction of the Toronto metro stations used in the experiment, based on work by Craftwerker on Wikipedia, distributed under CC-BY-SA 4.0.

34 / 38

Reparametrization for Example 2

Set stdist as 2, 1, +1, +2 to indicate station distance, with negative signs indicating stations in opposite direction of travel

The ANOVA table for the reparametrized models shows no evidence against the null of symmetry (interaction).

term SS df F stat p-value
stdist 121.9 3 37.86 < .001
direction 0.4 1 0.35 .554
stdist:direction 5.7 3 1.77 .154
Residuals 208.2 194
35 / 38

Interaction plot for reformated data

36 / 38

Custom contrasts for Example 2

We are interested in testing the perception of distance, by looking at H0:μ1=μ+1,μ2=μ+2.

mod3 <- lm(distance ~ stdist * direction, data = MP14_S1)
(emm <- emmeans(mod3, specs = "stdist"))
# order is -2, -1, 1, 2
contrasts <- emm |> contrast(
list("two dist" = c(-1, 0, 0, 1),
"one dist" = c(0, -1, 1, 0)))
contrasts # print pairwise contrasts
test(contrasts, joint = TRUE)
37 / 38

Estimated marginal means and contrasts

Strong evidence of differences in perceived distance depending on orientation.

## stdist emmean SE df lower.CL upper.CL
## -2 3.83 0.145 194 3.54 4.11
## -1 2.48 0.144 194 2.20 2.76
## +1 1.62 0.150 194 1.33 1.92
## +2 2.70 0.145 194 2.42 2.99
##
## Results are averaged over the levels of: direction
## Confidence level used: 0.95
## contrast estimate SE df t.ratio p.value
## two dist -1.122 0.205 194 -5.470 <.0001
## one dist -0.856 0.207 194 -4.129 0.0001
##
## Results are averaged over the levels of: direction
## df1 df2 F.ratio p.value
## 2 194 23.485 <.0001
38 / 38

Outline

Factorial designs and interactions

Tests for two-way ANOVA

2 / 38
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
oTile View: Overview of Slides
Esc Back to slideshow