Multiway ANOVA and MANOVA

Session 7

MATH 80667A: Experimental Design and Statistical Methods
HEC Montréal

1 / 40

Outline

Multiway ANOVA

MANOVA

2 / 40

Multifactorial designs3 / 40

Beyond two factors

We can consider multiple factors $A$ , $B$ , $C$ , $\dots$ with respectively $n_{a}$ , $n_{b}$ , $n_{c}$ , $\dots$ levels and with $n_{r}$ replications for each.

The total number of treatment combinations is

$n_{a} \times n_{b} \times n_{c} \times \dots$

4 / 40

Beyond two factors

We can consider multiple factors $A$ , $B$ , $C$ , $\dots$ with respectively $n_{a}$ , $n_{b}$ , $n_{c}$ , $\dots$ levels and with $n_{r}$ replications for each.

The total number of treatment combinations is

$n_{a} \times n_{b} \times n_{c} \times \dots$

Curse of dimensionality

4 / 40

Full three-way ANOVA model

Each cell of the cube is allowed to have a different mean

$\begin{aligned} \underset{response}{Y_{i j k r}} = \underset{cell mean}{μ_{i j k}} + \underset{error}{ε_{i j k r}} \end{aligned}$ with $ε_{i j k t}$ are independent error term for

row $i$
column $j$
depth $k$
replication $r$

5 / 40

Parametrization of a three-way ANOVA model

With the sum-to-zero parametrization with factors $A$ , $B$ and $C$ , write the response as

$\begin{aligned} \underset{theoretical average}{E (Y_{i j k r})} & = \underset{global mean}{μ} \\ + \underset{main effects}{α_{i} + β_{j} + γ_{k}} \\ + \underset{two-way interactions}{(α β)_{i j} + (α γ)_{i k} + (β γ)_{j k}} \\ + \underset{three-way interaction}{(α β γ)_{i j k}} \end{aligned}$

6 / 40

global mean, row, column and depth main effects

global mean, row, column and depth main effects

row/col, row/depth and col/depth interactions and three-way interaction.

row/col, row/depth and col/depth interactions and three-way interaction.

7 / 40

Example of three-way design

Petty, Cacioppo and Heesacker (1981). Effects of rhetorical questions on persuasion: A cognitive response analysis. Journal of Personality and Social Psychology.

A $2 \times 2 \times 2$ factorial design with 8 treatments groups and $n = 160$ undergraduates.

Setup: should a comprehensive exam be administered to bachelor students in their final year?

Response Likert scale on $- 5$ (do not agree at all) to $5$ (completely agree)
Factors
$A$ : strength of the argument (strong or weak)
$B$ : involvement of students low (far away, in a long time) or high (next year, at their university)
$C$ : style of argument, either regular form or rhetorical (Don't you think?, ...)

8 / 40

Interaction plot

Interaction plot for a $2 \times 2 \times 2$ factorial design from Petty, Cacioppo and Heesacker (1981)

9 / 40

p.472 of Keppel and Wickens

The microwave popcorn experiment

What is the best brand of microwave popcorn?

Factors
brand (two national, one local)
power: 500W and 600W
time: 4, 4.5 and 5 minutes
Response: ~~weight~~, ~~volume~~, ~~number~~, percentage of popped kernels.
Pilot study showed average of 70% overall popped kernels (10% standard dev), timing values reasonable
Power calculation suggested at least $r = 4$ replicates, but researchers proceeded with $r = 2$ ...

10 / 40

ANOVA table

data(popcorn, package = 'hecedsm')
# Fit model with three-way interaction
model <- aov(percentage ~ brand*power*time,
             data = popcorn)
# ANOVA table - 'anova' is ONLY for balanced designs
anova_table <- anova(model) 
# Quantile-quantile plot
car::qqPlot(model)

Model assumptions: plots and tests are meaningless with $n_{r} = 2$ replications per group...

11 / 40

Quantile-quantile plot

All points fall roughly on a straight line.

12 / 40

Code for interaction plot

popcorn |> 
   group_by(brand, time, power) |>
   summarize(meanp = mean(percentage)) |>
ggplot(mapping = aes(x = power, 
                     y = meanp, 
                     col = time, 
                     group = time)) + 
  geom_line() + 
  facet_wrap(~brand)

13 / 40

Interaction plot

No evidence of three-way interaction (hard to tell with $r = 2$ replications).

14 / 40

Variance decomposition for balanced designs

terms
degrees of freedom


AA
na−1na−1

BB
nb−1nb−1

CC
nc−1nc−1

ABAB
(na−1)(nb−1)(na−1)(nb−1)

ACAC
(na−1)(nc−1)(na−1)(nc−1)

BCBC
(nb−1)(nc−1)(nb−1)(nc−1)

ABCABC
(na−1)(nb−1)(nc−1)(na−1)(nb−1)(nc−1)

residualresidual
nanbnc(R−1)nanbnc(R−1)

totaltotal
nanbncnr−1nanbncnr−1


15 / 40

terms	degrees of freedom
$A$	$n_{a} - 1$
$B$	$n_{b} - 1$
$C$	$n_{c} - 1$
$A B$	$(n_{a} - 1) (n_{b} - 1)$
$A C$	$(n_{a} - 1) (n_{c} - 1)$
$B C$	$(n_{b} - 1) (n_{c} - 1)$
$A B C$	$(n_{a} - 1) (n_{b} - 1) (n_{c} - 1)$
$residual$	$n_{a} n_{b} n_{c} (R - 1)$
$total$	$n_{a} n_{b} n_{c} n_{r} - 1$

Analysis of variance table for the 3-way model
 
      
    Degrees of freedom 
    Sum of squares 
    Mean square 
    F statistic 
    p-value 
  


    brand 

10 
55 
89 
180 
  

    power 

11 
11 
19 
035 
  

    time 

58 
29 
87 
002 
  

    brand:power 

04 
02 
12 
349 
  

    brand:time 

86 
46 
09 
016 
  

    power:time 

71 
85 
27 
765 
  

    brand:power:time 

33 
83 
13 
967 
  

    Residuals 

87 
66 
     
     
  

/ 40

	Degrees of freedom	Sum of squares	Mean square	F statistic	p-value
brand	2	331.10	165.55	1.89	0.180
power	1	455.11	455.11	5.19	0.035
time	2	1554.58	777.29	8.87	0.002
brand:power	2	196.04	98.02	1.12	0.349
brand:time	4	1433.86	358.46	4.09	0.016
power:time	2	47.71	23.85	0.27	0.765
brand:power:time	4	47.33	11.83	0.13	0.967
Residuals	18	1577.87	87.66

Omitting terms in a factorial design

The more levels and factors, the more parameters to estimate (and replications needed)

Costly to get enough observations / power
The assumption of normality becomes more critical when $r = 2$ !

It may be useful not to consider some interactions if they are known or (strongly) suspected not to be present

If important interactions are omitted from the model, biased estimates/output!

17 / 40

Guidelines for the interpretation of effects

Start with the most complicated term (bottom up)

If the three-way interaction $A B C$ is significative:
- don't interpret main effects or two-way interactions!
- comparison is done cell by cell within each level
If the $A B C$ term isn't significative:
- can marginalize and interpret lower order terms
- back to a series of two-way ANOVAs

18 / 40

Marginalization vs main effects

Marginalization means that we reduce the dimension of the problem, e.g., we transform a three-way ANOVA into a two-way ANOVA by collapsing over a dimension.

Main effects are the effects of factors $A$ , $B$ , $C$ (i.e., row, column and depth effects).

19 / 40

What contrasts are of interest?

Can view a three-way ANOVA as a series of one-way ANOVA or two-way ANOVA...

Depending on the goal and if the interactions are significative or not, could compare for variable $A$

marginal contrast $ψ_{A}$ (averaging over $B$ and $C$ )
marginal conditional contrast for particular subgroup: $ψ_{A}$ within $c_{1}$
contrast involving two variables: $ψ_{A B}$
contrast differences between treatment at $ψ_{A} \times B$ , averaging over $C$ .
etc.

See helper code and chapter 22 of Keppel & Wickens (2004) for a detailed example.

20 / 40

Effects and contrasts for microwave-popcorn

Following preplanned comparisons

Which combo (brand, power, time) gives highest popping rate? (pairwise comparisons of all combos)
Best brand overall (marginal means marginalizing over power and time, assuming no interaction)
Effect of time and power on percentage of popped kernels
pairwise comparison of time $\times$ power
main effect of power
main effect of time

21 / 40

Preplanned comparisons using `emmeans`

Let $A$ =brand, $B$ =power, $C$ =time

Compare difference between percentage of popped kernels for 4.5 versus 5 minutes, for brands 1 and 2

$H_{0} : (μ_{1.2} - μ_{1.3}) - (μ_{2.2} - μ_{2.3}) = 0$

library(emmeans)
# marginal means
emm_popcorn_AC <- emmeans(model, 
                          specs = c("brand","time"))
contrast_list <- 
  list(
    brand12with4.5vs5min = c(0, 0, 0, 1, -1, 0, -1, 1,0))
contrast(emm_popcorn_AC,  # marginal mean (no time)
         method = contrast_list) # list of contrasts

22 / 40

Preplanned comparisons

Compare all three times (4, 4.5 and 5 minutes)

At level 99% with Tukey's HSD method

Careful! Potentially misleading because there is a brand * time interaction present.

# List of variables to keep go in `specs`: keep only time
emm_popcorn_C <- emmeans(model, specs = "time")
pairs(emm_popcorn_C, 
      adjust = "tukey", 
      level = 0.99, 
      infer = TRUE)

23 / 40

Multivariate analysis of variance24 / 40

Motivational example

From Anandarajan et al. (2002), Canadian Accounting Perspective

This study questions whether the current or proposed Canadian standard of disclosing a going-concern contingency is viewed as equivalent to the standard adopted in the United States by financial statement users. We examined loan officers’ perceptions across three different formats

25 / 40

Alternative going-concern reporting formats

Bank loan officers were selected as the appropriate financial statement users for this study.

Experiment was conducted on the user’s interpretation of a going-concern contingency when it is provided in one of three disclosure formats:

Integrated note (Canadian standard)
Stand-alone note (Proposed standard)
Stand-alone note plus modified report with explanatory paragraph (standard adopted in US and other countries)

26 / 40

Multivariate response

27 / 40

Why use MANOVA?Control experimentwise errordo a single test instead of JJ univariate ANOVAs, thereby reducing the type I error

Detect differences in combination that would not be found with univariate tests
Increase power (context dependent)
28 / 40

Multivariate model

Postulate the following model: $Y_{i j} \sim {N o r m a l}_{p} (μ_{j}, Σ), j = 1, \dots J$

Each response $Y_{i j}$ is $p$ -dimensional.

We assume multivariate measurements are independent of one another, with

the same multivariate normal distribution
same covariance matrix $Σ$ (each measurement can have different variance)
same mean vector $μ_{j}$ within each $j = 1, \dots, J$ experimental groups.

The model is fitted using multivariate linear regression.

29 / 40

Model fitting with multivariate response

In R, we fit a model binding the different vectors of response in a matrix with $p$ columns

data(AVC02, package = "hecedsm")
# Fit the model binding variables with cbind
# on left of tilde (~) symbol
modMANOVA <- manova(
  cbind(prime, debt, profitability) ~ format, 
  data = AVC02)

30 / 40

Bivariate MANOVA

Confidence ellipses for bivariate MANOVA with discriminant analysis.

We use the correlation between the $p$ measurements to find better discriminant (the diagonal line is the best separating plane between the two variables).

31 / 40

Confidence intervals and confidence regions

Simultaneous confidence region (ellipse), marginal confidence intervals (blue) and Bonferroni-adjusted intervals (green).

The dashed lines show the univariate projections of the confidence ellipse.

32 / 40

Model assumptions

The more complex the model, the more assumptions...

Same as ANOVA, with in addition

The response follow a multivariate normal distribution
- Shapiro–Wilk test, univariate Q-Q plots
The covariance matrix is the same for all subjects
- Box's $M$ test is often used, but highly sensitive to departures from the null (other assumptions impact the test)

Normality matters more in small samples (but tests will often lead to rejection, notably because of rounded measurements or Likert scales)

33 / 40

When to use MANOVA?

In addition, for this model to make sense, you need just enough correlation (Goldilock principle)

if correlation is weak, use univariate analyses
- (no gain from multivariate approach relative to one-way ANOVAs)
- less power due to additional covariance parameter estimation
if correlation is too strong, redundancy
- don't use Likert scales that measure a similar dimension (rather, consider PLS or factor analysis)

Only combine elements that theoretically or conceptually make sense together.

34 / 40

Testing equality of mean vectors

The null hypothesis is that the $J$ groups have the same mean

$H_{0} : μ_{1} = \dots = μ_{J}$ against the alternative that at least one vector is different from the rest.
The null imposes $(J - 1) \times p$ restrictions on the parameters.

The test statistic is Hotelling's $T^{2}$ (with associated null distribution), but we can compute using an $F$ distribution.

35 / 40

Choice of test statistic

In higher dimensions, with $J \geq 3$ , there are many statistics that can be used to test equality of mean.

The statistics are constructed from within/between sum covariance matrices.

These are

Roy's largest root (most powerful provided all assumptions hold)
Wilk's $Λ$ : most powerful, most commonly used
Pillai's trace: most robust choice for departures from normality or equality of covariance matrices

Most give similar conclusion, and they are all equivalent with $J = 2$ .

36 / 40

Results for MANOVAsummary(modMANOVA) # Pilai is default

##            Df  Pillai approx F num Df den Df Pr(>F)
## format      2 0.02581  0.55782      6    256 0.7637
## Residuals 129
summary(modMANOVA, test = "Wilks")

##            Df   Wilks approx F num Df den Df Pr(>F)
## format      2 0.97424   0.5561      6    254  0.765
## Residuals 129
summary(modMANOVA, test = "Hotelling-Lawley")

##            Df Hotelling-Lawley approx F num Df den Df Pr(>F)
## format      2         0.026397  0.55434      6    252 0.7664
## Residuals 129
summary(modMANOVA, test = "Roy") # not reliable here?

##            Df      Roy approx F num Df den Df Pr(>F)
## format      2 0.024436   1.0426      3    128 0.3761
## Residuals 129
37 / 40

MANOVA for repeated measures

We can also use MANOVA for repeated measures to get away from the hypothesis of equal variance per group or equal correlation.

model <- afex::aov_ez(
  id = "id",           # subject id
  dv = "latency",      # response
  within = "stimulus", # within-subject
  data = hecedsm::AA21,
  fun_aggregate = mean)
model$Anova # for models fitted via 'afex'

38 / 40

Output

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##             Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)  1   0.95592   238.56      1     11 8.373e-09 ***
## stimulus     1   0.09419     0.52      2     10    0.6098    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Less powerful than repeated measures ANOVA because we have to estimate more parameters. Still assumes that the covariance structure is the same for each experimental group.

39 / 40

Follow-up analyses

Researchers often conduct post hoc univariate tests using univariate ANOVA. In R, Holm-Bonferonni's method is applied for marginal tests (you need to correct for multiple testing!)

# Results for univariate analysis of variance (as follow-up)
summary.aov(modMANOVA)
# Note the "rep.meas" as default name 
# to get means of each variable separately
emmeans::emmeans(modMANOVA, specs = c("format", "rep.meas"))

A better option is to proceed with descriptive discriminant analysis, a method that tries to find the linear combinations of the vector means to discriminate between groups. Beyond the scope of the course.

40 / 40

Multiway ANOVA and MANOVA

Session 7

MATH 80667A: Experimental Design and Statistical Methods
HEC Montréal

1 / 40

Outline

Multiway ANOVA

MANOVA

2 / 40

Multifactorial designs3 / 40

Beyond two factors

We can consider multiple factors $A$ , $B$ , $C$ , $\dots$ with respectively $n_{a}$ , $n_{b}$ , $n_{c}$ , $\dots$ levels and with $n_{r}$ replications for each.

The total number of treatment combinations is

$n_{a} \times n_{b} \times n_{c} \times \dots$

4 / 40

Beyond two factors

We can consider multiple factors $A$ , $B$ , $C$ , $\dots$ with respectively $n_{a}$ , $n_{b}$ , $n_{c}$ , $\dots$ levels and with $n_{r}$ replications for each.

The total number of treatment combinations is

$n_{a} \times n_{b} \times n_{c} \times \dots$

Curse of dimensionality

4 / 40

Full three-way ANOVA model

Each cell of the cube is allowed to have a different mean

$\begin{aligned} \underset{response}{Y_{i j k r}} = \underset{cell mean}{μ_{i j k}} + \underset{error}{ε_{i j k r}} \end{aligned}$ with $ε_{i j k t}$ are independent error term for

row $i$
column $j$
depth $k$
replication $r$

5 / 40

Parametrization of a three-way ANOVA model

With the sum-to-zero parametrization with factors $A$ , $B$ and $C$ , write the response as

6 / 40

global mean, row, column and depth main effects

row/col, row/depth and col/depth interactions and three-way interaction.

7 / 40

Example of three-way design

Petty, Cacioppo and Heesacker (1981). Effects of rhetorical questions on persuasion: A cognitive response analysis. Journal of Personality and Social Psychology.

A $2 \times 2 \times 2$ factorial design with 8 treatments groups and $n = 160$ undergraduates.

Setup: should a comprehensive exam be administered to bachelor students in their final year?

Response Likert scale on $- 5$ (do not agree at all) to $5$ (completely agree)
Factors
$A$ : strength of the argument (strong or weak)
$B$ : involvement of students low (far away, in a long time) or high (next year, at their university)
$C$ : style of argument, either regular form or rhetorical (Don't you think?, ...)

8 / 40

Interaction plot

Interaction plot for a $2 \times 2 \times 2$ factorial design from Petty, Cacioppo and Heesacker (1981)

9 / 40

p.472 of Keppel and Wickens

The microwave popcorn experiment

What is the best brand of microwave popcorn?

Factors
brand (two national, one local)
power: 500W and 600W
time: 4, 4.5 and 5 minutes
Response: ~~weight~~, ~~volume~~, ~~number~~, percentage of popped kernels.
Pilot study showed average of 70% overall popped kernels (10% standard dev), timing values reasonable
Power calculation suggested at least $r = 4$ replicates, but researchers proceeded with $r = 2$ ...

10 / 40

ANOVA table

data(popcorn, package = 'hecedsm')
# Fit model with three-way interaction
model <- aov(percentage ~ brand*power*time,
             data = popcorn)
# ANOVA table - 'anova' is ONLY for balanced designs
anova_table <- anova(model) 
# Quantile-quantile plot
car::qqPlot(model)

Model assumptions: plots and tests are meaningless with $n_{r} = 2$ replications per group...

11 / 40

Quantile-quantile plot

All points fall roughly on a straight line.

12 / 40

Code for interaction plot

popcorn |> 
   group_by(brand, time, power) |>
   summarize(meanp = mean(percentage)) |>
ggplot(mapping = aes(x = power, 
                     y = meanp, 
                     col = time, 
                     group = time)) + 
  geom_line() + 
  facet_wrap(~brand)

13 / 40

Interaction plot

No evidence of three-way interaction (hard to tell with $r = 2$ replications).

14 / 40

Variance decomposition for balanced designs

terms
degrees of freedom


AA
na−1na−1

BB
nb−1nb−1

CC
nc−1nc−1

ABAB
(na−1)(nb−1)(na−1)(nb−1)

ACAC
(na−1)(nc−1)(na−1)(nc−1)

BCBC
(nb−1)(nc−1)(nb−1)(nc−1)

ABCABC
(na−1)(nb−1)(nc−1)(na−1)(nb−1)(nc−1)

residualresidual
nanbnc(R−1)nanbnc(R−1)

totaltotal
nanbncnr−1nanbncnr−1


15 / 40

terms	degrees of freedom
$A$	$n_{a} - 1$
$B$	$n_{b} - 1$
$C$	$n_{c} - 1$
$A B$	$(n_{a} - 1) (n_{b} - 1)$
$A C$	$(n_{a} - 1) (n_{c} - 1)$
$B C$	$(n_{b} - 1) (n_{c} - 1)$
$A B C$	$(n_{a} - 1) (n_{b} - 1) (n_{c} - 1)$
$residual$	$n_{a} n_{b} n_{c} (R - 1)$
$total$	$n_{a} n_{b} n_{c} n_{r} - 1$

Analysis of variance table for the 3-way model
 
      
    Degrees of freedom 
    Sum of squares 
    Mean square 
    F statistic 
    p-value 
  


    brand 

10 
55 
89 
180 
  

    power 

11 
11 
19 
035 
  

    time 

58 
29 
87 
002 
  

    brand:power 

04 
02 
12 
349 
  

    brand:time 

86 
46 
09 
016 
  

    power:time 

71 
85 
27 
765 
  

    brand:power:time 

33 
83 
13 
967 
  

    Residuals 

87 
66 
     
     
  

/ 40

	Degrees of freedom	Sum of squares	Mean square	F statistic	p-value
brand	2	331.10	165.55	1.89	0.180
power	1	455.11	455.11	5.19	0.035
time	2	1554.58	777.29	8.87	0.002
brand:power	2	196.04	98.02	1.12	0.349
brand:time	4	1433.86	358.46	4.09	0.016
power:time	2	47.71	23.85	0.27	0.765
brand:power:time	4	47.33	11.83	0.13	0.967
Residuals	18	1577.87	87.66

Omitting terms in a factorial design

The more levels and factors, the more parameters to estimate (and replications needed)

Costly to get enough observations / power
The assumption of normality becomes more critical when $r = 2$ !

It may be useful not to consider some interactions if they are known or (strongly) suspected not to be present

If important interactions are omitted from the model, biased estimates/output!

17 / 40

Guidelines for the interpretation of effects

Start with the most complicated term (bottom up)

If the three-way interaction $A B C$ is significative:
- don't interpret main effects or two-way interactions!
- comparison is done cell by cell within each level
If the $A B C$ term isn't significative:
- can marginalize and interpret lower order terms
- back to a series of two-way ANOVAs

18 / 40

Marginalization vs main effects

Marginalization means that we reduce the dimension of the problem, e.g., we transform a three-way ANOVA into a two-way ANOVA by collapsing over a dimension.

Main effects are the effects of factors $A$ , $B$ , $C$ (i.e., row, column and depth effects).

19 / 40

What contrasts are of interest?

Can view a three-way ANOVA as a series of one-way ANOVA or two-way ANOVA...

Depending on the goal and if the interactions are significative or not, could compare for variable $A$

marginal contrast $ψ_{A}$ (averaging over $B$ and $C$ )
marginal conditional contrast for particular subgroup: $ψ_{A}$ within $c_{1}$
contrast involving two variables: $ψ_{A B}$
contrast differences between treatment at $ψ_{A} \times B$ , averaging over $C$ .
etc.

See helper code and chapter 22 of Keppel & Wickens (2004) for a detailed example.

20 / 40

Effects and contrasts for microwave-popcorn

Following preplanned comparisons

Which combo (brand, power, time) gives highest popping rate? (pairwise comparisons of all combos)
Best brand overall (marginal means marginalizing over power and time, assuming no interaction)
Effect of time and power on percentage of popped kernels
pairwise comparison of time $\times$ power
main effect of power
main effect of time

21 / 40

Preplanned comparisons using `emmeans`

Let $A$ =brand, $B$ =power, $C$ =time

Compare difference between percentage of popped kernels for 4.5 versus 5 minutes, for brands 1 and 2

$H_{0} : (μ_{1.2} - μ_{1.3}) - (μ_{2.2} - μ_{2.3}) = 0$

library(emmeans)
# marginal means
emm_popcorn_AC <- emmeans(model, 
                          specs = c("brand","time"))
contrast_list <- 
  list(
    brand12with4.5vs5min = c(0, 0, 0, 1, -1, 0, -1, 1,0))
contrast(emm_popcorn_AC,  # marginal mean (no time)
         method = contrast_list) # list of contrasts

22 / 40

Preplanned comparisons

Compare all three times (4, 4.5 and 5 minutes)

At level 99% with Tukey's HSD method

Careful! Potentially misleading because there is a brand * time interaction present.

# List of variables to keep go in `specs`: keep only time
emm_popcorn_C <- emmeans(model, specs = "time")
pairs(emm_popcorn_C, 
      adjust = "tukey", 
      level = 0.99, 
      infer = TRUE)

23 / 40

Multivariate analysis of variance24 / 40

Motivational example

From Anandarajan et al. (2002), Canadian Accounting Perspective

This study questions whether the current or proposed Canadian standard of disclosing a going-concern contingency is viewed as equivalent to the standard adopted in the United States by financial statement users. We examined loan officers’ perceptions across three different formats

25 / 40

Alternative going-concern reporting formats

Bank loan officers were selected as the appropriate financial statement users for this study.

Experiment was conducted on the user’s interpretation of a going-concern contingency when it is provided in one of three disclosure formats:

Integrated note (Canadian standard)
Stand-alone note (Proposed standard)
Stand-alone note plus modified report with explanatory paragraph (standard adopted in US and other countries)

26 / 40

Multivariate response

27 / 40

Why use MANOVA?Control experimentwise errordo a single test instead of JJ univariate ANOVAs, thereby reducing the type I error

Detect differences in combination that would not be found with univariate tests
Increase power (context dependent)
28 / 40

Multivariate model

Postulate the following model: $Y_{i j} \sim {N o r m a l}_{p} (μ_{j}, Σ), j = 1, \dots J$

Each response $Y_{i j}$ is $p$ -dimensional.

We assume multivariate measurements are independent of one another, with

the same multivariate normal distribution
same covariance matrix $Σ$ (each measurement can have different variance)
same mean vector $μ_{j}$ within each $j = 1, \dots, J$ experimental groups.

The model is fitted using multivariate linear regression.

29 / 40

Model fitting with multivariate response

In R, we fit a model binding the different vectors of response in a matrix with $p$ columns

data(AVC02, package = "hecedsm")
# Fit the model binding variables with cbind
# on left of tilde (~) symbol
modMANOVA <- manova(
  cbind(prime, debt, profitability) ~ format, 
  data = AVC02)

30 / 40

Bivariate MANOVA

Confidence ellipses for bivariate MANOVA with discriminant analysis.

We use the correlation between the $p$ measurements to find better discriminant (the diagonal line is the best separating plane between the two variables).

31 / 40

Confidence intervals and confidence regions

Simultaneous confidence region (ellipse), marginal confidence intervals (blue) and Bonferroni-adjusted intervals (green).

The dashed lines show the univariate projections of the confidence ellipse.

32 / 40

Model assumptions

The more complex the model, the more assumptions...

Same as ANOVA, with in addition

The response follow a multivariate normal distribution
- Shapiro–Wilk test, univariate Q-Q plots
The covariance matrix is the same for all subjects
- Box's $M$ test is often used, but highly sensitive to departures from the null (other assumptions impact the test)

Normality matters more in small samples (but tests will often lead to rejection, notably because of rounded measurements or Likert scales)

33 / 40

When to use MANOVA?

In addition, for this model to make sense, you need just enough correlation (Goldilock principle)

if correlation is weak, use univariate analyses
- (no gain from multivariate approach relative to one-way ANOVAs)
- less power due to additional covariance parameter estimation
if correlation is too strong, redundancy
- don't use Likert scales that measure a similar dimension (rather, consider PLS or factor analysis)

Only combine elements that theoretically or conceptually make sense together.

34 / 40

Testing equality of mean vectors

The null hypothesis is that the $J$ groups have the same mean

$H_{0} : μ_{1} = \dots = μ_{J}$ against the alternative that at least one vector is different from the rest.
The null imposes $(J - 1) \times p$ restrictions on the parameters.

The test statistic is Hotelling's $T^{2}$ (with associated null distribution), but we can compute using an $F$ distribution.

35 / 40

Choice of test statistic

In higher dimensions, with $J \geq 3$ , there are many statistics that can be used to test equality of mean.

The statistics are constructed from within/between sum covariance matrices.

These are

Roy's largest root (most powerful provided all assumptions hold)
Wilk's $Λ$ : most powerful, most commonly used
Pillai's trace: most robust choice for departures from normality or equality of covariance matrices

Most give similar conclusion, and they are all equivalent with $J = 2$ .

36 / 40

Results for MANOVAsummary(modMANOVA) # Pilai is default

##            Df  Pillai approx F num Df den Df Pr(>F)
## format      2 0.02581  0.55782      6    256 0.7637
## Residuals 129
summary(modMANOVA, test = "Wilks")

##            Df   Wilks approx F num Df den Df Pr(>F)
## format      2 0.97424   0.5561      6    254  0.765
## Residuals 129
summary(modMANOVA, test = "Hotelling-Lawley")

##            Df Hotelling-Lawley approx F num Df den Df Pr(>F)
## format      2         0.026397  0.55434      6    252 0.7664
## Residuals 129
summary(modMANOVA, test = "Roy") # not reliable here?

##            Df      Roy approx F num Df den Df Pr(>F)
## format      2 0.024436   1.0426      3    128 0.3761
## Residuals 129
37 / 40

MANOVA for repeated measures

We can also use MANOVA for repeated measures to get away from the hypothesis of equal variance per group or equal correlation.

model <- afex::aov_ez(
  id = "id",           # subject id
  dv = "latency",      # response
  within = "stimulus", # within-subject
  data = hecedsm::AA21,
  fun_aggregate = mean)
model$Anova # for models fitted via 'afex'

38 / 40

Output

## 
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
##             Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)  1   0.95592   238.56      1     11 8.373e-09 ***
## stimulus     1   0.09419     0.52      2     10    0.6098    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Less powerful than repeated measures ANOVA because we have to estimate more parameters. Still assumes that the covariance structure is the same for each experimental group.

39 / 40

Follow-up analyses

Researchers often conduct post hoc univariate tests using univariate ANOVA. In R, Holm-Bonferonni's method is applied for marginal tests (you need to correct for multiple testing!)

# Results for univariate analysis of variance (as follow-up)
summary.aov(modMANOVA)
# Note the "rep.meas" as default name 
# to get means of each variable separately
emmeans::emmeans(modMANOVA, specs = c("format", "rep.meas"))

40 / 40

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help
o	Tile View: Overview of Slides

Multiway ANOVA and MANOVA

Outline

Multifactorial designs

Beyond two factors

Beyond two factors

Full three-way ANOVA model

Parametrization of a three-way ANOVA model

Example of three-way design

Interaction plot

The microwave popcorn experiment

ANOVA table

Quantile-quantile plot

Code for interaction plot

Interaction plot

Variance decomposition for balanced designs

Analysis of variance table for the 3-way model

Omitting terms in a factorial design

Guidelines for the interpretation of effects

Marginalization vs main effects

What contrasts are of interest?

Effects and contrasts for microwave-popcorn

Preplanned comparisons using emmeans

Preplanned comparisons

Multivariate analysis of variance

Motivational example

Alternative going-concern reporting formats

Multivariate response

Why use MANOVA?

Multivariate model

Model fitting with multivariate response

Bivariate MANOVA

Confidence intervals and confidence regions

Model assumptions

When to use MANOVA?

Testing equality of mean vectors

Choice of test statistic

Results for MANOVA

MANOVA for repeated measures

Output

Follow-up analyses

Outline

Help

Multiway ANOVA and MANOVA

Multiway ANOVA and MANOVA

Outline

Multifactorial designs

Beyond two factors

Beyond two factors

Full three-way ANOVA model

Parametrization of a three-way ANOVA model

Example of three-way design

Interaction plot

The microwave popcorn experiment

ANOVA table

Quantile-quantile plot

Code for interaction plot

Interaction plot

Variance decomposition for balanced designs

Analysis of variance table for the 3-way model

Omitting terms in a factorial design

Guidelines for the interpretation of effects

Marginalization vs main effects

What contrasts are of interest?

Effects and contrasts for microwave-popcorn

Preplanned comparisons using emmeans

Preplanned comparisons

Multivariate analysis of variance

Motivational example

Alternative going-concern reporting formats

Multivariate response

Why use MANOVA?

Multivariate model

Model fitting with multivariate response

Bivariate MANOVA

Confidence intervals and confidence regions

Model assumptions

When to use MANOVA?

Testing equality of mean vectors

Choice of test statistic

Results for MANOVA

Preplanned comparisons using `emmeans`

Preplanned comparisons using `emmeans`