Assumptions of one-way ANOVA

2025

Assumptions of one-way ANOVA

Independence: Observations within and between groups are independent.
Normality: Residual scores are normally distributed.
- Residuals are the differences between each individual score and the group mean.
- This assumption means that the distribution of the raw data within each group is normal, centered at that group’s mean.
Homogeneity of variance: Population variances are equal across groups.
Scale of measurement: The dependent variable is continuous (i.e., interval or ratio scale). Independent variable is categorical.

Independence assumption

No direct test of independence. It must be addressed via study design:
- E.g., different subjects in each group
- E.g., no repeated measures unless using repeated-measures ANOVA (we will cover this in another slide deck)

Why Does Normality of Residuals Matter?

The F-statistic is a ratio of two scaled \(\chi^2\) variables:

\[ F = \frac{MS_{\text{between}}}{MS_{\text{within}}} \]

What is a \(\chi^2\) distribution?

If \(Z_1, Z_2, \dots, Z_k\) are independent and each \(Z_i \sim \mathscr{N}(0, 1)\), then:

\[ \sum_{i=1}^k Z_i^2 \sim \chi^2_k \]

This is the definition of a \(\chi^2\) distribution with \(k\) degrees of freedom.
The \(\chi^2\) distribution is completely defined by its degrees of freedom.

What is a \(\chi^2\) distribution?

Why Does Normality of Residuals Matter?

The MS terms in the F-stats come from SS terms:

\[ SS_{\text{between}} = \sum_{i=1}^k n_i (\bar{Y}_i - \bar{Y})^2 \]

\[ SS_{\text{within}} = \sum_{i=1}^k \sum_{j=1}^{n_i} (Y_{ij} - \bar{Y}_i)^2 \]

Why Does Normality of Residuals Matter?

For these to follow \(\chi^2\) distributions:
- The deviations inside each sum must be normally distributed

How Normal Residuals Lead to \(\chi^2\) Terms

A residual is just the difference between an individual score and its group mean:

\[ \begin{align*} \varepsilon_{ij} &= \text{observation}_{ij} - \text{group mean}_i \\ \varepsilon_{ij} &= Y_{ij} - \mu_i \end{align*} \]

How Normal Residuals Lead to \(\chi^2\) Terms

Assume residuals are independent and normal:

\[ \varepsilon_{ij} = Y_{ij} - \mu_i \sim \mathscr{N}(0, \sigma^2) \]

How Normal Residuals Lead to \(\chi^2\) Terms

Then sample means are also normal:

\[ \bar{Y}_i = \mu_i + \frac{1}{n_i} \sum \varepsilon_{ij} \sim \mathscr{N}\left(\mu_i, \frac{\sigma^2}{n_i}\right) \]

How Normal Residuals Lead to \(\chi^2\) Terms

So:
\(\bar{Y}_i - \bar{Y} \sim \text{Normal}\)
\(Y_{ij} - \bar{Y}_i \sim \text{Normal}\)

How Normal Residuals Lead to \(\chi^2\) Terms

Squaring normal variables gives \(\chi^2\) terms:
\(SS_{\text{between}} \sim \chi^2_{k-1}\)
\(SS_{\text{within}} \sim \chi^2_{N - k}\)

Summary: What Normality Ensures

Normal residuals \(\rightarrow\) normal sample means and deviations
Squared normal deviations \(\rightarrow\) \(\chi^2\) distributed sums
\(\chi^2\) numerator & denominator \(\rightarrow\) F-distributed ratio

Checking Normality of Residuals

Residual-based check (Model-level)
Fit the ANOVA model and extract residuals:
Tests whether the model residuals are normally distributed
This is the standard approach to check the ANOVA assumption
Use when verifying the validity of the F-test

Checking Normality in the data

Recall that normality of residuals is equivalent to normality of the data within each group
Test normality within each group:
Helps identify which group(s) may be problematic
Useful when residuals look non-normal and you want to localize the issue

Checking for normality of residuals

If you had the residuals from your ANOVA, you could check for normality using a histogram or a statistical test.
ezANOVA hides the residuals by default but we can tell it to provide them using the return_aov argument

ezANOVA(
  data = d,
  dv = .(rt),
  wid = .(subject),
  between = .(stimulation, task),
  type = 3,
  return_aov = TRUE # This returns the aov object
)

Example: Extracting residuals from `ezANOVA()`

Suppose we’re testing how brain stimulation (Sham vs. TMS) and task type (Working Memory vs. Attention) affect response time in a cognitive task. Our data.table looks like this:

##    subject stimulation task       rt
## 1        1        Sham   WM 527.5810
## 2        2        Sham   WM 540.7929
## 3        3        Sham   WM 612.3483
## 4        4        Sham   WM 552.8203
## 5        5        Sham   WM 555.1715
## 6        6        Sham   WM 618.6026
## 7        7        Sham   WM 568.4366
## 8        8        Sham   WM 499.3976
## 9        9        Sham   WM 522.5259
## 10      10        Sham   WM 532.1735
## 11      11        Sham   WM 598.9633
## 12      12        Sham   WM 564.3926
## 13      13        Sham   WM 566.0309
## 14      14        Sham   WM 554.4273
## 15      15        Sham   WM 527.7664
## 16      16        Sham   WM 621.4765
## 17      17        Sham   WM 569.9140
## 18      18        Sham   WM 471.3353
## 19      19        Sham   WM 578.0542
## 20      20        Sham   WM 531.0883
## 21      21        Sham ATTN 557.2871
## 22      22        Sham ATTN 591.2810
## 23      23        Sham ATTN 558.9598
## 24      24        Sham ATTN 570.8444
## 25      25        Sham ATTN 574.9984
## 26      26        Sham ATTN 532.5323
## 27      27        Sham ATTN 633.5115
## 28      28        Sham ATTN 606.1349
## 29      29        Sham ATTN 554.4745
## 30      30        Sham ATTN 650.1526
## 31      31        Sham ATTN 617.0586
## 32      32        Sham ATTN 588.1971
## 33      33        Sham ATTN 635.8050
## 34      34        Sham ATTN 635.1253
## 35      35        Sham ATTN 632.8632
## 36      36        Sham ATTN 627.5456
## 37      37        Sham ATTN 622.1567
## 38      38        Sham ATTN 597.5235
## 39      39        Sham ATTN 587.7615
## 40      40        Sham ATTN 584.7812
## 41      41         TMS   WM 492.2117
## 42      42         TMS   WM 511.6833
## 43      43         TMS   WM 469.3841
## 44      44         TMS   WM 606.7582
## 45      45         TMS   WM 568.3185
## 46      46         TMS   WM 475.0757
## 47      47         TMS   WM 503.8846
## 48      48         TMS   WM 501.3338
## 49      49         TMS   WM 551.1986
## 50      50         TMS   WM 516.6652
## 51      51         TMS   WM 530.1327
## 52      52         TMS   WM 518.8581
## 53      53         TMS   WM 518.2852
## 54      54         TMS   WM 574.7441
## 55      55         TMS   WM 510.9692
## 56      56         TMS   WM 580.6588
## 57      57         TMS   WM 458.0499
## 58      58         TMS   WM 543.3845
## 59      59         TMS   WM 524.9542
## 60      60         TMS   WM 528.6377
## 61      61         TMS ATTN 605.1856
## 62      62         TMS ATTN 569.9071
## 63      63         TMS ATTN 576.6717
## 64      64         TMS ATTN 549.2570
## 65      65         TMS ATTN 547.1284
## 66      66         TMS ATTN 602.1411
## 67      67         TMS ATTN 607.9284
## 68      68         TMS ATTN 592.1202
## 69      69         TMS ATTN 626.8907
## 70      70         TMS ATTN 672.0034
## 71      71         TMS ATTN 570.3588
## 72      72         TMS ATTN 497.6332
## 73      73         TMS ATTN 630.2295
## 74      74         TMS ATTN 561.6320
## 75      75         TMS ATTN 562.4797
## 76      76         TMS ATTN 631.0229
## 77      77         TMS ATTN 578.6091
## 78      78         TMS ATTN 541.1713
## 79      79         TMS ATTN 597.2521
## 80      80         TMS ATTN 584.4443

Example: Extracting residuals from `ezANOVA()`

Run the ANOVA and extract the residuals:

# Run ezANOVA
ez_result <- ezANOVA(
  data = d,
  dv = .(rt),
  wid = .(subject),
  between = .(stimulation, task),
  type = 3,
  return_aov = TRUE
)

# Extract and check residuals
res <- residuals(ez_result$aov)

Example: Extracting residuals from `ezANOVA()`

Now that we have the residuals, we can check their distribution using the approaches we’ve already covered (e.g., histogram, Shapiro-Wilk test)

Example: Extracting residuals from `ezANOVA()`

Now that we have the residuals, we can check their distribution using the approaches we’ve already covered (e.g., histogram, Shapiro-Wilk test)

## 
##  Shapiro-Wilk normality test
## 
## data:  res
## W = 0.99368, p-value = 0.9663

Homogeneity of variance

Also called homoscedasticity
Assumes population variances are equal across all groups.
Especially important when sample sizes are unequal.
Levene’s test is a common test for homogeneity of variance and is provided in the ez output.

Robustness of ANOVA Assumptions

Normality of residuals:
- ANOVA is robust to violations of normality when:
  - Sample sizes are moderate to large (e.g., \(\geq 30\))
  - Group sizes are equal or similar
- ANOVA is not robust to violations of normality when:
  - Sample sizes are small
  - Group sizes are highly unequal

Robustness of ANOVA Assumptions

Homogeneity of variance:
- ANOVA is more robust when:
  - Group sizes are equal
  - Variance differences are small to moderate
- ANOVA is not robust, even with equal group sizes, when:
  - Variance differences are large

Violations of homogeneity of variance

Question

You run a one-way ANOVA and extract residuals. The histogram below shows the residuals. What should you conclude?

The residuals are clearly skewed, violating the normality assumption of ANOVA. If sample sizes are small or group sizes are unequal, this could invalidate the F-test.

Question

Consider the boxplot below. All group means are equal. What ANOVA assumption might be violated?

This violates the assumption of homogeneity of variance. Although group means are equal, Group C has much higher variability. If group sizes were also unequal, the F-test would be especially unreliable.

Question

You simulate ANOVA results under the null (equal group means) while increasing the variance in one group. What does this tell you?

As the variance in Group C increases, the Type I error rate rises above 0.05 — even though group sizes are equal. This demonstrates that ANOVA is not robust to extreme variance inequality, especially when variance ratios are large.

Question

You simulate ANOVA results under the null hypothesis by explicitly generating non-normal residuals, especially skewed in the smallest group. How does increasing imbalance affect Type I error?

This simulation imposes non-normal residuals specifically in the smallest group, creating asymmetry in the data that violates ANOVA’s assumptions. As imbalance increases, so does the Type I error rate. This confirms that ANOVA is not robust when residuals are non-normal and group sizes are unequal.