# load libraries

library(data.table)
library(ggplot2)
library(ez)

# clean work space

rm(list = ls())

# init colorscheme

COL <- c("#2271B2", "#E69F00", "#D55E00")
names(COL) <- c("blue", "orange", "red")
theme_set(
  theme_minimal(base_size = 13) +
    theme(
      panel.grid.minor = element_blank(),
      strip.text = element_text(face = "bold"),
      legend.position = "bottom"
    )
)
update_geom_defaults("point", list(size = 2))
update_geom_defaults("line", list(linewidth = 0.8))

Overview

This tutorial introduces three closely related ANOVA designs that arise frequently in cognitive science research:

(a) Repeated-measures (fully within-subjects) ANOVA — every participant contributes data to every level of every factor. This removes between-subject variability from the error term, increasing statistical power.

(b) Two-way factorial ANOVA — two factors are crossed so that every combination of levels is observed. Both factors can be between-subjects, or both can be within-subjects (fully repeated measures).

(c) Mixed ANOVA — one factor is between-subjects (different participants in each group) and one factor is within-subjects (every participant experiences every level). This is extremely common in experimental psychology and neuroscience.

All three designs will be fitted using ezANOVA() from the ez package. ezANOVA() handles the bookkeeping of within- versus between-subjects factors, computes Mauchly’s test of sphericity where needed, and provides epsilon-corrected p-values automatically.


Part 1 — Repeated measures ANOVA (pigeon dataset, within-subjects)

Dataset description

Twenty pigeons participated in a reversal-learning experiment. Each bird completed three successive phases: an initial learning phase (learn), a reversal phase (reverse), and a transfer test phase (test). Ten blocks of trials were recorded within each phase. Birds also belonged to one of two groups — experimental or control — but we begin by ignoring the group factor so we can focus on understanding the within-subjects structure.

Load and aggregate

We first collapse across blocks and birds to obtain one proportion-correct value per bird per phase. This is the level of aggregation required for a repeated-measures ANOVA: one observation per participant per cell.

d <- fread("data/pigeon_reversal_summary.csv")
d_bird_phase <- d[, .(prop_correct = mean(correct)), .(bird_id, group, phase)]
d_bird_phase[, phase := factor(phase, levels = c("learn", "reverse", "test"))]
head(d_bird_phase)
##    bird_id        group   phase prop_correct
##      <int>       <char>  <fctr>        <num>
## 1:       1 experimental   learn         0.67
## 2:       1 experimental reverse         0.58
## 3:       1 experimental    test         0.79
## 4:       2 experimental   learn         0.71
## 5:       2 experimental reverse         0.49
## 6:       2 experimental    test         0.87

Fit the repeated-measures ANOVA

With ezANOVA() we specify:

  • dv — the dependent variable
  • wid — the participant identifier
  • within — the within-subjects factor(s)
  • type = 3 — Type III sums of squares (standard for unbalanced or interaction-containing designs)
  • detailed = TRUE — returns sums of squares and mean squares in addition to F and p
fit_rm <- ezANOVA(
  data    = d_bird_phase,
  dv      = prop_correct,
  wid     = bird_id,
  within  = phase,
  type    = 3,
  detailed = TRUE
)
print(fit_rm)
## $ANOVA
##        Effect DFn DFd      SSn      SSd           F            p p<.05       ges
## 1 (Intercept)   1  19 28.44193 0.053365 10126.42678 2.217231e-27     * 0.9925804
## 2       phase   2  38  0.81676 0.159240    97.45315 1.095077e-15     * 0.7934600
## 
## $`Mauchly's Test for Sphericity`
##   Effect         W         p p<.05
## 2  phase 0.7771319 0.1033838      
## 
## $`Sphericity Corrections`
##   Effect       GGe       p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2  phase 0.8177497 3.17128e-13         * 0.884275 3.999088e-14         *

Sphericity

When a within-subjects factor has three or more levels, the ANOVA assumes sphericity — that the variances of the differences between all pairs of levels are equal. ezANOVA() automatically runs Mauchly’s test to check this assumption. If Mauchly’s test is significant (p < .05), the sphericity assumption is violated and the uncorrected F-test p-value is too liberal.

Two corrections are available in the output:

  • GGe (Greenhouse–Geisser epsilon) — a conservative correction that reduces the degrees of freedom. Use p[GG] when GGe < 0.75.
  • HFe (Huynh–Feldt epsilon) — a less conservative correction. Use p[HF] when GGe ≥ 0.75 but sphericity is still violated.

When Mauchly’s test is not significant, use the uncorrected p-value in the ANOVA table.


Part 2 — Two-way mixed ANOVA (pigeon: group × phase)

Adding a between-subjects factor

Now we include group as a between-subjects factor alongside phase as the within-subjects factor. This gives us a 2 × 3 mixed ANOVA. We can test:

  1. Main effect of group — do experimental and control birds differ overall in accuracy?
  2. Main effect of phase — does accuracy differ across learn, reverse, and test phases, averaging across groups?
  3. Group × phase interaction — does the effect of phase differ between the two groups?
fit_mixed_pigeon <- ezANOVA(
  data    = d_bird_phase,
  dv      = prop_correct,
  wid     = bird_id,
  between = group,
  within  = phase,
  type    = 3,
  detailed = TRUE
)
print(fit_mixed_pigeon)
## $ANOVA
##        Effect DFn DFd         SSn        SSd           F            p p<.05       ges
## 1 (Intercept)   1  18 28.44193500 0.03568333 14347.16945 1.413110e-27     * 0.9958771
## 2       group   1  18  0.01768167 0.03568333     8.91929 7.913966e-03     * 0.1305578
## 3       phase   2  36  0.81676000 0.08206667   179.14314 1.944622e-19     * 0.8739981
## 4 group:phase   2  36  0.07717333 0.08206667    16.92673 6.577536e-06     * 0.3959163
## 
## $`Mauchly's Test for Sphericity`
##        Effect        W         p p<.05
## 3       phase 0.873384 0.3164057      
## 4 group:phase 0.873384 0.3164057      
## 
## $`Sphericity Corrections`
##        Effect       GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 3       phase 0.8876139 1.684781e-17         * 0.977805 4.691907e-19         *
## 4 group:phase 0.8876139 1.833819e-05         * 0.977805 8.051624e-06         *

Interpreting the output

The ANOVA component of the output lists a row for each effect: group, phase, and group:phase. Each row gives the F statistic, numerator and denominator degrees of freedom (DFn, DFd), p-value (p), and generalised eta-squared (ges), which is a measure of effect size. The Mauchly's Test and epsilon rows apply only to effects that involve the within-subjects factor.

An interaction (group:phase) is significant when the relationship between phase and accuracy is not the same for both groups. For example, experimental birds might show a large drop in accuracy during reversal while control birds show no such drop. When an interaction is present, be cautious about interpreting the main effects in isolation — a significant main effect of group could simply reflect the different trajectories across phases rather than a consistent group difference.

Interaction plot

A line plot with phase on the x-axis and one line per group is the standard way to visualise a two-way interaction.

d_sum <- d_bird_phase[,
  .(mean_pc = mean(prop_correct),
    se_pc   = sd(prop_correct) / .N^0.5),
  .(group, phase)
]

ggplot(d_sum, aes(x = phase, y = mean_pc, colour = group, group = group)) +
  geom_line() +
  geom_point(size = 2) +
  geom_errorbar(aes(ymin = mean_pc - se_pc, ymax = mean_pc + se_pc), width = 0.1) +
  labs(
    x      = "Phase",
    y      = "Mean proportion correct",
    colour = "Group",
    title  = "Group × Phase interaction"
  ) +
  scale_colour_manual(values = COL)
Mean proportion correct by phase and group. Error bars are ±1 SE.

Mean proportion correct by phase and group. Error bars are ±1 SE.

Reading the interaction plot: If the two lines are roughly parallel, the effect of phase is similar in both groups and there is no interaction. If the lines converge, diverge, or cross, the effect of phase differs between groups — consistent with a significant interaction. Use the plot alongside the ANOVA table: a non-significant interaction with parallel lines gives you confidence that the main effects can be interpreted straightforwardly.


Part 3 — Mixed ANOVA for switch cost (monkey dataset)

Dataset description

Sixteen rhesus macaques (Macaca mulatta) participated in a single test session. Based on baseline task-switching performance assessed during preliminary testing, animals were assigned to a fast-switcher group (n = 8) or a slow-switcher group (n = 8). All animals were fluid-restricted and received juice rewards for correct responses. During testing, animals were seated in a custom primate chair facing a touchscreen monitor. Each trial began with a briefly presented coloured stimulus — either red or blue — that served as a cue indicating which response mapping was in effect. A target stimulus then appeared and the animal was required to move a joystick left or right to categorise the target. A switch trial was any trial on which the cue colour differed from the immediately preceding trial, requiring a change in the active response mapping. A repeat trial used the same cue colour and mapping as the preceding trial. Each animal completed 400 trials. Response accuracy (correct) and response latency (rt, in seconds) were recorded on each trial.

The key question is whether the switch cost — the RT penalty on switch trials relative to repeat trials — differs between fast-switchers and slow-switchers. This is a 2 (group: between) × 2 (trial type: within) mixed ANOVA.

Load and aggregate

d_monkey <- fread("data/monkey_switching_summary.csv")
d_monkey_agg <- d_monkey[, .(mean_rt = mean(rt)), .(monkey_id, group, trial_type)]
head(d_monkey_agg)
##    monkey_id         group trial_type   mean_rt
##        <int>        <char>     <char>     <num>
## 1:         1 fast-switcher     repeat 0.3821956
## 2:         1 fast-switcher     switch 0.4421073
## 3:         2 fast-switcher     repeat 0.3783784
## 4:         2 fast-switcher     switch 0.4409861
## 5:         3 fast-switcher     repeat 0.3859532
## 6:         3 fast-switcher     switch 0.4491574

Fit the mixed ANOVA

fit_monkey <- ezANOVA(
  data    = d_monkey_agg,
  dv      = mean_rt,
  wid     = monkey_id,
  between = group,
  within  = trial_type,
  type    = 3,
  detailed = TRUE
)
print(fit_monkey)
## $ANOVA
##             Effect DFn DFd        SSn          SSd           F            p p<.05       ges
## 1      (Intercept)   1  14 6.86069232 0.0481742183  1993.79867 1.683881e-16     * 0.9930100
## 2            group   1  14 0.04300998 0.0481742183    12.49921 3.294959e-03     * 0.4710642
## 3       trial_type   1  14 0.09013279 0.0001196522 10546.05884 1.508729e-21     * 0.6511231
## 4 group:trial_type   1  14 0.01671859 0.0001196522  1956.17240 1.922421e-16     * 0.2571598

Note that with only two levels of the within-subjects factor (switch and repeat), sphericity is automatically satisfied — you will see that Mauchly’s test is not computed for two-level factors.

Visualise the switch cost by group

d_monkey_sum <- d_monkey_agg[,
  .(mean_rt = mean(mean_rt),
    se_rt   = sd(mean_rt) / .N^0.5),
  .(group, trial_type)
]

ggplot(d_monkey_sum, aes(x = trial_type, y = mean_rt, colour = group, group = group)) +
  geom_line() +
  geom_point(size = 2) +
  geom_errorbar(aes(ymin = mean_rt - se_rt, ymax = mean_rt + se_rt), width = 0.1) +
  labs(
    x      = "Trial type",
    y      = "Mean RT (s)",
    colour = "Group",
    title  = "Switch cost by group"
  ) +
  scale_colour_manual(values = COL)
Mean RT by trial type and switcher group. Error bars are ±1 SE.

Mean RT by trial type and switcher group. Error bars are ±1 SE.

Interpretation

A significant main effect of trial_type would indicate that, overall, animals are slower on switch trials than repeat trials — a classic switch cost. A significant main effect of group would indicate that one group is generally faster than the other, regardless of trial type. A significant group × trial_type interaction would indicate that the magnitude of the switch cost differs between fast-switchers and slow-switchers — perhaps fast-switchers show a smaller RT penalty when switching than slow-switchers do. As always, inspect the interaction plot alongside the ANOVA table to understand the pattern of means.


Part 4 — Interpreting interactions

When a two-way ANOVA yields a significant interaction, keep the following principles in mind:

Main effects can mislead. A significant main effect of factor A simply reflects the average across all levels of factor B. If the interaction is significant, this average may not represent the effect of A at any particular level of B. Always report and interpret the interaction before the main effects.

Visualise first. A line plot of cell means (one line per level of one factor, levels of the other factor on the x-axis) is the clearest way to see what an interaction looks like. Parallel lines → no interaction; non-parallel lines → potential interaction.

Follow up with simple effects. If the interaction is significant and you want to understand it further, run the ANOVA separately within each level of one factor. For example, if group × phase is significant, run a one-way repeated-measures ANOVA on phase separately for the experimental group and separately for the control group. This is called a simple effects analysis. Keep in mind that running multiple tests inflates the familywise error rate, so apply a correction (e.g., Bonferroni) if appropriate.

Effect size. Generalised eta-squared (ges) is reported by ezANOVA() for every effect. Values around .01 are considered small, .06 medium, and .14 large by convention, though these benchmarks should be interpreted in the context of the specific research area.


Apply this to your assigned dataset

Your assigned dataset is determined by your student ID number. Take the last digit and compute last_digit %% 3 in R.

Load your assigned dataset and run a factorial ANOVA using ezANOVA(). Specifically:

  1. Identify at least two factors in your dataset — at least one of which is within-subjects (e.g., condition, phase, block, or session) and optionally one between-subjects factor (e.g., group or assigned condition).
  2. Aggregate your data to one value per participant per combination of factor levels.
  3. Run the appropriate ANOVA using ezANOVA(), setting the between and within arguments correctly.
  4. Report all main effects and the interaction: F statistic, degrees of freedom, p-value, and eta-squared.
  5. Create an interaction plot using ggplot2 (lines connecting means, one line per level of the between-subjects factor if applicable).
  6. Write a paragraph (4–6 sentences) interpreting the results — including whether there is an interaction and what it means substantively.

This ANOVA and interaction plot are required components of your final project.