# load libraries
library(data.table)
library(ggplot2)
library(ez)
# clean work space
rm(list = ls())
# init colorscheme
COL <- c("#2271B2", "#E69F00", "#D55E00")
names(COL) <- c("blue", "orange", "red")
theme_set(
theme_minimal(base_size = 13) +
theme(
panel.grid.minor = element_blank(),
strip.text = element_text(face = "bold"),
legend.position = "bottom"
)
)
update_geom_defaults("point", list(size = 2))
update_geom_defaults("line", list(linewidth = 0.8))
This tutorial introduces three closely related ANOVA designs that arise frequently in cognitive science research:
(a) Repeated-measures (fully within-subjects) ANOVA — every participant contributes data to every level of every factor. This removes between-subject variability from the error term, increasing statistical power.
(b) Two-way factorial ANOVA — two factors are crossed so that every combination of levels is observed. Both factors can be between-subjects, or both can be within-subjects (fully repeated measures).
(c) Mixed ANOVA — one factor is between-subjects (different participants in each group) and one factor is within-subjects (every participant experiences every level). This is extremely common in experimental psychology and neuroscience.
All three designs will be fitted using ezANOVA() from
the ez package. ezANOVA() handles the
bookkeeping of within- versus between-subjects factors, computes
Mauchly’s test of sphericity where needed, and provides
epsilon-corrected p-values automatically.
Twenty pigeons participated in a reversal-learning experiment. Each
bird completed three successive phases: an initial learning phase
(learn), a reversal phase (reverse), and a
transfer test phase (test). Ten blocks of trials were
recorded within each phase. Birds also belonged to one of two groups —
experimental or control — but we begin by
ignoring the group factor so we can focus on understanding the
within-subjects structure.
We first collapse across blocks and birds to obtain one proportion-correct value per bird per phase. This is the level of aggregation required for a repeated-measures ANOVA: one observation per participant per cell.
d <- fread("data/pigeon_reversal_summary.csv")
d_bird_phase <- d[, .(prop_correct = mean(correct)), .(bird_id, group, phase)]
d_bird_phase[, phase := factor(phase, levels = c("learn", "reverse", "test"))]
head(d_bird_phase)
## bird_id group phase prop_correct
## <int> <char> <fctr> <num>
## 1: 1 experimental learn 0.67
## 2: 1 experimental reverse 0.58
## 3: 1 experimental test 0.79
## 4: 2 experimental learn 0.71
## 5: 2 experimental reverse 0.49
## 6: 2 experimental test 0.87
With ezANOVA() we specify:
dv — the dependent variablewid — the participant identifierwithin — the within-subjects factor(s)type = 3 — Type III sums of squares (standard for
unbalanced or interaction-containing designs)detailed = TRUE — returns sums of squares and mean
squares in addition to F and pfit_rm <- ezANOVA(
data = d_bird_phase,
dv = prop_correct,
wid = bird_id,
within = phase,
type = 3,
detailed = TRUE
)
print(fit_rm)
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 (Intercept) 1 19 28.44193 0.053365 10126.42678 2.217231e-27 * 0.9925804
## 2 phase 2 38 0.81676 0.159240 97.45315 1.095077e-15 * 0.7934600
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 phase 0.7771319 0.1033838
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 phase 0.8177497 3.17128e-13 * 0.884275 3.999088e-14 *
When a within-subjects factor has three or more levels, the ANOVA
assumes sphericity — that the variances of the differences
between all pairs of levels are equal. ezANOVA()
automatically runs Mauchly’s test to check this
assumption. If Mauchly’s test is significant (p < .05), the
sphericity assumption is violated and the uncorrected F-test p-value is
too liberal.
Two corrections are available in the output:
p[GG]
when GGe < 0.75.p[HF] when GGe ≥ 0.75 but sphericity is
still violated.When Mauchly’s test is not significant, use the uncorrected p-value
in the ANOVA table.
Now we include group as a between-subjects factor
alongside phase as the within-subjects factor. This gives
us a 2 × 3 mixed ANOVA. We can test:
fit_mixed_pigeon <- ezANOVA(
data = d_bird_phase,
dv = prop_correct,
wid = bird_id,
between = group,
within = phase,
type = 3,
detailed = TRUE
)
print(fit_mixed_pigeon)
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 (Intercept) 1 18 28.44193500 0.03568333 14347.16945 1.413110e-27 * 0.9958771
## 2 group 1 18 0.01768167 0.03568333 8.91929 7.913966e-03 * 0.1305578
## 3 phase 2 36 0.81676000 0.08206667 179.14314 1.944622e-19 * 0.8739981
## 4 group:phase 2 36 0.07717333 0.08206667 16.92673 6.577536e-06 * 0.3959163
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 3 phase 0.873384 0.3164057
## 4 group:phase 0.873384 0.3164057
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 3 phase 0.8876139 1.684781e-17 * 0.977805 4.691907e-19 *
## 4 group:phase 0.8876139 1.833819e-05 * 0.977805 8.051624e-06 *
The ANOVA component of the output lists a row for each
effect: group, phase, and
group:phase. Each row gives the F statistic, numerator and
denominator degrees of freedom (DFn, DFd),
p-value (p), and generalised eta-squared
(ges), which is a measure of effect size. The
Mauchly's Test and epsilon rows apply only to effects that
involve the within-subjects factor.
An interaction (group:phase) is
significant when the relationship between phase and accuracy is not the
same for both groups. For example, experimental birds might show a large
drop in accuracy during reversal while control birds show no such drop.
When an interaction is present, be cautious about interpreting the main
effects in isolation — a significant main effect of group could simply
reflect the different trajectories across phases rather than a
consistent group difference.
A line plot with phase on the x-axis and one line per group is the standard way to visualise a two-way interaction.
d_sum <- d_bird_phase[,
.(mean_pc = mean(prop_correct),
se_pc = sd(prop_correct) / .N^0.5),
.(group, phase)
]
ggplot(d_sum, aes(x = phase, y = mean_pc, colour = group, group = group)) +
geom_line() +
geom_point(size = 2) +
geom_errorbar(aes(ymin = mean_pc - se_pc, ymax = mean_pc + se_pc), width = 0.1) +
labs(
x = "Phase",
y = "Mean proportion correct",
colour = "Group",
title = "Group × Phase interaction"
) +
scale_colour_manual(values = COL)
Mean proportion correct by phase and group. Error bars are ±1 SE.
Reading the interaction plot: If the two lines are roughly parallel, the effect of phase is similar in both groups and there is no interaction. If the lines converge, diverge, or cross, the effect of phase differs between groups — consistent with a significant interaction. Use the plot alongside the ANOVA table: a non-significant interaction with parallel lines gives you confidence that the main effects can be interpreted straightforwardly.
Sixteen rhesus macaques (Macaca mulatta) participated in a single test session. Based on baseline task-switching performance assessed during preliminary testing, animals were assigned to a fast-switcher group (n = 8) or a slow-switcher group (n = 8). All animals were fluid-restricted and received juice rewards for correct responses. During testing, animals were seated in a custom primate chair facing a touchscreen monitor. Each trial began with a briefly presented coloured stimulus — either red or blue — that served as a cue indicating which response mapping was in effect. A target stimulus then appeared and the animal was required to move a joystick left or right to categorise the target. A switch trial was any trial on which the cue colour differed from the immediately preceding trial, requiring a change in the active response mapping. A repeat trial used the same cue colour and mapping as the preceding trial. Each animal completed 400 trials. Response accuracy (correct) and response latency (rt, in seconds) were recorded on each trial.
The key question is whether the switch cost — the RT penalty on switch trials relative to repeat trials — differs between fast-switchers and slow-switchers. This is a 2 (group: between) × 2 (trial type: within) mixed ANOVA.
d_monkey <- fread("data/monkey_switching_summary.csv")
d_monkey_agg <- d_monkey[, .(mean_rt = mean(rt)), .(monkey_id, group, trial_type)]
head(d_monkey_agg)
## monkey_id group trial_type mean_rt
## <int> <char> <char> <num>
## 1: 1 fast-switcher repeat 0.3821956
## 2: 1 fast-switcher switch 0.4421073
## 3: 2 fast-switcher repeat 0.3783784
## 4: 2 fast-switcher switch 0.4409861
## 5: 3 fast-switcher repeat 0.3859532
## 6: 3 fast-switcher switch 0.4491574
fit_monkey <- ezANOVA(
data = d_monkey_agg,
dv = mean_rt,
wid = monkey_id,
between = group,
within = trial_type,
type = 3,
detailed = TRUE
)
print(fit_monkey)
## $ANOVA
## Effect DFn DFd SSn SSd F p p<.05 ges
## 1 (Intercept) 1 14 6.86069232 0.0481742183 1993.79867 1.683881e-16 * 0.9930100
## 2 group 1 14 0.04300998 0.0481742183 12.49921 3.294959e-03 * 0.4710642
## 3 trial_type 1 14 0.09013279 0.0001196522 10546.05884 1.508729e-21 * 0.6511231
## 4 group:trial_type 1 14 0.01671859 0.0001196522 1956.17240 1.922421e-16 * 0.2571598
Note that with only two levels of the within-subjects factor
(switch and repeat), sphericity is
automatically satisfied — you will see that Mauchly’s test is not
computed for two-level factors.
d_monkey_sum <- d_monkey_agg[,
.(mean_rt = mean(mean_rt),
se_rt = sd(mean_rt) / .N^0.5),
.(group, trial_type)
]
ggplot(d_monkey_sum, aes(x = trial_type, y = mean_rt, colour = group, group = group)) +
geom_line() +
geom_point(size = 2) +
geom_errorbar(aes(ymin = mean_rt - se_rt, ymax = mean_rt + se_rt), width = 0.1) +
labs(
x = "Trial type",
y = "Mean RT (s)",
colour = "Group",
title = "Switch cost by group"
) +
scale_colour_manual(values = COL)
Mean RT by trial type and switcher group. Error bars are ±1 SE.
A significant main effect of trial_type would indicate that, overall, animals are slower on switch trials than repeat trials — a classic switch cost. A significant main effect of group would indicate that one group is generally faster than the other, regardless of trial type. A significant group × trial_type interaction would indicate that the magnitude of the switch cost differs between fast-switchers and slow-switchers — perhaps fast-switchers show a smaller RT penalty when switching than slow-switchers do. As always, inspect the interaction plot alongside the ANOVA table to understand the pattern of means.
When a two-way ANOVA yields a significant interaction, keep the following principles in mind:
Main effects can mislead. A significant main effect of factor A simply reflects the average across all levels of factor B. If the interaction is significant, this average may not represent the effect of A at any particular level of B. Always report and interpret the interaction before the main effects.
Visualise first. A line plot of cell means (one line per level of one factor, levels of the other factor on the x-axis) is the clearest way to see what an interaction looks like. Parallel lines → no interaction; non-parallel lines → potential interaction.
Follow up with simple effects. If the interaction is significant and you want to understand it further, run the ANOVA separately within each level of one factor. For example, if group × phase is significant, run a one-way repeated-measures ANOVA on phase separately for the experimental group and separately for the control group. This is called a simple effects analysis. Keep in mind that running multiple tests inflates the familywise error rate, so apply a correction (e.g., Bonferroni) if appropriate.
Effect size. Generalised eta-squared
(ges) is reported by ezANOVA() for every
effect. Values around .01 are considered small, .06 medium, and .14
large by convention, though these benchmarks should be interpreted in
the context of the specific research area.
Your assigned dataset is determined by your student ID number. Take
the last digit and compute last_digit %% 3 in R.
0: https://github.com/crossley/cogs2020/tree/main/final_project_data/cat_learn_switch1: https://github.com/crossley/cogs2020/tree/main/final_project_data/cat_learn_auto2: https://github.com/crossley/cogs2020/tree/main/final_project_data/cat_learn_unlearnLoad your assigned dataset and run a factorial ANOVA using
ezANOVA(). Specifically:
ezANOVA(), setting the
between and within arguments correctly.This ANOVA and interaction plot are required components of your final project.