26 Repeated measures ANOVA
A repeated measures design is one in which at least one of the factors consists of repeated measurements on the same experiment unit – this usually corresponds to multiple measurements from the same subjects.
It is fair to view this as an extension of the paired-samples t-test, just as it is fair to view factorial ANOVA as an extension of the independent samples t-test.
Advantage: individual differences possibly reduced as a source of between-group differences.
Advantage: sample size is not divided between conditions so can require fewer subjects.
Disadvantage: fewer subjects means smaller degrees of freedom (we will see below the relevant \(df\) term shrinks from \(n_{observations} - k\) to \((k - 1)(n_{subject} - 1)\). The more degrees of freedom we have, in general, the less extreme observed outcome we need to reject the null (because of its effect on the shape of the sampling distribution of our test statistic).
26.1 Intuition
The intuition for a repeated measures ANOVA is the same as that for a factorial ANOVA.
E.g., if the population means in the levels of some factor (e.g., the mean effect of different doses of a medicine) are different, then between-level variability should be greater than within-level variability.
However, the repeated measures aspect introduces one important difference.
Between-level variability will inherently be smaller in a repeated measures design than in an independent samples design (e.g., because the same subjects give measurements for each level, and subjects tend to be similar to themselves).
This means that, to decide that there are true differences, we should require less between-level differences in variability for repeated measures designs than for independent samples designs.
Recall that for a factorial ANOVA, the \(F\)-test that we use is a ratio of between-level variability to within-level variability.
\[F = \frac{MS_{between-levels}}{MS_{within-levels}}\]
In a repeated measures ANOVA, the \(F\)-test that we use is the ratio
\[F = \frac{MS_{between-levels}}{MS_{within-levels} - MS_{between-subjects}}\]
26.2 Formal treatment
- \(k\) is the number of factor levels
- \(n\) is the number of subjects
- \(x_{ij}\) is observation from factor level \(i\) and subject \(j\)
\[\begin{align} SS_{between-levels} &= n \sum_{i=1}^k (\bar{x_{i \bullet}} - \bar{x_{\bullet \bullet}})^2 \\ SS_{within-levels} &= \sum_{i=1}^k \sum_{j=1}^n (x_{ij} - \bar{x_{i \bullet}})^2 \\ SS_{between-subject} &= k \sum_{j=1}^n (\bar{x_{\bullet j}} - \bar{x_{\bullet \bullet}})^2 \\ SS_{error} &= SS_{within-levels} - SS_{between-subject} \\ \end{align}\]
The nomenclature \(SS_{error}\) will make more sense in the coming lectures.
This leads to the ANOVA table:
\(Df\) | \(SS\) | \(MS\) | \(F\) | \(P(>F)\) |
---|---|---|---|---|
k-1 | see above | \(SS_{between-levels}\) | \(\frac{MS_{between-levels}}{MS_{error}}\) | |
(k-1)(n-1) | see above | \(SS_{error}\) |
26.3 Repeated measures ANOVA in R
26.3.1 toy example
## level subject score
## 1: 1 1 11.262954
## 2: 1 2 9.673767
## 3: 1 3 11.329799
## 4: 1 4 11.272429
## 5: 1 5 10.414641
## 6: 2 1 18.460050
## 7: 2 2 19.071433
## 8: 2 3 19.705280
## 9: 2 4 19.994233
## 10: 2 5 22.404653
## 11: 3 1 30.763593
## 12: 3 2 29.200991
## 13: 3 3 28.852343
## 14: 3 4 29.710538
## 15: 3 5 29.700785
- Notice in the above data that each subject gives multiple measurements (one per factor level).
level <- rep(1:3, each=5)
subject <- rep(1:5, 3)
score <- c(11.262954, 9.673767, 11.329799, 11.272429, 10.414641,
18.460050, 19.071433, 19.705280, 19.994233, 22.404653,
30.763593, 29.200991, 28.852343, 29.710538, 29.700785
)
d <- data.table(level, subject, score)
k <- d[, length(unique(level))] # n factor levels
n <- d[, length(unique(subject))] # n subs
## do it by hand
ss_between_levels <- 0
for(i in 1:k) {
ss_between_levels <- ss_between_levels +
(d[level==i, mean(score)] - d[, mean(score)])^2
}
ss_between_levels <- n * ss_between_levels
ss_between_subject <- 0
for(j in 1:n) {
ss_between_subject <- ss_between_subject +
(d[subject==j, mean(score)] - d[, mean(score)])^2
}
ss_between_subject <- k * ss_between_subject
ss_within_levels <- 0
for(i in 1:k) {
for(j in 1:n) {
ss_within_levels <- ss_within_levels +
(d[level==i & subject==j, score] - d[level==i, mean(score)])^2
}
}
ss_error <- ss_within_levels - ss_between_subject
df_between_levels <- k - 1
df_error <- (k-1)*(n-1)
ms_between_levels <- ss_between_levels / df_between_levels
ms_error <- ss_error / df_error
fobs <- ms_between_levels / ms_error
p_val <- pf(fobs, df_between_levels, df_error, lower.tail=F)
## Use the function `ezANOVA()` from the `ez` package to
## perform a repeated measures ANOVA
d[, subject := factor(subject)]
d[, level := factor(level)]
ezANOVA(
data=d, ## where the data is located
dv=score, ## the dependent variable
wid=subject, ## the repeated measure indicator column
within = .(level), ## a list of repeated measures factors
type = 3 ## type of sums of squares desired
)
## $ANOVA
## Effect DFn DFd F p p<.05 ges
## 2 level 2 8 370.7887 1.29746e-08 * 0.9852661
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 level 0.5279246 0.3835817
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 level 0.679313 2.30526e-06 * 0.9073176 5.770776e-08 *
26.3.2 Real data
## Consider the MIS data
d <- fread('https://crossley.github.io/book_stats/data/mis/mis_data.csv')
## We will answer this question:
## Are there significant differences in the mean error per
## subject across phases? Note that this question ignores
## differences between conditions
## First, fix the annoying bug that different subjects in
## different groups have the same number.
d[group==1, subject := subject+10]
## compute mean error per subject
dd <- d[order(subject, phase), mean(error, na.rm=TRUE), .(subject, phase)]
## It's important to code factors as factors
dd[, subject := factor(subject)]
dd[, phase := factor(phase)]
## do it by hand
n <- d[, length(unique(subject))]
k <- d[, length(unique(phase))]
ss_between_phases <- 0
for(i in d[, unique(phase)]) {
ss_between_phases <- ss_between_phases +
(dd[phase==i, mean(V1)] - dd[, mean(V1)])^2
}
ss_between_phases <- n * ss_between_phases
ss_between_subject <- 0
for(j in d[, unique(subject)]) {
ss_between_subject <- ss_between_subject +
(dd[subject==j, mean(V1)] - dd[, mean(V1)])^2
}
ss_between_subject <- k * ss_between_subject
ss_within_phases <- 0
for(i in d[, unique(phase)]) {
for(j in d[, unique(subject)]) {
ss_within_phases <- ss_within_phases +
(dd[phase==i & subject==j, V1] - dd[phase==i, mean(V1)])^2
}
}
ss_error <- ss_within_phases - ss_between_subject
df_between_phases <- k - 1
df_error <- (k-1)*(n-1)
ms_between_phases <- ss_between_phases / df_between_phases
ms_error <- ss_error / df_error
fobs <- ms_between_phases / ms_error
p_val <- pf(fobs, df_between_levels, df_error, lower.tail=F)
## Do it with ezANOVA()
ezANOVA(
data=dd,
dv=V1,
wid=subject,
within=.(phase),
type=3
)
## $ANOVA
## Effect DFn DFd F p p<.05 ges
## 2 phase 2 38 123.1449 2.479838e-17 * 0.7786623
##
## $`Mauchly's Test for Sphericity`
## Effect W p p<.05
## 2 phase 0.4135505 0.0003537993 *
##
## $`Sphericity Corrections`
## Effect GGe p[GG] p[GG]<.05 HFe p[HF] p[HF]<.05
## 2 phase 0.6303384 9.950264e-12 * 0.654296 4.302482e-12 *
26.3.3 Making sense of ezANOVA
output
What is Mauchly's Test for Sphericity
and Sphericity Corrections
? Both have to do with the underlying
assumptions being made by a repeated measures ANOVA. Time
permitting, we will return to this as we review the course
material in preparation for the final exam.
26.3.4 Quick note on balanced versus unbalanced data
The formulas I wrote in previous sections for computing the
various sums of squares all assumed that we had a perfectly
balanced design. Just as with factorial ANOVA, everything
gets a little wonky with an unbalanced design. The details
of this aren’t really suitable for this class. The important
thing to know is that ezANOVA()
will handle it all for
you.