Setting up

library(data.table)
library(ez)
rm(list=ls())
set.seed(1)

This tutorial introduces some core statistical analyses commonly used in psychology and behavioral research. Each scenario gives you:

You’ll use:

Bonus activities:

Q1

A social researcher is interested in students height across different universities and whether they are significantly taller than the state average (165cm). For their first study, they looked at students who study at Western Sydney University (WSU), sampling a total of 100 students.

data_q1 <- rnorm(100, 170, 5)
a). What is the independent variable(s) and how many levels?

No IV per se because this is a one sample test

b). What is the dependant variable?

No DV per se because this is a one sample test

c). Calculate the mean for this sample
mean_wsu <- mean(data_q1)
d). What type of statistical test would you run? Why?

One sample t test. The study is comparing a single sample to a known population value. Population standard deviation is unknown.

e). Perform the statistical test and write a sentence explaining the results.
t.test(x=data_q1, 
        y=NULL, 
        alternative="greater",
        mu=160,
        paired=F, ## doesn't matter with one-sample
        var.equal=F, ## doesn't matter with one-sample
        conf.level=0.95)

# WSU student heights are significantly greater than the state average of 165cm, t(99) = 23.48, p < 0.001.

Q2

A researcher is interested in whether caffeine affects recall memory. 60 participants were recruited in this study and were required to complete a memory recall task. 30 participants were given a cup of coffee before the task, and the other 30 participants were given water.

coffee <- rnorm(30, 8, 2)
water <- rnorm(30, 7, 2)
id <- 1:60 

data <- data.table(id = id, coffee = coffee, water = water)
data_q2 <- melt(data, id.vars = "id", variable.name = "condition", value.name = "score")
a). What is the independent variable(s) and how many levels?

Caffeine (coffee, water)

b). What is the dependant variable?

Memory recall score

c). Calculate the mean for the dependent variable within each level of the independent variable(s).
mean_coffee <- mean(data_q2[condition == "coffee", score])
mean_water <- mean(data_q2[condition == "water", score])
d). What type of statistical test would you run? Why?

Independent sample t test. Two independent samples (different participants in the 2 conditions).

e). Perform the statistical test and write a sentence explaining the results.
x_obs <- data_q2[condition == "coffee", score]
y_obs <- data_q2[condition == "water", score]

t.test(x = x_obs,
       y = y_obs,
       alternative = "two.sided",
       mu = 0,
       paired = F,
       var.equal = T,
       conf.level = 0.95)

# There was a significant effect of caffeine on memory recall, t(118) = 6.0542, p < 0.001. Participants who were given coffee before he memory recall test performer significantly better than paarticipants who were given water.

Q3

A social researcher wants to expand their study on student height across to different universities. They have sampled 100 students each who study exclusively at WSU, MQU and UNSW. They are interested in comparing the heights across these 3 universities and whether there are any significant height differences.

x_raw <- rnorm(100, 170, 5)
y_raw <- rnorm(100, 166, 5)
z_raw <- rnorm(100, 171, 5)
x <- (x_raw - mean(x_raw)) / sd(x_raw) * 5 + 170  # mean = 170, sd = 5
y <- (y_raw - mean(y_raw)) / sd(y_raw) * 5 + 166  # mean = 166, sd = 5
z <- (z_raw - mean(z_raw)) / sd(z_raw) * 5 + 171  # mean = 171, sd = 5

data <- data.table(MQU = x, UNSW = y, WSU = z)
data_q3 <- melt(data = data, measure.vars = c("MQU", "UNSW", "WSU"), variable.name = "University", value.name = "Height")
data_q3[, Subject := 1:300]
data_q3[, University := factor(University)]
data_q3[, Subject := factor(Subject)]
a). What is the independent variable(s) and how many levels?

University (MQU, WSU, UNSW)

b). What is the dependant variable?

Height (in cm)

c). Calculate the mean for the dependent variable within each level of the independent variable(s).
mean_wsu <- mean(data_q3[University == "WSU", Height])
mean_mqu <- mean(data_q3[University == "MQU", Height])
mean_unsw <- mean(data_q3[University == "UNSW", Height])
d). What type of statistical test would you run? Why?

One Way ANOVA

e). Perform the statistical test and write a sentence explaining the results
ezANOVA(data=data_q3, wid =Subject, dv=Height, within=NULL, between=.(University), type=3)

# There is a significant effect of University on student height, F(2, 297) = 28, p < 0.001. This shows that student height across WSU, MQU and UNSW differ significantly.

Q4

A researcher is interested in the effects of essential oil, specifically lavender oil, and its effects on sleep quality/duration. 80 people were recruited in this study. For the first week, all participants followed their usual sleep routines, and at the end of that week, researchers recorded each person’s average number of hours slept per night. In the 2nd week, they used lavender oil before bed each night. Sleep hours were recorded again at the end of that week and then compared to the first week.

before_lav <- rnorm(80, 7, 2)
after_lav <- rnorm(80, 8, 2)
ID <- rep(1:80, 2)
data_q4 <- data.table(id = ID, before = before_lav, after = after_lav)
data_q4 <- melt(data_q4, id.vars = "id", variable.name = "time", value.name = "hours")
a). What is the independent variable(s) and how many levels?

Lavender oil (before, after)

b). What is the dependant variable?

Hours of sleep

c). Calculate the mean for the dependent variable within each level of the independent variable(s).
mean_before <- mean(data_q4[time == "before", hours])
mean_after <- mean(data_q4[time == "after", hours])
d). What type of statistical test would you run? Why?

Paired t test. 2 samples of data from the same group of participants.

e). Perform the statistical test and write a sentence explaining the results.
x_obs <- data_q4[time == "before", hours]
y_obs <- data_q4[time == "after", hours]

t.test(x = x_obs,
       y = y_obs,
       alternative = "two.sided",
       mu = 0,
       paired = T,
       var.equal = T,
       conf.level = 0.95)

# There was a significant effect of lavender oil on hours of sleep, t(159) = -6.0212, p < 0.001. Participants slept sigificantly more hours on average per night after using lavender oil.