CARRY OVER FROM WEEK 3

library(data.table)
library(ggplot2)

#Load in built-in data as a data.table
iris <- as.data.table(iris)

Additional Questions - combining data.table and ggplot knowledge

A). Create a histogram on petal length only for the versicolor species. Adjust binwidth and add labels accordingly.

ggplot(data = iris[Species == "versicolor"], aes(x = Petal.Length)) +
  geom_histogram(binwidth = 0.05) +
  labs(title = "Distribution of Petal Length for Versicolor ",
       x = "Petal Length",
       y = "Frequency")

B). Create a boxplot graph on sepal length only for the setosa and virginica species. Add labels accordingly.

ggplot(data = iris[Species %in% c("setosa", "virginica")], aes(x = Species, y = Sepal.Length)) +
  geom_boxplot() +
  labs(title = "Distribution of Petal Length for Virginica and Setosa",
       x = "Species",
       y = "Petal Length")

Get some experiment data

  • Complete the experiment located at the following address: https://run.pavlovia.org/demos/simplertt/

  • Be sure to enter your name or some other ID that you will remember and can be easily searched for.

  • Download your data in a .csv file by taking the following steps:

  • Go here: https://gitlab.pavlovia.org/demos/simplertt

  • Read about the experiment you just participated in by scrolling to the bottom and reading the README.md file.

  • Click the data folder to land at the following address: https://gitlab.pavlovia.org/demos/simplertt/tree/master/data

  • Near the top right of the page, click the Find file button, and search for a file containing the unique ID that you entered at the beginning of the experiment.

  • Click on the .csv file that pops up.

  • Finally, click the download button to download your .csv file to your local machine.

Analyse your data using data.table

  • Load the data.table library and rm to be sure your are starting with a clean work space.
library(data.table)
library(ggplot2)

rm(list = ls())

Q1. Load the data into a data.table using the fread function from the data.table library.

# You need to replace the path I use here with a path that
# points to wherever you have your data stored.
d <- fread('')

When data.table objects have lots of columns, str can be a good summary function to use for basic inspection

str(d)

It’s certainly difficult to know what all of these columns encode. This is something you will get used to as you build and run your own experiments (e.g., as you will in later COGS units). For now, I’ll just tell you. The data contains a row for every trial completed, including practice trials.

Q2a. Extract the column named ’practiceTrials.thisN and mainTrials.thisN

d[, .(practiceTrials.thisN, mainTrials.thisN)]

Q2b. What are the NA’s telling us?

Ans = We can see that mainTrials.thisN is NA during practice and practiceTrials.thisN is NA during the main experiment.

Q2c. What’s the column name that stores RT?

Ans = We can also see that our reaction time per trial is stored in a column named response.rt

Q2d. What column is our independent variable? What’s it telling us

Ans = isi is an independent variable.

Okay, we are now equipped to pull out just the rows and columns that we need for a simple exploration of our performance.

# We begin by just looking and the main trials
d_main <- d[!is.na(mainTrials.thisN), .(response.rt, isi)]

Q3a. Explain in words what the above code is doing

Q3b. Create a simple histogram plotting response.rt. What do you notice?

ggplot(d_main, aes(x=response.rt)) + 
  geom_histogram(bins=30)

Q3c. How can we compute and report basic descriptive statistics (mean, median, and standard deviation) for response times in the main experiment using data.table? Specifically, how do we apply functions like mean(), median(), and sd() to the response.rt column within d_main to summarize the data efficiently?

# Finally, we report basic descriptive statistics as actual
# numbers for the main experiment
d_main[, .(rt_mean=mean(response.rt), 
           rt_median=median(response.rt), 
           rt_sd=sd(response.rt))]