2 Introduction to R
This chapter provides an introduction to R, covering basic concepts like defining variables, performing mathematical operations, and using built-in functions. It also introduces various data types (numeric, character, logical, and factor) and containers (vector, list, and data frame) in R, along with how to create and access their elements. It also covers control flow with for and while loops, conditional flow with if statements, and the creation of custom functions. Lastly, it discusses inspecting objects using functions like ls(), rm(), class(), str(), head(), tail(), and summary() to understand and manage the R environment.
2.2 Functions
Functions are objects that take arguments as inputs, perform some operation on those inputs, and return the results.
ls()
is a function that reports what objects (e.g., variables you have defined) are in the current environment and therefore available for you to interact with.
## [1] "x" "y"
rm()
is a function that can remove objects from the current environment.It takes several arguments. Type
?rm()
to see help information on how to use it.We will commonly combine
ls()
andrm()
to remove all objects from the current environment. This is a good thing to do at the beginning of every new script you write.
2.4 Containers
2.4.1 vector
Create a vector with the function
c()
.The elements of a
vector
must be of the same type.Access element
i
ofvector
x
with square brackets (e.g.,x[i]
)
## [1] 1.00000 2.00000 3.14159
## [1] 3.14159
2.4.2 list
Create a list with the function
list()
The elements of a
list
can be of different types.Access element
i
oflist
x
with double square brackets (e.g.,x[[i]]
)
# Create a three element list containing one numeric
# item, one `character` item, and one logical item.
x <- list(3.14159, 'pi', TRUE)
x
## [[1]]
## [1] 3.14159
##
## [[2]]
## [1] "pi"
##
## [[3]]
## [1] TRUE
## [1] TRUE
2.4.3 data.frame
- Create a
data.frame
with the functiondata.frame()
-data.frame
is pretty close what you might think of as an
excel spreadsheet.
- Access column
x
indata.frame
df
with the$
operator (e.g.,df$x
).
# Create vectors to later store in a data frame
x <- c('I', 'I', 'I', 'II', 'II', 'II', 'III', 'III', 'III', 'IV', 'IV', 'IV')
y <- c('a', 'a', 'b', 'b', 'c', 'c', 'd', 'd', 'e', 'e', 'f', 'f')
z <- rnorm(12)
# Create the data frame
df <- data.frame(x, y, z)
df
## x y z
## 1 I a -0.9330328
## 2 I a -0.3860412
## 3 I b -0.1272712
## 4 II b -0.0750369
## 5 II c 0.1739119
## 6 II c 2.3674721
## 7 III d -0.1730269
## 8 III d -1.3843449
## 9 III e -0.2846021
## 10 IV e -0.5652392
## 11 IV f 1.4165630
## 12 IV f -0.2475928
## [1] "I" "I" "I" "II" "II" "II" "III" "III" "III" "IV" "IV" "IV"
2.5 Loops
2.5.1 for
loops
for
loops will run a chunk of code repeatedly for a fixed number of iterations.The general syntax of a
for
loop is as follows:
for(x in y) {
# On the first iteration, `x` will take the value `y[1]`
# On the second iteration, `x` will take the value `y[2]`
# On the third iteration, `x` will take the value `y[3]`
# The loop will end after `x` has taken the value `y[length(y)]`
# That is, the loop will end when we have iterated through
# all elements in `y`
}
- As an example suppose we want to print the numbers 1, 2, 3
## [1] 1
## [1] 2
## [1] 3
## [1] 1
## [1] 2
## [1] 3
2.5.2 while
loops
while
loops will run a chunk of code repeatedly over and over again until somelogical
condition is met.You have to be careful with these, because if your code never sets up a stopping condition, then the loop will execute until your computer turns to dust.
The general syntax of a
for
loop is as follows:
condition <- TRUE
while(condition) {
# On the first iteration, `x` will take the value `y[1]`
# On the second iteration, `x` will take the value `y[2]`
# On the third iteration, `x` will take the value `y[3]`
# The loop will end only when `condition` is set to `FALSE`
}
- Lets again consider the example printing the numbers 1, 2, 3
## [1] 1
## [1] 2
## [1] 3
# smart monkey way to print 1, 2, 3
x <- 1
while(x < 4) {
print(x)
x <- x + 1 # without this line the loop would run forever
}
## [1] 1
## [1] 2
## [1] 3
2.6 Conditional flow
Very often we will want to execute a chunk of code only in some situations (e.g., for a particular experiment condition) and we will want to run some other chunk of code in other situations.
The primary method for doing this is to use
if
statements. The general syntax of anif
statement is as follows:
- For example, suppose we want to print whether or not a number is less than 5.
## [1] "x is less than 5"
2.7 Custom functions
Custom functions are very useful because they allow us to flexibly reuse the same chunk of code in different places without having to rewrite the entire chunk.
The general syntax for defining functions is as follows:
function_name <- function(argument_1, argument_2, ...) {
## code to run when the function is called. Can use
## `argument_1`, `argument_2`, and any other argument
## passed in.
##...
return(the_result)
}
function_name
is the name of the function.argument_1
,argument_2
, etc. are variables that you want the code chunk inside the function to use.the_result
is a variable that you will have to be careful to define in the code chunk in the function.Consider the following example:
## [1] 4
my_func
take two argumentsx
andy
and returnsx + y - 1
2.8 Inspect existing objects
R
has many built-in functions that are useful for inspecting what sort of thing an existing object is. We illustrate the use of some of these functions below.
# define some variables (objects) to inspect
var_1 <- 10
var_2 <- "apple"
var_3 <- TRUE
var_4 <- c(1, 2, 3)
var_5 <- list('a', 'b', 'c')
var_6 <- data.frame(v4=var_4, v5=var_5)
# substitute different variables into the following
# functions to see how they help you inspect existing
# objects.
str(var_1)
## num 10
## [1] "numeric"
## [1] 10
## [1] 10
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 10 10 10 10 10 10
2.9 Summary
Here’s a list of the R functions we saw in the above examples:
ls()
- Lists objects in the current environment.rm()
- Removes objects from the current environment.class()
- Returns the class (type) of an object.factor()
- Creates a factor (categorical data type).c()
- Combines values into a vector or list.list()
- Creates a list.data.frame()
- Creates a data frame.rnorm()
- Generates normally distributed random numbers.print()
- Prints its argument.str()
- Displays the structure of an R object.head()
- Returns the first parts of an object.tail()
- Returns the last parts of an object.summary()
- Provides a summary of an object’s properties.
Additionally, we saw how to create and use custom functions,
as well as how to use control flow constructs like for
loops, while
loops, and if
statements.