[1] 9
STA141A: Fundamentals of Statistical Data Science
R built-in functions are, e.g., sum() or mean(), where the input is a vector and the output is a number.R environment as an object with this name.When calling a function, you can specify the arguments by:
How to write your own function:
Example
Let x and y be numeric vectors of the same length. We can calculate:
x by mean(x);x by var(x);x by sd(x);x and y using cov(x, y);x and y using cor(x, y).for loopTemplate
Example for loop: element-wise squaring
for loopExample for loop: cumulative sum.
[1] 1 10 8 7 3
z <- 0
for (i in 1:5) {
z <- z + y[i] # uses previous iteration's value of z
print(paste("the cumulative sum of the vector y at index", i, "is:", z))
}[1] "the cumulative sum of the vector y at index 1 is: 1"
[1] "the cumulative sum of the vector y at index 2 is: 11"
[1] "the cumulative sum of the vector y at index 3 is: 19"
[1] "the cumulative sum of the vector y at index 4 is: 26"
[1] "the cumulative sum of the vector y at index 5 is: 29"
[1] 29
for loopExample for loop: compute \(\sum_{n=1}^5 n!\)
z <- 0
for (i in 1:5) {
y <- factorial(i) # the factorial() function is built into R
print(paste0("the value of ", i, "! is: ", y))
z <- z + y # uses previous iteration's value of z
}[1] "the value of 1! is: 1"
[1] "the value of 2! is: 2"
[1] "the value of 3! is: 6"
[1] "the value of 4! is: 24"
[1] "the value of 5! is: 120"
[1] 153
Q: what happens if we omit z <- 0 at line 1?
while loopUseful for when we don’t know how many times we want to execute commands.
Example while loop (random walk)
while loopUseful for when we don’t know how many times we want to execute commands.
Example while loop (random walk): another set of random steps
x <- 0
while (-2 <= x && x <= 2) {
curr_step <- sample(c(-1, 1), size=1)
print(paste0("moving x=", x, " by step of size ", curr_step))
x <- x + curr_step # uses previous iteration's value of x
}[1] "moving x=0 by step of size -1"
[1] "moving x=-1 by step of size -1"
[1] "moving x=-2 by step of size -1"
while loopUseful for when we don’t know how many times we want to execute commands.
Example while loop (random walk): fix the set of “random” steps
set.seed(42) # for reproducibility; fixes any proceding "random" results
x <- 0
while (-2 <= x && x <= 2) {
curr_step <- sample(c(-1, 1), size=1)
print(paste0("moving x=", x, " by step of size ", curr_step))
x <- x + curr_step # uses previous iteration's value of x
}[1] "moving x=0 by step of size -1"
[1] "moving x=-1 by step of size -1"
[1] "moving x=-2 by step of size -1"
while loopUseful for when we don’t know how many times we want to execute commands.
It is possible that the body of a while() loop will never be executed.
apply() family of functionsapply() and related functionslapply(X, FUN, ...): returns a list containing the result of the function FUN applied to all the elements of the list/vector X.sapply(X, FUN, ...): essentially does lapply(X, FUN, ...) first and then tries to coerce the output into a vector.gradesCalculate group means using…
group1 group2 group3
4.44 8.13 12.54
R is a functional programming language: functions can be used as objects!
funlist <- list(sum, mean, var, sd)
dat <- runif(10)
# Use for loop to apply dat to all functions in funlist
for (f in funlist) {
print(f(dat)) # prints values
}[1] 4.108169
[1] 0.4108169
[1] 0.116695
[1] 0.3416066
# Use sapply to apply dat to all functions in funlist
sapply(funlist, \(f) f(dat)) # also stores values in a vector[1] 4.1081692 0.4108169 0.1166950 0.3416066
for() and apply()Beyond aesthetic differences…
for() executes commands sequentially.apply() family can execute commands in parallel (but don’t by default).purrr::reduce()Repeatedly applies a binary function to the elements of a vector or list.
Reduce(), but the version from the purrr package has nicer functionality.reduce(<list or vector>, <binary function>)purrr::reduce(): example using paste()[1] "eek a bear"
[1] "eekabear"
[1] "eek...a...bear"
paste() also does element-wise pasting
So what if you want to paste together all elements of a character vector?
purrr::reduce(): example using set intersection[[1]]
[1] 0 5 10 15 20 25 30
[[2]]
[1] 0 3 6 9 12 15 18 21 24 27 30
[[3]]
[1] 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
purrr::reduce(): example of stacking data frames [1] -0.2992672 -0.2799575 -0.2311141 -1.2870888 -0.5693856 -3.0185564
[7] -0.4975908 -0.3545537 -1.7564090 -0.7374990
[[1]]
param u
1 -0.2992672 0.3690927
2 -0.2992672 0.5785272
3 -0.2992672 0.9776749
4 -0.2992672 0.6875838
[[2]]
param u
1 -0.2799575 0.44512359
2 -0.2799575 0.80760922
3 -0.2799575 -0.03743895
4 -0.2799575 0.06727781
[[3]]
param u
1 -0.2311141 0.7884435
2 -0.2311141 0.6223001
3 -0.2311141 0.0650239
4 -0.2311141 -0.1781900
[[4]]
param u
1 -1.287089 -0.9658007
2 -1.287089 -0.7921962
3 -1.287089 -0.1906617
4 -1.287089 -0.8355938
[[5]]
param u
1 -0.5693856 0.55956111
2 -0.5693856 -0.55701136
3 -0.5693856 0.01990297
4 -0.5693856 0.23791847
[[6]]
param u
1 -3.018556 -3.0122451
2 -3.018556 -0.6813479
3 -3.018556 -2.3840054
4 -3.018556 -1.5757809
[[7]]
param u
1 -0.4975908 0.4693016
2 -0.4975908 0.6642751
3 -0.4975908 0.3465215
4 -0.4975908 -0.1475987
[[8]]
param u
1 -0.3545537 -0.2326702
2 -0.3545537 -0.2385875
3 -0.3545537 0.0588810
4 -0.3545537 0.5495114
[[9]]
param u
1 -1.756409 -1.7557505
2 -1.756409 -1.1815049
3 -1.756409 0.8154147
4 -1.756409 0.7950465
[[10]]
param u
1 -0.737499 0.5379891
2 -0.737499 -0.1587868
3 -0.737499 0.1574230
4 -0.737499 0.5551562
param u
1 -0.2992672 0.36909267
2 -0.2992672 0.57852718
3 -0.2992672 0.97767495
4 -0.2992672 0.68758376
5 -0.2799575 0.44512359
6 -0.2799575 0.80760922
7 -0.2799575 -0.03743895
8 -0.2799575 0.06727781
9 -0.2311141 0.78844348
10 -0.2311141 0.62230012
11 -0.2311141 0.06502390
12 -0.2311141 -0.17819002
13 -1.2870888 -0.96580066
14 -1.2870888 -0.79219616
15 -1.2870888 -0.19066173
16 -1.2870888 -0.83559384
17 -0.5693856 0.55956111
18 -0.5693856 -0.55701136
19 -0.5693856 0.01990297
20 -0.5693856 0.23791847
21 -3.0185564 -3.01224508
22 -3.0185564 -0.68134793
23 -3.0185564 -2.38400545
24 -3.0185564 -1.57578093
25 -0.4975908 0.46930157
26 -0.4975908 0.66427514
27 -0.4975908 0.34652154
28 -0.4975908 -0.14759872
29 -0.3545537 -0.23267022
30 -0.3545537 -0.23858752
31 -0.3545537 0.05888100
32 -0.3545537 0.54951137
33 -1.7564090 -1.75575051
34 -1.7564090 -1.18150490
35 -1.7564090 0.81541467
36 -1.7564090 0.79504652
37 -0.7374990 0.53798910
38 -0.7374990 -0.15878679
39 -0.7374990 0.15742300
40 -0.7374990 0.55515619
if and elseOnly one condition: if statement
Example (one coin toss)
if and else: vectorized versionifelse(condition1, statement1, statement2)
Example (five coin tosses)
else ifMore than two conditions:
Comments on loops
Performs commands sequentially
Often we will want to perform the same set of (complicated) commands on different chunks of data.
forloop, but can be difficult to understand because it is so flexibleapply()family of functions