Control statements are most often used in connection with functions.
Think about functions as tools that we can use to do things. So far we have been working with built-in functions. We can also define our own functions in R, assign them a name, and then call them just like the built-in functions.
To create a function, use the function
keyword, followed by a list of arguments and the function body.
How it works:
When the function body is a single expression:
function(argument1, ...., argumentN) expression
If there are a series of expressions, curly braces should be used around the function body:
function(argument1, ...., argumentN) {
expression1
.
.
.
expressionrM
}
We have three ways to create a function.
<-
.my_function <- function(x,y,z) {c(x + 1, y + 1, z + 1)}
lapply(mtcars, function(x) length(unique(x)))
open.account <- function(total) {
list(
deposit = function(amount) {
if(amount <= 0)
stop("Deposits must be positive!")
total <<- total + amount
print(paste(amount, "deposited. Your balance is", total))
},
withdraw = function(amount) {
if(amount > total)
stop("You don’t have that much money!")
total <<- total - amount
print(paste(amount, "withdrawn. Your balance is", total))
},
balance = function() {
print(paste("Your balance is", total))
}
)
}
Adapted from https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf
To call a function, use the function name followed by parentheses.
my_function <- function(x,y,z) {c(x + 1, y + 1, z + 1)}
my_function(1,2,3)
## [1] 2 3 4
A useful trick: We can type the name of a function to see the code that runs when we call it.
my_function
## function(x,y,z) {c(x + 1, y + 1, z + 1)}
Arguments control how we call the function.
When we call a function in R, there are different ways to specify the arguments.
When arguments are given in the named form name = object
, they may be given in any order.
If we don’t name the arguments, then R will match them based on position.
Besides, we may also specify named arguments after the positional arguments.
my_function(1,2,3)
## [1] 2 3 4
my_function(x = 1, y = 2, z = 3)
## [1] 2 3 4
my_function(1, 2, z = 3)
## [1] 2 3 4
However, it’s a good practice to use exact argument names. It removes ambiguity and makes our code more readable.
If we don’t pass anything to the function, it uses the default value.
my_function <- function(x = 1, y = 2, z = 3) {c(x + 1, y + 1, z + 1)}
my_function()
## [1] 2 3 4
my_function(x = 3, y = 4)
## [1] 4 5 4
Example 1:
Write an R program to calculate a dog’s age in dog’s years. For the first two years, a dog year is equal to 10.5 human years. After that, each dog year equals 4 human years. When a dog’s age in human years is 15, what is the dog’s age in dog’s years?
age <- function(human){
if (human <= 2){
dog <- human*10.5
} else {
dog <- 2*10.5 + (human-2)*4
}
print(dog)
}
age(15)
## [1] 73
Example 2:
Write an R function that calculates the number of uppercase letters and lowercase letters in a string. Hint: Use ?"regular expression"
.
function_letters <- function(string) {
string <- strsplit(string, split = "")[[1]]
c1 <- 0
c2 <- 0
for (i in string){
if (grepl("[A-Z]", i)){
c1 <- c1 + 1
} else if (grepl("[a-z]", i)){
c2 <- c2 + 1
}
}
print(paste("upper case: ", c1))
print(paste("lower case: ", c2))
}
string1 <- "Mr. and Mrs. Dursley of number four, Privet Drive, were proud to say that they were perfectly normal, thank you very much."
string2 <- "One hot spring evening, just as the sun was going down, two men appeared at Patriarch's Ponds."
function_letters(string1)
## [1] "upper case: 5"
## [1] "lower case: 90"
function_letters(string2)
## [1] "upper case: 3"
## [1] "lower case: 71"
R returns the last evaluated expression in the function. We may also use the return()
function to specify the value returned by the function. But it is common to omit the return statement.
my_function <- function(x,y,z) {return(c(x + 1, y + 1, z + 1))}
my_function(1,2,3)
## [1] 2 3 4
print()
or return()
print()
shows the human user a string representing what is going on inside the computer. return()
gives back a value.
print()
is very useful for understanding how a program works and can be used in debugging to check various values in a program without interrupting the program.
return()
is the main way that a function returns a value. All functions will return a value, and if there is no return statement, it will return nothing. The value that is returned by a function can then be further used as an argument passed to another function, stored as a variable, or just printed for the benefit of the human user.
Example:
return()
:
tri <- function(k) {
if (k > 0) {
result <- k + tri(k - 1)
return(result)
} else {
result <- 0
return(result)
}
}
tri(6)
## [1] 21
print()
:
tri <- function(k) {
if (k > 0) {
result <- k + tri(k - 1)
print(result)
} else {
result <- 0
print(result)
}
}
tri(6)
## [1] 0
## [1] 1
## [1] 3
## [1] 6
## [1] 10
## [1] 15
## [1] 21
stop()
stop()
stops execution of the current expression and executes an error action.
deposit <- function(amount) {
total <- 100
if(amount <= 0)
stop("Deposits must be positive!\n")
total <- total + amount
print(paste(amount, "deposited. Your balance is", total))
}
deposit(6)
## [1] "6 deposited. Your balance is 106"
If we try to deposit -6 into our account by deposit(-6)
, the function will return the error message Error in deposit(-6) : Deposits must be positive!
.
In R, a function can call itself. We can loop through data to reach a result. Let’s first see an example; then we’ll explain it, and we will discuss three more examples (problem sets).
Example:
tri <- function(k) {
if (k > 0) {
result <- k + tri(k - 1)
return(result)
} else {
result <- 0
return(result)
}
}
tri(6)
## [1] 21
## Code from https://www.w3schools.com/r/r_functions.asp
Notes: (1) Recursion
k | result |
---|---|
6 | 6+tri(5) |
5 | 5+tri(4) |
4 | 4+tri(3) |
3 | 3+tri(2) |
2 | 2+tri(1) |
1 | 1+tri(0) |
0 | 0 |
result | output |
---|---|
tri(0) | 0 |
1+tri(0) | 1 |
2+tri(1) | 2+1=3 |
3+tri(2) | 3+2+1=6 |
4+tri(3) | 4+3+2+1=10 |
5+tri(4) | 5+4+3+2+1=15 |
6+tri(5) | 6+5+4+3+2+1=21 |
We can call a function within another function.
We’ll also see passing functions to other functions with the apply
family function in data frame manipulation, which we discuss in data manipulation.
stdError <- function(x) {sqrt(var(x)/length(x))}
tapply(data, label, stdError)
A variable’s scope is the set of places from which you can see the variable. R will try to find a variable in the current environment. If it doesn’t find that variable, it will look one level up in its parent environment, and then that environment’s parent, until it reaches the global environment.
Compare the two examples below:
Example 1:
y <- 10
f <- function(x) {
y <- 2
y*2 + g(x)
}
g <- function(x) {
x * y
}
f(3)
## [1] 34
Example 2:
f <- function(x) {
y <- 2
g <- function(x) {
x * y
}
y*2 + g(x)
}
f(3)
## [1] 10
In R, function arguments are only evaluated if accessed. This is an important feature because it allows us to do things like including potentially expensive computations in function arguments, which will be evaluated only when needed.