Lecture 8A: Functions

STAT 545 - Fall 2025

Learning Outcomes

From today’s class, students are anticipated to be able to:

  • Understand when to use a function

  • Build functions from scratch in R

  • Document functions using roxygen2

  • Test functions with the testthat package

  • Specify what to return in a function using return()

Lecture Notes

YouTube Video

Set-up

Required packages:

Functions

We’ve used functions throughout this course since Week 1:

  • mean(), mutate(), and pivot_longer()

  • These are built into R or loaded via a package

  • Even though they are pre-made, someone had to write them at some point!

Oftentimes we need more than whats available on R.

Self-made R Functions

Why might we want to write our own functions?

  1. Shortens your code

  2. Easier to update code when repeated processes are used (fewer bugs and headaches)

  3. Reproducibility

A good rule of thumb: if you find yourself repeating code, consider writing a function

Self-made R Functions

To make a function in R, we provide the function names and the arguments like the following:


my_function_name <- function(argument1, argument2, ...){
  
  # code that involves argument1, argument2, and so on
  # and will calculate something to output
  #
  # by default, whatever is calculated in the last line of the code will be 
  # outputted. We can override this with a return() statement (more on that later)
  
}

Notes:

Self-made R Functions

Here’s a simple example of a function I wrote to simulate rolling a user-inputted number of D10s (a 10-sided die used for tabletop gaming) and returning the sum of the dice.

Note

Have you compared your answer with your neighbour? You may notice the output will change each time you run this function, which is what we want as we are randomly sampling. If I wanted to make this reproducible, then I would set the seed to some number before running my function with, for example, set.seed(123).

Choose a number. With a partner, set the seed to that number before the #try rolling two dice comments. Re-run your code. Do you get the same output?

Function Documentation

You should have also noticed by now that other people’s functions in packages are documented (try running ?mutate in your R console or in the chunk below) - there’s information about:

  1. what the function does, at a high level

  2. the objects it expects you to input

  3. the object that the function outputs

At an absolute minimum, functions should have some comments indicating what the function does and what the inputs are.

Function Documentation

Try documenting this function by adding a description of what it does, and what the inputs (arguments) and outputs (what it returns) are

Documenting Functions with roxygen2

We can do even better than commenting by utilizing roxygen2 tags

  • Tags are placed immediately above the function definition.

  • Designed for use when creating R packages, but also provide a standardized way to document a function

  • Make it easy for you to migrate your function to an R package if need be (more on packages next week)

  • Roxygen comment lines always start with #'

#' Description of function goes here
#' 
#' @param x description of the parameter input x goes here
#' @param y description of the parameter input x goes here
#' @returns description of the what function returns goes here

name_of_function <- function(x, y) {
  your function goes here!
}

Documenting Functions with roxygen2

For the dice example, we could write:

Testing

When you’re using other people’s functions – like those in packages – they often work.

It is very easy to oversimplify a function and have it not work.

Because of this, it’s important to test the functions. We should use:

  • standard test cases (i.e., rolling 1, 3, or 100 dicee), and

  • edge cases (conditions that fall outside the typical or expected parameters, i.e. rolling 0, or 2.5 dice)

to ensure the function works as expected. This included ensuring errors are “thrown” when required.

Testing

Let’s try rolling 4 dice:

Now, let’s try rolling no dice. The expected output is 0.

Testing with testthat

Instead of manually coding test cases over and over, we can use functions from the testthat package in R. For example, when rolling no dice, we would expect the output to be 0. We can use the expect_equal() function to confirm this. The function won’t output anything if the output is as expected:

or will throw an error if not:

Testing with testthat

The test_that() function makes these tests even more readable:

Error Handling

Let’s try inputting a nonsense input, like 2.5 dice. This input doesn’t make sense, so let’s see what happens:

Interesting! This is something we should consider controlling for when creating our function.

Error Handling

Within a function call, we can force errors to appear using the stop() function and conditional statements.

For example, we may only want to allow whole numbers (positive numbers of dice) to be inputs:

Notes:

Error Handling

So rolling 2 dice shouldn’t throw an error, but rolling 2.5 should as well:

Notes:

Error Handling: Exercise

Add another section of code that throws an error that says “num_dice must be positive” when a negative number of dice is inputted into the function.

Notes:

Returns

By default, your function will return the last thing computed in your function. However, we can return other items, like lists and vectors and dataframes using return().

While perhaps redundant as the last line of code here is what we want to output, we could explicitly tell R what to output by:

Returns

We could also return a vector of the the number of dice, and the number of faces of each dice, and the sum.

Notes:

Next Class (Thursday)

We will continue with self-made R functions and explore default values, ellipses, data masking, and missing value handling.

Worksheet B1 and Assignment B1

Now it’s your turn to explore functions!