purrrSTAT 545 - Fall 2025
Use the map family of functions from the purrr package to iteratively apply a function.
Create and operate on list columns in a tibble using nest(), unnest(), and the map_*() family of functions.
Define functions on-the-fly within a map function using shortcuts.
Create and operate on list columns in a tibble using nest(), unnest()
(No video lecture this week!)
We will require the following packages:
Here is a list in R; it holds multiple items.
| Vectors | Lists |
|---|---|
Access elements with square brackets [] |
Access elements with [[]] |
| Each element must be an atomic data type (i.e., a single value) | Elements can be anything, even another list or another vector |
| Each element has to be of the same type | Elements can be as different as you like |
Let’s take our sample list and access some items stored in it.
To access the data within the vector, we can index it as well:
Elements within a list can also be named using names().
Once the elements are named, you can access them using the $ operator, similar to how you can grab columns from a data frame or tibble:
Data frames and tibbles are actually a special type of list:
Let’s write a “for loop” in R that iterates over the entries of a numeric vector x, squares each entry, and stores the result in a numeric vector output:
purrrOften, you can replace loops with a compact call to a function in the purrr package.
There are many map_*() functions, a few of which we will highlight here:
map(): applies a function to each element in a list or vector, returns a listmap_dbl(): applies a function to each element in a list or vector, returns a vectormap_dfr(): applies a function to each element in a list or vector, returns a data frameThe first argument of each function specifies the list/vector we want to iterate over, and the second argument specifies a function that we want to apply to each entry.
purrrLet’s use these map functions over a list containing ages of some made-up people
We will compare the outputs of these functions when applying a simple square function to their ages.
purrrmap()A list is returned!
purrrmap_dbl()A vector is returned!
purrrmap_dfr()A dataframe is returned!
purrrpurrr ShortcutsHere’s an example using purrr::map_dbl() and a custom function:
Here are two examples of “shortcuts”:
Consider the following example: a snippet of the Game of Thrones data from An API of Ice and Fire.
## # A tibble: 6 × 3
## name gender titles
## <chr> <chr> <list>
## 1 Theon Greyjoy Male <chr [2]>
## 2 Tyrion Lannister Male <chr [2]>
## 3 Victarion Greyjoy Male <chr [2]>
## 4 Will Male <chr [1]>
## 5 Areo Hotah Male <chr [1]>
## 6 Chett Male <chr [1]>
Some characters have one title (e.g., Will); others have more than one title (e.g., Theon Greyjoy).
The titles column is a list column, where each entry is a list that contains as many or as few strings as we like.
nest()Instead of having one row per child, perhaps we want to nest the data such that there is only one row per family.
We can nest the child column!
Aside: in general this data will not be tidy, but we’re demonstrating how to nest and un-nest data here
nest()We first need to group by family_id to gather children from each family, and then create a nested column called “children”:
Now we have a list column that contains vectors of varying lengths with the children’s names within each family.
unnest()We can revert back to the original tibble using unnest().
You’ll dive deeper into these topics in Worksheet B3
Take the rest of today’s class and Thursday’s class to work through the worksheet and ask for help if needed.