library(tidyverse)
library(readr)
library(here)Lecture 7A: Reading and Writing Data
October 14, 2025
Learning Outcomes
From today’s class, students are anticipated to be able to:
Read and write a delimited file, like a csv, from R using the
readrpackage.Make relative paths using the
here::here()function.Read data from a spreadsheet
Read and write R binary files (rds files) from R.
Lecture Slides
Set-up
Required packages:
Data Formats
Data has to be stored somewhere. When saving data locally, common file formats include
Spreadsheets: Excel (.xlsx), Google Sheets (.gsheet)
Delimited files: Plaintext files containing data, e.g., text files (.txt), comma separated values (.csv), tab separated values (.tsv)
R binary: A serialization of an R object to a binary file (.rds). Basically, that means that it can be loaded in and out of R, but it can’t be opened by anything but R.
CSVs are the most “one-size-fits-all”: you can open them in spreadsheet software, but they are also plaintext under the hood, meaning they are lightweight (don’t take a lot of storage) and can be opened in any text editor.
Spreadsheets are nice for human interaction (like through Excel), but can be clunky in R and often use more memory to store due to their extra features.
R binary can be useful for storing results that you don’t want to rerun in R, but it is not as useful for storing raw data. The R binary data type is quite restrictive and we don’t tend to store data this way. Our lecture will focus on CSVs.
Comma Separated Values (CSVs)
Jenny Bryan’s website has a fabulous section on reading and writing files in R. We’re going to summarize a few of the important functions here, but if you’d like to learn more then check out that website for more in-depth explorations!
We will start by talking about how to read and write Comma Separated Value files. CSVs are often used to store data. When the penguins data set is stored as a .csv, the first few entries look like when opened as a text file (see for yourself here):
species,island,bill_len,bill_dep,flipper_len,body_mass,sex,year
Adelie,Torgersen,39.1,18.7,181,3750,male,2007 Adelie,Torgersen,39.5,17.4,186,3800,female,2007 Adelie,Torgersen,40.3,18,195,3250,female,2007 Adelie,Torgersen,NA,NA,NA,NA,NA,2007 Adelie,Torgersen,36.7,19.3,193,3450,female,2007 Adelie,Torgersen,39.3,20.6,190,3650,male,2007 Adelie,Torgersen,38.9,17.8,181,3625,female,2007 Adelie,Torgersen,39.2,19.6,195,4675,male,2007
Now, this isn’t exactly easy for humans to read, but saving data as CSVs has its advantages. The data is stored in a simple form (lightweight - files aren’t large) that has broad compatibility and can be used in a wide range of applications. And of course, we can use functions in R to make it more readable. A few main functions of note, which are from the readr package, are:
read_csv(): tidyverse equivalent ofread.csv()used to read from a CSV to a tibblewrite_csv(): tidyverse equivalent ofwrite.csv()used to export a tibble into CSV format
Let’s assume that a file called penguins.csv is saved in the same folder as our code. We can read in, and save the tibble as a variable called penguins using:
penguins <- read_csv("penguins.csv")Rows: 344 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): species, island, sex
dbl (5): bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, year
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(penguins)# A tibble: 6 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
# ℹ 2 more variables: sex <chr>, year <dbl>
Pretty easy! Note that the file path needs to be a string, relative to where you are now in the directory (i.e., where the R script you’re working on is saved. You can always call getwd() to see what directory you’re working on currently, and we’ll show more tools for dealing with directories later in this lecture.)
We can also manipulate the data, and save the output as a new CSV. For example,
penguins_2007 <- penguins %>%
filter(year == 2007) #filter only on year 2007
write_csv(penguins_2007, "penguins_2007.csv") #save new data as penguins_2007.csvWant to read and write to an Excel file? The readxl package in the tidyverse is for you!
For the very niche option of R binary: read_rds() and write_rds().
File Locations and Paths
In the previous example, we saved and read in data that was stored in the same folder. However, we will often want to read from or write to other locations, including sub-folders in our project.
To do so, we need to specify where we are reading/writing our data from/to.
Absolute Paths
Absolute paths start with “/” (or “\” for Windows users) and begin at the root of your computer. This is a looooong set of “directions” that tell you where the file is located.
I could always read in my penguins dataset using an absolute file path where the file path begins at the root of your computer. Consider the following file structure:

The absolute path to the penguins.csv data set is /Users/grace/documents/STAT545/Lec7A/datasets/penguins.csv. Note the “/” (or “\”) at the beginning of the string indicates that you start at the root of your folder. This will work to load in the data. However, it is not best practice in terms of reproducibility. If I moved my project folder anywhere else in my computer, or sent this code to someone else to read in the data, this long file path string would have to be updated.
Because I wrote this on a Mac, the slashes are forward “/”. Windows users write file paths with back slashes ““.
Later in the lecture, we discuss the here::here() function which solves this problem completely.
Relative Paths
The best practice is to use a relative path. This helps with reproducibility and automation!
Instead of starting at the root of your computer, you can give directions to the file you want to load in relative to the working directory (i.e., where you are now).

If we are working in the Lec7A directory on mycode.R, all we need to do inorder to access penguins.csv is go into the datasets folder (which is in our working directory) and load it in! The relative file path datasets/penguins.csv (note there is no back or forward slash at the beginning of the filepath). This means if I move my Lec7A folder, or share it with someone else, anyone can load in the data with this line of code (well, almost…. so long as they have the same operating system!)
If you’re having trouble visualizing the working directory, you could consider the folders nested this way as well:

Some useful tips for relative paths:
they do not start with a slash
.represents the current directory..means go to one folder before the current directory (open the parent folder)- i.e., to go to the
thesisfolder if my current working directory isLec7a, the path is..\..\thesis(leave theLec7afolder to go to theSTAT545folder, then leave theSTAT545folder to go to documents, then go to thethesis).
- i.e., to go to the
you can call
getwd()in R to confirm where your working directory is (it will show the absolute file path as the output)in R projects, by default your working directory is you R project folder.
The here Package
As we stated before, things can get frustrating when sharing files between operating systems. Even with relative paths, we’ll need to manually replace forward and back slashed when switching to/from Mac and Windows operating systems.
Thankfully, there is a package that allows us to use relative paths without specifying a filepath string that is operating system dependent. Let’s (install, if necessary, and) load the here package
# install.packages("here")
library(here)Now, let’s call here():
here::here()[1] "/Users/gracetompkins/Desktop/STAT545.github.io"
Side note: we will explicitly call here() from the here package using here:: as dplyr also has a here() function.
I get a long chain of folders where this R Project (which I used to build this website) is stored. The cool thing about here is that I can specify a file path relative to my project root (the above location) without using any operating system-specific strings.
For example, the penguins.csv data set is located in webpages > lectures_i > datasets within my R project folder. I can access it by:
penguins <- read_csv(here("webpages", "lectures_i", "datasets", "penguins.csv"))Rows: 344 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): species, island, sex
dbl (5): bill_len, bill_dep, flipper_len, body_mass, year
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(penguins) #view first few entries of the tibble# A tibble: 6 × 8
species island bill_len bill_dep flipper_len body_mass sex year
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
3 Adelie Torgersen 40.3 18 195 3250 female 2007
4 Adelie Torgersen NA NA NA NA <NA> 2007
5 Adelie Torgersen 36.7 19.3 193 3450 female 2007
6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
This is reproducible!
Some notes:
By default in an R project,
here::here()will be the project folder.I don’t think you can go outside of your root folder for the R project, unless you re-initialize the root somehow using
here::iam().This does not change the working directory. However, we recommend against using
setwd()and similar functions to play around with directories in R projects. This again affects reproducibility.
Resources
- Video lecture: Reading and Writing Data
- The “Writing and Reading files” chapter of stat545.com.