Lecture 2B: Version Control
September 11, 2025
Learning Objectives
From today’s topic, students are anticipated to be able to:
create an R Project,
use git on their own computer (locally),
connect between a local git repository and that repository on GitHub, using RStudio,
make commits in git using RStudio,
make a branch in git using RStudio or GitHub, and
use collaborative GitHub features such as Issues and pull requests.
After this class, you should be able to start working on your Collaborative Project.
Video Lecture
Get Acquainted with Git and GitHub
“Git” is a version control system designed to track changes. This is what GitHub uses! Git essentially creates “snapshots” of your work as you progress through a project.
Repositories, Organizations, and Personal Accounts
A repository stores files and the history of the files; the usual convention is to use a single repository to organize a single project.
GitHub is a place where repositories can live online. Being online provides us a way to share and collaborate on projects. It also serves as a backup for your project.
Example 1: Grace’s GitHub page
Example 1: The STAT 545 GitHub page
The first repository lives under Grace’s personal Github account. The second repository lives under the UBC-STAT organization. Organizations are useful if you have lots of different projects with a common theme which lots of people are collaborating on.
We can interact with and send data between GitHub and RStudio using Git.
Git States and Lingo
To use Git, we should first understand the primary states that files can be in. These are:
- Modified: you’ve changed something in the files, but Git hasn’t accounted for them yet.
- Staged: Git has tracked the changes you’ve made, but they haven’t been accounted for in the latest snapshot of your work.
- Committed: the changes you’ve made have been stored and updated as the latest snapshot of your project.
After changes are committed locally (on your computer), we can push them to a GitHub repository. This sends the changes and updates the most recent version of your project on GitHub. We can also pull changes from GitHub to your local machine. This is useful when multiple people (or perhaps one person on multiple devices) are working on a project.
Other useful Git features include:
- Cloning a repository: copying an existing GitHub repository to your local machine. You can push and pull changes after cloning.
- Using Branches: creating a separate line of development within your repository, allowing for isolated work on new features. This is a safe way to ensure you aren’t overwriting the main repository branch. Branches can be merged together later in development
- Pull Requests: Git’s way of proposing changes to the repository’s main branch.
R Projects
While Git and GitHub can be useful for more than just R, we will demonstrate how Git and GitHub can be used within an R Project.
R Projects are useful for organizing files and resources specific to a project in a self-contained directory. R Projects allow for more reproducible workflows, and allow users to resume exactly where they left off within a project by keeping variables in the environment even after the session has closed. It also allows us to integrate version control with Git.
Create an R Project
Open RStudio.
Go to “File” > “New R Project…”.
Select “New Directory”, “New Project”, and enter a name for your project. For our example, I’ll be creating “STAT545_Example_Project”. Select “Browse” and find a location where you’d like to save your project (a new folder the same name as your project will be created). I chose the Desktop as an example. Ensure “Create a Git Repository” is selected. Click “Create Project”.
RStudio will refresh. You will now see at the top of RStudio that the name of your project is displayed.
To add a new R Script to your R Project, go to “File” > “New File” > “R Script”. Write your code. “File” > “Save as” and name your file and ensure it’s saved in your project folder. Let’s add one called analysis.R
with some simple code:
# view the first few rows of mtcars dataset
head(mtcars)
# calculate the average mpg among all cars, save it as a variable
mean_mpg <- mean(mtcars$mpg)
# print the mean mpg
mean_mpg
After running this code, you’ll notice mean_mpg
is in your environment. The cool thing about R Projects is that the next time you open your project, you’ll continue exactly where you’re left off! All variables you created will have been saved, as if you never closed R Studio.
You can add multiple files to projects, and save things like images/pdfs of plots, multiple R Scripts (code), and data sets. Really, anything that you can save in the R Project folder will be contained in the R Project.
Create a GitHub Repository for your Existing (Local) R Project
Syncing your R Project to GitHub is a smart idea! You can work on the project on multiple machines, back up your work, see previous versions of your work, and collaborate easily with others.
In this lecture, we will assume an R Project already exists and we want to create a new GitHub repository for it. Previously, we created an R Project “STAT545_Example_Project” with the R Script analysis.R
. Let’s sync this project to GitHub!
In the R Console, load in the
usethis
package by callinglibrary(usethis)
In the R Console, call
use_git()
. When asked if it’s okay to commit some changes, select an option that means yes (it could be “OK” or “For sure” or “Yes”, for example) by writing the number in the console and hitting Enter. If it prompts you to restart RStudio, you can do so.If you restarted RStudio, call
library(usethis)
again. Then, in the Console calluse_github()
. A new repository on your GitHub account should be created! You should see your codeanalysis.R
as one of the items in your repository.
If you have issues, see the last section of Lecture 1A: Installation to set up your GitHub credentials on RStudio.
Clone a GitHub Repository (GitHub Repository Already Exists Remotely)
What if you want to use an existing R Project that is on GitHub in a repository? We can clone it!
In RStudio, go to “File > New Project”
Click on “Version Control: Checkout a project from a version control repository”
Click on “Git: Clone a project from a repository”
Fill in the info: URL: use HTTPS address (found on GitHub when clicking the green “Code” button) Create as a subdirectory of: Browse to where you would like to save this repository on your machine (creates a new folder in the location you select)
Now you have a local version of this project! You can interact with it using the instructions in the following section.
Side note: you can clone any public project, however you need to be added as a collaborator to push those changes to GitHub. If you’d like to play around with existing repositories, you could “Fork” it. A fork is a copy of a repository. Forking a repository allows you to freely experiment with changes without affecting the original project. More info here: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo.
Syncing Your R Project with GitHub
You’ve created a repository for your R Project - but what happens when we edit our project on our computer? Well, we need to send those changes to our GitHub repository (it’s not automatic).
Let’s make some changes to our STAT545_Example_Project! Adding a README file could be a nice addition. Whatever is written in this file will appear on the home page of our repository and can provide useful information about our repository.
Let’s create an Markdown (NOT RMARKDOWN!) document called “README.md” (“File” > “New File” > “Markdown File”). Set the title to “README”.
Let’s add the following to the file:
## STAT545 Example Project
This is my first R Project, synced to GitHub! It follows the lecture notes at <https://ubc-stat.github.io/STAT545.github.io/webpages/lectures_i/lec2b_versioncontrol.html>
Save this file as “README.md”.
Let’s also change something in our analysis.R
file. Let’s add the following to our file:
# plot wt vs mpg
plot(mtcars$wt, mtcars$mpg)
Save the changes.
Now, there are two ways that I’ll show you how to interact with GitHub through RStudio.
Using RStudio’s “Git” Tab
Using the Terminal
(Secret third option: using GitHub Desktop. I won’t show this in detail but it is an option!)
From RStudio’s “Git” Tab
When you navigate to the “Git” tab in RStudio, you will see that README.Rmd
and analysis.R
are listed. This is because these files have changes! README.Rmd
was added, and analysis.R
was modified.
We need to stage the files, commit the changes, and then push the changes to the remote GitHub version.
- Stage Files: Select the files you want to stage by selecting them. You will see symbols beside them - “A” stands for “Added” and “M” stands for “modified”. If you were to delete a file, you would see a red “D”.
- Commit Changes: Click the Commit button. In the pop-up, you can scan through and see the changes (additions, deletions, modifications) for each file that has been staged in Step 1. Add a message explaining the changes, and click Commit. You will get a message that says 2 files have changed. Close this.
- Push: You now have nothing to commit, as all changes have been tracked. You will see an information message saying “Your branch is ahead of”origin/main” by 1 commit. Time to Push! Click the Push button. If successful, the following message will appear:
>>> /usr/bin/git push origin HEAD:refs/heads/main
To https://github.com/<yourGitHubusername/STAT545_Example_Project.git
37b4ec8..7b32df4 HEAD -> main
Close this pop-up, and the “GitHub interface.”Review Changes” pop-up.
Now, when you go to your GitHub account, you should see some changes to your STAT545_Example_Project repository! You can see when the changes were pushed, and now a README appears below the files!
We can also Pull from the repository using the “Pull” button on the Git pane. This allows us to sync changes that are in our GitHub repository but not on out local machine (yet!). Simply use the “Pull” button in the Git pane on RStudio.
From RStudio’s Terminal
Instead of using the Git pane, we can write out commands for Git in the Terminal.
Stage Changes:
git add .
stages all of your modified files (can also just name the files you want to commit instead of using.
)Commit Changes:
git commit -m "some message"
commits your changes, along with a message summarizing the changesPush Changes:
git push origin main
pushes your changes to the GitHub repo (main branch, which is typically the default)
Other useful Terminal commands include:
git status
: see which of your changes have been tracked, staged, and committedgit checkout
: see what branch you’re on, or change branchesgit pull
: pull changes from the remote repository (GitHub) to your local computer
More on Git
Branches
Branches allow for parallel development within a repository. Branches are also a safe way to test out new features, fix bugs, or experiment with new ideas.
When creating a repository, a default branch is created. This is typically named “main” (or in older repositories, “master”). The default branch is the branch that GitHub displays when anyone visits your repository. The default branch is also the initial branch that Git checks out locally when someone clones the repository. Unless you specify a different branch, the default branch in a repository is the base branch for new pull requests and code commits.
To create a new branch, you can either navigate to the Git pane and click the icon with two purple rectangles connected to a white diamond. Here, you can create a new branch and name it. Then, you can switch between branches using the drop down menu beside it (see below).
You can make changes on your new branch, and push them to GitHub.
- NOTE: if using Terminal, you can create a new branch using
git checkout -b <new-branch-name>
and switch branches usinggit checkout <new-branch-name>
. You can push changes to the new branch usinggit push origin <new-branch-name>
, where<new-branch-name>
is the name of your branch)
After pushing changes to a branch, you will see a message on GitHub.
If this message does not appear, click “main” and switch the branch to the new one you just made. You will see a message that says “this branch is one commit ahead of main”). Click that message.
Click “Compare and pull request”. Here, you can see the changes between the main and new branch.
If you’re satisfied with the changes, you can click “Create pull request” to begin the merge the changes from your new branch to main.
If no conflicts exist, click “Merge pull request”. Now, the changes will be on the main (default) branch!
Click “Confirm merge”.
Merging Conflicts
Merge conflicts happen when we’ve created multiple versions of files that can’t be obviously combined into one definitive version.
Here is an example of something that would not cause a merge conflict:
At 9am, my TA pulls from the STAT 545 repository, makes a local change to the course dashboard, and commits and pushes her changes.
At 10am, I forget to pull from the STAT 545 repository, and start working on the Day 1 notes locally.
When I commit and push, Git is a bit confused, because I wasn’t working off of the “freshest” version of the STAT 545 repository - but since my TA and I were working on different lines of code, it will fairly seamlessly figure out that the right thing to do is to add my changes to the Day 1 notes to the current version of the STAT 545 repository online.
Here is an example of something that WOULD cause a merge conflict:
At 9am, my TA and I both pull from the STAT 545 repository.
At 10am, my TA changes the front page to say “STAT 555 @ UBC”, and commits/pushes those changes.
At 11am, I change the front page (without pulling!!!) to say “STAT 777 @ UBC”.
When I commit and push, Git doesn’t know what to do. Should it make the version that says “STAT 555 @ UBC” or “STAT 777 @ UBC” the definitive version? The push will fail, and Git will tell us we need to fix the conflict and then commit the result.
How do we fix this?
Pull.
Open the file that caused the merge conflict. You should see something like this:
<<<<<<< HEAD
STAT 555 @ UBC
=======
STAT 777 @ UBC
>>>>>>> 526363991d21ed20e7e0c57b5e99d944ac5ce5aa
The stuff below the
<<<<<<<
and above the=======
is what was in your local version; the stuff above the>>>>>>>
and below the=======
is what was in the remote conflicting version. Decide what you want to have on the offending line (e.g. “STAT 555 @ UBC”), and replace the whole block of text above with that.Save and commit the file. (An informative message here might be “Fixing a merge conflict.”) You should now be able to successfully push!
Other Tips
Commit changes OFTEN! Commit them as often as you would save your work!
Be careful when working with data containing personal identifying information (PII). This should NOT be stored on GitHub.
All of your projects in this class will live in the STAT 545 @ UBC Organization.
When you watch a Github repo, you get notifications when things happen in them. So if you “Watch” the STAT 545 webpage repo, then you will get email notifications when I update the site!
Resources
Today’s class is a digest of the following resources:
Grace’s Video Lecture: https://youtu.be/TigD_OxpGao
Online tutorials:
The basic version control workflow (without branching): Happy git w R: Chapter 20.
Starting with GitHub: Chapter 15: New project, GitHub first
Starting with files on your computer, and didn’t set up git: Chapter 16: Existing project, GitHub first
Starting with files on your computer, and did set up git: Chapter 17: Existing project, GitHub Last
Some additional resources that you might find useful:
- Read the Understanding the GitHub flow to get a sense of branching.