Stat 550
Daniel J. McDonald
Last modified – 03 April 2024
\[ \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\find}{find} \DeclareMathOperator{\st}{subject\,\,to} \newcommand{\E}{E} \newcommand{\Expect}[1]{\E\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[2]{\mathrm{Cov}\left[#1,\ #2\right]} \newcommand{\given}{\mid} \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\P}{\mathcal{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\snorm}[1]{\lVert #1 \rVert} \newcommand{\tr}[1]{\mbox{tr}(#1)} \newcommand{\U}{\mathbf{U}} \newcommand{\D}{\mathbf{D}} \newcommand{\V}{\mathbf{V}} \renewcommand{\hat}{\widehat} \]
Much of this lecture is based on material from Colin Rundel and Karl Broman
Words of wisdom
Your closest collaborator is you six months ago, but you don’t reply to emails.
– Paul Wilson
This will hurt, but what doesn’t kill you, makes you stronger.
git
is a command line program that lives on your machinegit init
.git
.git
directory contains a history of all changes made to “versioned” files.ipynb
& .md
git
git
/GitHub is broad and complicated. Here, just what you needTip
First things first, RStudio and the Terminal
Command line is the “old” type of computing. You type commands at a prompt and the computer “does stuff”.
You may not have seen where this is. RStudio has one built in called “Terminal”
The Mac System version is also called “Terminal”. If you have a Linux machine, this should all be familiar.
Windows is not great at this.
To get the most out of Git, you have to use the command line.
Repeat 3–5 as needed. Once you’re satisfied
You decide what is “versioned”.
A file called .gitignore
tells git
files or types to never track
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# Compiled junk
*.o
*.so
*.DS_Store
master
vs main
git
main/develop/branch
workflowmain
is protected, released version of software (maybe renamed to release
)develop
contains things not yet on main
, but thoroughly testeddevelop
gets merged to main
feature
branch off develop
to build your new featuredevelop
. Supervisors review your contributionsI and many DS/CS/Stat faculty use this workflow with my lab.
Typical for your PR to trigger tests to make sure you don’t break things
Typical for team members or supervisors to review your PR for compliance
Tip
I suggest (require?) you adopt the “production” version for your HW 2
Initializing
git config user.name --global "Daniel J. McDonald"
git config user.email --global "daniel@stat.ubc.ca"
git config core.editor --global nano
# or emacs or ... (default is vim)
Staging
Committing
# stage/commit simultaneously
git commit -am "message"
# open editor to write long commit message
git commit
Pushing
Branching
# switch to branchname, error if uncommitted changes
git checkout branchname
# switch to a previous commit
git checkout aec356
# create a new branch
git branch newbranchname
# create a new branch and check it out
git checkout -b newbranchname
# merge changes in branch2 onto branch1
git checkout branch1
git merge branch2
# grab a file from branch2 and put it on current
git checkout branch2 -- name/of/file
git branch -v # list all branches
Check the status
fixed stuff
or oops? maybe done?
Conventions: (see here for details)
Sometimes you merge things and “conflicts” happen.
Meaning that changes on one branch would overwrite changes on a different branch.
They look like this:
Here are lines that are either unchanged
from the common ancestor, or cleanly
resolved because only one side changed.
But below we have some troubles
<<<<<<< yours:sample.txt
Conflict resolution is hard;
let's go shopping.
=======
Git makes conflict resolution easy.
>>>>>>> theirs:sample.txt
And here is another line that is cleanly
resolved or unmodified.
You decide what to keep
======
)======
)Always delete the <<<<<
, ======
, and >>>>>
lines.
Once you’re satisfied, commit to resolve the conflict.
32b252c854c45d2f8dfda1076078eae8d5d7c81f
32b25
git
docs, it’s reversed, they point to the thing on which they dependhttps://training.github.com/downloads/github-git-cheat-sheet.pdf
README.md
git status
will give some of these as suggestions1. Saved but not staged
In RStudio, select the file and click then select Revert…
2. Staged but not committed
In RStudio, uncheck the box by the file, then use the method above.
# make a new branch with everything, but stay on main
git branch newbranch
# find out where to go to
git log
# undo everything after ace2193
git reset --hard ace2193
git checkout newbranch
main
to submitImportant
☠️☠️ Read all the instructions in the repo! ☠️☠️
UBC Stat 550 - 2024