Stat 406
Geoff Pleiss, Trevor Campbell
Last modified – 04 September 2024
\[ \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\find}{find} \DeclareMathOperator{\st}{subject\,\,to} \newcommand{\E}{E} \newcommand{\Expect}[1]{\E\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[2]{\mathrm{Cov}\left[#1,\ #2\right]} \newcommand{\given}{\ \vert\ } \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\P}{\mathcal{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\snorm}[1]{\lVert #1 \rVert} \newcommand{\tr}[1]{\mbox{tr}(#1)} \newcommand{\brt}{\widehat{\beta}^R_{s}} \newcommand{\brl}{\widehat{\beta}^R_{\lambda}} \newcommand{\bls}{\widehat{\beta}_{ols}} \newcommand{\blt}{\widehat{\beta}^L_{s}} \newcommand{\bll}{\widehat{\beta}^L_{\lambda}} \newcommand{\U}{\mathbf{U}} \newcommand{\D}{\mathbf{D}} \newcommand{\V}{\mathbf{V}} \]
Link to join on Canvas. This is our discussion board.
Note that this data is hosted on servers outside of Canada. You may wish to use a pseudonym to protect your privacy.
Anything super important will be posted to Slack and Canvas.
Be sure you get Canvas email.
Linked from the website.
This is where you complete / submit assignments / projects / in-class-work
This is also hosted on Servers outside Canada https://github.com/stat-406-2024/
Yes, some data is hosted on servers in the US.
But in the real world, no one uses Canvas / Piazza, so why not learn things they do use?
Much easier to communicate, “mark” or comment on your work
Much more DS friendly
Note that MDS uses both of these, the Stat and CS departments use both, many faculty use them, Google / Amazon / Meta use things like these, etc.
But I already know how to use git/Github…
Are you sure?
Yes. I really know how to use git/Github.
Then pull out your laptop and read my “How To Be a Git Wizard” slides.
I guarantee (with 99% confidence) that you will learn a new command.
Much of this lecture is based on material from Colin Rundel and Karl Broman
When you get really good
Version control can act as a living lab notebook
git
is a command line program that lives on your machinegit init
.git
.git
directory contains a history of all changes made to “versioned” files.ipynb
& .md
git
git
/GitHub is broad and complicated. Here, just what you needTip
First things first, RStudio and the Terminal
Command line is the “old” type of computing. You type commands at a prompt and the computer “does stuff”.
You may not have seen where this is. RStudio has one built in called “Terminal”
The Mac System version is also called “Terminal”. If you have a Linux machine, this should all be familiar.
Windows is not great at this.
To get the most out of Git, you have to use the command line.
Repeat 3–5 as needed. Once you’re satisfied
Instead, try “Update linear model in Question 1.2”
TLDR
Any file that YOU edit should be tracked
Any file that’s computer generated should PROBABLY NOT be tracked
However, in this course you will track rendered PDFs of your homeworks/labs. This makes it easier for the graders.
A file called .gitignore
tells git
files or types to never track
```{bash}
# History files
.Rhistory
.Rapp.history
# Session Data files
.RData
# User-specific files
.Ruserdata
# Compiled junk
*.o
*.so
*.DS_Store
```
Shortcut to track everything (use carefully):
You each have your own repo
You make a branch
DO NOT rename files
Make enough commits (3 for labs, 5 for HW).
Push your changes (at anytime) and make a PR against main
when done.
TAs review your work.
On HW, if you want to revise, make changes in response to feedback and push to the same branch. Then “re-request review”.
master
vs main
git
Typical for your PR to trigger tests to make sure you don’t break things
Typical for team members or supervisors to review your PR for compliance
In this course, we protect main
so that you can’t push there
Important
Read the PR template!!
Initializing
```{bash}
git config user.name --global "Geoff Pleiss"
git config user.email --global "geoff.pleiss@stat.ubc.ca"
git config core.editor --global nano
# or emacs or ... (Geoff loves vim and you should too!)
```
Staging
Committing
```{bash}
# stage/commit simultaneously
git commit -am "message"
# open editor to write long commit message
git commit
```
Pushing
Branching
```{bash}
# switch to branchname, error if uncommitted changes
git checkout branchname
# switch to a previous commit
git checkout aec356
# create a new branch
git branch newbranchname
# create a new branch and check it out
git checkout -b newbranchname
# merge changes in branch2 onto branch1
git checkout branch1
git merge branch2
# grab a file from branch2 and put it on current
git checkout branch2 -- name/of/file
git branch -v # list all branches
```
Check the status
Sometimes you merge things and “conflicts” happen.
Meaning that changes on one branch would overwrite changes on a different branch.
Here are lines that are either unchanged from
the common ancestor, or cleanly resolved
because only one side changed.
But below we have some troubles
<<<<<<< yours:sample.txt
Conflict resolution is hard;
let's go shopping.
=======
Git makes conflict resolution easy.
>>>>>>> theirs:sample.txt
And here is another line that is cleanly
resolved or unmodified.
You get to decide, do you want to keep
======
)======
)But always delete the <<<<<
, ======
, and >>>>>
lines.
Once you’re satisfied, committing resolves the conflict.
32b252c854c45d2f8dfda1076078eae8d5d7c81f
32b25
git
docs, it’s reversed, they point to the thing on which they dependhttps://training.github.com/downloads/github-git-cheat-sheet.pdf
README.md
git status
will give some of these as suggestions1. Saved but not staged
2. Staged but not committed
```{bash}
# make a new branch with everything, but stay on main
git branch newbranch
# undo everything that hasn't been pushed to main
git fetch && git reset --hard origin/main
git checkout newbranch
```
Anything more complicated, either post to Slack or LMGTFY
In the Lab next week, you’ll practice
UBC Stat 406 - 2024