Stat 406
Daniel J. McDonald
Last modified – 10 December 2023
\[ \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\find}{find} \DeclareMathOperator{\st}{subject\,\,to} \newcommand{\E}{E} \newcommand{\Expect}[1]{\E\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[2]{\mathrm{Cov}\left[#1,\ #2\right]} \newcommand{\given}{\ \vert\ } \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\P}{\mathcal{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\snorm}[1]{\lVert #1 \rVert} \newcommand{\tr}[1]{\mbox{tr}(#1)} \newcommand{\brt}{\widehat{\beta}^R_{s}} \newcommand{\brl}{\widehat{\beta}^R_{\lambda}} \newcommand{\bls}{\widehat{\beta}_{ols}} \newcommand{\blt}{\widehat{\beta}^L_{s}} \newcommand{\bll}{\widehat{\beta}^L_{\lambda}} \newcommand{\U}{\mathbf{U}} \newcommand{\D}{\mathbf{D}} \newcommand{\V}{\mathbf{V}} \]
Daniel J. McDonald
Associate Professor, Department of Statistics
I and the TAs are here to help you learn. Ask questions.
We encourage engagement, curiosity and generosity
We favour steady work through the Term (vs. sleeping until finals)
The assessments attempt to reflect this ethos.
When the term ends, I want
I do not want
I promise
I do not promise that you will all get the grade you want.
I work on COVID a lot.
Statistics is hugely important.
I encourage you to wear a mask
Do NOT come to class if you are possibly sick
Be kind and considerate to others
The Marking scheme is flexible enough to allow some missed classes
centering / scaling / factors-to-dummies / basis expansion / missing values / dimension reduction / discretization / transformations
Which box do you use?
Repeat all the preprocessing on new data. But be careful.
Source: https://vas3k.com/blog/machine_learning/
Each module is approximately 2 weeks long
Each module is based on a collection of readings and lectures
Each module (except the review) has a homework assignment
Effort-based
Total across three components: 65 points, any way you want
Knowledge-based
Final Exam, 35 points
You stay on top of the material
You come to class and participate
You gain coding practice in the labs
You work hard on the assignments
Most of this is Effort Based
work hard, guarantee yourself 65%
Coming to class – 3 hours
Reading the book – 1 hour
Labs – 1 hour
Homework – 4 hours
Study / thinking / playing – 1 hour
The goal is to “Do the work”
Assignments
Not easy, especially the first 2, especially if you are unfamiliar with R / Rmarkdown / ggplot
You may revise to raise your score to 7/10, see Syllabus. Only if you lose 3+ for content (penalties can’t be redeemed).
Don’t leave these for the last minute
Labs
Labs should give you practice, allow for questions with the TAs.
They are due at 2300 on the day of your lab, lightly graded.
You may do them at home, but you must submit individually (in lab, you may share submission)
Labs are lightly graded
Questions are similar to the Final
0 points for skipping, 2 points for trying, 4 points for correct
total = max(0, min(5 * points / N - 5, 10))
Be sure to sync your device in Canvas.
Don’t do this!
Average < 1 drops your Final Mark 1 letter grade.
A- becomes B-, C+ becomes D.
Scheduled by the university.
It is hard
The median last year was 50% \(\Rightarrow\) A-
Philosophy:
If you put in the effort, you’re guaranteed a C+.
But to get an A+, you should really deeply understand the material.
No penalty for skipping the final.
If you’re cool with C+ and hate tests, then that’s fine.
Skipping HW makes it difficult to get to 65
Come to class!
Yes it’s at 8am. I hate it too.
To compensate, I will record the class and post to Canvas.
In terms of last year’s class, attendance in lecture and active engagement (asking questions, coming to office hours, etc.) is the best predictor of success.
An Introduction to Statistical Learning
James, Witten, Hastie, Tibshirani, 2013, Springer, New York. (denoted [ISLR])
Available free online: http://statlearning.com/
The Elements of Statistical Learning
Hastie, Tibshirani, Friedman, 2009, Second Edition, Springer, New York. (denoted [ESL])
Also available free online: https://web.stanford.edu/~hastie/ElemStatLearn/
It’s worth your time to read.
If you need more practice, read the Worksheets.
All coding in R
Suggest you use RStudio IDE
See https://ubc-stat.github.io/stat-406/ for instructions
It tells you how to install what you will need, hopefully all at once, for the whole Term.
We will use R and we assume some background knowledge.
Links to useful supplementary resources are available on the website.
This course is not an intro to R / python / MongoDB / SQL.
All lectures will be recorded and posted
I cannot guarantee that they will all work properly (sometimes I mess it up)
Lectures are hard. It’s 8am, everyone’s tired.
Coding is hard. I hope you’ll get better at it.
I strongly urge you to get up at the same time everyday. My plan is to go to the gym on MWF. It’s really hard to sleep in until 10 on MWF and make class at 8 on T/Th.
Let’s be kind and understanding to each other.
I have to give you a grade, but I want that grade to reflect your learning and effort, not other junk.
If you need help, please ask.
UBC Stat 406 - 2023