00 Review and bonus clickers

Stat 406

Daniel J. McDonald

Last modified – 06 December 2023

\[ \DeclareMathOperator*{\argmin}{argmin} \DeclareMathOperator*{\argmax}{argmax} \DeclareMathOperator*{\minimize}{minimize} \DeclareMathOperator*{\maximize}{maximize} \DeclareMathOperator*{\find}{find} \DeclareMathOperator{\st}{subject\,\,to} \newcommand{\E}{E} \newcommand{\Expect}[1]{\E\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \newcommand{\Cov}[2]{\mathrm{Cov}\left[#1,\ #2\right]} \newcommand{\given}{\ \vert\ } \newcommand{\X}{\mathbf{X}} \newcommand{\x}{\mathbf{x}} \newcommand{\y}{\mathbf{y}} \newcommand{\P}{\mathcal{P}} \newcommand{\R}{\mathbb{R}} \newcommand{\norm}[1]{\left\lVert #1 \right\rVert} \newcommand{\snorm}[1]{\lVert #1 \rVert} \newcommand{\tr}[1]{\mbox{tr}(#1)} \newcommand{\brt}{\widehat{\beta}^R_{s}} \newcommand{\brl}{\widehat{\beta}^R_{\lambda}} \newcommand{\bls}{\widehat{\beta}_{ols}} \newcommand{\blt}{\widehat{\beta}^L_{s}} \newcommand{\bll}{\widehat{\beta}^L_{\lambda}} \newcommand{\U}{\mathbf{U}} \newcommand{\D}{\mathbf{D}} \newcommand{\V}{\mathbf{V}} \]

Office hours and such

(also in the Canvas announcement)

  1. Yes, there is lab as usual tomorrow. (But no Zoom OH)
  2. Homework 5 due tonight.
  3. Office hours next week:
    • Monday 5-6pm on Zoom (use the link on Canvas, TA)
    • Tuesday 3-4:30pm in ESB 4192 (me)
    • Wednesday 10-11am in ESB 4192 (TA)
    • Thursday 10-11am in ESB 3174 (me)
    • Friday 2-3pm on Zoom (use the link on Canvas, TA)

Final Exam on Monday, December 18 from 12-2pm

Grades etc.

  • Effort score done as soon as possible
  • HW 5, aiming for Friday Dec 15, but no guarantees
  • Clickers and Labs should be done soon
  • The Final is autograded
  • It usually takes me a few days to get the final grades in
  • Generally, no curves, no roundin’ up, etc.

Big picture

  • What is a model?
  • How do we evaluate models?
  • How do we decide which models to use?
  • How do we improve models?

General stuff

  • Linear algebra (SVD, matrix multiplication, matrix properties, etc.)
  • Optimization (derivitive + set to 0, gradient descent, Newton’s method, etc.)
  • Probability (conditional probability, Bayes rule, etc.)
  • Statistics (likelihood, MLE, confidence intervals, etc.)

1. Model selection

  • What is a statistical model?
  • What is the goal of model selection?
  • What is the difference between training and test error?
  • What is overfitting?
  • What is the bias-variance tradeoff?
  • What is the difference between AIC / BIC / CV / Held-out validation?

2. Regression

  • What do we mean by regression?
  • What is the difference between linear and non-linear regression?
  • What are linear smoothers and why do we care?
  • What is feature creation?
  • What is regularization?
  • What is the difference between L1 and L2 regularization?

3. Classification

  • What is classification? Bayes Rule?
  • What are linear decision boundaries?
  • Compare logistic regression to discriminant analysis.
  • What are the positives and negatives of trees?
  • What about loss functions? How do we measure performance?

4. Modern methods

  • What is the difference between bagging and boosting?
  • What is the point of the bootstrap?
  • What is the difference between random forests and bagging?
  • How do we understand Neural Networks?

5. Unsupervised learning

  • What is unsupervised learning?
  • Can be used for feature creation / EDA.
  • Understanding linear vs. non-linear methods.
  • What does PCA / KPCA estimate?
  • Positives and negatives of clustering procedures.

Pause for course evals

Currently at 18/139.

A few clicker questions

The singular value decomposition applies to any matrix.

  1. True
  2. False

Which of the following produces the ridge regression estimate of \(\beta\) with \(\lambda = 1\)?

  1. lm(y ~ x, lambda = 1)
  2. (crossprod(x)) + diag(ncol(x))) %*% crossprod(x, y)
  3. solve(crossprod(x) + diag(ncol(x))) %*% crossprod(x, y)
  4. glmnet(x, y, lambda = 1, alpha = 0)

If Classifier A has higher AUC than Classifier B, then Classifier A is preferred.

  1. True
  2. False

Which of the following is true about the bootstrap?

  1. It is a method for estimating the sampling distribution of a statistic.
  2. It is a method for estimating expected prediction error.
  3. It is a method for improving the performance of a classifier.
  4. It is a method for estimating the variance of a statistic.

Which campus eatery is the best place to celebrate the end of the Term?

  1. Koerner’s
  2. Sports Illustrated Clubhouse (formerly Biercraft)
  3. Brown’s Crafthouse
  4. Rain or Shine

Which would you prefer to hear about (briefly)?

  1. Daniel’s thoughts on stuff (grad school / undergrad school / life / etc.)
  2. Epidemiological forecasting
  3. Software for epidemiological forecasting
  4. Analysis of classical music
  5. Economic forecasting models