Office hours and such
(also in the Canvas announcement)
- Yes, there is lab as usual tomorrow. (But no Zoom OH)
- Homework 5 due tonight.
- Office hours next week:
- Monday 5-6pm on Zoom (use the link on Canvas, TA)
- Tuesday 3-4:30pm in ESB 4192 (me)
- Wednesday 10-11am in ESB 4192 (TA)
- Thursday 10-11am in ESB 3174 (me)
- Friday 2-3pm on Zoom (use the link on Canvas, TA)
Final Exam on Monday, December 18 from 12-2pm
Grades etc.
- Effort score done as soon as possible
- HW 5, aiming for Friday Dec 15, but no guarantees
- Clickers and Labs should be done soon
- The Final is autograded
- It usually takes me a few days to get the final grades in
- Generally, no curves, no roundin’ up, etc.
Big picture
- What is a model?
- How do we evaluate models?
- How do we decide which models to use?
- How do we improve models?
General stuff
- Linear algebra (SVD, matrix multiplication, matrix properties, etc.)
- Optimization (derivitive + set to 0, gradient descent, Newton’s method, etc.)
- Probability (conditional probability, Bayes rule, etc.)
- Statistics (likelihood, MLE, confidence intervals, etc.)
1. Model selection
- What is a statistical model?
- What is the goal of model selection?
- What is the difference between training and test error?
- What is overfitting?
- What is the bias-variance tradeoff?
- What is the difference between AIC / BIC / CV / Held-out validation?
2. Regression
- What do we mean by regression?
- What is the difference between linear and non-linear regression?
- What are linear smoothers and why do we care?
- What is feature creation?
- What is regularization?
- What is the difference between L1 and L2 regularization?
3. Classification
- What is classification? Bayes Rule?
- What are linear decision boundaries?
- Compare logistic regression to discriminant analysis.
- What are the positives and negatives of trees?
- What about loss functions? How do we measure performance?
4. Modern methods
- What is the difference between bagging and boosting?
- What is the point of the bootstrap?
- What is the difference between random forests and bagging?
- How do we understand Neural Networks?
5. Unsupervised learning
- What is unsupervised learning?
- Can be used for feature creation / EDA.
- Understanding linear vs. non-linear methods.
- What does PCA / KPCA estimate?
- Positives and negatives of clustering procedures.
Pause for course evals
Currently at 18/139.
The singular value decomposition applies to any matrix.
- True
- False
Which of the following produces the ridge regression estimate of \(\beta\) with \(\lambda = 1\)?
lm(y ~ x, lambda = 1)
(crossprod(x)) + diag(ncol(x))) %*% crossprod(x, y)
solve(crossprod(x) + diag(ncol(x))) %*% crossprod(x, y)
glmnet(x, y, lambda = 1, alpha = 0)
If Classifier A has higher AUC than Classifier B, then Classifier A is preferred.
- True
- False
Which of the following is true about the bootstrap?
- It is a method for estimating the sampling distribution of a statistic.
- It is a method for estimating expected prediction error.
- It is a method for improving the performance of a classifier.
- It is a method for estimating the variance of a statistic.
Which campus eatery is the best place to celebrate the end of the Term?
- Koerner’s
- Sports Illustrated Clubhouse (formerly Biercraft)
- Brown’s Crafthouse
- Rain or Shine