class: center, middle, inverse, title-slide # 01 Syllabus, Git, Github ### STAT 535A ### Daniel J. McDonald ### Last modified - 2021-01-11 --- ## About me * Started at UBC this summer (moved in mid-March, 2 days before the border closed) * Previously a Stats Prof at Indiana University for 8 years -- .center[ ![:scale 50%](https://weareiu.com/wp-content/uploads/2018/12/map-1.png) ] -- * I've taught this material or parts of it 4 or 5 times before * Much of my research uses optimization, though I never had a course on it -- .hand[I wish I had...] --- ## About you Assume working knowledge of/proficiency with: * Real analysis, calculus, linear algebra * Core problems in Machine Learning and Statistics * Programming (R, Python, Julia, your choice ...) * Data structures, computational complexity * Formal mathematical thinking If you fall short on any one of these things, it’s certainly possible to catch up; but don’t hesitate to talk to me --- ## Why learn convex optimization? .pull-left[ * Much of statistics relies on optimization * In most courses you learn how to go from a concept * ... to an optimization problem `\(P\)` * This course is about **how to solve** `\(P\)` * And why this is **a useful skill** ] .pull-right[.center[ ![:scale 40%](http://cdn.differencebetween.net/wp-content/uploads/2019/10/Difference-Between-Idea-and-Concept-1024x576.jpg) <svg style="height:100;fill:#084b7f;" viewBox="0 0 448 512"><path d="M413.1 222.5l22.2 22.2c9.4 9.4 9.4 24.6 0 33.9L241 473c-9.4 9.4-24.6 9.4-33.9 0L12.7 278.6c-9.4-9.4-9.4-24.6 0-33.9l22.2-22.2c9.5-9.5 25-9.3 34.3.4L184 343.4V56c0-13.3 10.7-24 24-24h32c13.3 0 24 10.7 24 24v287.4l114.8-120.5c9.3-9.8 24.8-10 34.3-.4z"/></svg> .large[ `$$P : \min_{\theta \in \Theta} f(\theta)$$` ]]] --- ## Motivation - Many software packages have built-in optimizers, - This is fine in certain cases (I have a function! Throw it in there.) - often, others have already solved `\(P\)` - Will work "fine" some of the time. -- - Immensely suboptimal most of the time -- ### Why bother? 1. Different algorithms can sometimes be better or worse on .hand[your] problem. 2. Solving `\(P\)` can improve your understanding of `\(P\)` 3. Solving `\(P\)` may lead you to propose `\(Q\)` --- ## Questions to answer when optimizing - Is it convex? - Can you take analytic derivatives? - Can you solve them for zero? - Is the problem small, low dimensional? - Do you need to solve it only once? - Can you find the analytic Hessian? Built-in optimizers are good if most of these answers are yes, or if they are all no. In other cases, it's better to write custom code. Knowing a little about optimization can reveal properties of the solutions. --- ## Example Two ways to solve `$$\min_\beta \frac{1}{2n}\sum_{i=1}^n (y_i-x_i^\top\beta)^2 + \lambda \sum_{j=1}^p |\beta_j|$$` .pull-left[ The LARS algorithm ```r system.time( out1 <- lars::lars(X, y) ) ``` ``` ## user system elapsed ## 1.417 0.309 1.731 ``` ] .pull-right[ The glmnet algorithm ```r system.time( out1 <- glmnet::glmnet(X, y) ) ``` ``` ## user system elapsed ## 0.685 0.045 0.753 ``` ] -- * That was a "small" problem, discrepancy gets much worse * `lars` won't work if `\(p>n\)` * But `lars` gives the **complete** path, `glmnet` doesn't --- ## Types of algorithms/problems - General purpose (0th-order) e.g. Simulated Annealing, Nelder-Mead - First order (Gradient descent, SGD, ADMM, Proximal methods) - Second order (Newton, BFGS) - Constrained/unconstrained? - Linear/non-linear? - Convex/Non-convex? -- We're going to talk (mainly) about first-order methods for convex problems. --- class: middle, inverse, center # Course mechanics --- ## Synchronous time * Meet M/W * I'm not going to record/post lectures, generally * This is a small graduate class. I won't force it, but I prefer you keep your videos on. * Try to "Raise a blue hand" and ask questions out loud rather than in the chat -- .hand[If you don't know, no one else does either. Please ask.] -- * About 50 minutes of lecture and 30 minutes of group activities (in break out rooms) --- ## Marks .pull-left[ **In class activities** * max 20 points (worth 1-4 each) * Done in group, 1 submission * honest effort gets full credit for all members **Little Quiz** * 10 points * During class on the last Monday * 30 minutes, all T/F or multiple choice ] -- .pull-right[ **Individual homework (x2)** * 15 points each * 1 week to revise for 80% **Group project (your choice)** * 35 points (5/25/5) * presentation on last day ] --- ## Course communication **Website:** <https://ubc-stat.github.io/stat-535-convexopt/> * Hosted on Github. * Links to slides and all materials * syllabus is there. Be sure to read it. **Slack:** * Link to join on Canvas. This is our discussion board. * Note that this data is hosted on servers outside of Canada. You may wish to use a pseudonym to protect your privacy. **Github organization** * Linked from the website. * This is where you complete/submit assignments/projects/in-class-work * This is hosted by UBC <https://learning.github.ubc.ca/STAT-535A-201-2020W/> --- ## Why these? * Yes, some data is hosted on servers in the US. * But in the real world, no one uses Canvas/Piazza, so why not learn things they do use? * Much easier to communicate, "grade" or comment on your work * Much more DS friendly * Note that MDS uses both of these, the department uses both, etc. -- Slack help from MDS [features](https://ubc-mds.github.io/resources_pages/slack/) and [rules](https://ubc-mds.github.io/resources_pages/slack_asking_for_help/) --- class: middle, inverse, center # Git and Github --- ## Why version control? .center[ ![:scale 30%](http://www.phdcomics.com/comics/archive/phd101212s.gif) ] * Much of this lecture is borrowed/stolen from Colin Rundel and Karl Broman --- ## Why version control? * Simple formal system for tracking all changes to a project * Time machine for your projects + Track blame and/or praise + Remove the fear of breaking things * Learning curve is steep, but when you need it you REALLY need it > Your closest collaborator is you six months ago, but you don’t reply to emails. > -- _Paul Wilson_ --- ## Why Git * You could use something like Box or Dropbox * These are poor-man's version control * Git is much more appropriate * It works with large groups * It's very fast * It's much better at fixing mistakes * Tech companies use it (so it's in your interest to have some experience) > This will hurt, but what doesn't kill you, makes you stronger. --- ## Overview * `git` is a command line program that lives on your machine * If you want to track changes in a directory, you type `git init` * This creates a (hidden) directory called `.git` * The `.git` directory contains a history of all changes made to "versioned" files * This top directory is referred to as a "repository" or "repo" * `github.com` is a service that hosts a repo remotely and has other features: issues, project boards, pull requests, renders `.ipynb` & `.md` * Some IDEs (pycharm, RStudio, VScode) have built in `git` (I don’t use this) * `git`/`github.com` is broad and complicated. Here, just what you **need** --- ## Typical workflow .pull-left[ 1. Download a repo from Github ``` git clone https://github.com/UBC-STAT/stat-535-convexopt.git ``` 2. Create a **branch** ``` git branch <branchname> ``` 3. Make changes to your files. 4. Add your changes to be tracked ("stage" them) ``` git add <name/of/tracked/file> ``` 5. Commit your changes ``` git commit -m "Some explanatory message" ``` **Repeat 3--5 as needed. Once you're satisfied** * Push to Github ``` git push ``` ] .pull-right[.center[ ![:scale 70%](gfx/git-clone.png) ]] --- ## What should be tracked? **Definitely** code, markdown documentation, tex files, bash scripts/makefiles, ... **Possibly** logs, jupyter notebooks, images (that won’t change), ... **Questionable** processed data, static pdfs, ... **Definitely not** full data, continually updated pdfs (other things compiled from source code), ... --- ## What things to track * You decide what is "versioned". * A file called `.gitignore` tells `git` files or types to never track ``` # History files .Rhistory .Rapp.history # Session Data files .RData # User-specific files .Ruserdata ``` * Shortcut to track everything (use carefully): ``` git add . ``` --- ## Rules **In class work** * Everyone shares a repo * Each group makes a branch (with a creative name) * Rename files that you clone. * You push your changes and make a PR against `master` * I review your work and merge the PR (I may ask for changes) **Homework** * You each have your own repo * You make a branch * No need to rename files * You must make at **least 5 commits** * Push your changes (at anytime) and make a PR against `master` * I review your work and merge the PR (I may ask for changes) **Project** * Each team gets a repo. * Make other branches as you wish to divide up work * When ready to submit, make a PR against `master` --- ## What's a PR? * This exists on Github (not git) * Demonstration -- .pull-left[.center[ ![:scale 100%](gfx/pr1.png) ]] .pull-right[.center[ ![:scale 80%](gfx/pr2.png) ]] --- ## Practice .pull-left[ [Website](https://ubc-stat.github.io/stat-535-convexopt/) Do you have `git`? ``` git --version ``` If it's there, you're done. Help on `git/github`: https://happygitwithr.com/install-git.html ] .pull-right[.center[ ![:scale 60%](https://imgs.xkcd.com/comics/git.png) ]]