1 Module 1: Introduction to statistical inference and the sampling distribution of parameter estimates

Learning objectives

By the end of this module, you will be able to:

Describe real-world examples of questions that can be answered with the statistical inference methods presented in this course (e.g., estimation, hypothesis testing).
Name common population parameters (mean, median, proportion) often estimated using sample data and write computer scripts to calculate estimates of these parameters.
Define the following terms concerning statistical inference: population, sample, population parameters, estimate, sampling distribution, and sample distribution.
Write an R script to draw random samples from a finite population (e.g., census data).
Write an R script to estimate a sampling distribution for a given statistic and population.
Define random variables and explain how they relate to sampling.
Explain random and representative sampling and how this can influence estimation.

Note this partly a review of many concepts covered in DSCI 100.

This is work in progress! Please refer to readings listed on Canvas.

Different type of questions (see https://datasciencebook.ca/intro.html#asking-a-question): descriptive, exploratory, predictive, inferential, Causal, Mechanistic
Box: Measure of centrality: Median vs Mean (explain difference) Talk about the Mode and why is not good in many circumstances.
General idea of taking a sample
- Population vs sample
- Parameter vs sample statistics
Box: what is a random variable
Why random sample
- Bias
- Representative sample
- Generelization

What is the sampling distribution (emphasize all possible random sample of size n)
Introduction to package infer
Box: histograms, probability, reminder of ggplot
Where is it centered
Why we want the sampling distribution
How to approximate it computationally (re-emphasize the all possible sample)
Why we don’t have access to it
Properties
- introduce effect of increase sample size
- talk about the increase repetition
- SE (with known population parameters)