Module 01

Course introduction and set theory and probability axioms


TC and DJM

Last modified — 09 Jan 2026

1 Welcome

Weekly “routine”:

  • Pre-class reading: 1 to 2 hours
  • Pre-class WebWork: 2 to 3 hours
  • Class meetings: 3 hours
    • no laptops/phones!
    • bring paper and pencil/pen
    • each class will have a few in-class problems
    • I / TAs will circulate around to help you with them
    • we will cover the solutions in class
    • we will collect one problem for grading
  • Office hours (TBD)

Syllabus

Effort-based component

  • Pre-class WebWork: 14 points total
  • In-class exercises: 14 points total

Total: min(20, Pre-Class + In-Class)%

  • Nothing dropped, no extensions, no make ups, no weight transfers.
  • If you miss anything, make up for it somewhere else.
  • Choose your own adventure, no need to let me know.

Syllabus

Exams

  • Midterm 1: 20%

  • Midterm 2: 20%

  • Final: 40%

  • All closed-book, (probably) no notes.

  • The pre-class and in-class exercises will be designed to prepare you for the kinds of questions we ask on exams.

  • So: actually do them! They’re there to help you practice. Don’t rely on solutions you find online or AI assistance.

2 Basics of probability

The basics of probability

  • Formal treatment of randomness
  • Key to this are models
  • Defining a probability requires the following:
    • Random experiment
    • Sample space
    • Event
    • Rules to combine events (set operations)

Experiments

Definition

An action undertaken to make a discovery, test a hypothesis, or confirm a known fact.

Example

Release your pen from 4.9 meters above the ground.

Predicted outcomes

  • The pen will fall to the ground.

  • It will take about 1 sec to reach the ground.

Actual observations

  • The pen did touch the ground

  • Less sure if it took exactly 1 second to do so

Uncertainty

The outcome of some experiments cannot be determined beforehand.

Example
  • Roll a die: which side will show?

  • Draw a card from a well-shuffled deck: which one you will get?

  • How many students will be in the classroom today?

Probability theory

  • Even though die rolls are random, patterns emerge when we repeat the experiment many times.

  • Probability Theory describes such patterns via mathematical models.

  • It is a branch of Mathematics, and is based on a set of Axioms.

  • Axioms: Statements or propositions accepted to hold true

  • Theorems: Propositions which are established to hold true using sound logical reasoning.

Sample Space

Definition
Sample space is the set of all possible outcomes of a random experiment.

We denote it by \(\Omega\), and a generic outcome, also called sample point, by \(\omega\) (i.e. \(\omega \in \Omega\)).

Note

The text uses \(S\) and \(s \in S\).

Example
  • Roll a die: \(\Omega = \{ 1,2,3,4,5, 6 \} \subset \mathbb{N}\).

  • Draw a card from a poker deck: \(\Omega = \{ 2\spadesuit, 2\diamondsuit, \ldots, A\clubsuit ,A \heartsuit \}\)

  • Wind speed at YVR (km/h): \(\Omega =[0, \infty) \subset \mathbb{R}\).

  • Wait time for R4 at UBC (min): \(\Omega =[0, 720) \subset \mathbb{R}\)

Events

Definition
An event is a subset of the sample space \(\Omega\).

Notation: We commonly use upper case letters (\(A\), \(B\), \(C\), …) for events.

Events are sets:

  • \(\omega \in A\) means “\(\omega\) is an element of \(A\)”.

  • \(C \subset D\) means “\(C\) is a subset of \(D\)”.

Examples

Events are often formed by outcomes sharing some property. It’s a good idea to practice listing explicitly the sample points of events described with words.

Example
  • Roll a dice:

    • \(A =\) “roll an even number” \(= \{ 2, 4, 6 \}\)
    • \(B =\) “roll a 3 or less” \(= \{1, 2, 3\}\)
    • \(F =\) “roll an even number no higher than 3” \(= \{ 2 \}\)
  • Bus wait time: \(H =\) “wait is less than half an hour” \(= [15, 30]\)

  • Max-wind-speed: \(G =\) “wind is over 80 km/hour” \(= (80, \infty )\)

More die rolling

Exercise 1

A die is rolled repeatedly until we see a 6.

  1. Specify/describe the sample space.

  2. Let \(E_{n}\) denote the event that the number of rolls is exactly \(n\) (\(n=1,2, \ldots\)). Describe the event \(E_{n}\).

Functioning systems

Exercise 2

A system has 5 components, which can either work or fail.

The experiment consists of observing the status (W/F) of the 5 components.

  1. Describe the sample space for this experiment.

  2. What is the value of \(\# \Omega\)?

  3. Let \(A= \{ \text{components 4 and 5 fail} \}\). What is \(\# A\)?

3 Set theory

Set Operations

Suppose \(A\), \(B\) are events (subsets of \(\Omega\)).

  • Union: \(A \mathop{\mathrm{\mathchoice{\bigcup}{\cup}{\cup}{\cup}}}B\) \[\omega \in A \cup B \Leftrightarrow \omega \in A \mbox{ or } \omega \in B\]

  • Intersection: \(A \cap B\) \[\omega \in A \cap B \Leftrightarrow \omega \in A \mbox{ and } \omega \in B\]

  • Complement: \(A^c\) \[\omega \in A^c\Leftrightarrow \omega \notin A\]

  • Symmetric difference: \(A \, \triangle \, B\) \[A \, \triangle \, B \, = \, \left( A \cap B^c \right) \, \cup \, \left( A^c \cap B \right)\]

Properties of set operations

  • Equality

    • \(A = B \quad \Leftrightarrow \quad A \subseteq B \ \text{ and } \ B \subseteq A\)
  • Commutative:

    • \(A \cup B \ = \ B \cup A\)

    • \(A \cap B \ = \ B \cap A\)

  • Associative:

    • \(A\cup B\cup C \, = \, \left( A\cup B\right) \cup C=A\cup \left( B\cup C\right)\)

    • \(A\cap B\cap C \, = \, \left( A\cap B\right) \cap C=A\cap \left( B\cap C\right)\)

  • Distributive:

    • \(\left( A\cup B\right) \cap C \, = \, \left( A\cap C\right) \cup \left( B\cap C\right)\)

    • \(\left( A\cap B\right) \cup C \, = \, \left( A\cup C\right) \cap \left( B\cup C\right)\)

Laws of Partitioning

Exercise 3
  1. \(A \ = ( A\cap B ) \, \cup \, ( A\cap B^{c} )\)

Hint: use the fact that \(B \cup B^c = \Omega\)

  1. \(A \, \cup\, B \ = \ A \, \cup \left( B\cap A^{c}\right)\)

Hint: use the first rule above above to express \(B\) in terms of \(B\cap A\) and \(B\cap A^c\)

De Morgan’s Laws

Theorem
For any two events (sets) \(A\) and \(B\), we have \[ ( A\cup B ) ^{c} \, = \, A^{c}\cap B^{c} \]

To prove the theorem it is sufficient to show that \[ ( A\cup B )^{c} \subseteq A^{c}\cap B^{c} \] and that \[ A^{c}\cap B^{c} \subseteq ( A\cup B ) ^{c} \]

Proof of De Morgan’s Laws

Exercise 4
Prove De Morgan’s Laws

Power Set, Empty Set, Cardinality

The power set of \(\Omega\) (denoted \(2^\Omega\)) is the set of all possible subsets of \(\Omega\).

For example, if \(\Omega \ = \ \{ 1,2,3 \}\) then: \[2^{\Omega } \, = \Bigl\{ \varnothing , \{ 1\} , \{ 2 \} , \{ 3 \} , \{ 1,2 \} , \{ 1,3 \} , \{ 2,3 \}, \{ 1, 2, 3 \} \Bigr\}\]

  • The symbol \(\varnothing\) denotes the empty set: \(\varnothing = \{ \}\).

  • The symbol \(\#\) or \(|\cdot|\) denote the size of a set (number of elements): \(\#\Omega\) or \(|\Omega|\) both mean the number of elements in \(\Omega\).

Size of the Power Set

Exercise 5
If \(\Omega\) has \(n\) elements, what is \(|2^\Omega|\)?

4 Probability

Intro to probability

  • We will define what a probability is, as a mathematical object

  • We will derive and discuss some basic rules that are helpful for computing probabilities (using set operations)

Probability of an event

  • Even though random outcomes cannot be predicted, in some cases we have an idea about the chance that an outcome occurs.

    • If you toss a fair coin, the chance of observing a head is the same as that of observing a tail.

    • If you buy a lottery ticket, the chance of winning is very small.

  • A probability function \(\mathbb{P}\) quantifies these chances.

  • Probability functions are computed on events \(A \in \mathcal{B}\). We calculate \(\mathbb{P}(A)\). Mathematically / formally, we have: \[ \mathbb{P}\, : \, \mathcal{B} \to [0, 1] \] where \(\mathcal{B}\) is a collection of possible events.

  • Probability functions need to do this “coherently”

Probability Axioms

Let \(\Omega\) be a sample space and \({\cal B}\) be a collection of events (i.e. subsets of \(\Omega\)).

Definition

A probability function is a function \(\mathbb{P}\) with domain \({\cal B}\) such that

  1. Axiom 1: \(\mathbb{P}( \Omega ) = 1\);

  2. Axiom 2: \(\mathbb{P}( A ) \geq 0\) for any \(A \in {\cal B}\);

  3. Axiom 3: If \(\{ A_{n}\}_{n \ge 1}\) is a sequence of disjoint events, then \[\mathbb{P}\left( \bigcup_{n=1}^{\infty }A_{n}\right) \, = \sum_{n=1}^{\infty }\mathbb{P}( A_{n})\]

Note: \(\{ A_{n}\}_{n \ge 1}\) is a sequence of disjoint events when \(A_i \cap A_j = \varnothing\) if \(i \ne j\)

Probability

  • Kolmogorov showed how one can construct such functions, and that a probability function only needs to satisfy those three properties to be a “coherent” probability function

  • In other words: every desirable property of a probability \(\mathbb{P}\) can be shown to hold using only Axioms 1, 2, and 3 (and logic).

  • Alternatively: any function \(\mathbb{P}\) that satisfies Axioms 1, 2, and 3 is a “proper” probability function.

Properties of the probability function

In general, \(A\), \(B\), \(C\), etc. denote arbitrary events. \(\Omega\) is the sample space.

  • Probability of the complement: \(\mathbb{P}( A^{c} ) =1-\mathbb{P}( A )\)

  • Monotonicity: \(A\subset B\Rightarrow \mathbb{P}( A ) \leq \mathbb{P}(B )\)

  • Probability of the union: \(\mathbb{P}( A\cup B ) =\mathbb{P}( A ) +\mathbb{P}( B ) - \mathbb{P}( A\cap B )\)

  • Boole’s inequality: \(\mathbb{P}( \bigcup _{i=1}^{m}A_{i} ) \leq \sum_{i=1}^{m}\mathbb{P}( A_{i} )\)

Exercise 6
Prove all 4 properties.

5 Extra problems

Problem 1

Marley borrows 2 books. Suppose that there is a 0.5 probability they like the first book, 0.4 that they like the second book, and 0.3 that they like both.

What is the probability that they will NOT like both books? (i.e. that they will not like either book?)

Problem 2

Jane must take two tests, call them \(T_1\) and \(T_2\). The probability that she passes test \(T_1\) is 0.8, that she passes test \(T_2\) is 0.7, and that of passing both tests is 0.6.

Calculate the probability that:

  1. She passes at least one test.

  2. She passes at most one test.

  3. She fails both tests.

  4. She passes only one test.

Problem 3

  1. Suppose that \(\mathbb{P}\left( A\right) =0.85\) and \(\mathbb{P}\left( B\right) =0.75.\) Show that \[\mathbb{P}\left( A\cap B\right) \geq 0.60.\]

  2. More generally, prove the Bonferroni inequality: \[\mathbb{P}\left( \bigcap _{i=1}^{n}A_{i}\right) \geq \sum_{i=1}^{n} \mathbb{P}\left( A_{i}\right) -\left( n-1\right) .\]