- #[[ST2001 - Statistics in Data Science I]] - **Previous Topic:** [[Sampling]] - **Next Topic:** [[Random Variables]] - **Relevant Slides:** ![Topic 4 - Probability.pdf](../assets/Topic_4_-_Probability_1664204337770_0.pdf) - - Probability provides the *framework* for the study & application of statistics. - # What are Probabilities? - Take, for example, a 6-sided die about to be tossed for the first time. - **Classical:** 6 possible outcomes, by symmetry, each equally likely to occur, - **Frequentist:** Empirical evidence shows that similar dice thrown in the past have landed on each side about equally often. - **Subjective:** The degree of individual belief in occurrence of an event can be influenced by classical or frequentist arguments. - Subjective probabilities are also influenced by other reasons when symmetry arguments don't apply & repeated trials are not possible. - - # Probability - The probability of an event $A$ is the number of (equally likely & disjoint) outcomes in the event divided by the total number of (equally likely & disjoint) possible outcomes. - $$P(A) = \frac{\text{\# of outcomes in A}}{\text{\# of possible outcomes}}$$ - $$(0 \leq P(A) \leq 1)$$ - ## Sample Spaces - What is a **sample space**? #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.7 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:42:50.065Z card-last-score:: 1 - The set of all possible outcomes of a random experiment is called the **sample space**, $S$. - $S$ is **discrete** if it consists of a finite or countably infinite set of outcomes. - $S$ is **continuous** if it contains an interval of real numbers. - $$P(S) = 1$$ - ## Events - What is an **event**? #card card-last-interval:: 4.14 card-repeats:: 2 card-ease-factor:: 2.56 card-next-schedule:: 2022-11-21T23:15:48.008Z card-last-reviewed:: 2022-11-17T20:15:48.008Z card-last-score:: 5 - An **event** is a specific collection of sample points / possible outcomes. - An event is denoted by $E$ or by capital letters, $A$, $B$, etc. - What is a **SImple Event**? #card card-last-interval:: 3.57 card-repeats:: 2 card-ease-factor:: 2.46 card-next-schedule:: 2022-11-22T07:34:13.841Z card-last-reviewed:: 2022-11-18T18:34:13.841Z card-last-score:: 5 - A **Simple Event** is a collection of only **one** sample point / possible outcomes. - What is a **Compound Event**? #card card-last-interval:: 4.14 card-repeats:: 2 card-ease-factor:: 2.56 card-next-schedule:: 2022-11-21T23:15:54.761Z card-last-reviewed:: 2022-11-17T20:15:54.762Z card-last-score:: 5 - A **Compound Event** is a collection of **more than one** sample point / possible outcomes. - ## Permuatations - A **permutation** is an arrangement of objects. - It can also be an arrangement of $r$ objects chosen from $n$ distinct objects where replacement in the selection is not allowed. - The symbol, $P^n_r$ represents the number of permutations of $r$ objects selected from $n$ objects. - The calculation is given by the formula: - $$P^n_r = \frac{n!}{(n-r)!}$$ - ## Joint Events (and / or) - Probabilities of **joint events** can often be determined from the probabilities of the individual events that comprise them. - Joint events are generated by applying basic set operations to individual events, specifically: - **Complement** of event $A$ is $\bar{A} =$ all outcomes *not* in $A$. - **Union** of events $A \cup B$; $A$ **or** $B$ or both. - **Intersection** of events $A$ **and** $B$ -> $A \cap B$. - **Disjoint** events cannot occur together -> $A \cap B = \empty$. - - ## Probability of a Union ($A$ **or** $B$) #card card-last-interval:: 21.53 card-repeats:: 4 card-ease-factor:: 2.32 card-next-schedule:: 2022-12-06T08:01:39.281Z card-last-reviewed:: 2022-11-14T20:01:39.281Z card-last-score:: 3 - For any two events $A$ and $B$, the probability of union is given by: - $$P(A \cup B) = P(A) + P(B) - P(A \cap B)$$ - For two **disjoint** (also called **mutually exclusive**) events $A$ and $B$, the probability that one *or* the other occurs is the sum of the probabilities of the two events (provided that $A$ and $B$ are disjoint). - $$P(A \cup B) = P(A) + P(B)$$ - If $P(A \cup B)$ is greater than 1, then you know you have made a mistake and that the events were not mutually exclusive -> there is an intersection. - ## Intersections ($A$ **and** $B$) card-last-interval:: 3.51 card-repeats:: 2 card-ease-factor:: 2.6 card-next-schedule:: 2022-10-08T00:26:58.336Z card-last-reviewed:: 2022-10-04T12:26:58.337Z card-last-score:: 5 - #### Multiplication Rule for Independent Events #card card-last-interval:: 33.64 card-repeats:: 4 card-ease-factor:: 2.9 card-next-schedule:: 2022-12-18T11:07:08.232Z card-last-reviewed:: 2022-11-14T20:07:08.232Z card-last-score:: 5 - For two **independent** events $A$ and $B$, the probability that *both* $A$ **and** $B$ occur is the product of the probabilities of the two events. - $$P(A \cap B) = P(A) \times P(B)$$ - If two events are **independent**, that means that one event has no impact on the probability of occurrence of the other event. - ## Conditional Probability card-last-score:: 1 card-repeats:: 1 card-next-schedule:: 2022-10-04T23:00:00.000Z card-last-interval:: -1 card-ease-factor:: 2.5 card-last-reviewed:: 2022-10-04T12:31:44.517Z - What is **conditional probability**? #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.36 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:30:38.038Z card-last-score:: 1 - $P(B | A)$ is the probability of event $B$ occurring, given that event $A$ has already occurred. - The **conditional probability** of $B$ given $A$, denoted by $P(B | A)$, is defined by: #card card-last-interval:: 0.98 card-repeats:: 2 card-ease-factor:: 2.36 card-next-schedule:: 2022-11-15T19:08:28.312Z card-last-reviewed:: 2022-11-14T20:08:28.312Z card-last-score:: 3 - $$P(B|A) = \frac{P(A \cap B)}{P(A)} \text{, provided } P(A) > 0$$ - **Note:** $P(A)$ cannot equal 0, since we know that $A$ *has* occurred. - ### General Multiplication Rule for Dependent Events #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.6 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:33:46.273Z card-last-score:: 1 - The conditional probability can be rewritten to further generalise the multiplication rule: - $$P(A \cap B) = P(A) \cdot P(B|A)$$ - $$P(B \cap A) = P(B)B \cdot P(B|A)$$ - $$\text{As } P(A \cap B) = P(B \cap A) \text{ implies}$$ - $$P(A) \cdot P(B | A) = P(B) \cdot P(A |B)$$ - These results mean that $P(A |B)$ can be calculated once we know $P(A)$, $P(B)$, and $P(B | A)$. - #### Bayes' Theorem #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.36 card-next-schedule:: 2022-11-22T00:00:00.000Z card-last-reviewed:: 2022-11-21T13:07:04.337Z card-last-score:: 1 - **Bayes' Theorem** states that: - $$P(A | B) = \frac{P(B | A) \cdot P(A)}{P(B)} \text{ for } P(B)>0$$ - ## Independence - Two events, $A$ and $B$ are independent, if and only if: - $$P(A \cap B) = P(A)\cdot P(B)$$ - Therefore, to obtain the probability that two independent events will occur, we simply find the product of their individual probabilities. - Two events $A$ and $B$ are independent, if and only if: - $$P(B | A) = P(B) \text{ or } P(A|B) = P(A)$$ - assuming the existence of the conditional probabilities. - Otherwise, $A$ and $B$ are **dependent**. - If in an experiment, the events $A$ and $B$ can both occur, then: - $$P(A \cap B) = P(A)P(B|A) \text{, provided } P(A) > 0$$ -