Add second year

This commit is contained in:
2023-12-07 01:19:12 +00:00
parent 3291e5c79e
commit 3d12031ab8
1168 changed files with 431409 additions and 0 deletions

View File

@ -0,0 +1,142 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Stars & Bars]]
- **Next Topic:** [[Introduction to Graph Theory]]
- **Relevant Slides:** ![MA284-Week06.pdf](../assets/MA284-Week06_1665576169094_0.pdf)
-
- # Advanced Counting Using PIE
collapsed:: true
- The PIE works for larger number of sets than 2 and 3, although it gets a little messy to write down.
- For 4 sets, we can think of it as:
- $|A \cup B \cup C \cup D| =$ (the sum of the sizes of each single set) $-$ (the sum of the sizes of each **intersection** of 2 sets) $+$ (the sum of the sizes of each **intersection** of 3 sets) $-$ (the sum of the sizes of the **intersection** of all 4 sets).
- ## Example
- How many ways can we distribute 10 slices of pie to 4 kids that such that no kid gets more than 2 slices (and each slice is distributed)? [See [the textbook](https://discrete.openmathbooks.org/dmoi3/sec_advPIE.html) for a more detailed solution.]
- The answer is obviously 0 - there will be 2 slices leftover after each kid gets the maximum of 2 slices.
- Without the restriction that nobody gets more than 2 slices, there would be $\binom{13}{3} = 286$ ways of sharing distributing the slices ($10+4-1$ stars & $4-1$ bars).
- Now, count the number of ways where a child gets more than 2 slices, i.e. some child gets $\geq 3$ slices.
- $$\binom{4}{1} \binom{7+3}{3} = 4(120) = 480$$
- (choose one of 4 kids)(number of ways of distributing).
- Add back in the doubly counted ones, subtracted the triply counted,
-
- $$\binom{13}{3} - \binom{4}{1} \binom{10}{3} + \binom{4}{2} \binom{7}{3} - \binom{4}{3} \binom{4}{3} + \binom{4}{4} \binom{1}{3} \\ = 286 - 480 +210 -16 = 0$$
- ## Example
- Not all problems have such easy solutions.
- How many non-negative integer solutions are there to $x_1 + x_2 + x_3 +x_4 +x_5 = 13$ if:
- 1. There are no restrictions (other than $x_i$ being an **nni**).
2. $0\leq x_i \leq 3$ for each $i$.
- 1. $$\displaystyle \binom{13+4}{4} = \binom{17}{4}$$
- 2. Idea: All possibilities $-$ "the wrong ones", i.e., count the possibilities where at least one of the $x_i \geq 4$.
- $\binom{5}{1}$ ways pf choosing the $x_i$ and then number of solutions to $x_1+x_2+x_3+x_4+x_5 = 9$ is $\binom{9+4}{4} = \binom{13}{4}$, i.e. $\binom{5}{1} \binom{13}{4}$.
- But we have double counted, so number of solutions with two $x_i \ geq 4$ is $\binom{5}{2}$ choices and $x_1+x_2+x_3+x_4+x_5 = 5$ has $\binom{5+4}{4} = \binom{9}{4}$ solutions.
- Answer: $\displaystyle \binom{17}{4} - \binom{5}{1} \binom{13}{4} + \binom{5}{2}\binom{9}{4} - \binom{5}{3}\binom{5}{4} +0$.
-
- # Derangements
- What is a **derangement**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T19:47:44.301Z
card-last-reviewed:: 2022-11-17T09:47:44.301Z
card-last-score:: 5
- A **derangement** is a permutation where no element is left in its original place, everything is moved.
- ## Example - Derangements of 4 Letters $\text{STARS}$.
- Let $D_n$ be the number of *derangements* of $n$ objects.
- First, we will work out the formulae for $D_1$, $D_2$, $D_3$, & $D_4$.
- $$D_1 = 0,\ D_2 = 1,\ D_3 = 2,\ D_4 = 9$$
- We derive a formula using PIE.
- We know that there are $4!$ permutations. Which ones are **not** derangements?
- Suppose that one item (at least) is left in place.
- There are $$\displaystyle \binom{4}{1} \cdot 3!$$ such permutations.
- (choose one item to not change from four)(number of ways of permutating the other items).
- However, some of these will be counted twice.
- So, by PIE, the answer is
- $$D_4 = 4! - \binom{4}{1}3! + \binom{4}{2}2!-\binom{4}{3}1!+\binom{4}{4}0!$$
- $$D_4 = 4! - \frac{4!3!}{1!3!} +\frac{4!2!}{2!2!}-\frac{4!1!}{3!1!} + \frac{4!0!}{4!0!}$$
- $$D_4 = 4![1-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+\frac{1}{4!}] = 9$$
- In general, the formula for $D_n$, the number of derangements of $n$ objects is
- $$D_n = n!(1-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+ \dots + (-1)^n \frac{1}{n!})$$
- Note that the series expansion for e^x is
- $$e^x = 1 + \frac{x}{1!} +\frac{x^2}{2!}+\frac{x^3}{3!} + \dots$$
- So $$\displaystyle e^{-1} = 1 - \frac{1}{1!} + \frac{1}{2!}-\frac{1}{3!}+ \dots$$
- So $$\displaystyle \lim_{n \to \infty} \frac{D_n}{n!} = e^{-1} \approx 0.36787$$
- # Counting with Repetitions
- What is a **Multinomial Coefficient**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:07:58.830Z
card-last-score:: 1
- The number of different permutations of $n$ objects, where there are $n_1$ indistinguishable objects of Type 1, $n_2$ indistinguishable objects of Type 2, ..., and $n_k$ indistinguishable objects of Type $k$, is
- $$\frac{n!}{(n_1!)(n_2!) \dots (n_k!)}$$
- ## Example
- How many "words" can we make from the letters in the set $\{R,O,S,C,O,M,M,O,N\}$.
- If somehow the three $O$s were all distinguishable, and the two $M$s were distinguishable, the answer would be $9!$.
- But, since we can't distinguish the identical letters,
- Let's choose which of the 9 positions in which we place the three $O$s.
- This can be done in $$\displaystyle \binom{9}{3}$$ ways.
- Now, let's choose which of the remaining 6 positions in which we place the two $M$s.
- This can be done in $$\displaystyle \binom{6}{2}$$ ways.
- Finally, let's choose where to replace the remaining 4 letters.
- This can be done in $$4!$$ ways.
- By the Multiplicative Principle, the answer is
- $$\binom{9}{3}\binom{6}{2}4! = \frac{9!}{3!6!} \frac{6!}{2!}{4!} 4! = \frac{9!}{3!2!}$$
- # Example (MA284 Semester 1 Exam, 2014/2015)
- **(i) Find the number of different arrangements of the letters in the place name `WOLLONGONG`.**
- `OOOLLNNGGW`
- $$\frac{10!}{3!2!2!2!1!} = 75600$$
- **(ii) How many of these arrangements start with three `O`s?**
- `OOO` (one way) and 7 others.
- $$\frac{7!}{2!2!2!} = 630$$
- **(ii) How many contain the two `G`s consecutively?**
- Treat `GG` as a single letter and permute 9 letters.
- $$\frac{9!}{3!2!2!1!} = 15120$$
- **(iv) How many *do not* contain the two `G`s consecutively?**
- Use **(i)** - **(iv)**.
- $$75600 - 15120 = 60480$$
- # Counting Functions
- Recall that $f: A \rightarrow B$ is a **function** that maps every element of the set $A$ onto some element of set $B$.
- We call $A$ the **domain** & $B$ the **codomain**.
- Each element of $A$ gets mapped to exactly one element of $B$.
- What does it mean if $a$ is the **image** of $b$? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:52.154Z
card-last-score:: 1
- If $f(a) = b$ where $a \in A$ and $b \in B$, we say that "the **image** of $a$ is $b$", or, equivalently, "$b$ is the **image** of $a$".
- What is a **surjective** function (surjection)? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:13:44.812Z
card-last-score:: 1
- For some function $f: A \rightarrow B$, if every element of $B$ is the image of some element $A$, we say that the function is **surjective** (also called "**onto**").
- What is an **injective** function (injection)? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:06.397Z
card-last-score:: 1
- For some function $f: A \rightarrow B$, if no two elements of $A$ have the same image in $B$, we say that the function is **injective** (also called "one-to-one").
- What is a **bijective** function (bijection)? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-18T20:09:58.766Z
card-last-reviewed:: 2022-11-14T20:09:58.766Z
card-last-score:: 5
- The function $f: A \rightarrow B$ is a **bijection** if it is both **surjective** & **injective**.
- Then $f$ defines a **one-to-one correspondence** between $A$ & $B$
- ## Examples
- **Let** $A$ **&** $B$ **be finite sets. How many functions** $f: A \rightarrow B$ **are there?**
- We can use ((6336be87-7dea-4ba3-b7d0-c77a73bae948)) to deduce that there are in total $|B|^{|A|}$ functions from $A$ to $B$.
- **How many functions** $f: A\{1,2,3,4,5,6,7,8\} \rightarrow \{1,2,3,4,5,6,7,8\}$ **are bijective**?
- Remember what it means for a function to be **bijective:** ^^each element in the codomain must be the image of **exactly one** element of the domain.^^
- What we are really doing is just rearranging the elements of the codomain, so we are defining a **permutation** of 8 elements.
- Therefore, the answer to our question is 8!.
- More generally, there are $n!$ bijections of the set $\{1,2,\cdots, n\}$ onto itself.
- [[2022年10月19日]]
-

View File

@ -0,0 +1,142 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Stars & Bars]]
- **Next Topic:** [[Introduction to Graph Theory]]
- **Relevant Slides:** ![MA284-Week06.pdf](../assets/MA284-Week06_1665576169094_0.pdf)
-
- # Advanced Counting Using PIE
collapsed:: true
- The PIE works for larger number of sets than 2 and 3, although it gets a little messy to write down.
- For 4 sets, we can think of it as:
- $|A \cup B \cup C \cup D| =$ (the sum of the sizes of each single set) $-$ (the sum of the sizes of each **intersection** of 2 sets) $+$ (the sum of the sizes of each **intersection** of 3 sets) $-$ (the sum of the sizes of the **intersection** of all 4 sets).
- ## Example
- How many ways can we distribute 10 slices of pie to 4 kids that such that no kid gets more than 2 slices (and each slice is distributed)? [See [the textbook](https://discrete.openmathbooks.org/dmoi3/sec_advPIE.html) for a more detailed solution.]
- The answer is obviously 0 - there will be 2 slices leftover after each kid gets the maximum of 2 slices.
- Without the restriction that nobody gets more than 2 slices, there would be $\binom{13}{3} = 286$ ways of sharing distributing the slices ($10+4-1$ stars & $4-1$ bars).
- Now, count the number of ways where a child gets more than 2 slices, i.e. some child gets $\geq 3$ slices.
- $$\binom{4}{1} \binom{7+3}{3} = 4(120) = 480$$
- (choose one of 4 kids)(number of ways of distributing).
- Add back in the doubly counted ones, subtracted the triply counted,
-
- $$\binom{13}{3} - \binom{4}{1} \binom{10}{3} + \binom{4}{2} \binom{7}{3} - \binom{4}{3} \binom{4}{3} + \binom{4}{4} \binom{1}{3} \\ = 286 - 480 +210 -16 = 0$$
- ## Example
- Not all problems have such easy solutions.
- How many non-negative integer solutions are there to $x_1 + x_2 + x_3 +x_4 +x_5 = 13$ if:
- 1. There are no restrictions (other than $x_i$ being an **nni**).
2. $0\leq x_i \leq 3$ for each $i$.
- 1. $$\displaystyle \binom{13+4}{4} = \binom{17}{4}$$
- 2. Idea: All possibilities $-$ "the wrong ones", i.e., count the possibilities where at least one of the $x_i \geq 4$.
- $\binom{5}{1}$ ways pf choosing the $x_i$ and then number of solutions to $x_1+x_2+x_3+x_4+x_5 = 9$ is $\binom{9+4}{4} = \binom{13}{4}$, i.e. $\binom{5}{1} \binom{13}{4}$.
- But we have double counted, so number of solutions with two $x_i \ geq 4$ is $\binom{5}{2}$ choices and $x_1+x_2+x_3+x_4+x_5 = 5$ has $\binom{5+4}{4} = \binom{9}{4}$ solutions.
- Answer: $\displaystyle \binom{17}{4} - \binom{5}{1} \binom{13}{4} + \binom{5}{2}\binom{9}{4} - \binom{5}{3}\binom{5}{4} +0$.
-
- # Derangements
- What is a **derangement**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T19:47:44.301Z
card-last-reviewed:: 2022-11-17T09:47:44.301Z
card-last-score:: 5
- A **derangement** is a permutation where no element is left in its original place, everything is moved.
- ## Example - Derangements of 4 Letters $\text{STARS}$.
- Let $D_n$ be the number of *derangements* of $n$ objects.
- First, we will work out the formulae for $D_1$, $D_2$, $D_3$, & $D_4$.
- $$D_1 = 0,\ D_2 = 1,\ D_3 = 2,\ D_4 = 9$$
- We derive a formula using PIE.
- We know that there are $4!$ permutations. Which ones are **not** derangements?
- Suppose that one item (at least) is left in place.
- There are $$\displaystyle \binom{4}{1} \cdot 3!$$ such permutations.
- (choose one item to not change from four)(number of ways of permutating the other items).
- However, some of these will be counted twice.
- So, by PIE, the answer is
- $$D_4 = 4! - \binom{4}{1}3! + \binom{4}{2}2!-\binom{4}{3}1!+\binom{4}{4}0!$$
- $$D_4 = 4! - \frac{4!3!}{1!3!} +\frac{4!2!}{2!2!}-\frac{4!1!}{3!1!} + \frac{4!0!}{4!0!}$$
- $$D_4 = 4![1-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+\frac{1}{4!}] = 9$$
- In general, the formula for $D_n$, the number of derangements of $n$ objects is
- $$D_n = n!(1-\frac{1}{1!}+\frac{1}{2!}-\frac{1}{3!}+ \dots + (-1)^n \frac{1}{n!})$$
- Note that the series expansion for e^x is
- $$e^x = 1 + \frac{x}{1!} +\frac{x^2}{2!}+\frac{x^3}{3!} + \dots$$
- So $$\displaystyle e^{-1} = 1 - \frac{1}{1!} + \frac{1}{2!}-\frac{1}{3!}+ \dots$$
- So $$\displaystyle \lim_{n \to \infty} \frac{D_n}{n!} = e^{-1} \approx 0.36787$$
- # Counting with Repetitions
- What is a **Multinomial Coefficient**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:07:58.830Z
card-last-score:: 1
- The number of different permutations of $n$ objects, where there are $n_1$ indistinguishable objects of Type 1, $n_2$ indistinguishable objects of Type 2, ..., and $n_k$ indistinguishable objects of Type $k$, is
- $$\frac{n!}{(n_1!)(n_2!) \dots (n_k!)}$$
- ## Example
- How many "words" can we make from the letters in the set $\{R,O,S,C,O,M,M,O,N\}$.
- If somehow the three $O$s were all distinguishable, and the two $M$s were distinguishable, the answer would be $9!$.
- But, since we can't distinguish the identical letters,
- Let's choose which of the 9 positions in which we place the three $O$s.
- This can be done in $$\displaystyle \binom{9}{3}$$ ways.
- Now, let's choose which of the remaining 6 positions in which we place the two $M$s.
- This can be done in $$\displaystyle \binom{6}{2}$$ ways.
- Finally, let's choose where to replace the remaining 4 letters.
- This can be done in $$4!$$ ways.
- By the Multiplicative Principle, the answer is
- $$\binom{9}{3}\binom{6}{2}4! = \frac{9!}{3!6!} \frac{6!}{2!}{4!} 4! = \frac{9!}{3!2!}$$
- # Example (MA284 Semester 1 Exam, 2014/2015)
- **(i) Find the number of different arrangements of the letters in the place name `WOLLONGONG`.**
- `OOOLLNNGGW`
- $$\frac{10!}{3!2!2!2!1!} = 75600$$
- **(ii) How many of these arrangements start with three `O`s?**
- `OOO` (one way) and 7 others.
- $$\frac{7!}{2!2!2!} = 630$$
- **(ii) How many contain the two `G`s consecutively?**
- Treat `GG` as a single letter and permute 9 letters.
- $$\frac{9!}{3!2!2!1!} = 15120$$
- **(iv) How many *do not* contain the two `G`s consecutively?**
- Use **(i)** - **(iv)**.
- $$75600 - 15120 = 60480$$
- # Counting Functions
- Recall that $f: A \rightarrow B$ is a **function** that maps every element of the set $A$ onto some element of set $B$.
- We call $A$ the **domain** & $B$ the **codomain**.
- Each element of $A$ gets mapped to exactly one element of $B$.
- What does it mean if $a$ is the **image** of $b$? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:52.154Z
card-last-score:: 1
- If $f(a) = b$ where $a \in A$ and $b \in B$, we say that "the **image** of $a$ is $b$", or, equivalently, "$b$ is the **image** of $a$".
- What is a **surjective** function (surjection)? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:13:44.812Z
card-last-score:: 1
- For some function $f: A \rightarrow B$, if every element of $B$ is the image of some element $A$, we say that the function is **surjective** (also called "**onto**").
- What is an **injective** function (injection)? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:06.397Z
card-last-score:: 1
- For some function $f: A \rightarrow B$, if no two elements of $A$ have the same image in $B$, we say that the function is **injective** (also called "one-to-one").
- What is a **bijective** function (bijection)? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-18T20:09:58.766Z
card-last-reviewed:: 2022-11-14T20:09:58.766Z
card-last-score:: 5
- The function $f: A \rightarrow B$ is a **bijection** if it is both **surjective** & **injective**.
- Then $f$ defines a **one-to-one correspondence** between $A$ & $B$
- ## Examples
- **Let** $A$ **&** $B$ **be finite sets. How many functions** $f: A \rightarrow B$ **are there?**
- We can use ((6336be87-7dea-4ba3-b7d0-c77a73bae948)) to deduce that there are in total $|B|^{|A|}$ functions from $A$ to $B$.
- **How many functions** $f: A\{1,2,3,4,5,6,7,8\} \rightarrow \{1,2,3,4,5,6,7,8\}$ **are bijective**?
- Remember what it means for a function to be **bijective:** ^^each element in the codomain must be the image of **exactly one** element of the domain.^^
- What we are really doing is just rearranging the elements of the codomain, so we are defining a **permutation** of 8 elements.
- Therefore, the answer to our question is 8!.
- More generally, there are $n!$ bijections of the set $\{1,2,\cdots, n\}$ onto itself.
- [[2022年10月19日]]
-

View File

@ -0,0 +1,55 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[SQL SELECT: Working with Strings & Subqueries]]
- **Next Topic:** [[Entity Relationship Models]]
- **Relevant Slides:** ![Topic 6 SQL_DML_aggregateFns and Group By Having.pdf](../assets/Topic_6_SQL_DML_aggregateFns_and_Group_By_Having_1664362673690_0.pdf)
-
- # Aggregate Functions #card
card-last-interval:: 3.58
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T21:34:29.060Z
card-last-reviewed:: 2022-09-30T08:34:29.061Z
card-last-score:: 3
- **Aggregate Functions** are only supported in `SELECT` clauses & `HAVING` clauses.
- Keywords `SUM`, `AVG`, `MIN`, `MAX` work as expected and can only be applied to **numeric** data.
- Keyword `COUNT` can be used to count the number of tuples / values / rows specified in a query.
- We can also use mathematical operations as part of an aggregate function on **numeric** data, e.g., `+`, `-`, `*`, `/`.
- # `GROUP BY` #card
card-last-interval:: 3.58
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T21:36:41.416Z
card-last-reviewed:: 2022-09-30T08:36:41.416Z
card-last-score:: 3
- `GROUP BY <group attributes>`
- The `GROUP BY` clause allows the grouping of rows of a table together so that all occurrences within a specified group are collected together.
- Aggregate clauses can then be applied to groups.
- ## Using Aggregate Functions with `GROUP BY` #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-06T23:00:00.000Z
card-last-reviewed:: 2022-10-06T09:42:27.555Z
card-last-score:: 1
- The `GROUP BY` clause specifies the group and the aggregate function is applied to the group.
- `COUNT(*)` can be used to *count* the number of rows (tuples) in the specified groups.
- `SUM`, `AVG`, `MIN`, `MAX` can be used to find the sum, average, min, & max of a *numerical value* in a specified group.
- ^^**Important:** You must `GROUP BY` **all** attributes in the `SELECT` clause *unless* they are involved in an aggregation.^^
- This **^^wouldn't work^^** as we do not `GROUP BY` all the attributes in the `SELECT` clause - `salary` remains ungrouped.
- ```SQL
SELECT dno, salary
FROM employee
GROUP BY dno -- THIS IS WRONG
```
-
- # `HAVING` #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T08:32:55.188Z
card-last-score:: 1
- `HAVING <group condition>`
- The `HAVING` clause is used in conjunction with `GROUP BY` and allows the specification of **conditions on groups**.
- The column names in the `HAVING` clause must also appear in the `GROUP BY` list or be contained within an aggregate function, i.e., you cannot apply a `HAVING` condition to something that has not been calculated already.
-

View File

@ -0,0 +1,55 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[SQL SELECT: Working with Strings & Subqueries]]
- **Next Topic:** [[Entity Relationship Models]]
- **Relevant Slides:** ![Topic 6 SQL_DML_aggregateFns and Group By Having.pdf](../assets/Topic_6_SQL_DML_aggregateFns_and_Group_By_Having_1664362673690_0.pdf)
-
- # Aggregate Functions #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:37:18.526Z
card-last-reviewed:: 2022-10-07T10:37:18.527Z
card-last-score:: 5
- **Aggregate Functions** are only supported in `SELECT` clauses & `HAVING` clauses.
- Keywords `SUM`, `AVG`, `MIN`, `MAX` work as expected and can only be applied to **numeric** data.
- Keyword `COUNT` can be used to count the number of tuples / values / rows specified in a query.
- We can also use mathematical operations as part of an aggregate function on **numeric** data, e.g., `+`, `-`, `*`, `/`.
- # `GROUP BY` #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:37:37.905Z
card-last-reviewed:: 2022-10-07T10:37:37.905Z
card-last-score:: 5
- `GROUP BY <group attributes>`
- The `GROUP BY` clause allows the grouping of rows of a table together so that all occurrences within a specified group are collected together.
- Aggregate clauses can then be applied to groups.
- ## Using Aggregate Functions with `GROUP BY` #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:01:12.709Z
card-last-score:: 1
- The `GROUP BY` clause specifies the group and the aggregate function is applied to the group.
- `COUNT(*)` can be used to *count* the number of rows (tuples) in the specified groups.
- `SUM`, `AVG`, `MIN`, `MAX` can be used to find the sum, average, min, & max of a *numerical value* in a specified group.
- ^^**Important:** You must `GROUP BY` **all** attributes in the `SELECT` clause *unless* they are involved in an aggregation.^^
- This **^^wouldn't work^^** as we do not `GROUP BY` all the attributes in the `SELECT` clause - `salary` remains ungrouped.
- ```SQL
SELECT dno, salary
FROM employee
GROUP BY dno -- THIS IS WRONG
```
-
- # `HAVING` #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T10:16:13.976Z
card-last-reviewed:: 2022-10-06T17:16:13.977Z
card-last-score:: 3
- `HAVING <group condition>`
- The `HAVING` clause is used in conjunction with `GROUP BY` and allows the specification of **conditions on groups**.
- The column names in the `HAVING` clause must also appear in the `GROUP BY` list or be contained within an aggregate function, i.e., you cannot apply a `HAVING` condition to something that has not been calculated already.
-

View File

@ -0,0 +1,55 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[SQL SELECT: Working with Strings & Subqueries]]
- **Next Topic:** [[Entity Relationship Models]]
- **Relevant Slides:** ![Topic 6 SQL_DML_aggregateFns and Group By Having.pdf](../assets/Topic_6_SQL_DML_aggregateFns_and_Group_By_Having_1664362673690_0.pdf)
-
- # Aggregate Functions #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:37:18.526Z
card-last-reviewed:: 2022-10-07T10:37:18.527Z
card-last-score:: 5
- **Aggregate Functions** are only supported in `SELECT` clauses & `HAVING` clauses.
- Keywords `SUM`, `AVG`, `MIN`, `MAX` work as expected and can only be applied to **numeric** data.
- Keyword `COUNT` can be used to count the number of tuples / values / rows specified in a query.
- We can also use mathematical operations as part of an aggregate function on **numeric** data, e.g., `+`, `-`, `*`, `/`.
- # `GROUP BY` #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:37:37.905Z
card-last-reviewed:: 2022-10-07T10:37:37.905Z
card-last-score:: 5
- `GROUP BY <group attributes>`
- The `GROUP BY` clause allows the grouping of rows of a table together so that all occurrences within a specified group are collected together.
- Aggregate clauses can then be applied to groups.
- ## Using Aggregate Functions with `GROUP BY` #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T21:32:53.059Z
card-last-reviewed:: 2022-10-10T11:32:53.059Z
card-last-score:: 5
- The `GROUP BY` clause specifies the group and the aggregate function is applied to the group.
- `COUNT(*)` can be used to *count* the number of rows (tuples) in the specified groups.
- `SUM`, `AVG`, `MIN`, `MAX` can be used to find the sum, average, min, & max of a *numerical value* in a specified group.
- ^^**Important:** You must `GROUP BY` **all** attributes in the `SELECT` clause *unless* they are involved in an aggregation.^^
- This **^^wouldn't work^^** as we do not `GROUP BY` all the attributes in the `SELECT` clause - `salary` remains ungrouped.
- ```SQL
SELECT dno, salary
FROM employee
GROUP BY dno -- THIS IS WRONG
```
-
- # `HAVING` #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T10:16:13.976Z
card-last-reviewed:: 2022-10-06T17:16:13.977Z
card-last-score:: 3
- `HAVING <group condition>`
- The `HAVING` clause is used in conjunction with `GROUP BY` and allows the specification of **conditions on groups**.
- The column names in the `HAVING` clause must also appear in the `GROUP BY` list or be contained within an aggregate function, i.e., you cannot apply a `HAVING` condition to something that has not been calculated already.
-

View File

@ -0,0 +1,43 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Introduction to Agile Methods]]
- **Next Topic:** null
- **Relevant Slides:** ![Week 4 - Agile Methods, XP.pdf](../assets/Week_4_-_Agile_Methods,_XP_1664439416140_0.pdf)
-
- # XP
- **eXtreme Programming (XP)** is one of the most popular agile software development methods.
- Some characteristics of XP include:
- Pair programming.
- Refactoring.
- Test-Driven Development (TDD).
- Continuous Integration.
- Metaphor.
- Small releases.
- Simple design.
- Customer tests.
- ## Principles of XP #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:11:14.853Z
card-last-score:: 1
- Communication.
- Simplicity.
- Feedback.
- Courage.
- Respect.
- All the contributors to an XP project (members of one team) sit together. This team must include a business representative (Product Owner) - the "Customer" - who provides the requirements, sets the priorities, and steers the project.
- ## Planning
- XP planning addresses two key questions in software development: predicting what will be accomplished by the due date, and determining what to do next.
- **Release Planning** is a practice where the Customer presents the desired features to the programmers, and the programmers estimate their difficulty.
- **Iteration Planning** is the practice whereby the team is given direction every few weeks (Sprints).
- ## Customer Tests
- As part of presenting each desired feature, the XP Customer defines one or more automated acceptance tests to show that the feature is working.
- The team builds these tests and uses them to prove to themselves, and to the customer, that the feature is implemented correctly.
- ## Small Releases
- XP teams practice small releases in two important ways:
- First, the team releases running, tested software, delivering business value chosen by the Customer, every iteration.
- Second, XP teams release to their end users frequently as well.
- ## Coding Standards
- XP teams follow a common coding standard, so that all the code
-

View File

@ -0,0 +1,43 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Introduction to Agile Methods]]
- **Next Topic:** null
- **Relevant Slides:** ![Week 4 - Agile Methods, XP.pdf](../assets/Week_4_-_Agile_Methods,_XP_1664439416140_0.pdf)
-
- # XP
- **eXtreme Programming (XP)** is one of the most popular agile software development methods.
- Some characteristics of XP include:
- Pair programming.
- Refactoring.
- Test-Driven Development (TDD).
- Continuous Integration.
- Metaphor.
- Small releases.
- Simple design.
- Customer tests.
- ## Principles of XP #card
card-last-interval:: 3.18
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T21:25:51.008Z
card-last-reviewed:: 2022-10-06T17:25:51.008Z
card-last-score:: 3
- Communication.
- Simplicity.
- Feedback.
- Courage.
- Respect.
- All the contributors to an XP project (members of one team) sit together. This team must include a business representative (Product Owner) - the "Customer" - who provides the requirements, sets the priorities, and steers the project.
- ## Planning
- XP planning addresses two key questions in software development: predicting what will be accomplished by the due date, and determining what to do next.
- **Release Planning** is a practice where the Customer presents the desired features to the programmers, and the programmers estimate their difficulty.
- **Iteration Planning** is the practice whereby the team is given direction every few weeks (Sprints).
- ## Customer Tests
- As part of presenting each desired feature, the XP Customer defines one or more automated acceptance tests to show that the feature is working.
- The team builds these tests and uses them to prove to themselves, and to the customer, that the feature is implemented correctly.
- ## Small Releases
- XP teams practice small releases in two important ways:
- First, the team releases running, tested software, delivering business value chosen by the Customer, every iteration.
- Second, XP teams release to their end users frequently as well.
- ## Coding Standards
- XP teams follow a common coding standard, so that all the code
-

View File

@ -0,0 +1,156 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Principle of Inclusion-Exclusion]]
- **Next Topic:** [[Combinatorial Proofs]]
- **Relevant Slides:** ![MA284-Week03.pdf](../assets/MA284-Week03_1663699934644_0.pdf)
-
- # Binary Strings & Lattice Paths
collapsed:: true
- ## Binary Strings
- A **bit** is a "binary digit", e.g., 1 or 0.
- A **bit string** is a string (list) of bits, e.g., 1011010.
- What is the **length** of a string? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:13:43.295Z
card-last-score:: 1
- The **length** of the string is the number of bits.
- An $n$-bit string has length $n$.
- The set of all $n$-bit strings (for given $n$) is denoted $B^n$.
- What is the **weight** of a string? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T12:12:45.309Z
card-last-reviewed:: 2022-09-26T12:12:45.310Z
card-last-score:: 5
- The **weight** of the string is the number of 1s.
- The set of all $n$-bit strings of weight $k$ is denoted $B^n_k$.
- ## Lattice Paths
- What is a **lattice**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:15:21.865Z
card-last-reviewed:: 2022-09-26T12:15:21.865Z
card-last-score:: 3
- The (integer) **lattice** is the set of all points in the Cartesian plane for which both the $x$ & $y$ coordinates are integers.
- What is a **lattice path**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:15:00.437Z
card-last-score:: 1
- A **lattice path** is the ^^shortest possible path^^ connecting two points on the lattice, moving only horizontally & vertically.
- There can be multiple lattice paths, so long as they are of equally short length.
- ![image.png](../assets/image_1663745526135_0.png)
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as $|B_3^5|$.
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as the number from $(0,0)$ to $(2,2)$, plus the number from $(0,0)$ to $(3,1)$.
-
- # Binomial Coefficients
- What is the coefficient of say, $x^3y^2$ in $(x+y)^5$?
- $$(x+y)^0=1
\newline
(x+y)^1=x+y
\newline
(x+y)^2=x^2+2xy+y^2
\newline
(x+y)^3=x^3+3x^2y+3xy^2+y^3
\newline
(x+y)^4=x^4+4x^3y+6x^2y^2+4xy^3+y^4
\newline
(x+y)^5=x^5+5x^4y+10x^3y^2+10x^2y^3+5xy^4+y^5
$$
- So, by doing a lot of multiplication, we have worked out that the coefficient of $x^3y^2$ is $10$.
- But, there is a more systematic way of answering this problem.
-
- $$(x+y)^5=(x+y)(x+y)(x+y)(x+y)(x+y)$$
- We can work out the coefficient of $x^3y^2$ in the expansion of $(x+y)^5$ by counting the number of ways we can **choose** $3$ $x$s & $2$ $y$s in
- $$(x+y)(x+y)(x+y)(x+y)(x+y)$$
-
- The numbers that occurred in all of our examples are called **binomial coefficients**, and are denoted
- $$\binom{n}{k}$$
- What are **Binomial Coefficients**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:13:19.107Z
card-last-reviewed:: 2022-09-26T12:13:19.108Z
card-last-score:: 3
- For each integer $n \geq 0$, and integer $k$ such that $0 \leq k \leq n$, there is a number $\binom{n}{k}$, read as "$n$ *choose* $k$".
- $\binom{n}{k} = |B^n_k|$, the number of $n$-bit strings of weight $k$.
- $\binom{n}{k}$ is the number of subsets of a set of size $n$, each with cardinality $k$.
- $\binom{n}{k}$ is the number of lattice paths of length $n$ containing $k$ steps to the right.
- $\binom{n}{k}$ is the coefficient of $x^k y^{n-k}$ in the expansion of $(x+y)^n$.
- $\binom{n}{k}$ is the number of ways to select $k$ objects from a total of $n$ objects.
- If we were to skip ahead, we would learn that there is a formula for $\binom{n}{k}$ (that is, "$n$ choose $k$") that is expressed in terms of **factorials**.
- Recall that the **factorial** of a natural number $n$ is:
- $$n! = n \times (n-1) \times (n-2) \times (n-4) \times ... \times 2 \times 1$$
- We will eventually learn that
- $$\binom{n}{k} = \frac{n!}{k!(n-k)!}$$
- However, the formula $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ is not very useful in practice.
-
-
- # Pascal's Triangle
collapsed:: true
- ![image.png](../assets/image_1663751328603_0.png)
- Earlier, we learned that if the set of all $n$-bit strings with weight $k$ is written $B^n_k$, then
- $$|b^n_k| = |B^{n-1}_{k-1}| + |B^{n-1}_k$$
- Similarly, we find that:
- #### Pascal's Identity: A recurrence relation for $\binom{n}{k}$ #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:14:13.327Z
card-last-score:: 1
- $$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$$
- This is often presented as **Pascal's Triangle**
- ![image.png](../assets/image_1663751709631_0.png)
-
-
- # Permutations
- What is a **permutation**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T12:14:28.732Z
card-last-reviewed:: 2022-09-26T12:14:28.732Z
card-last-score:: 5
- A **permutation** is an arrangement of objects. Changing the order of the objects gives a different permutation.
- Important: order matters!
- ### Number of Permutations
- How many **permutations** are there of $n$ objects? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-22T23:00:00.000Z
card-last-reviewed:: 2022-09-22T20:27:24.238Z
card-last-score:: 1
- There are $n!$ (i.e., $n$ *factorial*) permutations of $n$ (distinct) objects.
- How many permutations are there of $k$ objects from $n$? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:13:06.669Z
card-last-score:: 1
- The number of permutations of $k$ objects out of $n$, $P(n,k)$ is
- $$P(n,k) = n \times (n-1) \times ... \times (n - k + 1) = \frac{n!}{(n-k)!}$$
-
- ## The Binomial Coefficient Formula
- (1) We know that there are $P(n,k)$ permutations of $k$ objects out of $n$.
- (2) We know that
- $$P(n,k) = \frac{n!}{(n-k)!}$$
- (3) Another way of making a permutation of $k$ objects out of $n$ is to
- (a) Choose $k$ from $n$ without order. There $\binom{n}{k}$ ways of doing this.
- (b) Then count all the ways of ordering these $k$ objects. There are $k!$ ways of doing this.
- (c) By the Multiplicative Principle,
- $$P(n,k) = \binom{n}{k}k!$$
- (4) So now we know that
- $$\frac{n!}{(n-k)!} = \binom{n}{k}k!$$
- (5) This gives the formula
- $$\binom{n}{k} = \frac{n!}{(n-k)!k!}$$
-

View File

@ -0,0 +1,156 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Principle of Inclusion-Exclusion]]
- **Next Topic:** [[Combinatorial Proofs]]
- **Relevant Slides:** ![MA284-Week03.pdf](../assets/MA284-Week03_1663699934644_0.pdf)
-
- # Binary Strings & Lattice Paths
collapsed:: true
- ## Binary Strings
- A **bit** is a "binary digit", e.g., 1 or 0.
- A **bit string** is a string (list) of bits, e.g., 1011010.
- What is the **length** of a string? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:13:43.295Z
card-last-score:: 1
- The **length** of the string is the number of bits.
- An $n$-bit string has length $n$.
- The set of all $n$-bit strings (for given $n$) is denoted $B^n$.
- What is the **weight** of a string? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T12:12:45.309Z
card-last-reviewed:: 2022-09-26T12:12:45.310Z
card-last-score:: 5
- The **weight** of the string is the number of 1s.
- The set of all $n$-bit strings of weight $k$ is denoted $B^n_k$.
- ## Lattice Paths
- What is a **lattice**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:15:21.865Z
card-last-reviewed:: 2022-09-26T12:15:21.865Z
card-last-score:: 3
- The (integer) **lattice** is the set of all points in the Cartesian plane for which both the $x$ & $y$ coordinates are integers.
- What is a **lattice path**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:15:00.437Z
card-last-score:: 1
- A **lattice path** is the ^^shortest possible path^^ connecting two points on the lattice, moving only horizontally & vertically.
- There can be multiple lattice paths, so long as they are of equally short length.
- ![image.png](../assets/image_1663745526135_0.png)
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as $|B_3^5|$.
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as the number from $(0,0)$ to $(2,2)$, plus the number from $(0,0)$ to $(3,1)$.
-
- # Binomial Coefficients
- What is the coefficient of say, $x^3y^2$ in $(x+y)^5$?
- $$(x+y)^0=1
\newline
(x+y)^1=x+y
\newline
(x+y)^2=x^2+2xy+y^2
\newline
(x+y)^3=x^3+3x^2y+3xy^2+y^3
\newline
(x+y)^4=x^4+4x^3y+6x^2y^2+4xy^3+y^4
\newline
(x+y)^5=x^5+5x^4y+10x^3y^2+10x^2y^3+5xy^4+y^5
$$
- So, by doing a lot of multiplication, we have worked out that the coefficient of $x^3y^2$ is $10$.
- But, there is a more systematic way of answering this problem.
-
- $$(x+y)^5=(x+y)(x+y)(x+y)(x+y)(x+y)$$
- We can work out the coefficient of $x^3y^2$ in the expansion of $(x+y)^5$ by counting the number of ways we can **choose** $3$ $x$s & $2$ $y$s in
- $$(x+y)(x+y)(x+y)(x+y)(x+y)$$
-
- The numbers that occurred in all of our examples are called **binomial coefficients**, and are denoted
- $$\binom{n}{k}$$
- What are **Binomial Coefficients**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:13:19.107Z
card-last-reviewed:: 2022-09-26T12:13:19.108Z
card-last-score:: 3
- For each integer $n \geq 0$, and integer $k$ such that $0 \leq k \leq n$, there is a number $\binom{n}{k}$, read as "$n$ *choose* $k$".
- $\binom{n}{k} = |B^n_k|$, the number of $n$-bit strings of weight $k$.
- $\binom{n}{k}$ is the number of subsets of a set of size $n$, each with cardinality $k$.
- $\binom{n}{k}$ is the number of lattice paths of length $n$ containing $k$ steps to the right.
- $\binom{n}{k}$ is the coefficient of $x^k y^{n-k}$ in the expansion of $(x+y)^n$.
- $\binom{n}{k}$ is the number of ways to select $k$ objects from a total of $n$ objects.
- If we were to skip ahead, we would learn that there is a formula for $\binom{n}{k}$ (that is, "$n$ choose $k$") that is expressed in terms of **factorials**.
- Recall that the **factorial** of a natural number $n$ is:
- $$n! = n \times (n-1) \times (n-2) \times (n-4) \times ... \times 2 \times 1$$
- We will eventually learn that
- $$\binom{n}{k} = \frac{n!}{k!(n-k)!}$$
- However, the formula $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ is not very useful in practice.
-
-
- # Pascal's Triangle
collapsed:: true
- ![image.png](../assets/image_1663751328603_0.png)
- Earlier, we learned that if the set of all $n$-bit strings with weight $k$ is written $B^n_k$, then
- $$|b^n_k| = |B^{n-1}_{k-1}| + |B^{n-1}_k$$
- Similarly, we find that:
- #### Pascal's Identity: A recurrence relation for $\binom{n}{k}$ #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:14:13.327Z
card-last-score:: 1
- $$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$$
- This is often presented as **Pascal's Triangle**
- ![image.png](../assets/image_1663751709631_0.png)
-
-
- # Permutations
- What is a **permutation**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T12:14:28.732Z
card-last-reviewed:: 2022-09-26T12:14:28.732Z
card-last-score:: 5
- A **permutation** is an arrangement of objects. Changing the order of the objects gives a different permutation.
- Important: order matters!
- ### Number of Permutations
- How many **permutations** are there of $n$ objects? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-22T23:00:00.000Z
card-last-reviewed:: 2022-09-22T20:27:24.238Z
card-last-score:: 1
- There are $n!$ (i.e., $n$ *factorial*) permutations of $n$ (distinct) objects.
- How many permutations are there of $k$ objects from $n$? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-26T23:00:00.000Z
card-last-reviewed:: 2022-09-26T12:13:06.669Z
card-last-score:: 1
- The number of permutations of $k$ objects out of $n$, $P(n,k)$ is
- $$P(n,k) = n \times (n-1) \times ... \times (n - k + 1) = \frac{n!}{(n-k)!}$$
-
- ## The Binomial Coefficient Formula
- (1) We know that there are $P(n,k)$ permutations of $k$ objects out of $n$.
- (2) We know that
- $$P(n,k) = \frac{n!}{(n-k)!}$$
- (3) Another way of making a permutation of $k$ objects out of $n$ is to
- (a) Choose $k$ from $n$ without order. There $\binom{n}{k}$ ways of doing this.
- (b) Then count all the ways of ordering these $k$ objects. There are $k!$ ways of doing this.
- (c) By the Multiplicative Principle,
- $$P(n,k) = \binom{n}{k}k!$$
- (4) So now we know that
- $$\frac{n!}{(n-k)!} = \binom{n}{k}k!$$
- (5) This gives the formula
- $$\binom{n}{k} = \frac{n!}{(n-k)!k!}$$
-

View File

@ -0,0 +1,154 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Principle of Inclusion-Exclusion]]
- **Next Topic:** [[Combinatorial Proofs]]
- **Relevant Slides:** ![MA284-Week03.pdf](../assets/MA284-Week03_1663699934644_0.pdf)
-
- # Binary Strings & Lattice Paths
- ## Binary Strings
- A **bit** is a "binary digit", e.g., 1 or 0.
- A **bit string** is a string (list) of bits, e.g., 1011010.
- What is the **length** of a string? #card
card-last-interval:: 3.58
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-03T21:29:15.109Z
card-last-reviewed:: 2022-09-30T08:29:15.109Z
card-last-score:: 5
- The **length** of the string is the number of bits.
- An $n$-bit string has length $n$.
- The set of all $n$-bit strings (for given $n$) is denoted $B^n$.
- What is the **weight** of a string? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-08T12:11:29.792Z
card-last-reviewed:: 2022-10-04T12:11:29.793Z
card-last-score:: 5
- The **weight** of the string is the number of 1s.
- The set of all $n$-bit strings of weight $k$ is denoted $B^n_k$.
- ## Lattice Paths
- What is a **lattice**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-08T12:15:35.955Z
card-last-reviewed:: 2022-10-04T12:15:35.956Z
card-last-score:: 3
- The (integer) **lattice** is the set of all points in the Cartesian plane for which both the $x$ & $y$ coordinates are integers.
- What is a **lattice path**? #card
card-last-interval:: 4.43
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T00:33:48.083Z
card-last-reviewed:: 2022-10-03T14:33:48.083Z
card-last-score:: 5
- A **lattice path** is the ^^shortest possible path^^ connecting two points on the lattice, moving only horizontally & vertically.
- There can be multiple lattice paths, so long as they are of equally short length.
- ![image.png](../assets/image_1663745526135_0.png)
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as $|B_3^5|$.
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as the number from $(0,0)$ to $(2,2)$, plus the number from $(0,0)$ to $(3,1)$.
-
- # Binomial Coefficients
- What is the coefficient of say, $x^3y^2$ in $(x+y)^5$?
- $$(x+y)^0=1
\newline
(x+y)^1=x+y
\newline
(x+y)^2=x^2+2xy+y^2
\newline
(x+y)^3=x^3+3x^2y+3xy^2+y^3
\newline
(x+y)^4=x^4+4x^3y+6x^2y^2+4xy^3+y^4
\newline
(x+y)^5=x^5+5x^4y+10x^3y^2+10x^2y^3+5xy^4+y^5
$$
- So, by doing a lot of multiplication, we have worked out that the coefficient of $x^3y^2$ is $10$.
- But, there is a more systematic way of answering this problem.
-
- $$(x+y)^5=(x+y)(x+y)(x+y)(x+y)(x+y)$$
- We can work out the coefficient of $x^3y^2$ in the expansion of $(x+y)^5$ by counting the number of ways we can **choose** $3$ $x$s & $2$ $y$s in
- $$(x+y)(x+y)(x+y)(x+y)(x+y)$$
-
- The numbers that occurred in all of our examples are called **binomial coefficients**, and are denoted
- $$\binom{n}{k}$$
- What are **Binomial Coefficients**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:13:14.754Z
card-last-score:: 1
- For each integer $n \geq 0$, and integer $k$ such that $0 \leq k \leq n$, there is a number $\binom{n}{k}$, read as "$n$ *choose* $k$".
- $\binom{n}{k} = |B^n_k|$, the number of $n$-bit strings of weight $k$.
- $\binom{n}{k}$ is the number of subsets of a set of size $n$, each with cardinality $k$.
- $\binom{n}{k}$ is the number of lattice paths of length $n$ containing $k$ steps to the right.
- $\binom{n}{k}$ is the coefficient of $x^k y^{n-k}$ in the expansion of $(x+y)^n$.
- $\binom{n}{k}$ is the number of ways to select $k$ objects from a total of $n$ objects.
- If we were to skip ahead, we would learn that there is a formula for $\binom{n}{k}$ (that is, "$n$ choose $k$") that is expressed in terms of **factorials**.
- Recall that the **factorial** of a natural number $n$ is:
- $$n! = n \times (n-1) \times (n-2) \times (n-4) \times ... \times 2 \times 1$$
- We will eventually learn that
- $$\binom{n}{k} = \frac{n!}{k!(n-k)!}$$
- However, the formula $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ is not very useful in practice.
-
-
- # Pascal's Triangle
- ![image.png](../assets/image_1663751328603_0.png)
- Earlier, we learned that if the set of all $n$-bit strings with weight $k$ is written $B^n_k$, then
- $$|b^n_k| = |B^{n-1}_{k-1}| + |B^{n-1}_k$$
- Similarly, we find that:
- #### Pascal's Identity: A recurrence relation for $\binom{n}{k}$ #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:33:47.256Z
card-last-score:: 1
- $$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$$
- This is often presented as **Pascal's Triangle**
- ![image.png](../assets/image_1663751709631_0.png)
-
-
- # Permutations
- What is a **permutation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-08T12:14:42.530Z
card-last-reviewed:: 2022-10-04T12:14:42.530Z
card-last-score:: 5
- A **permutation** is an arrangement of objects. Changing the order of the objects gives a different permutation.
- A permutation of a set must have the same cardinality as that set.
- Important: order matters!
- ### Number of Permutations
- How many **permutations** are there of $n$ objects? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T11:41:35.217Z
card-last-score:: 1
- There are $n!$ (i.e., $n$ *factorial*) permutations of $n$ (distinct) objects.
- How many permutations are there of $k$ objects from $n$? #card
card-last-interval:: 4.43
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T00:33:41.297Z
card-last-reviewed:: 2022-10-03T14:33:41.298Z
card-last-score:: 5
- The number of permutations of $k$ objects out of $n$, $P(n,k)$ is
- $$P(n,k) = \binom{n}{k} = n \times (n-1) \times ... \times (n - k + 1) = \frac{n!}{(n-k)!}$$
- ## The Binomial Coefficient Formula
- (1) We know that there are $P(n,k)$ permutations of $k$ objects out of $n$.
- (2) We know that
- $$P(n,k) = \frac{n!}{(n-k)!}$$
- (3) Another way of making a permutation of $k$ objects out of $n$ is to
- (a) Choose $k$ from $n$ without order. There $\binom{n}{k}$ ways of doing this.
- (b) Then count all the ways of ordering these $k$ objects. There are $k!$ ways of doing this.
- (c) By the Multiplicative Principle,
- $$P(n,k) = \binom{n}{k}k!$$
- (4) So now we know that
- $$\frac{n!}{(n-k)!} = \binom{n}{k}k!$$
- (5) This gives the formula
- $$\binom{n}{k} = \frac{n!}{(n-k)!k!}$$
-

View File

@ -0,0 +1,154 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Principle of Inclusion-Exclusion]]
- **Next Topic:** [[Combinatorial Proofs]]
- **Relevant Slides:** ![MA284-Week03.pdf](../assets/MA284-Week03_1663699934644_0.pdf)
-
- # Binary Strings & Lattice Paths
- ## Binary Strings
- A **bit** is a "binary digit", e.g., 1 or 0.
- A **bit string** is a string (list) of bits, e.g., 1011010.
- What is the **length** of a string? #card
card-last-interval:: 11.55
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-18T06:19:03.162Z
card-last-reviewed:: 2022-10-06T17:19:03.163Z
card-last-score:: 5
- The **length** of the string is the number of bits.
- An $n$-bit string has length $n$.
- The set of all $n$-bit strings (for given $n$) is denoted $B^n$.
- What is the **weight** of a string? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:52:44.155Z
card-last-reviewed:: 2022-10-08T22:52:44.155Z
card-last-score:: 5
- The **weight** of the string is the number of 1s.
- The set of all $n$-bit strings of weight $k$ is denoted $B^n_k$.
- ## Lattice Paths
- What is a **lattice**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-18T04:54:49.128Z
card-last-reviewed:: 2022-10-08T22:54:49.129Z
card-last-score:: 5
- The (integer) **lattice** is the set of all points in the Cartesian plane for which both the $x$ & $y$ coordinates are integers.
- What is a **lattice path**? #card
card-last-interval:: 14.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-23T02:49:59.194Z
card-last-reviewed:: 2022-10-08T22:49:59.194Z
card-last-score:: 5
- A **lattice path** is the ^^shortest possible path^^ connecting two points on the lattice, moving only horizontally & vertically.
- There can be multiple lattice paths, so long as they are of equally short length.
- ![image.png](../assets/image_1663745526135_0.png)
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as $|B_3^5|$.
- The number of lattice paths from $(0,0)$ to $(3,2)$ is the same as the number from $(0,0)$ to $(2,2)$, plus the number from $(0,0)$ to $(3,1)$.
-
- # Binomial Coefficients
- What is the coefficient of say, $x^3y^2$ in $(x+y)^5$?
- $$(x+y)^0=1
\newline
(x+y)^1=x+y
\newline
(x+y)^2=x^2+2xy+y^2
\newline
(x+y)^3=x^3+3x^2y+3xy^2+y^3
\newline
(x+y)^4=x^4+4x^3y+6x^2y^2+4xy^3+y^4
\newline
(x+y)^5=x^5+5x^4y+10x^3y^2+10x^2y^3+5xy^4+y^5
$$
- So, by doing a lot of multiplication, we have worked out that the coefficient of $x^3y^2$ is $10$.
- But, there is a more systematic way of answering this problem.
-
- $$(x+y)^5=(x+y)(x+y)(x+y)(x+y)(x+y)$$
- We can work out the coefficient of $x^3y^2$ in the expansion of $(x+y)^5$ by counting the number of ways we can **choose** $3$ $x$s & $2$ $y$s in
- $$(x+y)(x+y)(x+y)(x+y)(x+y)$$
-
- The numbers that occurred in all of our examples are called **binomial coefficients**, and are denoted
- $$\binom{n}{k}$$
- What are **Binomial Coefficients**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-12T08:41:44.758Z
card-last-reviewed:: 2022-10-08T22:41:44.758Z
card-last-score:: 5
- For each integer $n \geq 0$, and integer $k$ such that $0 \leq k \leq n$, there is a number $\binom{n}{k}$, read as "$n$ *choose* $k$".
- $\binom{n}{k} = |B^n_k|$, the number of $n$-bit strings of weight $k$.
- $\binom{n}{k}$ is the number of subsets of a set of size $n$, each with cardinality $k$.
- $\binom{n}{k}$ is the number of lattice paths of length $n$ containing $k$ steps to the right.
- $\binom{n}{k}$ is the coefficient of $x^k y^{n-k}$ in the expansion of $(x+y)^n$.
- $\binom{n}{k}$ is the number of ways to select $k$ objects from a total of $n$ objects.
- If we were to skip ahead, we would learn that there is a formula for $\binom{n}{k}$ (that is, "$n$ choose $k$") that is expressed in terms of **factorials**.
- Recall that the **factorial** of a natural number $n$ is:
- $$n! = n \times (n-1) \times (n-2) \times (n-4) \times ... \times 2 \times 1$$
- We will eventually learn that
- $$\binom{n}{k} = \frac{n!}{k!(n-k)!}$$
- However, the formula $\displaystyle \binom{n}{k} = \frac{n!}{k!(n-k)!}$ is not very useful in practice.
-
-
- # Pascal's Triangle
- ![image.png](../assets/image_1663751328603_0.png)
- Earlier, we learned that if the set of all $n$-bit strings with weight $k$ is written $B^n_k$, then
- $$|b^n_k| = |B^{n-1}_{k-1}| + |B^{n-1}_k$$
- Similarly, we find that:
- #### Pascal's Identity: A recurrence relation for $\binom{n}{k}$ #card
card-last-interval:: 11.56
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-18T23:22:56.816Z
card-last-reviewed:: 2022-10-07T10:22:56.817Z
card-last-score:: 5
- $$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$$
- This is often presented as **Pascal's Triangle**
- ![image.png](../assets/image_1663751709631_0.png)
-
-
- # Permutations
- What is a **permutation**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:53:38.455Z
card-last-reviewed:: 2022-10-08T22:53:38.455Z
card-last-score:: 5
- A **permutation** is an arrangement of objects. Changing the order of the objects gives a different permutation.
- A permutation of a set must have the same cardinality as that set.
- Important: order matters!
- ### Number of Permutations
- How many **permutations** are there of $n$ objects? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T17:14:28.391Z
card-last-reviewed:: 2022-10-06T17:14:28.392Z
card-last-score:: 5
- There are $n!$ (i.e., $n$ *factorial*) permutations of $n$ (distinct) objects.
- How many permutations are there of $k$ objects from $n$? #card
card-last-interval:: 14.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-23T02:49:51.823Z
card-last-reviewed:: 2022-10-08T22:49:51.823Z
card-last-score:: 5
- The number of permutations of $k$ objects out of $n$, $P(n,k)$ is
- $$P(n,k) = \binom{n}{k} = n \times (n-1) \times ... \times (n - k + 1) = \frac{n!}{(n-k)!}$$
- ## The Binomial Coefficient Formula
- (1) We know that there are $P(n,k)$ permutations of $k$ objects out of $n$.
- (2) We know that
- $$P(n,k) = \frac{n!}{(n-k)!}$$
- (3) Another way of making a permutation of $k$ objects out of $n$ is to
- (a) Choose $k$ from $n$ without order. There $\binom{n}{k}$ ways of doing this.
- (b) Then count all the ways of ordering these $k$ objects. There are $k!$ ways of doing this.
- (c) By the Multiplicative Principle,
- $$P(n,k) = \binom{n}{k}k!$$
- (4) So now we know that
- $$\frac{n!}{(n-k)!} = \binom{n}{k}k!$$
- (5) This gives the formula
- $$\binom{n}{k} = \frac{n!}{(n-k)!k!}$$
-

View File

@ -0,0 +1,136 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
- What are the two main types of scheduler? #card
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- yield(P_{current}, P_{next}) (sometimes implemented as instruction in hardware), where P_{current} is an identifier of the current process and P_{next} is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
- ### Non-Preemptive Strategies #card
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The scheduler is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-06T23:00:00.000Z
card-last-reviewed:: 2022-10-06T09:46:22.373Z
card-last-score:: 1
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T09:39:03.608Z
card-last-reviewed:: 2022-10-06T09:39:03.608Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-06T23:00:00.000Z
card-last-reviewed:: 2022-10-06T09:43:34.034Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: 3.57
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:43:53.128Z
card-last-reviewed:: 2022-10-06T09:43:53.128Z
card-last-score:: 5
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies to modify the priority.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,203 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
card-last-interval:: 0.91
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T19:43:11.047Z
card-last-reviewed:: 2022-10-08T22:43:11.048Z
card-last-score:: 3
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:09:47.382Z
card-last-score:: 1
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
collapsed:: true
- What are the two main types of scheduler? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T17:41:55.585Z
card-last-reviewed:: 2022-10-08T22:41:55.587Z
card-last-score:: 5
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
card-last-interval:: 3.71
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T10:14:20.540Z
card-last-reviewed:: 2022-10-06T17:14:20.542Z
card-last-score:: 3
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- $\text{yield}(P_{current}, P_{next})$ (sometimes implemented as instruction in hardware), where $P_{current}$ is an identifier of the current process and $P_{next}$ is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-17T19:51:16.969Z
card-last-reviewed:: 2022-10-08T22:51:16.969Z
card-last-score:: 3
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:16:52.341Z
card-last-score:: 1
- ### Non-Preemptive Strategies #card
card-last-interval:: 3.09
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T19:27:02.886Z
card-last-reviewed:: 2022-10-06T17:27:02.886Z
card-last-score:: 5
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
card-last-interval:: 3.1
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T12:22:34.748Z
card-last-reviewed:: 2022-10-07T10:22:34.748Z
card-last-score:: 3
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The **scheduler** is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
card-last-interval:: 2.14
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T13:31:14.568Z
card-last-reviewed:: 2022-10-07T10:31:14.569Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: 2.77
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T04:23:33.101Z
card-last-reviewed:: 2022-10-07T10:23:33.101Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:22:34.266Z
card-last-reviewed:: 2022-10-06T17:22:34.267Z
card-last-score:: 5
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T09:39:03.608Z
card-last-reviewed:: 2022-10-06T09:39:03.608Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:19:24.014Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: 3.57
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:43:53.128Z
card-last-reviewed:: 2022-10-06T09:43:53.128Z
card-last-score:: 5
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:32:43.331Z
card-last-score:: 1
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies exist to modify the priority of a process.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,203 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-19T08:45:36.066Z
card-last-reviewed:: 2022-10-10T11:45:36.066Z
card-last-score:: 3
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:18.238Z
card-last-score:: 1
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
collapsed:: true
- What are the two main types of scheduler? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T17:41:55.585Z
card-last-reviewed:: 2022-10-08T22:41:55.587Z
card-last-score:: 5
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
card-last-interval:: 3.71
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T10:14:20.540Z
card-last-reviewed:: 2022-10-06T17:14:20.542Z
card-last-score:: 3
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- $\text{yield}(P_{current}, P_{next})$ (sometimes implemented as instruction in hardware), where $P_{current}$ is an identifier of the current process and $P_{next}$ is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-17T19:51:16.969Z
card-last-reviewed:: 2022-10-08T22:51:16.969Z
card-last-score:: 3
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:49.280Z
card-last-score:: 1
- ### Non-Preemptive Strategies #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-14T11:45:13.446Z
card-last-reviewed:: 2022-10-10T11:45:13.446Z
card-last-score:: 5
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
card-last-interval:: 3.1
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T12:22:34.748Z
card-last-reviewed:: 2022-10-07T10:22:34.748Z
card-last-score:: 3
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The **scheduler** is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:44:24.451Z
card-last-score:: 1
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: 2.77
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T04:23:33.101Z
card-last-reviewed:: 2022-10-07T10:23:33.101Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:22:34.266Z
card-last-reviewed:: 2022-10-06T17:22:34.267Z
card-last-score:: 5
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T09:39:03.608Z
card-last-reviewed:: 2022-10-06T09:39:03.608Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:39.065Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:46:27.395Z
card-last-score:: 1
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:40:03.804Z
card-last-score:: 1
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies exist to modify the priority of a process.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,203 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** [[Process Synchronisation]]
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-19T08:45:36.066Z
card-last-reviewed:: 2022-10-10T11:45:36.066Z
card-last-score:: 3
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:18.238Z
card-last-score:: 1
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
collapsed:: true
- What are the two main types of scheduler? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T17:41:55.585Z
card-last-reviewed:: 2022-10-08T22:41:55.587Z
card-last-score:: 5
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
card-last-interval:: 3.71
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T10:14:20.540Z
card-last-reviewed:: 2022-10-06T17:14:20.542Z
card-last-score:: 3
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- $\text{yield}(P_{current}, P_{next})$ (sometimes implemented as instruction in hardware), where $P_{current}$ is an identifier of the current process and $P_{next}$ is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-17T19:51:16.969Z
card-last-reviewed:: 2022-10-08T22:51:16.969Z
card-last-score:: 3
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:49.280Z
card-last-score:: 1
- ### Non-Preemptive Strategies #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-14T11:45:13.446Z
card-last-reviewed:: 2022-10-10T11:45:13.446Z
card-last-score:: 5
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
card-last-interval:: 3.1
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T12:22:34.748Z
card-last-reviewed:: 2022-10-07T10:22:34.748Z
card-last-score:: 3
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The **scheduler** is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:44:24.451Z
card-last-score:: 1
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: 2.77
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T04:23:33.101Z
card-last-reviewed:: 2022-10-07T10:23:33.101Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:22:34.266Z
card-last-reviewed:: 2022-10-06T17:22:34.267Z
card-last-score:: 5
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T09:39:03.608Z
card-last-reviewed:: 2022-10-06T09:39:03.608Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:39.065Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:46:27.395Z
card-last-score:: 1
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:40:03.804Z
card-last-score:: 1
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies exist to modify the priority of a process.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,203 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** [[Process Synchronisation]]
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-19T08:45:36.066Z
card-last-reviewed:: 2022-10-10T11:45:36.066Z
card-last-score:: 3
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:18.238Z
card-last-score:: 1
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
collapsed:: true
- What are the two main types of scheduler? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T17:41:55.585Z
card-last-reviewed:: 2022-10-08T22:41:55.587Z
card-last-score:: 5
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:31:15.409Z
card-last-reviewed:: 2022-10-20T08:31:15.409Z
card-last-score:: 5
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- $\text{yield}(P_{current}, P_{next})$ (sometimes implemented as instruction in hardware), where $P_{current}$ is an identifier of the current process and $P_{next}$ is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-17T19:51:16.969Z
card-last-reviewed:: 2022-10-08T22:51:16.969Z
card-last-score:: 3
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:50:48.551Z
card-last-score:: 1
- ### Non-Preemptive Strategies #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-14T11:45:13.446Z
card-last-reviewed:: 2022-10-10T11:45:13.446Z
card-last-score:: 5
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:31:38.171Z
card-last-score:: 1
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The **scheduler** is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-24T08:38:33.286Z
card-last-reviewed:: 2022-10-20T08:38:33.286Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: 8.63
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-28T23:29:59.692Z
card-last-reviewed:: 2022-10-20T08:29:59.692Z
card-last-score:: 5
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:29:18.651Z
card-last-reviewed:: 2022-10-20T08:29:18.651Z
card-last-score:: 3
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-24T08:30:13.805Z
card-last-reviewed:: 2022-10-20T08:30:13.806Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:39.065Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:43:05.332Z
card-last-score:: 1
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:40:03.804Z
card-last-score:: 1
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies exist to modify the priority of a process.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,203 @@
- #[[CT213 - Computer Systems & Organisation]]
- **Previous Topic:** [[Process Management]]
- **Next Topic:** [[Process Synchronisation]]
- **Relevant Slides:** ![Lecture 5.pdf](../assets/Lecture_5_1664977343897_0.pdf)
-
- # Scheduling
- What is **scheduling**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-19T08:45:36.066Z
card-last-reviewed:: 2022-10-10T11:45:36.066Z
card-last-score:: 3
- **Scheduling** allows one process to use the CPU while the execution of another process is on hold (i.e., in the waiting state) due to unavailability of any resource like I/O etc.
- It aims to make the system efficient, fast, & fair.
- It is part of the **process manager**.
- Scheduling is the mechanism that handles the ^^**removal** of the running processes from the CPU and the **selection** of another process.^^
- It is responsible for **multiplexing** processes in the CPU.
- When it is time for the **running** process to be removed from the CPU (into a *ready* or *suspended* state), a different process is selected from the set of processes in the ready state.
- The selection of another process is based on a particular strategy - the **scheduling algorithm** will determine the order in which the OS will execute the processes.
-
- ## Scheduler Organisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:18.238Z
card-last-score:: 1
- When a process is changed to the *ready* state. the **enqueuer** places a pointer to the process descriptor into a **ready list**.
- Whenever the scheduler switches the CPU from executing one process to another, the **context switcher** saves the contents of all the processor registers of the process being removed into the **process' descriptor**.
- There are two types of context switch: **Voluntary** & **Involuntary**.
- The **dispatcher** is invoked after the current process has been from the CPU.
- The dispatcher chooses one of the processes enqueued in the ready list and then allocates CPU to that process by performing another context switch from *itself* to the selected process.
- ![image.png](../assets/image_1664978282949_0.png)
- ## Scheduler Types
collapsed:: true
- What are the two main types of scheduler? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T17:41:55.585Z
card-last-reviewed:: 2022-10-08T22:41:55.587Z
card-last-score:: 5
- **Cooperative** Scheduler (Voluntary CPU Sharing).
- **Preemptive** Scheduler (Involuntary CPU Sharing).
- ### Cooperative Scheduler (Voluntary CPU Sharing) #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:31:15.409Z
card-last-reviewed:: 2022-10-20T08:31:15.409Z
card-last-score:: 5
- Each process will **periodically invoke** the process scheduler, voluntarily sharing the CPU.
- Each process should call a function that will implement the process scheduling.
- $\text{yield}(P_{current}, P_{next})$ (sometimes implemented as instruction in hardware), where $P_{current}$ is an identifier of the current process and $P_{next}$ is an identifier of the next process.
- Cooperative multitasking allows much simpler implementation of applications, because their ^^execution is never unexpectedly interrupted by the process scheduler.^^
- Possible problem: If the process does not voluntarily cooperate with the others, one process could keep the CPU forever.
- ### Preemptive Scheduler (Involuntary CPU Sharing) #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-17T19:51:16.969Z
card-last-reviewed:: 2022-10-08T22:51:16.969Z
card-last-score:: 3
- The interrupt system **enforces periodic involuntary interruption** of any process's execution; it can force a process to involuntarily execute a yield type function (or instruction).
- This is done by incorporating an **interval timer** device that produces an interrupt whenever the time expires.
- The programmable interval timer will cause an **interrupt** to run every $K$ clock ticks of a time interval, thus causing the hardware to execute the logical equivalent of a yield instruction to invoke the **interruption handler**.
- The **interrupt handler** for the timer interrupt will call the scheduler to reschedule the processor **without** any action on the part of the running process.
- The scheduler decides which process is run next.
- The scheduler is guaranteed to be invoked once every $K$ clock ticks.
- Even if a certain process executes in an infinite loop, it will **not** block the execution of the other processes.
-
- ## Performance Elements
- Having a set of processes $P = \{p_i, 0 \leq i \leq n\}$.
- **Service Time -** $\tau(p_i)$: The amount of time that a process needs to spend in the active/running state before it completes.
- **Wait Time -** $W(p_i)$: The time that the process spends waiting in the ready state before its first transition to the active state.
- **Turn-around Time -** $T_{TRnd}(p_i)$: The amount of time between the moment that a process enters the ready state and the moment that the process exits the running state for the last time.
- These elements are used to measure the performance of each scheduling algorithm.
-
- ## Selection Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:50:48.551Z
card-last-score:: 1
- ### Non-Preemptive Strategies #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-14T11:45:13.446Z
card-last-reviewed:: 2022-10-10T11:45:13.446Z
card-last-score:: 5
- Allow any process to run to completion once it has been allocated control of the CPU.
- A process that gets the control of the CPU releases the CPU whenever it ends or when it voluntarily gives up control of the CPU.
- ### Preemptive Strategies #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:31:38.171Z
card-last-score:: 1
- The process with the highest priority among all the *ready* process is allocated the CPU.
- All lower priority processes are made to yield to the highest priority process whenever it requests the CPU.
- The **scheduler** is called every time a process enters the *ready* queue as well as when an interval timer expires.
- Preemptive strategies allow for equitable resource sharing among processes, at the expense of overloading the system.
-
- ## Scheduling Algorithms
- ### First Come, First Served (FCFS) #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-24T08:38:33.286Z
card-last-reviewed:: 2022-10-20T08:38:33.286Z
card-last-score:: 3
- **Non-preemptive** algorithm.
- This scheduling strategy assigns priority to processes by the order in which they request the processor.
- The priority of a process is computed by the enqueuer by **time stamping** all incoming processes and then having the dispatcher select the process that has the ^^oidest time stamp.^^
- Possible implementation: Using a FIFO data structure (where each entry points to a process descriptor). The enqueuer adds processes to the tail of the queue and the dispatcher removes processes from the head of the queue.
- Easy to implement.
- Not widely used because of ^^unpredictable **turn-around time** & **waiting time**.^^
- ### Shortest Job First (SJF) #card
card-last-interval:: 8.63
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-28T23:29:59.692Z
card-last-reviewed:: 2022-10-20T08:29:59.692Z
card-last-score:: 5
- **Non-preemptive** algorithm.
- SJF is an optimal algorithm from the perspective of **average turn-around time** - it minimises the average turn-around time.
- Preferential service of small jobs.
- Requires the ^^knowledge of the **service time**^^ for each process.
- In extreme cases where the system has little idle time, processes with large service time will never be served.
- In the case where it is not possible to know the service time for each process, the service time is estimated using predictors.
- ### Shortest Remaining Time Next (SRTN) #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:29:18.651Z
card-last-reviewed:: 2022-10-20T08:29:18.651Z
card-last-score:: 3
- Similar to SJF, but **preemptive**.
- If a long job is mostly complete, it might have a very short time remaining, and therefore would be prioritised.
- ### Time Slice (Round Robin) #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-24T08:30:13.805Z
card-last-reviewed:: 2022-10-20T08:30:13.806Z
card-last-score:: 3
- **Preemptive** algorithm.
- Each process gets a time slice of CPU time, distributing the processing time equitably among all processes that are requesting the processor.
- Whenever the time slice expires, control of the CPU is given to the next process in the ready list, and the process being switched from is placed back into the ready process list.
- Time Slice implies the existence of a **specialised timer** that measures the processor time for each process.
- Every time a process becomes active, the timer is intitialised.
- Not very well suited for long jobs, as the scheduler will be called multiple times until the job is done.
- Very sensitive to the size of the time slice.
- Too big -> large delays in the response time for interactive processes.
- Too small -> too much time spent running the scheduler.
- Very big -> turns into FCFS.
- The time slice is determined by analysing the number of instructions that the processor can execute in a given time slice.
-
- ### Priority-Based Preemptive Scheduling (Event Driven) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:39.065Z
card-last-score:: 1
- Both **preemptive** & **non-preemptive** variants exist.
- Each process has an ^^externally assigned priority.^^
- Every time an event occurs that generates a process switch, the ^^process with the highest priority^^ is chosen from the ready process list.
- There is a possibility that processes with low priority will never gain CPU time.
- There are variants with **static** & **dynamic** priorities.
- The **dynamic priority** computation solves the problem that some processes may never gain CPU time - the longer a process waits, the higher its priority becomes.
- Used for real-time systems.
- ### Multiple Level Queue Scheduling #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:43:47.911Z
card-last-score:: 1
- Complex systems have requirements for real-time, interactive users and batch jobs - Therefore, a **combined scheduling mechanism** should be used.
- The processes are divided into **classes**.
- Each class has a process queue, and has been assigned a specific scheduling algorithm.
- Each process queue is treated according to its queue scheduling algorithm.
- Each queue is assigned a priority.
- As long as there are processes in a higher priority queue, those will be serviced.
- #### Multiple Level Queue (with Feedback) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:40:03.804Z
card-last-score:: 1
- Same as MLQ, but the ^^processes can migrate from class to class^^ in a dynamic fashion.
- Different strategies exist to modify the priority of a process.
- Increase the priority for a given process. (E.g., the user needs a larger share of the CPU to sustain acceptable service).
- Decrease the priority for a given process. (E.g., the user process is trying to get more CPU share, which may impact on the other users).
- If a process is giving up the CPU before its time slice expires, then the process is assigned to a higher priority queue.
- During the evolution to completion, a process may go through a number of different classes.
- Any of the previous algorithms covered may be used for treating a specific process class.

View File

@ -0,0 +1,80 @@
- #[[CT216 - Software Engineering I]]
- No previous topic
- **Next Topic:** [[Software Processes]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662846749778_0.pdf)
-
- What is **Cloud Computing**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:15:54.781Z
card-last-reviewed:: 2022-09-18T15:15:54.782Z
card-last-score:: 3
- **Cloud Computing** is a model for enabling convenient, on-demand network access to a ^^shared pool of configurable computing resources^^ (e.g., networks, servers, storage, applications, & services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
- What is a **Public Cloud**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T14:58:45.902Z
card-last-reviewed:: 2022-09-18T14:58:45.902Z
card-last-score:: 5
- Amazon, MS Azure, & Google Cloud are examples of **public clouds**.
- Any member of the public can sign up and start provisioning computing resources within minutes.
- They are **highly scalable** and allow an organisation to grow its infrastructure rapidly.
- What is a **Private Cloud**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:09:14.946Z
card-last-reviewed:: 2022-09-18T15:09:14.947Z
card-last-score:: 5
- Computing resources are dedicated to a single customer and not shared with other customers.
- Considered to be more **secure**.
- What is a **Hybrid Cloud**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T14:57:26.074Z
card-last-reviewed:: 2022-09-18T14:57:26.075Z
card-last-score:: 5
- A **hybrid cloud** is simply a mix of public & private cloud resources.
- An organisation may choose this option if there is a mixture in the criticality of their data or computational requirements.
- Data that doesn't require heightened security can be pushed onto the **public cloud**, while data which does can be hosted on the **private cloud**.
- ## Cloud Services
- What is **SaaS**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:27:14.210Z
card-last-reviewed:: 2022-09-19T18:27:14.211Z
card-last-score:: 3
- **Software as a Service (SaaS)** provides users with (essentially) ^^a **cloud application**, the platform on which it runs, & the platform's underlying infrastructure.^^
- What is **PaaS**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:29:40.779Z
card-last-score:: 1
- **Platform as a Service (PaaS)** provides users with ^^compute, networking, & storage resources.^^
-
- What are the advantages of cloud computing? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:24:40.121Z
card-last-reviewed:: 2022-09-19T18:24:40.122Z
card-last-score:: 5
- **Elasticity** - if your application becomes very popular, you can procure new resources within minutes.
- Reduced capital expenditure.
- Economies of scale.
- What are the disadvantages of cloud computing? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:05:53.952Z
card-last-reviewed:: 2022-09-19T18:05:53.952Z
card-last-score:: 5
- Security / privacy
- Cost
- Migration issues

View File

@ -0,0 +1,80 @@
- #[[CT216 - Software Engineering I]]
- No previous topic
- **Next Topic:** [[Software Processes]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662846749778_0.pdf)
-
- What is **Cloud Computing**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:29:53.364Z
card-last-reviewed:: 2022-10-01T13:29:53.365Z
card-last-score:: 5
- **Cloud Computing** is a model for enabling convenient, on-demand network access to a ^^shared pool of configurable computing resources^^ (e.g., networks, servers, storage, applications, & services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
- What is a **Public Cloud**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:16:02.512Z
card-last-reviewed:: 2022-09-30T12:16:02.512Z
card-last-score:: 5
- Amazon, MS Azure, & Google Cloud are examples of **public clouds**.
- Any member of the public can sign up and start provisioning computing resources within minutes.
- They are **highly scalable** and allow an organisation to grow its infrastructure rapidly.
- What is a **Private Cloud**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:17:34.107Z
card-last-reviewed:: 2022-10-01T13:17:34.107Z
card-last-score:: 5
- Computing resources are dedicated to a single customer and not shared with other customers.
- Considered to be more **secure**.
- What is a **Hybrid Cloud**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:15:46.810Z
card-last-reviewed:: 2022-09-30T12:15:46.810Z
card-last-score:: 5
- A **hybrid cloud** is simply a mix of public & private cloud resources.
- An organisation may choose this option if there is a mixture in the criticality of their data or computational requirements.
- Data that doesn't require heightened security can be pushed onto the **public cloud**, while data which does can be hosted on the **private cloud**.
- ## Cloud Services
- What is **SaaS**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T21:29:20.765Z
card-last-reviewed:: 2022-10-03T14:29:20.766Z
card-last-score:: 3
- **Software as a Service (SaaS)** provides users with (essentially) ^^a **cloud application**, the platform on which it runs, & the platform's underlying infrastructure.^^
- What is **PaaS**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-17T13:41:30.969Z
card-last-reviewed:: 2022-10-06T09:41:30.970Z
card-last-score:: 5
- **Platform as a Service (PaaS)** provides users with ^^compute, networking, & storage resources.^^
-
- What are the advantages of cloud computing? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:49.635Z
card-last-reviewed:: 2022-10-03T14:27:49.635Z
card-last-score:: 3
- **Elasticity** - if your application becomes very popular, you can procure new resources within minutes.
- Reduced capital expenditure.
- Economies of scale.
- What are the disadvantages of cloud computing? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:43:27.437Z
card-last-reviewed:: 2022-10-03T11:43:27.437Z
card-last-score:: 5
- Security / privacy
- Cost
- Migration issues

View File

@ -0,0 +1,156 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Convex Polyhedra]]
- **Next Topic:** [[Trees]]
- **Relevant Slides:** ![MA284-Week10.pdf](../assets/MA284-Week10_1667999565189_0.pdf)
-
- # Vertex Colouring
- There are maps that can be coloured with a single colour, two colours, three colours, or four colours.
- For all maps, no matter how complicated, at most four colours is sufficient.
- # Colouring Graphs
- If we think of a map as a way of showing which regions share borders, then we can represent it as a **graph**, where:
- A vertex in the graph corresponds to a region in the map.
- There is an edge between two vertices in the graph if the corresponding regions share a border.
- Colouring regions of a map corresponds to colouring vertices of the graph. Since neighbouring regions in the map must have different colours, so too must adjacent vertices.
- More precisely:
- **Vertex Colouring:** An assignment of colours to the vertices of a graph.
- **Proper Colouring:** If the vertex colouring has the property that adjacent vertices are coloured differently, then the colouring is called **proper**.
- **Minimal Colouring:** A proper colouring that is done with the fewest possible number of colours.
- Lots of different proper colourings are possible. If the graph has $v$ vertices, then clearly at most $v$ colours are needed. However, usually, we need far fewer.
- ## Chromatic Numbers
- What is the **chromatic number** of a graph? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T10:50:36.778Z
card-last-reviewed:: 2022-11-14T15:50:36.779Z
card-last-score:: 5
- The **chromatic number** of the graph, written $\chi(G)$ is the smallest number of colours needed to get a proper vertex colouring of a graph $G$.
- ### Example
- Determine the **chromatic number** of the graphs $C_2$, $C_3$, $C_4$, & $C_5$.
background-color:: green
- $$\chi(C_2) = 2$$
- $$\chi(C_3) = 3$$
- $$\chi(C_4) = 2$$
- $$\chi(C_3) = 5$$
- Determine the **chromatic number** of the $K_n$ & $K_{p,q}$ for any $n$, $p$, $q$.
background-color:: green
- $$\chi(K_4) = 4$$
- $$\chi(K_n) = n$$
- $$\chi(K_{3,3}) = 2$$
- $$\chi(K_{p,q}) = 2$$
- In general, calculating $\chi(G)$ is not easy, but there are some ideas that can help. For example, it is clearly true that if a graph has $v$ vertices, then
- $$1 \leq \chi(G) \leq v$$
- ### Cliques
- If the graph happens to be **complete**, then $\chi(G) = v$. If it is **not** complete, then we can look at ***cliques*** in the graph.
- What is a **clique** of a graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:14:53.505Z
card-last-score:: 1
- A **clique** is a subgraph of a graph, all of whose vertices are connected to each other.
- (Clique numbers will not be on the exam).
- What is the **clique number** of a graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:21:22.385Z
card-last-score:: 1
- The **clique number** of a graph, $G$, is the number of vertices in the largest clique in $G$.
- **Lower Bound:** The chromatic number of a graph is *at least* its clique number.
- **Upper Bound:** $\chi(G) \leq \Delta(G) + 1$, where $\Delta(G)$ denotes the largest degree of any vertex in the graph $G$.
- # Algorithms for $\chi(G)$
- In general, finding a proper colouring for a graph is hard. There are some algorithms that are efficient, but not optimal. We'll look at two:
- The **Greedy Algorithm**.
- The **Welsh-Powell Algorithm**.
- The **Greedy Algorithm** is simple & efficient, but the result can depend on the ordering of the vertices.
- The **Welsh-Powell Algorithm** is slightly more complicated, but can give better colourings.
- ## Greedy Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:22:55.701Z
card-last-score:: 1
- 1. Number all the vertices. Number your colours.
2. Give a colour to vertex 1.
3. Take the remaining vertices in order. Assign each one the lowest numbered colours that is different from the colours of its neighbours.
- ## Welsh-Powell Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:51:29.529Z
card-last-score:: 1
- 1. List all vertices in decreasing order of their degree (i.e., largest degree first). If two or more share the same degree, list them in any way you want.
2. Colour the first listed vertex (with the first unused colour).
3. Work down the list, giving that colour to all vertices **not** conencted to one previously coloured.
4. Cross (verb.) coloured vertices of the list, and return to the start of the list.
- # Eulerian Paths & Circuits
- Recall that a **path** is a sequence of adjacent vertices in a graph.
- What is a **Eulerian Path**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:18:26.028Z
card-last-score:: 1
- A **Eulerian Path** (also called an *Euler Path* and an *Eulerian trail*) in a graph is path which uses every edge exactly once.
- ![image.png](../assets/image_1668164848583_0.png)
- Recall that a **circuit** is a path that begins & ends at that same vertex, and no edge is repeated.
- What is an **Eulerian Circuit**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:21:29.522Z
card-last-score:: 1
- An **Eulerian Circuit** (also called an *Eulerian Cycle*) is an *Eulerian path* that that starts & finishes on at the same vertex.
- If a graph has such a circuit, we say that it is *Eulerian*.
- It is possible to come up with a condition that guarantees that a graph has an Eulerian Path, and, additionally, one that ensures that ensures that it has an Eulerian Circuit.
- To begin with, we'll reason that the following graph could *not* have an Eulerian circuit, although it *does* have an Eulerian path.
- ![image.png](../assets/image_1668165195209_0.png)
- Suppose, first, that we have a graph that ==**does** have an Eulerian circuit.== Then, for every edge in the circuit that "exits" a vertex, there is another that "enters" that vertex. So, every vertex must have even degree.
- A graph has an **Eulerian Circuit** if and only if every vertex has even degree.
- Next, suppose that a graph==does **not** have an Eulerian circuit==, but does have an **Eulerian path**. Then, the degree at the "start" & "end" verticwes must be odd, and every other vertex has even degree.
- A graph has an **Eulerian Path** if and only if it has either **zero** or **two** vertices with odd degree.
- # Hamiltonian Paths & Cycles
- What is a **Hamiltonian Path**? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-24T08:06:13.770Z
card-last-reviewed:: 2022-11-21T13:06:13.770Z
card-last-score:: 5
- A **Hamiltonian Path** is a graph that visits every vertex exactly once.
- What is a **Hamiltonian Cycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:23:21.600Z
card-last-score:: 1
- A **Hamiltonian Cycle** is a cycle which visits the start / end vertex twice, and every other vertex exactly once.
- A graph that has a Hamiltonian Cycle is called a **Hamiltonian Graph**.
- Important examples of Hamiltonian Graphs include cycle graphs, complete graphs, & graphs of the platonic solids.
- In general, the problem of finding a Hamiltonian path or cycle in a large graph is hard - it is known to be NP-complete. However, there are two relatively simple *sufficient conditions* to testing if a graph is Hamiltonian:
- ## Ore's Theorem #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:30.834Z
card-last-score:: 1
- A graph with $v$ vertices, where $v \geq 3$, is **Hamiltonian** if, for every pair of non-adjacent vertices, the sum of their degrees is $\geq v$.
- ## Dirac's Theorem #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:15:37.739Z
card-last-score:: 1
- A simple graph with $v$ vertices, where $v \geq 3$, is **Hamiltonian** if every vertex has degree $\geq v / 2$.
-
-

View File

@ -0,0 +1,156 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Convex Polyhedra]]
- **Next Topic:** [[Trees]]
- **Relevant Slides:** ![MA284-Week10.pdf](../assets/MA284-Week10_1667999565189_0.pdf)
-
- # Vertex Colouring
- There are maps that can be coloured with a single colour, two colours, three colours, or four colours.
- For all maps, no matter how complicated, at most four colours is sufficient.
- # Colouring Graphs
- If we think of a map as a way of showing which regions share borders, then we can represent it as a **graph**, where:
- A vertex in the graph corresponds to a region in the map.
- There is an edge between two vertices in the graph if the corresponding regions share a border.
- Colouring regions of a map corresponds to colouring vertices of the graph. Since neighbouring regions in the map must have different colours, so too must adjacent vertices.
- More precisely:
- **Vertex Colouring:** An assignment of colours to the vertices of a graph.
- **Proper Colouring:** If the vertex colouring has the property that adjacent vertices are coloured differently, then the colouring is called **proper**.
- **Minimal Colouring:** A proper colouring that is done with the fewest possible number of colours.
- Lots of different proper colourings are possible. If the graph has $v$ vertices, then clearly at most $v$ colours are needed. However, usually, we need far fewer.
- ## Chromatic Numbers
- What is the **chromatic number** of a graph? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T10:50:36.778Z
card-last-reviewed:: 2022-11-14T15:50:36.779Z
card-last-score:: 5
- The **chromatic number** of the graph, written $\chi(G)$ is the smallest number of colours needed to get a proper vertex colouring of a graph $G$.
- ### Example
- Determine the **chromatic number** of the graphs $C_2$, $C_3$, $C_4$, & $C_5$.
background-color:: green
- $$\chi(C_2) = 2$$
- $$\chi(C_3) = 3$$
- $$\chi(C_4) = 2$$
- $$\chi(C_3) = 5$$
- Determine the **chromatic number** of the $K_n$ & $K_{p,q}$ for any $n$, $p$, $q$.
background-color:: green
- $$\chi(K_4) = 4$$
- $$\chi(K_n) = n$$
- $$\chi(K_{3,3}) = 2$$
- $$\chi(K_{p,q}) = 2$$
- In general, calculating $\chi(G)$ is not easy, but there are some ideas that can help. For example, it is clearly true that if a graph has $v$ vertices, then
- $$1 \leq \chi(G) \leq v$$
- ### Cliques
- If the graph happens to be **complete**, then $\chi(G) = v$. If it is **not** complete, then we can look at ***cliques*** in the graph.
- What is a **clique** of a graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:14:53.505Z
card-last-score:: 1
- A **clique** is a subgraph of a graph, all of whose vertices are connected to each other.
- (Clique numbers will not be on the exam).
- What is the **clique number** of a graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:21:22.385Z
card-last-score:: 1
- The **clique number** of a graph, $G$, is the number of vertices in the largest clique in $G$.
- **Lower Bound:** The chromatic number of a graph is *at least* its clique number.
- **Upper Bound:** $\chi(G) \leq \Delta(G) + 1$, where $\Delta(G)$ denotes the largest degree of any vertex in the graph $G$.
- # Algorithms for $\chi(G)$
- In general, finding a proper colouring for a graph is hard. There are some algorithms that are efficient, but not optimal. We'll look at two:
- The **Greedy Algorithm**.
- The **Welsh-Powell Algorithm**.
- The **Greedy Algorithm** is simple & efficient, but the result can depend on the ordering of the vertices.
- The **Welsh-Powell Algorithm** is slightly more complicated, but can give better colourings.
- ## Greedy Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:22:55.701Z
card-last-score:: 1
- 1. Number all the vertices. Number your colours.
2. Give a colour to vertex 1.
3. Take the remaining vertices in order. Assign each one the lowest numbered colours that is different from the colours of its neighbours.
- ## Welsh-Powell Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:51:29.529Z
card-last-score:: 1
- 1. List all vertices in decreasing order of their degree (i.e., largest degree first). If two or more share the same degree, list them in any way you want.
2. Colour the first listed vertex (with the first unused colour).
3. Work down the list, giving that colour to all vertices **not** conencted to one previously coloured.
4. Cross (verb.) coloured vertices of the list, and return to the start of the list.
- # Eulerian Paths & Circuits
- Recall that a **path** is a sequence of adjacent vertices in a graph.
- What is a **Eulerian Path**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:18:26.028Z
card-last-score:: 1
- A **Eulerian Path** (also called an *Euler Path* and an *Eulerian trail*) in a graph is path which uses every edge exactly once.
- ![image.png](../assets/image_1668164848583_0.png)
- Recall that a **circuit** is a path that begins & ends at that same vertex, and no edge is repeated.
- What is an **Eulerian Circuit**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:21:29.522Z
card-last-score:: 1
- An **Eulerian Circuit** (also called an *Eulerian Cycle*) is an *Eulerian path* that that starts & finishes on at the same vertex.
- If a graph has such a circuit, we say that it is *Eulerian*.
- It is possible to come up with a condition that guarantees that a graph has an Eulerian Path, and, additionally, one that ensures that ensures that it has an Eulerian Circuit.
- To begin with, we'll reason that the following graph could *not* have an Eulerian circuit, although it *does* have an Eulerian path.
- ![image.png](../assets/image_1668165195209_0.png)
- Suppose, first, that we have a graph that ==**does** have an Eulerian circuit.== Then, for every edge in the circuit that "exits" a vertex, there is another that "enters" that vertex. So, every vertex must have even degree.
- A graph has an **Eulerian Circuit** if and only if every vertex has even degree.
- Next, suppose that a graph==does **not** have an Eulerian circuit==, but does have an **Eulerian path**. Then, the degree at the "start" & "end" verticwes must be odd, and every other vertex has even degree.
- A graph has an **Eulerian Path** if and only if it has either **zero** or **two** vertices with odd degree.
- # Hamiltonian Paths & Cycles
- What is a **Hamiltonian Path**? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-24T08:06:13.770Z
card-last-reviewed:: 2022-11-21T13:06:13.770Z
card-last-score:: 5
- A **Hamiltonian Path** is a graph that visits every vertex exactly once.
- What is a **Hamiltonian Cycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:23:21.600Z
card-last-score:: 1
- A **Hamiltonian Cycle** is a cycle which visits the start / end vertex twice, and every other vertex exactly once.
- A graph that has a Hamiltonian Cycle is called a **Hamiltonian Graph**.
- Important examples of Hamiltonian Graphs include cycle graphs, complete graphs, & graphs of the platonic solids.
- In general, the problem of finding a Hamiltonian path or cycle in a large graph is hard - it is known to be NP-complete. However, there are two relatively simple *sufficient conditions* to testing if a graph is Hamiltonian:
- ## Ore's Theorem #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:30.834Z
card-last-score:: 1
- A graph with $v$ vertices, where $v \geq 3$, is **Hamiltonian** if, for every pair of non-adjacent vertices, the sum of their degrees is $\geq v$.
- ## Dirac's Theorem #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:15:37.739Z
card-last-score:: 1
- A simple graph with $v$ vertices, where $v \geq 3$, is **Hamiltonian** if every vertex has degree $\geq v / 2$.
-
-

View File

@ -0,0 +1,87 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Binomial Coefficients]]
- **Next Topic:** null
- **Relevant Slides:** ![MA284-Week04.pdf](../assets/MA284-Week04_1664365603740_0.pdf)
-
- # Pascal's Triangle
- **Pascal's Identity:** A recurrence relation for $\binom{n}{k}$:
- $$\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$$
- In the previous topic, we "proved" that $\binom{n}{k} = \frac{n!}{(n-k)!k!}$ by counting $P(n,k)$ in two different ways.
- This is a classic example of a **Combinatorial Proof**, where we establish a formula by counting something in 2 different ways.
- ![image.png](../assets/image_1664366784504_0.png)
- Binomial coefficients have many important properties. Looking at their arrangement in Pascal's Triangle, we can spot some:
- (i) For all $n$, $\binom{n}{0} = \binom{n}{n} = 1$.
- (ii) $\displaystyle\sum^{n}_{i=0} \binom{n}{i}= 2^n$
- (iii) $\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k}$ (Pascal's Identity).
- (iv) $\binom{n}{k} = \binom{n}{n-k}$
-
- # Algebraic & Combinatorial Proofs
- Proofs of identities involving **Binomial Coefficients** can be classified as:
- **Algebraic:** If they rely mainly on the formula for binomial coefficients.
- $$
\binom{n}{k} = \frac{n!}{k!(n-k)!}
\newline
\therefore \binom{n}{k} = \frac{n!}{(n-k)!(n-(n-k))!} = \frac{n!}{(n-k)!k!} = \binom{n}{k}
$$
- **Combinatorial:** If they involve counting a set in two different ways.
- Let $A$ be a set of size $n$.
- $\binom{n}{k} =$ number of subsets of $A$ of cardinality $k$, but for each such subset there is a one-to-one correspondence with a subset of size $n-k$.
- i.e., $\binom{n}{n-k} = \binom{n}{k}$,
- ## Algebraic Proof of Pascal's Triangle Recurrence Relation
- $$\binom{n}{k} = \frac{n!}{k!(n-k)!} \newline \newline$$
- $$\binom{n-1}{k-1} + \binom{n-1}{k} = \frac{(n-1)!}{(k-1)!(n-k)!} + \frac{(n-1)!}{k!(n-k-1)!}$$
- $$ = \frac{k(n-1)!}{k(k-1)!(n-k)!} + \frac{(n-1)!(n-k)}{k!(n-k-1)!(n-k)} $$
- $$= \frac{k(n-1)!+(n-k)(n-1)!}{k!(n-k)!}$$
- $$= \frac{(n-1)!(k+n-k)}{k!(n-k)!} = \frac{n!}{k!(n-k)!} = \binom{n}{k}$$
-
-
- ## Example:
- Let $A$ be a set with $n$ elements.
- Then, the total number of subsets of $A$ can be counted as follows:
- Generic subset: An element is either in it or not
- For each of $n$ elements, there are 2 choices: in or not in.
- By Multiplicative principle, 2 \times 2 \times ... \times 2 such subsets [2^n = |P(n)|].
- Number of subsets with:
- 0
-
-
-
- # How Combinatorial Proofs Work
- ## Which are better: Algebraic or Combinatorial proofs?
- When we first study discrete mathematics, **algebraic** proofs may seem to be the easiest: they rely only using some standard formulae, and don't require any deeper insight. They are also more "familiar".
- However:
- Often, algebraic proofs are quite tricky.
- Usually, algebraic proofs give no insight as to why a fact is true.
- ## Example
- We wish to show that:
- $${\binom{n}{0}}^2 + {\binom{n}{1}}^2 + {\binom{n}{2}}^2 + ... + {\binom{n}{n}}^2 = \binom{2n}{n}$$
- We note that $\binom{2n}{n}$ is the total number of subsets of size $n$ in a set with $2n$ elements.
- Let $A$ be a set with $2n$ elements, and label them $A = \{a_1, a_2, ..., a_n, a_{n+1}, ..., a_{2n}\}$.
- Any subset of $A$ with $n$ elements has $k$ elements from $\{a_1 a_2, ..., a_n\}$ and $n-k$ elements from t $\{a_{n+1}, ..., a_{2n}\}$ where $k$ ranges from $0$ to $n$.
- There are $\displaystyle \binom{n}{k} \cdot \binom{n}{n-k}$ ways of choosing these $n$ elements by the Multiplicative Principle.
- So the total number of subsets with $n$ elements is $\displaystyle\sum^n_{k=0} \binom{n}{k} \binom{n}{n-k}$ and noting that $\displaystyle \binom{n}{n-k} = \binom{n}{k}$ and the results follows.
- ## Example
- Using a combinatorial argument, or otherwise, prove that
- $$k\binom{n}{k} = n\binom{n-1}{k-1}$$
- **Combinatorial Proof**:
- Suppose we have a panel of $n$ players and we need to choose a team of $k$ player with a distinguished player (e.g., the goalkeeper).
- We count how many ways we can do this.
- [A] Pick the team, then pick the goalie.
- By the Multiplicative Principle, we can pick a team of $k$ from $n$ in $\binom{n}{k}$ ways and have $k$ ways then of choosing a keeper from this.
- $$= k\binom{n}{k}$$
- [B] Pick the goalie, then pick the remainder of the team.
- We have $n$ choices for the goalie. then choose $k-1$ from the $n-1$ remaining players.
- By the Multiplicative Principle we have $\displaystyle n\binom{n-1}{k-1}$ ways.
- Result follows.
- ## What is a "Combinatorial Proof" really? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:37:08.312Z
card-last-reviewed:: 2022-10-07T10:37:08.313Z
card-last-score:: 3
- [1] These proofs involve finding two different ways to answer the same counting question.
- [2] Then, we explain why the answer to the problem posed one way is $A$.
- [3] Next, we explain why the answer to the problem posed the other way is $B$.
- [4] Since $A$ and $B$ are answers to the same question, we have shown that it must be that $A = B$.
-

View File

@ -0,0 +1,39 @@
- #[[MA284 - Discrete Mathematics]]
- No previous topic
- **Next topic:** [[Principle of Inclusion-Exclusion]]
- **Relevant Slides:** ![Week01.pdf](../assets/Week01_1662844828934_0.pdf)
-
- What is **Combinatorics**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-17T10:51:03.296Z
card-last-reviewed:: 2022-09-13T10:51:03.297Z
card-last-score:: 5
- **Combinatorics** is the mathematics of *counting*.
-
- ## The Additive & Multiplicative Principles
- ### The Additive Principle
- What is the **Additive Principle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-13T23:00:00.000Z
card-last-reviewed:: 2022-09-13T10:46:53.779Z
card-last-score:: 1
- If an event $A$ can occur $m$ ways, and event $B$ can occur $n$ (disjoint) ways, then event "$A$ **or** $B$" can occur $m + n$ ways.
- ### The Multiplicative Principle
- What is the **Multiplicative Principle**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-16T11:37:21.246Z
card-last-reviewed:: 2022-09-12T11:37:21.246Z
card-last-score:: 5
- If event $A$ can occur $m$ ways, and each possibility allows for event $B$ to occur in $n$ (disjoint) ways, then the event "$A$ **and** $B$" can occur in $m \times n$ ways.
-
- ## Counting with Sets
- What is a **set**?
- A **set** is a collection of things.
- The items in a set are called *elements*.
- A set is **unordered**.

View File

@ -0,0 +1,45 @@
- #[[MA284 - Discrete Mathematics]]
- No previous topic
- **Next topic:** [[Principle of Inclusion-Exclusion]]
- **Relevant Slides:** ![Week01.pdf](../assets/Week01_1662844828934_0.pdf)
-
- What is **Combinatorics**? #card
card-last-interval:: 11.16
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-30T21:29:52.224Z
card-last-reviewed:: 2022-09-19T18:29:52.225Z
card-last-score:: 5
- **Combinatorics** is the mathematics of *counting*.
-
- ## The Additive & Multiplicative Principles
- ### The Additive Principle
- What is the **Additive Principle**? #card
card-last-interval:: 9.84
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-29T13:43:04.372Z
card-last-reviewed:: 2022-09-19T17:43:04.373Z
card-last-score:: 5
- If an event $A$ can occur $m$ ways, and event $B$ can occur $n$ (disjoint) ways, then event "$A$ **or** $B$" can occur $m + n$ ways.
- ### The Multiplicative Principle
- What is the **Multiplicative Principle**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:18:14.874Z
card-last-reviewed:: 2022-09-18T15:18:14.874Z
card-last-score:: 3
- If event $A$ can occur $m$ ways, and each possibility allows for event $B$ to occur in $n$ (disjoint) ways, then the event "$A$ **and** $B$" can occur in $m \times n$ ways.
-
- ## Counting with Sets
- What is a **set**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-21T20:51:01.873Z
card-last-reviewed:: 2022-09-17T20:51:01.875Z
card-last-score:: 5
- A **set** is a collection of things.
- The items in a set are called *elements*.
- A set is **unordered**.

View File

@ -0,0 +1,67 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[Social Engineering]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_07.pdf](../assets/ct255_07_1667826292487_0.pdf)
-
- # Groups, Rings, & Fields
- In mathematics,
- a **group** is a set equipped with a binary operation that is associative, has an identity element, and is such that every element has an inverse, e.g., $(\mathbb{Z}, +)$.
- a **ring** is a set equipped with two binary operations satisfying properties analogous to those of addition & multiplication of integers, e.g. $(\mathbb{Z}, +, *)$.
- a **field** is a set on which addition, subtraction, multiplication, & division are defined and behave as the corresponding operations on rational & real numbers do.
-
- # Diffie-Hellman Key Exchange
- What is the **Diffie-Hellman Key Exchange**? #card
card-last-interval:: 0.98
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-10T11:39:05.685Z
card-last-reviewed:: 2022-11-09T12:39:05.686Z
card-last-score:: 3
- **Diffie-Hellman** provides **secure key exchange** between two partners.
- The negotiated key is subsequently used for private key encryption / authentication.
- It uses the multiplicative group of integers modulo $n$ $(\mathbb{Z} / n \mathbb{Z})^x$.
- It is based on the difficulty of computing discrete logarithms over such groups, e.g.:
- $$6^3 \text{ mod } 17 = 216 \text{ mod } 17 =12 \text{ (easy) }$$
- $$12 = 6 ^y \text{ mod } 17 ? \text{ hard }$$
- The core equation for the key exchange is
- $$K = (A)^B \text{ mod } q$$
- ## Diffie-Hellman: Global Public Elements
- Select a prime number $q$ and positive and a positive integer $a$, where $a < q$ and $a$ is a **primitive root** of $q$.
- What is a **primitive root**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:41:19.113Z
card-last-score:: 1
- $a$ is a **primitive root** of $q$, if numbers $a \text{ mod } q, a^2 \text{ mod } q, \cdots , a^{q-1} \text{ mod } q$ are distinct integer values between $1$ and $(q-1)$ in some permutation, i.e., elements of $(\mathbb{Z} / q \mathbb{Z})^x$.
- **Example:** $a = 3$ is a primitive root of $(\mathbb{Z} / 5\mathbb{Z})^x$, $a=4$ is not:
background-color:: green
- ## Generation of Secret-Key
- Both users share a public prime number $q$ and primitive root $a$.
- User A:
- 1. Select secret number $XA$ with $XA < q$.
2. Calculate public value $YA = a^{XA} \text{ mod } q$ (difficult to reverse).
3. $YA$ is sent to User B.
- User B:
- 1. Select secret number $XB$ with $XB < q$.
2. Calculate public value $YB = a^{XB} \text{ mod } q$ (difficult to reverse).
3. $YB$ is sent to User A.
- User A:
- User A owns $XA$ and receives $YB$.
- Generate secret key: $K = (YB)^{XA} \text{ mod } q$.
- User B:
- User B owns $XB$ and receives $YA$.
- Generate secret key: $K = (YA)^{XB} \text{ mod } q$.
- Both keys are identical.
- ## Diffie-Hellman in Practice
- The algorithm is used in tandem with a variety of secure network protocols.
- Provision of secure end-to-end connection.
- No endpoint authentication - you can't validate who you are talking to.
- Modulus $p$ typically has a minimum length of 1024 bits.
- ## DH & Man-in-the-Middle (MitM) Attacks
- ![image.png](../assets/image_1667828493859_0.png)
- Mallory is a MitM attacker and performs message interception & message fabrication.
- Mallory establishes two individual (secure) connections with Alice & Bob.
- Neither Alice nor Bob are aware of Mallory's existence (as there is no authentication).
-

View File

@ -0,0 +1,108 @@
- #[[CT230 - Database Systems I]]
- No previous topic
- **Next Topic:** [[The Relational Model]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662845512365_0.pdf)
-
- What is a **database**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:12:15.766Z
card-last-reviewed:: 2022-09-30T12:12:15.767Z
card-last-score:: 3
- A **database** is ^^a collection of related data.^^
-
- What is the **Database Approach**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:35:42.411Z
card-last-reviewed:: 2022-10-07T10:35:42.412Z
card-last-score:: 3
- A **single repository** of data (which may be distributed) is maintained that is **defined once** and then accessed by various users via a **DBMS**.
- ## Database Management Systems
- What is a **DBMS**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:02.196Z
card-last-reviewed:: 2022-10-03T11:42:02.196Z
card-last-score:: 3
- The **DataBase Management System (DBMS)** is a collection of programs that facilitates the process of ^^defining, constructing, & manipulating^^ databases for various applications.
- ### DBMS Capabilities #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:27:57.126Z
card-last-reviewed:: 2022-10-03T14:27:57.127Z
card-last-score:: 3
- 1. **Define** database (DDL)
2. **Manipulate** database (SQL)
3. **Control** redundancy
4. **Restrict** unauthorised access
5. **Enforce** integrity constraints
6. Provide multiple user interfaces / **views**
7. Provide **concurrent access**
8. Provide mechanism for **recovery**
9. Provide **back-up**
10. Allows representation of complex relationships between data (For efficiency & optimisation reasons)
-
- ### Disadvantages of DBMS approach #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:17:45.706Z
card-last-reviewed:: 2022-10-01T13:17:45.707Z
card-last-score:: 5
- Strict schema & multiple tables / relations
- Complexity
- Size
- Cost of DBMS
- Additional hardware costs
- Cost of conversion
- Performance
- Higher impact of failure
- ### DBMS Users #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:38:38.838Z
card-last-reviewed:: 2022-10-07T10:38:38.839Z
card-last-score:: 3
- **Administrators (DBA)** - accounts, passwords, privileges. Requiring constant vigilance
- **System Analysts** - "What's required to solve a problem? What does the business need?"
- **Designers** - ER diagrams, mapping ER diagrams to tables
- **Application Programmers** - creating tables, adding data, creating queries
- **End users**
-
- What is **Database Abstraction**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:18:04.573Z
card-last-reviewed:: 2022-10-01T13:18:04.573Z
card-last-score:: 3
- **Database Abstraction** refers to the hiding of the details of data storage that are not needed by most database users.
- The aim is to separate user's views of the database from the way that it is "physically" represented.
- 3 ways in which data can be described:
- **External:** user's view
- **Conceptual:** logical structure as seen by DBA
- **Internal:** DBMS and OS view of data
- What is the database **schema**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:33.547Z
card-last-reviewed:: 2022-09-30T12:10:33.548Z
card-last-score:: 3
- The database **schema** is the ^^logical structure of the database.^^
- What is the database **instance**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:29:01.333Z
card-last-reviewed:: 2022-10-03T14:29:01.333Z
card-last-score:: 5
- The database **instance** is ^^the actual content of the database at some point in time.^^
-
-

View File

@ -0,0 +1,108 @@
- #[[CT230 - Database Systems I]]
- No previous topic
- **Next Topic:** [[The Relational Model]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662845512365_0.pdf)
-
- What is a **database**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:12:15.766Z
card-last-reviewed:: 2022-09-30T12:12:15.767Z
card-last-score:: 3
- A **database** is ^^a collection of related data.^^
-
- What is the **Database Approach**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:35:42.411Z
card-last-reviewed:: 2022-10-07T10:35:42.412Z
card-last-score:: 3
- A **single repository** of data (which may be distributed) is maintained that is **defined once** and then accessed by various users via a **DBMS**.
- ## Database Management Systems
- What is a **DBMS**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:02.196Z
card-last-reviewed:: 2022-10-03T11:42:02.196Z
card-last-score:: 3
- The **DataBase Management System (DBMS)** is a collection of programs that facilitates the process of ^^defining, constructing, & manipulating^^ databases for various applications.
- ### DBMS Capabilities #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:27:57.126Z
card-last-reviewed:: 2022-10-03T14:27:57.127Z
card-last-score:: 3
- 1. **Define** database (DDL)
2. **Manipulate** database (SQL)
3. **Control** redundancy
4. **Restrict** unauthorised access
5. **Enforce** integrity constraints
6. Provide multiple user interfaces / **views**
7. Provide **concurrent access**
8. Provide mechanism for **recovery**
9. Provide **back-up**
10. Allows representation of complex relationships between data (For efficiency & optimisation reasons)
-
- ### Disadvantages of DBMS approach #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:17:45.706Z
card-last-reviewed:: 2022-10-01T13:17:45.707Z
card-last-score:: 5
- Strict schema & multiple tables / relations
- Complexity
- Size
- Cost of DBMS
- Additional hardware costs
- Cost of conversion
- Performance
- Higher impact of failure
- ### DBMS Users #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:38:38.838Z
card-last-reviewed:: 2022-10-07T10:38:38.839Z
card-last-score:: 3
- **Administrators (DBA)** - accounts, passwords, privileges. Requiring constant vigilance
- **System Analysts** - "What's required to solve a problem? What does the business need?"
- **Designers** - ER diagrams, mapping ER diagrams to tables
- **Application Programmers** - creating tables, adding data, creating queries
- **End users**
-
- What is **Database Abstraction**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:18:04.573Z
card-last-reviewed:: 2022-10-01T13:18:04.573Z
card-last-score:: 3
- **Database Abstraction** refers to the hiding of the details of data storage that are not needed by most database users.
- The aim is to separate user's views of the database from the way that it is "physically" represented.
- 3 ways in which data can be described:
- **External:** user's view
- **Conceptual:** logical structure as seen by DBA
- **Internal:** DBMS and OS view of data
- What is the database **schema**? #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-11-02T21:44:54.610Z
card-last-reviewed:: 2022-10-10T11:44:54.610Z
card-last-score:: 5
- The database **schema** is the ^^logical structure of the database.^^
- What is the database **instance**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:29:01.333Z
card-last-reviewed:: 2022-10-03T14:29:01.333Z
card-last-score:: 5
- The database **instance** is ^^the actual content of the database at some point in time.^^
-
-

View File

@ -0,0 +1,108 @@
- #[[CT230 - Database Systems I]]
- No previous topic
- **Next Topic:** [[The Relational Model]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662845512365_0.pdf)
-
- What is a **database**? #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-11-12T18:33:52.240Z
card-last-reviewed:: 2022-10-20T08:33:52.241Z
card-last-score:: 3
- A **database** is ^^a collection of related data.^^
-
- What is the **Database Approach**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:35:42.411Z
card-last-reviewed:: 2022-10-07T10:35:42.412Z
card-last-score:: 3
- A **single repository** of data (which may be distributed) is maintained that is **defined once** and then accessed by various users via a **DBMS**.
- ## Database Management Systems
- What is a **DBMS**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:02.196Z
card-last-reviewed:: 2022-10-03T11:42:02.196Z
card-last-score:: 3
- The **DataBase Management System (DBMS)** is a collection of programs that facilitates the process of ^^defining, constructing, & manipulating^^ databases for various applications.
- ### DBMS Capabilities #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:27:57.126Z
card-last-reviewed:: 2022-10-03T14:27:57.127Z
card-last-score:: 3
- 1. **Define** database (DDL)
2. **Manipulate** database (SQL)
3. **Control** redundancy
4. **Restrict** unauthorised access
5. **Enforce** integrity constraints
6. Provide multiple user interfaces / **views**
7. Provide **concurrent access**
8. Provide mechanism for **recovery**
9. Provide **back-up**
10. Allows representation of complex relationships between data (For efficiency & optimisation reasons)
-
- ### Disadvantages of DBMS approach #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:17:45.706Z
card-last-reviewed:: 2022-10-01T13:17:45.707Z
card-last-score:: 5
- Strict schema & multiple tables / relations
- Complexity
- Size
- Cost of DBMS
- Additional hardware costs
- Cost of conversion
- Performance
- Higher impact of failure
- ### DBMS Users #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T10:38:38.838Z
card-last-reviewed:: 2022-10-07T10:38:38.839Z
card-last-score:: 3
- **Administrators (DBA)** - accounts, passwords, privileges. Requiring constant vigilance
- **System Analysts** - "What's required to solve a problem? What does the business need?"
- **Designers** - ER diagrams, mapping ER diagrams to tables
- **Application Programmers** - creating tables, adding data, creating queries
- **End users**
-
- What is **Database Abstraction**? #card
card-last-interval:: 19.01
card-repeats:: 4
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-08T08:37:10.629Z
card-last-reviewed:: 2022-10-20T08:37:10.629Z
card-last-score:: 3
- **Database Abstraction** refers to the hiding of the details of data storage that are not needed by most database users.
- The aim is to separate user's views of the database from the way that it is "physically" represented.
- 3 ways in which data can be described:
- **External:** user's view
- **Conceptual:** logical structure as seen by DBA
- **Internal:** DBMS and OS view of data
- What is the database **schema**? #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-11-02T21:44:54.610Z
card-last-reviewed:: 2022-10-10T11:44:54.610Z
card-last-score:: 5
- The database **schema** is the ^^logical structure of the database.^^
- What is the database **instance**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:29:01.333Z
card-last-reviewed:: 2022-10-03T14:29:01.333Z
card-last-score:: 5
- The database **instance** is ^^the actual content of the database at some point in time.^^
-
-

View File

@ -0,0 +1,128 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Introduction to Graph Theory]]
- **Next Topic:** [[Convex Polyhedra]]
- **Relevant Slides:** ![MA284-Week08.pdf](../assets/MA284-Week08_1666785726176_0.pdf)
-
- # Definitions
- What is a **walk**? #card
- A **walk** is a sequence of vertices such that consecutive vertices are adjacent.
- What is a **trail**? #card
- A **trail** is a walk in which no edge is repeated.
- What is a **path**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-08T00:00:00.000Z
card-last-reviewed:: 2022-11-07T08:29:15.009Z
card-last-score:: 1
- A **path** is a trail in which no vertex is repeated, except possibly the first & last.
- The path on $n$ vertices is denoted $P_n$.
- ## Example
- <img src="https://mermaid.ink/img/ICBmbG93Y2hhcnQgTFIKQigoQikpIC0tLSBBKChBKSkKQiAtLS0gQygoQykpCkMgLS0tIEUoKEUpKQpFIC0tLSBEKChEKSkKQiAtLS0gRSAtLS0gQQoK" />
{{renderer :mermaid_uiukfrr}}
- ```mermaid
flowchart LR
B((B)) --- A((A))
B --- C((C))
C --- E((E))
E --- D((D))
B --- E --- A
```
- $(a,b,c,e,d)$ is a **walk**.
- So too is $(a,b,e,a,b,c)$ (not a trail or a path).
-
- What is the **length** of a path? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-08T00:00:00.000Z
card-last-reviewed:: 2022-11-07T08:29:37.150Z
card-last-score:: 1
- The **length** of a path is the number of edges in the sequence.
- ## Cycles & Circuits
- What is a **cycle**? #card
- A **cycle** is a path that begins & ends at the same vertex, but no other vertex is repeated.
- A cycle on $n$ vertices is denoted $C_n$.
- What is a **circuit**? #card
- A **circuit** is a path that begins & ends at the same vertex, and no edge is repeated.
- What does it mean if a graph is **connected**? #card
- A graph is **connected** if there is a path between every pair of vertices.
- What is the **degree** of a vertex? #card
- The **degree** of a vertex is the number of edges emanating from it.
- If $v$ is a vertex, we denote its degree as $d(v)$.
- ## Handshaking Lemma
- If we know the degree of every vertex in the graph, then we know the number of edges. This is the **Handshaking Lemma**.
- What is the **Handshaking Lemma**? #card
- In any graph, the sum of the degrees of vertices in the graph, is always twice the number of edges.
- $$\sum_{v \in V} d(v) = 2|E|$$
- # Types of Graphs
- What is a **Complete** Graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-08T00:00:00.000Z
card-last-reviewed:: 2022-11-07T08:29:22.503Z
card-last-score:: 1
- A graph is **complete** if every pair of vertices is adjacent.
- This family of graphs is very important.
- Complete graphs are denoted $K_n$ - the complete graph on $n$ vertices.
- What is a **Bipartite Graph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-08T00:00:00.000Z
card-last-reviewed:: 2022-11-07T08:30:00.319Z
card-last-score:: 1
- A graph is **bipartite** if it is possible to partition the vertex set, $V$, into two disjoint sets, $V_1$ & $V_2$, such that there are no edges between any two vertices in the same set.
- What is a **Complete Bipartite** graph? #card
- If a bipartite graph is such that *every* vertex in $V_1$ is connected to *every* vertex in $V_2$ (and vice versa), the graph is a **complete bipartite graph**.
- If $|V_1| = m$ and $|V_2| = n$, we denote it $K_{m,n}$.
- What is a **subgraph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-08T00:00:00.000Z
card-last-reviewed:: 2022-11-07T08:30:55.795Z
card-last-score:: 1
- We say that $G_1 = (V_1, E_1)$ is a **subgraph** of $G_2 = (V_2, E_2)$ provided $V_1 \subset V_2$ and $E_1 \subset E_2$.
- What is an **induced subgraph**? #card
- We say that $G_1(V_1, E_1)$ is an **induced subgraph** of $G_2 = (V_2, E_2)$ provided that $V_1 \subset V_2$ and $E_2$ contains **all** edges of $E_1$ which join edges in $V_1$.
-
- # Planar Graphs
- What is a **planar graph**? #card
- If you can sketch a graph such that none of its edges cross, it is a **planar graph**.
- What is a **face**? #card
- When a planar graph is drawn without edges crossing, the edges & vertices of the graph divide the plane into regions called **faces**.
- The number of faces does not change no matter how you draw the graph, as long as no edges cross.
-
- ## Example
- The graph $K_{2,3}$ is **planar**.
background-color:: red
- ![image.png](../assets/image_1666951300835_0.png)
- [[draws/2022-10-28-11-04-05.excalidraw]]
- The planar representation $K_{2,3}$ has **3 faces** (the "outside" region counts as a face).
- Give a planar representation of $K_4$, and count how many faces it has.
background-color:: red
- [[draws/2022-10-28-11-22-12.excalidraw]]
- Why "face"?
background-color:: red
- [[draws/2022-10-28-11-25-20.excalidraw]]
- # Euler's Formula for Planar Graphs #card
- For any ^^(connected) planar graph^^ with $v$ vertices, $e$ edges, and $f$ faces, we have:
- $$v - e + f = 2$$
- ## Outline of Proof
- Start with $P_2$.
- Here, $v=2$, $e = 1$, $f=1$. So $v-e+f=2$.
- Any other graph can be made by adding vertices & edges (or just edges) to $P_2$.
- Suppose $v-e+f=2$ for a graph.
- If we add a new edge *with* a new vertex, then no new face is created, so $v-e+f$ does not change.
- If we add a new edge *without* a new vertex, then $f$ will increase by 1, so again, $v-e+f$ does not change.
- ## Example
- Is it possible for a connected planar graph to have 5 vertices, 7 edges, and 3 faces? Explain.
background-color:: red
- No. Euler's formula tells us that $v-e+f=2$.
- Here, $v=5$, $e=7$, $f=3$, so $v-e+f=1$.
- Any such graph is **not planar**.
-
-

View File

@ -0,0 +1,200 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Introduction to Graph Theory]]
- **Next Topic:** [[Convex Polyhedra]]
- **Relevant Slides:** ![MA284-Week08.pdf](../assets/MA284-Week08_1666785726176_0.pdf)
-
- # Definitions
- What is a **walk**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:23:27.284Z
card-last-score:: 1
- A **walk** is a sequence of vertices such that consecutive vertices are adjacent.
- What is a **trail**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:14:59.935Z
card-last-score:: 1
- A **trail** is a walk in which no edge is repeated.
- What is a **path**? #card
card-last-interval:: 0.9
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T17:22:40.523Z
card-last-reviewed:: 2022-11-14T20:22:40.524Z
card-last-score:: 3
- A **path** is a trail in which no vertex is repeated, except possibly the first & last.
- The path on $n$ vertices is denoted $P_n$.
- ## Example
- <img src="https://mermaid.ink/img/ICBmbG93Y2hhcnQgTFIKQigoQikpIC0tLSBBKChBKSkKQiAtLS0gQygoQykpCkMgLS0tIEUoKEUpKQpFIC0tLSBEKChEKSkKQiAtLS0gRSAtLS0gQQoK" />
{{renderer :mermaid_uiukfrr}}
- ```mermaid
flowchart LR
B((B)) --- A((A))
B --- C((C))
C --- E((E))
E --- D((D))
B --- E --- A
```
- $(a,b,c,e,d)$ is a **walk**.
- So too is $(a,b,e,a,b,c)$ (not a trail or a path).
-
- What is the **length** of a path? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:22:48.413Z
card-last-score:: 1
- The **length** of a path is the number of edges in the sequence.
- ## Cycles & Circuits
- What is a **cycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:16.908Z
card-last-score:: 1
- A **cycle** is a path that begins & ends at the same vertex, but no other vertex is repeated.
- A cycle on $n$ vertices is denoted $C_n$.
- What is a **circuit**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-22T00:00:00.000Z
card-last-reviewed:: 2022-11-21T13:05:28.290Z
card-last-score:: 1
- A **circuit** is a path that begins & ends at the same vertex, and no edge is repeated.
- What does it mean if a graph is **connected**? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T11:16:09.571Z
card-last-reviewed:: 2022-11-14T16:16:09.572Z
card-last-score:: 5
- A graph is **connected** if there is a path between every pair of vertices.
- What is the **degree** of a vertex? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-25T13:10:12.122Z
card-last-reviewed:: 2022-11-21T13:10:12.123Z
card-last-score:: 5
- The **degree** of a vertex is the number of edges emanating from it.
- If $v$ is a vertex, we denote its degree as $d(v)$.
- ## Handshaking Lemma
- If we know the degree of every vertex in the graph, then we know the number of edges. This is the **Handshaking Lemma**.
- What is the **Handshaking Lemma**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:54:11.744Z
card-last-score:: 1
- In any graph, the sum of the degrees of vertices in the graph, is always twice the number of edges.
- $$\sum_{v \in V} d(v) = 2|E|$$
- # Types of Graphs
- What is a **Complete** Graph? #card
card-last-interval:: 0.9
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T17:23:00.989Z
card-last-reviewed:: 2022-11-14T20:23:00.990Z
card-last-score:: 3
- A graph is **complete** if every pair of vertices is adjacent.
- This family of graphs is very important.
- Complete graphs are denoted $K_n$ - the complete graph on $n$ vertices.
- What is a **Bipartite Graph**? #card
card-last-interval:: 8.35
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-29T21:09:36.899Z
card-last-reviewed:: 2022-11-21T13:09:36.899Z
card-last-score:: 5
- A graph is **bipartite** if it is possible to partition the vertex set, $V$, into two disjoint sets, $V_1$ & $V_2$, such that there are no edges between any two vertices in the same set.
- What is a **Complete Bipartite** graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:51:45.394Z
card-last-score:: 1
- If a bipartite graph is such that *every* vertex in $V_1$ is connected to *every* vertex in $V_2$ (and vice versa), the graph is a **complete bipartite graph**.
- If $|V_1| = m$ and $|V_2| = n$, we denote it $K_{m,n}$.
- What is a **subgraph**? #card
card-last-interval:: 3.05
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T21:23:07.436Z
card-last-reviewed:: 2022-11-14T20:23:07.436Z
card-last-score:: 5
- We say that $G_1 = (V_1, E_1)$ is a **subgraph** of $G_2 = (V_2, E_2)$ provided $V_1 \subset V_2$ and $E_1 \subset E_2$.
- What is an **induced subgraph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:49:58.981Z
card-last-score:: 1
- We say that $G_1(V_1, E_1)$ is an **induced subgraph** of $G_2 = (V_2, E_2)$ provided that $V_1 \subset V_2$ and $E_2$ contains **all** edges of $E_1$ which join edges in $V_1$.
-
- # Planar Graphs
- What is a **planar graph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:19:45.615Z
card-last-score:: 1
- If you can sketch a graph such that none of its edges cross, it is a **planar graph**.
- What is a **face**? #card
card-last-interval:: 0.98
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T15:22:52.013Z
card-last-reviewed:: 2022-11-14T16:22:52.013Z
card-last-score:: 3
- When a planar graph is drawn without edges crossing, the edges & vertices of the graph divide the plane into regions called **faces**.
- The number of faces does not change no matter how you draw the graph, as long as no edges cross.
-
- ## Example
- The graph $K_{2,3}$ is **planar**.
background-color:: red
- ![image.png](../assets/image_1666951300835_0.png)
- [[draws/2022-10-28-11-04-05.excalidraw]]
- The planar representation $K_{2,3}$ has **3 faces** (the "outside" region counts as a face).
- Give a planar representation of $K_4$, and count how many faces it has.
background-color:: red
- [[draws/2022-10-28-11-22-12.excalidraw]]
- Why "face"?
background-color:: red
- [[draws/2022-10-28-11-25-20.excalidraw]]
- # Euler's Formula for Planar Graphs #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:18:07.821Z
card-last-score:: 1
- For any ^^(connected) planar graph^^ with $v$ vertices, $e$ edges, and $f$ faces, we have:
- $$v - e + f = 2$$
- ## Outline of Proof
- Start with $P_2$.
- Here, $v=2$, $e = 1$, $f=1$. So $v-e+f=2$.
- Any other graph can be made by adding vertices & edges (or just edges) to $P_2$.
- Suppose $v-e+f=2$ for a graph.
- If we add a new edge *with* a new vertex, then no new face is created, so $v-e+f$ does not change.
- If we add a new edge *without* a new vertex, then $f$ will increase by 1, so again, $v-e+f$ does not change.
- ## Example
- Is it possible for a connected planar graph to have 5 vertices, 7 edges, and 3 faces? Explain.
background-color:: red
- No. Euler's formula tells us that $v-e+f=2$.
- Here, $v=5$, $e=7$, $f=3$, so $v-e+f=1$.
- Any such graph is **not planar**.
-
-

View File

@ -0,0 +1,200 @@
- #[[MA284 - Discrete Mathematics]]
- **Previous Topic:** [[Introduction to Graph Theory]]
- **Next Topic:** [[Convex Polyhedra]]
- **Relevant Slides:** ![MA284-Week08.pdf](../assets/MA284-Week08_1666785726176_0.pdf)
-
- # Definitions
- What is a **walk**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:23:27.284Z
card-last-score:: 1
- A **walk** is a sequence of vertices such that consecutive vertices are adjacent.
- What is a **trail**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:14:59.935Z
card-last-score:: 1
- A **trail** is a walk in which no edge is repeated.
- What is a **path**? #card
card-last-interval:: 0.9
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T17:22:40.523Z
card-last-reviewed:: 2022-11-14T20:22:40.524Z
card-last-score:: 3
- A **path** is a trail in which no vertex is repeated, except possibly the first & last.
- The path on $n$ vertices is denoted $P_n$.
- ## Example
- <img src="https://mermaid.ink/img/ICBmbG93Y2hhcnQgTFIKQigoQikpIC0tLSBBKChBKSkKQiAtLS0gQygoQykpCkMgLS0tIEUoKEUpKQpFIC0tLSBEKChEKSkKQiAtLS0gRSAtLS0gQQoK" />
{{renderer :mermaid_uiukfrr}}
- ```mermaid
flowchart LR
B((B)) --- A((A))
B --- C((C))
C --- E((E))
E --- D((D))
B --- E --- A
```
- $(a,b,c,e,d)$ is a **walk**.
- So too is $(a,b,e,a,b,c)$ (not a trail or a path).
-
- What is the **length** of a path? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:22:48.413Z
card-last-score:: 1
- The **length** of a path is the number of edges in the sequence.
- ## Cycles & Circuits
- What is a **cycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:16.908Z
card-last-score:: 1
- A **cycle** is a path that begins & ends at the same vertex, but no other vertex is repeated.
- A cycle on $n$ vertices is denoted $C_n$.
- What is a **circuit**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-22T00:00:00.000Z
card-last-reviewed:: 2022-11-21T13:05:28.290Z
card-last-score:: 1
- A **circuit** is a path that begins & ends at the same vertex, and no edge is repeated.
- What does it mean if a graph is **connected**? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T11:16:09.571Z
card-last-reviewed:: 2022-11-14T16:16:09.572Z
card-last-score:: 5
- A graph is **connected** if there is a path between every pair of vertices.
- What is the **degree** of a vertex? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-25T13:10:12.122Z
card-last-reviewed:: 2022-11-21T13:10:12.123Z
card-last-score:: 5
- The **degree** of a vertex is the number of edges emanating from it.
- If $v$ is a vertex, we denote its degree as $d(v)$.
- ## Handshaking Lemma
- If we know the degree of every vertex in the graph, then we know the number of edges. This is the **Handshaking Lemma**.
- What is the **Handshaking Lemma**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:54:11.744Z
card-last-score:: 1
- In any graph, the sum of the degrees of vertices in the graph, is always twice the number of edges.
- $$\sum_{v \in V} d(v) = 2|E|$$
- # Types of Graphs
- What is a **Complete** Graph? #card
card-last-interval:: 0.9
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T17:23:00.989Z
card-last-reviewed:: 2022-11-14T20:23:00.990Z
card-last-score:: 3
- A graph is **complete** if every pair of vertices is adjacent.
- This family of graphs is very important.
- Complete graphs are denoted $K_n$ - the complete graph on $n$ vertices.
- What is a **Bipartite Graph**? #card
card-last-interval:: 8.35
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-29T21:09:36.899Z
card-last-reviewed:: 2022-11-21T13:09:36.899Z
card-last-score:: 5
- A graph is **bipartite** if it is possible to partition the vertex set, $V$, into two disjoint sets, $V_1$ & $V_2$, such that there are no edges between any two vertices in the same set.
- What is a **Complete Bipartite** graph? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:51:45.394Z
card-last-score:: 1
- If a bipartite graph is such that *every* vertex in $V_1$ is connected to *every* vertex in $V_2$ (and vice versa), the graph is a **complete bipartite graph**.
- If $|V_1| = m$ and $|V_2| = n$, we denote it $K_{m,n}$.
- What is a **subgraph**? #card
card-last-interval:: 3.05
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T21:23:07.436Z
card-last-reviewed:: 2022-11-14T20:23:07.436Z
card-last-score:: 5
- We say that $G_1 = (V_1, E_1)$ is a **subgraph** of $G_2 = (V_2, E_2)$ provided $V_1 \subset V_2$ and $E_1 \subset E_2$.
- What is an **induced subgraph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:49:58.981Z
card-last-score:: 1
- We say that $G_1(V_1, E_1)$ is an **induced subgraph** of $G_2 = (V_2, E_2)$ provided that $V_1 \subset V_2$ and $E_2$ contains **all** edges of $E_1$ which join edges in $V_1$.
-
- # Planar Graphs
- What is a **planar graph**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:19:45.615Z
card-last-score:: 1
- If you can sketch a graph such that none of its edges cross, it is a **planar graph**.
- What is a **face**? #card
card-last-interval:: 0.98
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-15T15:22:52.013Z
card-last-reviewed:: 2022-11-14T16:22:52.013Z
card-last-score:: 3
- When a planar graph is drawn without edges crossing, the edges & vertices of the graph divide the plane into regions called **faces**.
- The number of faces does not change no matter how you draw the graph, as long as no edges cross.
-
- ## Example
- The graph $K_{2,3}$ is **planar**.
background-color:: red
- ![image.png](../assets/image_1666951300835_0.png)
- [[draws/2022-10-28-11-04-05.excalidraw]]
- The planar representation $K_{2,3}$ has **3 faces** (the "outside" region counts as a face).
- Give a planar representation of $K_4$, and count how many faces it has.
background-color:: red
- [[draws/2022-10-28-11-22-12.excalidraw]]
- Why "face"?
background-color:: red
- [[draws/2022-10-28-11-25-20.excalidraw]]
- # Euler's Formula for Planar Graphs #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:18:07.821Z
card-last-score:: 1
- For any ^^(connected) planar graph^^ with $v$ vertices, $e$ edges, and $f$ faces, we have:
- $$v - e + f = 2$$
- ## Outline of Proof
- Start with $P_2$.
- Here, $v=2$, $e = 1$, $f=1$. So $v-e+f=2$.
- Any other graph can be made by adding vertices & edges (or just edges) to $P_2$.
- Suppose $v-e+f=2$ for a graph.
- If we add a new edge *with* a new vertex, then no new face is created, so $v-e+f$ does not change.
- If we add a new edge *without* a new vertex, then $f$ will increase by 1, so again, $v-e+f$ does not change.
- ## Example
- Is it possible for a connected planar graph to have 5 vertices, 7 edges, and 3 faces? Explain.
background-color:: red
- No. Euler's formula tells us that $v-e+f=2$.
- Here, $v=5$, $e=7$, $f=3$, so $v-e+f=1$.
- Any such graph is **not planar**.
-
-

View File

@ -0,0 +1,24 @@
- #[[ST2001 Labs]]
- **Previous Topic:** [[Using R as a Calculator]]
- **Next Topic:** null
-
- Most of our computation in R will be using data in table form, with rows & columns, which are often stored as a `data.frame` object.
- The rows usually represent different *observations*, (e.g., humans, cars, etc.) and the columns represent different *variables* (features of the observation), (e.g., height, weight, etc.).
-
- What does `glimpse()` do? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T17:43:30.690Z
card-last-score:: 1
- The `glimpse()` function gives us an overview of the dataset on each experimental unit.
- What does `dim()` do? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:03:27.442Z
card-last-score:: 1
- The `dim()` function returns the number of rows & columns in a dataframe.
-

View File

@ -0,0 +1,24 @@
- #[[ST2001 Labs]]
- **Previous Topic:** [[Using R as a Calculator]]
- **Next Topic:** null
-
- Most of our computation in R will be using data in table form, with rows & columns, which are often stored as a `data.frame` object.
- The rows usually represent different *observations*, (e.g., humans, cars, etc.) and the columns represent different *variables* (features of the observation), (e.g., height, weight, etc.).
-
- What does `glimpse()` do? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:23:35.496Z
card-last-score:: 1
- The `glimpse()` function gives us an overview of the dataset on each experimental unit.
- What does `dim()` do? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:23:32.281Z
card-last-score:: 1
- The `dim()` function returns the number of rows & columns in a dataframe.
-

View File

@ -0,0 +1,24 @@
- #[[ST2001 Labs]]
- **Previous Topic:** [[Using R as a Calculator]]
- **Next Topic:** null
-
- Most of our computation in R will be using data in table form, with rows & columns, which are often stored as a `data.frame` object.
- The rows usually represent different *observations*, (e.g., humans, cars, etc.) and the columns represent different *variables* (features of the observation), (e.g., height, weight, etc.).
-
- What does `glimpse()` do? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:21:42.690Z
card-last-reviewed:: 2022-10-06T17:21:42.691Z
card-last-score:: 5
- The `glimpse()` function gives us an overview of the dataset on each experimental unit.
- What does `dim()` do? #card
card-last-interval:: 0.81
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T17:48:26.089Z
card-last-reviewed:: 2022-10-08T22:48:26.089Z
card-last-score:: 3
- The `dim()` function returns the number of rows & columns in a dataframe.
-

View File

@ -0,0 +1,24 @@
- #[[ST2001 Labs]]
- **Previous Topic:** [[Using R as a Calculator]]
- **Next Topic:** null
-
- Most of our computation in R will be using data in table form, with rows & columns, which are often stored as a `data.frame` object.
- The rows usually represent different *observations*, (e.g., humans, cars, etc.) and the columns represent different *variables* (features of the observation), (e.g., height, weight, etc.).
-
- What does `glimpse()` do? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:21:42.690Z
card-last-reviewed:: 2022-10-06T17:21:42.691Z
card-last-score:: 5
- The `glimpse()` function gives us an overview of the dataset on each experimental unit.
- What does `dim()` do? #card
card-last-interval:: 7.48
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-17T22:44:50.782Z
card-last-reviewed:: 2022-10-10T11:44:50.783Z
card-last-score:: 5
- The `dim()` function returns the number of rows & columns in a dataframe.
-

View File

@ -0,0 +1,11 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** [[Random Variables]]
- **Next Topic:** [[The Normal Distribution]]
- **Relevant Slides:** ![Topic 6 - Binomial and Poisson.pdf](../assets/Topic_6_-_Binomial_and_Poisson_1665414148124_0.pdf)
-
- What is a **Bernouli Trial**? #card
- A **Bernouli Trial** is a random experiment with just two outcomes - success / failure.
- For a single trial, random variable:
- $$X = \Bigl\{{1 \text{ sucess}}\\{0 \text{ failure}}$$
-
-

View File

@ -0,0 +1,125 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** [[Random Variables]]
- **Next Topic:** [[The Normal Distribution]]
- **Relevant Slides:** ![Topic 6 - Binomial and Poisson.pdf](../assets/Topic_6_-_Binomial_and_Poisson_1665414148124_0.pdf)
-
- Often, the observations generated by different statistical experiments have the same type of behaviour.
- In general, only a handful of important probability distributions are needed to describe many of the discrete random variables encountered in practice.
-
- # Bernoulli Trials
collapsed:: true
- What is a **Bernoulli Trial**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-18T23:00:00.000Z
card-last-reviewed:: 2022-10-18T08:48:37.965Z
card-last-score:: 1
- A **Bernoulli Trial** is a random experiment with just two outcomes - success / failure.
- For a single trial, random variable:
- $$X = \begin{cases}1, & \text{success,} \\0, & \text{failure.}\end{cases}$$
- $P(X = 1) = p$ and $P(X=0) = 1 -p$, where $p$ is the success probability, or more compactly:
- $$P(X = x) = p^x{(1-p)^{1-x}} \ \ \ \ \ x = 0,1$$
- What is the **expected value** of a Bernoulli Trial? #card
- $$E[X] = (0)(1-p)+(1)p = p$$
- What is the **variance** of a Bernoulli Trial? #card
- $$Var(X) = p(1-p)$$
- ## Bernoulli Trial Assumptions #card
- The outcomes of the trials are mutually **independent**.
- The probability of success $p$ is **constant** over trials.
- Note that these assumptions may not always be appropriate assumptions.
- ## Example: Camera Flash Tests
id:: 6368f276-bc7e-4d91-b7fb-c5b34c4c6feb
- The time to recharge the flash is tested in three mobile phone cameras. The probability that a camera passes the test is 0.8, and the cameras perform independently.
background-color:: green
- The random variable $X$ denotes the number of cameras that pass the test. The last column of the tables shows the values of $X$ assigned to each outcome of the experiment.
background-color:: green
- What is the probability that the first & second cameras pass the test, and the third one fails?
background-color:: green
- ![image.png](../assets/image_1667822368192_0.png)
- Each camera test can be treated as a **Bernoulli Trial**.
- $$P(PPF) = (0.8)(0.8)(0.2) = 0.128$$
- What is the probability that two cameras pass the test in three trials?
background-color:: green
- How many ways can this event happen?
- $$\binom{n}{r} = \frac{n!}{r!(n-r)!} = \frac{3!}{2!(3-2)!} = 3$$
- What is the probability of this event?
- 0.128 for each of the three ways.
- Probability = $3(0.128) = 0.383$.
- This is an example of the **Binomial Distribution**.
-
- # The Binomial Distribution
collapsed:: true
- What is the **binomial random variable**? #card
- A random experiment consists of $n$ Bernoulli trials such that:
- 1. The trials are independent.
2. Each trial results in only two possible outcomes, labelled as "success" & "failure".
3. The probability of a success in each trial, denotes as $p$, remains constant.
- The random variable $X$ that equals the number of trials that result in a success has a **binomial random variable** with parameters $0 < p < 1$ and $n = 1, 2, \cdots$.
- The **probability mass function** of $X$ is
- $$f(x) = \binom{n}{x}p^x (1-p)^{n-x} \ \ \ \ \ x = 0,1,\cdots, n$$
- ## Example: Camera Flash Tests
- See ((6368f276-bc7e-4d91-b7fb-c5b34c4c6feb)) for whole question.
background-color:: green
- Calculate the probability of 2 passes in 3 tests.
background-color:: green
- We are given that $n = 3$ and $p = 0.8$.
- Use the Binomial Distribution formula where $X$ is the number of passes:
- $$P(X = 2) = \binom{3}{2}(o.8)^2(0.2)^1 = 3(0.128) = 0.384$$
- ## Example: Organic Pollution
id:: 6368f570-83e7-4642-a881-7ccd40bb0399
- Each sample of water has a 10% chance of containing a particular organic pollutant. Assume that the sample are independent with regard to the presence of the pollutant.
background-color:: green
- Find the probability that, in the next 18 samples, exactly 2 contain the pollutant.
background-color:: green
- Let $X$ denote the number of samples that contain the pollutant in the next 18 samples analysed. Then $X$ is a binomial random variable with $p = 0.1$ and $n = 18$.
- $$P(X = 2) = \binom{18}{2}(0.1)^2(0.9)^{18-2} = 153(0.1)^2(0.9)^16 = 0.2835$$
- Determine the probability that $3 \leq X < 7$.
background-color:: green
- $$X = 3,4,5,6$$
- $$P(3 \leq X < 7) = P(X=3) + P(X=4) + P(X=5) + P(X=6)$$
- $$ \text{or}$$
- $$P(3 \leq X < 7) = \sum^6_{x=3} \binom{18}{x}(0.1)^x(0.9)^{18-x}$$
- $$ = 0.168 + 0.070 + 0.022 + 0.005 = 0.265$$
- ## Binomial Distributions in R #card
- `dbinom(x, size, prob)`, where `x` is the number of events required, `size` is the total number of trials, & `prob` is the probability of the event occurring.
- ### Example: Organic Pollution
- In ((6368f570-83e7-4642-a881-7ccd40bb0399)), `x=2`, `size=18`, & `p=0.10`.
background-color:: green
- ```R
dbinom(x=2, size=18, prob=0.1)
[1] 0.2835121
```
- ## Binomial Mean & Variance #card
- If $X$ is a **binomial random variable** with parameters $p$ & $n$:
- The **mean** & **variance** of the binomial distribution $b(x; n,p)$ are
- $$\mu = np \text{ and } \sigma^2 = npq \text{, where } q = 1-p$$
- ## Chebyshev's Inequality
- What is **Chebyshev's Inequality**? #card
- **Chebyshev's Inequality** provides an estimate as to where a certain percentage of observations will lie relative to the mean once the **standard deviation** is known.
- For example, at least 75% of values will lie within two standard deviations of the mean.
-
- # Poisson Distribution
- What are **Poisson Experiments**? #card
- Experiments yielding numerical values of a random variable $X$, the number of outcomes occurring during a given time interval or in a specified region, are called **Poisson Experiments**.
- The given time interval may be of any length, such as a minute, a day, a week, a month, or even a year.
- A Poisson Experiment is derived from the **Poisson Process** and possesses the following properties:
- The number of outcomes occurring one time interval or specified region of space is **independent** of the number that occur in any other disjoint time interval or region. In this sense, we say that the Poisson Process "has no memory".
- The probability that a single outcome will occur during a very short time interval or in a small region is **proportional** to the **length** of the time interval or the size of the region, and does not depend on the number of outcomes occurring outside this time interval or region.
- The probability that more than one outcome will occur in such a short time interval or fall in such a small region is **negligible**.
- What is the **Poisson Distribution**? #card
- The random variable $X$ that equals the number of events in a Poisson Process is a **Poisson Random Variable** with parameter $\lambda > 0 $, and the probability density function is
- $$f(x) = \frac{e^{-\lambda}\lambda^x}{x!} \text{ for } x = 0,1,2,3,\cdots$$
- ## Mean & Variance of Poisson Distribution
- If $\lambda$ is the average number of successes occurring in a given time interval or region in the Poisson Distribution, then the **mean** & the **variance** of the Poisson distribution are both equal to $\lambda$.
- Mean = $\lambda$, variance = $\lambda$.
- A one parameter distribution.
- ## Poisson Density Functions for Different Means
- ![image.png](../assets/image_1667824994941_0.png)
- If the variance is much greater than the mean, then the Poisson Distribution would not be a good model for the distribution of the random variable.
- ## Poisson Example: Calculations for Wire Flaws
- Suppose that the number of flaws on a thin copper wire follows a Poisson Distribution with a mean of 2.3 flaws per millimetre.
background-color:: green
- Find the probability of exactly 2 flaws in 1mm of wire.
background-color:: green
- $$P(X = 2) = \frac{e^{-2.3}2.3{2}}{2!} = 0.265$$

View File

@ -0,0 +1,151 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** [[Random Variables]]
- **Next Topic:** [[The Normal Distribution]]
- **Relevant Slides:** ![Topic 6 - Binomial and Poisson.pdf](../assets/Topic_6_-_Binomial_and_Poisson_1665414148124_0.pdf)
-
- Often, the observations generated by different statistical experiments have the same type of behaviour.
- In general, only a handful of important probability distributions are needed to describe many of the discrete random variables encountered in practice.
-
- # Bernoulli Trials
collapsed:: true
- What is a **Bernoulli Trial**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-18T23:00:00.000Z
card-last-reviewed:: 2022-10-18T08:48:37.965Z
card-last-score:: 1
- A **Bernoulli Trial** is a random experiment with just two outcomes - success / failure.
- For a single trial, random variable:
- $$X = \begin{cases}1, & \text{success,} \\0, & \text{failure.}\end{cases}$$
- $P(X = 1) = p$ and $P(X=0) = 1 -p$, where $p$ is the success probability, or more compactly:
- $$P(X = x) = p^x{(1-p)^{1-x}} \ \ \ \ \ x = 0,1$$
- What is the **expected value** of a Bernoulli Trial? #card
- $$E[X] = (0)(1-p)+(1)p = p$$
- What is the **variance** of a Bernoulli Trial? #card
- $$Var(X) = p(1-p)$$
- ## Bernoulli Trial Assumptions #card
- The outcomes of the trials are mutually **independent**.
- The probability of success $p$ is **constant** over trials.
- Note that these assumptions may not always be appropriate assumptions.
- ## Example: Camera Flash Tests
id:: 6368f276-bc7e-4d91-b7fb-c5b34c4c6feb
- The time to recharge the flash is tested in three mobile phone cameras. The probability that a camera passes the test is 0.8, and the cameras perform independently.
background-color:: green
- The random variable $X$ denotes the number of cameras that pass the test. The last column of the tables shows the values of $X$ assigned to each outcome of the experiment.
background-color:: green
- What is the probability that the first & second cameras pass the test, and the third one fails?
background-color:: green
- ![image.png](../assets/image_1667822368192_0.png)
- Each camera test can be treated as a **Bernoulli Trial**.
- $$P(PPF) = (0.8)(0.8)(0.2) = 0.128$$
- What is the probability that two cameras pass the test in three trials?
background-color:: green
- How many ways can this event happen?
- $$\binom{n}{r} = \frac{n!}{r!(n-r)!} = \frac{3!}{2!(3-2)!} = 3$$
- What is the probability of this event?
- 0.128 for each of the three ways.
- Probability = $3(0.128) = 0.383$.
- This is an example of the **Binomial Distribution**.
-
- # The Binomial Distribution
collapsed:: true
- What is a **binomial random variable**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:39:57.976Z
card-last-score:: 1
- A random experiment consists of $n$ Bernoulli trials such that:
- 1. The trials are independent.
2. Each trial results in only two possible outcomes, labelled as "success" & "failure".
3. The probability of a success in each trial, denotes as $p$, remains constant.
- The random variable $X$ that equals the number of trials that result in a success has a **binomial random variable** with parameters $0 < p < 1$ and $n = 1, 2, \cdots$.
- The **probability mass function** of $X$ is
- $$f(x) = \binom{n}{x}p^x (1-p)^{n-x} \ \ \ \ \ x = 0,1,\cdots, n$$
- ## Example: Camera Flash Tests
- See ((6368f276-bc7e-4d91-b7fb-c5b34c4c6feb)) for whole question.
background-color:: green
- Calculate the probability of 2 passes in 3 tests.
background-color:: green
- We are given that $n = 3$ and $p = 0.8$.
- Use the Binomial Distribution formula where $X$ is the number of passes:
- $$P(X = 2) = \binom{3}{2}(o.8)^2(0.2)^1 = 3(0.128) = 0.384$$
- ## Example: Organic Pollution
id:: 6368f570-83e7-4642-a881-7ccd40bb0399
- Each sample of water has a 10% chance of containing a particular organic pollutant. Assume that the sample are independent with regard to the presence of the pollutant.
background-color:: green
- Find the probability that, in the next 18 samples, exactly 2 contain the pollutant.
background-color:: green
- Let $X$ denote the number of samples that contain the pollutant in the next 18 samples analysed. Then $X$ is a binomial random variable with $p = 0.1$ and $n = 18$.
- $$P(X = 2) = \binom{18}{2}(0.1)^2(0.9)^{18-2} = 153(0.1)^2(0.9)^16 = 0.2835$$
- Determine the probability that $3 \leq X < 7$.
background-color:: green
- $$X = 3,4,5,6$$
- $$P(3 \leq X < 7) = P(X=3) + P(X=4) + P(X=5) + P(X=6)$$
- $$ \text{or}$$
- $$P(3 \leq X < 7) = \sum^6_{x=3} \binom{18}{x}(0.1)^x(0.9)^{18-x}$$
- $$ = 0.168 + 0.070 + 0.022 + 0.005 = 0.265$$
- ## Binomial Distributions in R #card
- `dbinom(x, size, prob)`, where `x` is the number of events required, `size` is the total number of trials, & `prob` is the probability of the event occurring.
- ### Example: Organic Pollution
- In ((6368f570-83e7-4642-a881-7ccd40bb0399)), `x=2`, `size=18`, & `p=0.10`.
background-color:: green
- ```R
dbinom(x=2, size=18, prob=0.1)
[1] 0.2835121
```
- ## Binomial Mean & Variance #card
- If $X$ is a **binomial random variable** with parameters $p$ & $n$:
- The **mean** & **variance** of the binomial distribution $b(x; n,p)$ are
- $$\mu = np \text{ and } \sigma^2 = npq \text{, where } q = 1-p$$
- ## Chebyshev's Inequality
- What is **Chebyshev's Inequality**? #card
- **Chebyshev's Inequality** provides an estimate as to where a certain percentage of observations will lie relative to the mean once the **standard deviation** is known.
- For example, at least 75% of values will lie within two standard deviations of the mean.
-
- # Poisson Distribution
- What are **Poisson Experiments**? #card
- Experiments yielding numerical values of a random variable $X$, the number of outcomes occurring during a given time interval or in a specified region, are called **Poisson Experiments**.
- The given time interval may be of any length, such as a minute, a day, a week, a month, or even a year.
- A Poisson Experiment is derived from the **Poisson Process** and possesses the following properties:
- The number of outcomes occurring one time interval or specified region of space is **independent** of the number that occur in any other disjoint time interval or region. In this sense, we say that the Poisson Process "has no memory".
- The probability that a single outcome will occur during a very short time interval or in a small region is **proportional** to the **length** of the time interval or the size of the region, and does not depend on the number of outcomes occurring outside this time interval or region.
- The probability that more than one outcome will occur in such a short time interval or fall in such a small region is **negligible**.
- What is the **Poisson Distribution**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:38:39.167Z
card-last-score:: 1
- The random variable $X$ that equals the number of events in a Poisson Process is a **Poisson Random Variable** with parameter $\lambda > 0$, and the probability density function is
- $$f(x) = \frac{e^{-\lambda}\lambda^x}{x!} \text{ for } x = 0,1,2,3,\cdots$$
- ## Mean & Variance of Poisson Distribution
- If $\lambda$ is the average number of successes occurring in a given time interval or region in the Poisson Distribution, then the **mean** & the **variance** of the Poisson distribution are both equal to $\lambda$.
- Mean = $\lambda$, variance = $\lambda$.
- A one parameter distribution.
- ## Poisson Density Functions for Different Means
- ![image.png](../assets/image_1667824994941_0.png)
- If the variance is much greater than the mean, then the Poisson Distribution would not be a good model for the distribution of the random variable.
- ## Poisson Example: Calculations for Wire Flaws
- Suppose that the number of flaws on a thin copper wire follows a Poisson Distribution with a mean of 2.3 flaws per millimetre.
background-color:: green
- Find the probability of exactly 2 flaws in 1mm of wire.
background-color:: green
- $$P(X = 2) = \frac{e^{-2.3}2.3{2}}{2!} = 0.265$$
- ## Poisson Example: Car Park
- A car park has 3 entrances, $A$, $B$, & $C$. The number of cars per hour entering through each of these is Poisson-distributed with mean $\lambda_A = 1.5$, $\lambda_B = 1.0$, and $\lambda_C = 2.5$. Arrivals at each entrance are **independent**.
background-color:: green
- $T$ is the total number of cars entering in an hour.
- $$T \sim \text{ Poisson}(\lambda_A + \lambda_B + \lambda_C) \equiv \text{Poisson}(1.5 + 1.0 + 2.5) \equiv \text{Poisson}(5)$$
- $$P(T = 4) = \frac{e^{-5} 5^4}{4!} = 0.1755$$
- ## Sum of Independent Poisson Random Variables #card
- If $X_1, X_2, \cdots, X_n$ are independently Poisson distributed with parameters $\lambda_1, \lambda_2, \cdots, \lambda_n$ then
- $$T = X_1 + X_2 + \cdots + X_n \text{ is Poisson}(\lambda_1 + \lambda_2 + \cdots + \lambda_n)$$
- and
- $$E[T] = \lambda_1 + \lambda_2 + \cdots + \lambda_n$$
- and
- $$\text{Var}(T) = \lambda_1 + \lambda_2 + \cdots + \lambda_n$$
-

View File

@ -0,0 +1,313 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** null
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-09T23:00:00.000Z
card-last-reviewed:: 2022-10-08T23:00:14.804Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: 3.21
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T15:21:53.036Z
card-last-reviewed:: 2022-10-07T10:21:53.036Z
card-last-score:: 3
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints&& of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:23:11.260Z
card-last-score:: 1
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 3.18
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T21:24:28.725Z
card-last-reviewed:: 2022-10-06T17:24:28.726Z
card-last-score:: 3
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: 2.6
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T05:10:30.449Z
card-last-reviewed:: 2022-10-08T15:10:30.450Z
card-last-score:: 5
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 2.33
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T22:25:34.684Z
card-last-reviewed:: 2022-10-08T15:25:34.685Z
card-last-score:: 5
- A **relationship instance** or **relationship occurrence** is a uniquely identifiable association which includes one occurrence from each participating entity type; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 3.57
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:44:53.027Z
card-last-reviewed:: 2022-10-06T09:44:53.027Z
card-last-score:: 5
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is the number of participating entity types.
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T22:45:27.674Z
card-last-score:: 1
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:17:03.860Z
card-last-score:: 1
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:21:54.114Z
card-last-reviewed:: 2022-10-06T17:21:54.115Z
card-last-score:: 5
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 4.04
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:47:07.833Z
card-last-reviewed:: 2022-10-07T10:47:07.833Z
card-last-score:: 5
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T19:44:43.280Z
card-last-reviewed:: 2022-10-06T09:44:43.280Z
card-last-score:: 3
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T20:18:30.906Z
card-last-reviewed:: 2022-10-07T10:18:30.906Z
card-last-score:: 3
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: 0.47
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T02:33:42.655Z
card-last-reviewed:: 2022-10-08T15:33:42.655Z
card-last-score:: 3
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: 3.84
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T13:20:26.940Z
card-last-reviewed:: 2022-10-06T17:20:26.941Z
card-last-score:: 3
- **Total Participation** (Mandatory Participation): All instances of an entity must participate in the relationship, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 2.51
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T03:07:33.950Z
card-last-reviewed:: 2022-10-08T15:07:33.952Z
card-last-score:: 5
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-18T05:50:33.140Z
card-last-reviewed:: 2022-10-09T08:50:33.141Z
card-last-score:: 3
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:33:14.077Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:31:46.745Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:25:11.281Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using
-

View File

@ -0,0 +1,340 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** [[Joins & Union Queries]]
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-09T23:00:00.000Z
card-last-reviewed:: 2022-10-08T23:00:14.804Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: 3.21
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T15:21:53.036Z
card-last-reviewed:: 2022-10-07T10:21:53.036Z
card-last-score:: 3
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints&& of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:15.394Z
card-last-score:: 1
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:57.663Z
card-last-reviewed:: 2022-10-10T11:45:57.664Z
card-last-score:: 5
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: 2.6
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T05:10:30.449Z
card-last-reviewed:: 2022-10-08T15:10:30.450Z
card-last-score:: 5
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 2.33
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T22:25:34.684Z
card-last-reviewed:: 2022-10-08T15:25:34.685Z
card-last-score:: 5
- A **relationship instance** or **relationship occurrence** is a uniquely identifiable association which includes one occurrence from each participating entity type; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 3.57
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:44:53.027Z
card-last-reviewed:: 2022-10-06T09:44:53.027Z
card-last-score:: 5
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is the number of participating entity types.
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:41:32.658Z
card-last-score:: 1
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:34:55.501Z
card-last-score:: 1
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:21:54.114Z
card-last-reviewed:: 2022-10-06T17:21:54.115Z
card-last-score:: 5
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 4.04
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:47:07.833Z
card-last-reviewed:: 2022-10-07T10:47:07.833Z
card-last-score:: 5
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:42.013Z
card-last-reviewed:: 2022-10-10T11:45:42.014Z
card-last-score:: 5
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T20:18:30.906Z
card-last-reviewed:: 2022-10-07T10:18:30.906Z
card-last-score:: 3
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:42:23.846Z
card-last-score:: 1
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: 3.84
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T13:20:26.940Z
card-last-reviewed:: 2022-10-06T17:20:26.941Z
card-last-score:: 3
- **Total Participation** (Mandatory Participation): All instances of an entity must participate in the relationship, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 2.51
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T03:07:33.950Z
card-last-reviewed:: 2022-10-08T15:07:33.952Z
card-last-score:: 5
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-18T05:50:33.140Z
card-last-reviewed:: 2022-10-09T08:50:33.141Z
card-last-score:: 3
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:40:09.417Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:39:33.993Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:34.222Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using `CREATE TABLES` commands.
- This stage is called **Mapping ER Models to Tables in the Relational Model**, and it specifies a set of rules that must be followed in a certain order.
- The rules specified here are based on **Chen's Notation**.
- ## Steps
- [1] For each entity, create a table $R$ that includes all the **simple** attributes of the entity.
- [2] For strong entities, choose a key attribute as the primary key of the table.
- [3] For weak entities $R$, include the primary key attributes of the table that corresponds to the owner as foreign key attributes of $R$.
- The primary key of $R$ is a combination of the primary key of the owner and the partial key of the weak entity type.
- The relationship of the weak & strong entity is generally taken care of by this step.
- [4] For each **binary** $1:1$ relationship, identify entites $S$ & $T$ that participate in the relation.
- If applicable, choose the entity that has **total participation** in the relation.
- Include the primary key of the other relation as the foreign key in this table.
- Include any attributes of the relationship as attributes of the chosen table.
- If both entities have total participation in the relationship, you can choose either one for the foreign key and proceed as above, or you can map two entities & their associated attributes & relationship attributes into one table.
- [5] For each **binary** $1:N$ relationship, identify the table $S$ that represents the $N$ side and the table $T$ that represents the $1$ side.
- Include the primary key of table $T$ as a foreign key in $S$ such that each entity on the $N$ side is related to at most one entity instance on the $1$ side.
- Include any attributes of the relationship as attributes of $S$.
- For recursive $1:N$ relationships, choose the primary key of the table and include it as a foreign key in the same table with a different name.
- [6] For each $M:N$ relationship, create a new table $S$ to represent the relationship.
- Include the primary keys of the tables that represent the participating entity types as foreign keys in $S$ - their combination will form the primary key of $S$.
- Also include in $S$ any attributes of the relationship.
- For a recursive $M:N$ relationship, both foreign keys come from the same table (give different names to each) and become the new primary key.
- [7] For each **multi-valued attribute** $A$ of an entity $S$, create a new table $R$.
- $R$ will include:
- An attribute corresponding to $A$.
- The primary key of $S$, which will be a foreign key in $R$. Call this $K$.
- The primary key of $R$ is a combination of $A$ & $K$.
-
-

View File

@ -0,0 +1,340 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** [[Joins & Union Queries]]
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:28:27.254Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:33:37.565Z
card-last-score:: 1
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints** of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:42:58.469Z
card-last-score:: 1
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:57.663Z
card-last-reviewed:: 2022-10-10T11:45:57.664Z
card-last-score:: 5
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: 2.6
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T05:10:30.449Z
card-last-reviewed:: 2022-10-08T15:10:30.450Z
card-last-score:: 5
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 15.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-04T12:37:54.349Z
card-last-reviewed:: 2022-10-20T08:37:54.349Z
card-last-score:: 5
- A **relationship instance** or **relationship occurrence** is a uniquely identifiable association which includes one occurrence from each participating entity type; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:28:15.095Z
card-last-reviewed:: 2022-10-20T08:28:15.096Z
card-last-score:: 3
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is ^^the number of participating entity types.^^
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:25:13.663Z
card-last-score:: 1
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: 0.85
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-27T07:51:05.213Z
card-last-reviewed:: 2022-10-26T11:51:05.213Z
card-last-score:: 3
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:29:04.692Z
card-last-score:: 1
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 4.04
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:47:07.833Z
card-last-reviewed:: 2022-10-07T10:47:07.833Z
card-last-score:: 5
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:42.013Z
card-last-reviewed:: 2022-10-10T11:45:42.014Z
card-last-score:: 5
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 5.14
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-25T11:37:32.828Z
card-last-reviewed:: 2022-10-20T08:37:32.828Z
card-last-score:: 5
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:39:12.380Z
card-last-score:: 1
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:31:55.731Z
card-last-score:: 1
- **Total Participation** (Mandatory Participation): All instances of an entity must participate in the relationship, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 2.51
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T03:07:33.950Z
card-last-reviewed:: 2022-10-08T15:07:33.952Z
card-last-score:: 5
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-18T05:50:33.140Z
card-last-reviewed:: 2022-10-09T08:50:33.141Z
card-last-score:: 3
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:42:23.258Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:38:00.499Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:41:13.705Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using `CREATE TABLES` commands.
- This stage is called **Mapping ER Models to Tables in the Relational Model**, and it specifies a set of rules that must be followed in a certain order.
- The rules specified here are based on **Chen's Notation**.
- ## Steps
- [1] For each entity, create a table $R$ that includes all the **simple** attributes of the entity.
- [2] For strong entities, choose a key attribute as the primary key of the table.
- [3] For weak entities $R$, include the primary key attributes of the table that corresponds to the owner as foreign key attributes of $R$.
- The primary key of $R$ is a combination of the primary key of the owner and the partial key of the weak entity type.
- The relationship of the weak & strong entity is generally taken care of by this step.
- [4] For each **binary** $1:1$ relationship, identify entites $S$ & $T$ that participate in the relation.
- If applicable, choose the entity that has **total participation** in the relation.
- Include the primary key of the other relation as the foreign key in this table.
- Include any attributes of the relationship as attributes of the chosen table.
- If both entities have total participation in the relationship, you can choose either one for the foreign key and proceed as above, or you can map two entities & their associated attributes & relationship attributes into one table.
- [5] For each **binary** $1:N$ relationship, identify the table $S$ that represents the $N$ side and the table $T$ that represents the $1$ side.
- Include the primary key of table $T$ as a foreign key in $S$ such that each entity on the $N$ side is related to at most one entity instance on the $1$ side.
- Include any attributes of the relationship as attributes of $S$.
- For recursive $1:N$ relationships, choose the primary key of the table and include it as a foreign key in the same table with a different name.
- [6] For each $M:N$ relationship, create a new table $S$ to represent the relationship.
- Include the primary keys of the tables that represent the participating entity types as foreign keys in $S$ - their combination will form the primary key of $S$.
- Also include in $S$ any attributes of the relationship.
- For a recursive $M:N$ relationship, both foreign keys come from the same table (give different names to each) and become the new primary key.
- [7] For each **multi-valued attribute** $A$ of an entity $S$, create a new table $R$.
- $R$ will include:
- An attribute corresponding to $A$.
- The primary key of $S$, which will be a foreign key in $R$. Call this $K$.
- The primary key of $R$ is a combination of $A$ & $K$.
-
-

View File

@ -0,0 +1,340 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** [[Joins & Union Queries]]
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:28:27.254Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:33:37.565Z
card-last-score:: 1
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints** of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:46:10.209Z
card-last-score:: 1
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:57.663Z
card-last-reviewed:: 2022-10-10T11:45:57.664Z
card-last-score:: 5
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: 2.6
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T05:10:30.449Z
card-last-reviewed:: 2022-10-08T15:10:30.450Z
card-last-score:: 5
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 15.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-04T12:37:54.349Z
card-last-reviewed:: 2022-10-20T08:37:54.349Z
card-last-score:: 5
- A **relationship instance** or **relationship occurrence** is a uniquely identifiable association which includes one occurrence from each participating entity type; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-23T18:28:15.095Z
card-last-reviewed:: 2022-10-20T08:28:15.096Z
card-last-score:: 3
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is ^^the number of participating entity types.^^
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:25:13.663Z
card-last-score:: 1
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: 0.85
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-27T07:51:05.213Z
card-last-reviewed:: 2022-10-26T11:51:05.213Z
card-last-score:: 3
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:29:04.692Z
card-last-score:: 1
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 4.04
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:47:07.833Z
card-last-reviewed:: 2022-10-07T10:47:07.833Z
card-last-score:: 5
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:45:42.013Z
card-last-reviewed:: 2022-10-10T11:45:42.014Z
card-last-score:: 5
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 5.14
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-25T11:37:32.828Z
card-last-reviewed:: 2022-10-20T08:37:32.828Z
card-last-score:: 5
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:39:12.380Z
card-last-score:: 1
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-10T00:00:00.000Z
card-last-reviewed:: 2022-11-09T12:45:13.796Z
card-last-score:: 1
- **Total Participation** (Mandatory Participation): ==All instances of an entity must participate in the relationship==, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 2.51
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T03:07:33.950Z
card-last-reviewed:: 2022-10-08T15:07:33.952Z
card-last-score:: 5
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-18T05:50:33.140Z
card-last-reviewed:: 2022-10-09T08:50:33.141Z
card-last-score:: 3
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:42:23.258Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:38:00.499Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:41:13.705Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using `CREATE TABLES` commands.
- This stage is called **Mapping ER Models to Tables in the Relational Model**, and it specifies a set of rules that must be followed in a certain order.
- The rules specified here are based on **Chen's Notation**.
- ## Steps
- [1] For each entity, create a table $R$ that includes all the **simple** attributes of the entity.
- [2] For strong entities, choose a key attribute as the primary key of the table.
- [3] For weak entities $R$, include the primary key attributes of the table that corresponds to the owner as foreign key attributes of $R$.
- The primary key of $R$ is a combination of the primary key of the owner and the partial key of the weak entity type.
- The relationship of the weak & strong entity is generally taken care of by this step.
- [4] For each **binary** $1:1$ relationship, identify entites $S$ & $T$ that participate in the relation.
- If applicable, choose the entity that has **total participation** in the relation.
- Include the primary key of the other relation as the foreign key in this table.
- Include any attributes of the relationship as attributes of the chosen table.
- If both entities have total participation in the relationship, you can choose either one for the foreign key and proceed as above, or you can map two entities & their associated attributes & relationship attributes into one table.
- [5] For each **binary** $1:N$ relationship, identify the table $S$ that represents the $N$ side and the table $T$ that represents the $1$ side.
- Include the primary key of table $T$ as a foreign key in $S$ such that each entity on the $N$ side is related to at most one entity instance on the $1$ side.
- Include any attributes of the relationship as attributes of $S$.
- For recursive $1:N$ relationships, choose the primary key of the table and include it as a foreign key in the same table with a different name.
- [6] For each $M:N$ relationship, create a new table $S$ to represent the relationship.
- Include the primary keys of the tables that represent the participating entity types as foreign keys in $S$ - their combination will form the primary key of $S$.
- Also include in $S$ any attributes of the relationship.
- For a recursive $M:N$ relationship, both foreign keys come from the same table (give different names to each) and become the new primary key.
- [7] For each **multi-valued attribute** $A$ of an entity $S$, create a new table $R$.
- $R$ will include:
- An attribute corresponding to $A$.
- The primary key of $S$, which will be a foreign key in $R$. Call this $K$.
- The primary key of $R$ is a combination of $A$ & $K$.
-
-

View File

@ -0,0 +1,340 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** [[Joins & Union Queries]]
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:07.942Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:30.174Z
card-last-score:: 1
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints** of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: 3.05
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T21:24:38.127Z
card-last-reviewed:: 2022-11-14T20:24:38.127Z
card-last-score:: 5
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 11.34
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-26T00:47:05.851Z
card-last-reviewed:: 2022-11-14T16:47:05.851Z
card-last-score:: 5
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:30:24.202Z
card-last-score:: 1
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 29.04
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-10T11:35:42.874Z
card-last-reviewed:: 2022-11-11T11:35:42.874Z
card-last-score:: 3
- A **relationship instance** or **relationship occurrence** is ==a uniquely identifiable association which includes one occurrence from each participating entity type==; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-22T01:30:24.968Z
card-last-reviewed:: 2022-11-11T11:30:24.969Z
card-last-score:: 5
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is ^^the number of participating entity types.^^
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-20T15:15:43.390Z
card-last-reviewed:: 2022-11-17T20:15:43.391Z
card-last-score:: 5
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-25T13:06:19.301Z
card-last-reviewed:: 2022-11-21T13:06:19.301Z
card-last-score:: 3
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:08.122Z
card-last-score:: 1
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 5.52
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T04:37:24.678Z
card-last-reviewed:: 2022-11-14T16:37:24.678Z
card-last-score:: 3
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:46:49.892Z
card-last-score:: 1
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 31.05
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-15T21:12:18.376Z
card-last-reviewed:: 2022-11-14T20:12:18.376Z
card-last-score:: 5
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:18:04.379Z
card-last-score:: 1
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-18T20:26:04.052Z
card-last-reviewed:: 2022-11-14T20:26:04.052Z
card-last-score:: 3
- **Total Participation** (Mandatory Participation): ==All instances of an entity must participate in the relationship==, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 7.45
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-22T02:28:50.125Z
card-last-reviewed:: 2022-11-14T16:28:50.125Z
card-last-score:: 3
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:06:29.661Z
card-last-score:: 1
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:01.789Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:03.316Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T19:36:10.883Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using `CREATE TABLES` commands.
- This stage is called **Mapping ER Models to Tables in the Relational Model**, and it specifies a set of rules that must be followed in a certain order.
- The rules specified here are based on **Chen's Notation**.
- ## Steps
- [1] For each entity, create a table $R$ that includes all the **simple** attributes of the entity.
- [2] For strong entities, choose a key attribute as the primary key of the table.
- [3] For weak entities $R$, include the primary key attributes of the table that corresponds to the owner as foreign key attributes of $R$.
- The primary key of $R$ is a combination of the primary key of the owner and the partial key of the weak entity type.
- The relationship of the weak & strong entity is generally taken care of by this step.
- [4] For each **binary** $1:1$ relationship, identify entites $S$ & $T$ that participate in the relation.
- If applicable, choose the entity that has **total participation** in the relation.
- Include the primary key of the other relation as the foreign key in this table.
- Include any attributes of the relationship as attributes of the chosen table.
- If both entities have total participation in the relationship, you can choose either one for the foreign key and proceed as above, or you can map two entities & their associated attributes & relationship attributes into one table.
- [5] For each **binary** $1:N$ relationship, identify the table $S$ that represents the $N$ side and the table $T$ that represents the $1$ side.
- Include the primary key of table $T$ as a foreign key in $S$ such that each entity on the $N$ side is related to at most one entity instance on the $1$ side.
- Include any attributes of the relationship as attributes of $S$.
- For recursive $1:N$ relationships, choose the primary key of the table and include it as a foreign key in the same table with a different name.
- [6] For each $M:N$ relationship, create a new table $S$ to represent the relationship.
- Include the primary keys of the tables that represent the participating entity types as foreign keys in $S$ - their combination will form the primary key of $S$.
- Also include in $S$ any attributes of the relationship.
- For a recursive $M:N$ relationship, both foreign keys come from the same table (give different names to each) and become the new primary key.
- [7] For each **multi-valued attribute** $A$ of an entity $S$, create a new table $R$.
- $R$ will include:
- An attribute corresponding to $A$.
- The primary key of $S$, which will be a foreign key in $R$. Call this $K$.
- The primary key of $R$ is a combination of $A$ & $K$.
-
-

View File

@ -0,0 +1,340 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Aggregate Clauses, Group By, & Having Clauses]]
- **Next Topic:** [[Joins & Union Queries]]
- **Relevant Slides:** ![ER-models.pdf](../assets/ER-models_1664888140370_0.pdf)
-
- What are **Data Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:07.942Z
card-last-score:: 1
- **Data models** are concepts to describe the structure of a database.
- They comprise:
- High level or logical models;
- Representational / Implementational data models;
- Physical data models.
- Data models allow for database abstraction.
- What are **ER Models**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:30.174Z
card-last-score:: 1
- **Entity Relationship Models** are a top-down approach to database design that provide a way to *model the data* that will be stored in a system. The models are then used to *create tables* in the relational model.
- ER Models are used to identify:
- [1] The important data to be stored in a database called **entities**.
- [2] The **relationships** between the entities.
- [3] The **attributes** of entities.
- [4] The **constraints** of relationships & entities.
-
- # ER Model Notation
- A number of different notations can be used to represent the same model.
- Chen Notation.
- IE Crow's Foot Notation.
- UML.
- Integrated Definition 1. Extended (IDEF1X).
-
- The original (Chen) notation uses diamonds, rectangles, and elipses.
- This is easier to hand-draw, so it is useful in an exam situation.
- It is less implementation-oriented than other notations.
-
- # Some Definitions
collapsed:: true
- ## Entities
collapsed:: true
- What is an **entity type**? #card
card-last-interval:: 3.05
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T21:24:38.127Z
card-last-reviewed:: 2022-11-14T20:24:38.127Z
card-last-score:: 5
- An **entity type** is a collection of *entity instances* that share common properties or charcteristics.
- It is a group of objects, with the same properties, which are identified as having an independent existence.
- ![image.png](../assets/image_1664889325926_0.png)
- What is an **entity instance**? #card
card-last-interval:: 11.34
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-26T00:47:05.851Z
card-last-reviewed:: 2022-11-14T16:47:05.851Z
card-last-score:: 5
- An **entity instance** or **entity occurrence** is a single, uniquely identifiable occurrence of an entity type (e.g., row in a table).
- ![image.png](../assets/image_1664889343582_0.png)
- ## Relationships
- What is a **relationship type**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:30:24.202Z
card-last-score:: 1
collapsed:: true
- A **relationship type** is a set of meaningful relationships among entity types.
- **Chen's Notation:** A diamond shape is used to name the relationship. 1 and M/N are used for the "1" and "many" sides respectively.
- **Crow's Foot Notation:** The titular "crow's foot" is used as the representation of "many", and one line is used for the representation of "1".
- ![image.png](../assets/image_1664890907305_0.png)
- **Example:** employee "works for" department. department "has" employee.
- ![image.png](../assets/image_1664889461415_0.png)
- What is a **relationship instance**? #card
collapsed:: true
card-last-interval:: 29.04
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-10T11:35:42.874Z
card-last-reviewed:: 2022-11-11T11:35:42.874Z
card-last-score:: 3
- A **relationship instance** or **relationship occurrence** is ==a uniquely identifiable association which includes one occurrence from each participating entity type==; reading left to right and right to left.
- **Example:**
- Left-to-Right: John Smith "works for" Research department.
- Right-to-Left: Research department "has" John Smith.
- What is the **degree** of a relationship type? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-22T01:30:24.968Z
card-last-reviewed:: 2022-11-11T11:30:24.969Z
card-last-score:: 5
collapsed:: true
- Whenever an attribute of one entity type refers to another entity type, some relationship exists.
- The **degree** of a relationship type is ^^the number of participating entity types.^^
- Relationship types may have certain constraints.
- What is the **Cardinality Ratio**? #card
collapsed:: true
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-20T15:15:43.390Z
card-last-reviewed:: 2022-11-17T20:15:43.391Z
card-last-score:: 5
- The **cardinality ratio** specifies ^^the number of relationship instances that an entity can participate in.^^
- The possible cardinality ratios for binary relationship types are:
- $1:1$, "one to one" - at most one instance of entity $A$ is associated with one instance of entity $B$.
- $1:N$, "one to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$.
- $M:N$, "many to many" - for one instance of entity $A$, there are 0 or more instances of entity $B$, and for one instance of entity $B$, there are 0 or more instances of entity $A$.
-
- ### Structural Constraints on Relationships
- Often, we may know the min & max of the cardinalities.
- Example: limit on the number of books that can be borrowed from a library.
- What are **Structural Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-25T13:06:19.301Z
card-last-reviewed:: 2022-11-21T13:06:19.301Z
card-last-score:: 3
- **Structural constraints** specify a pair of integer numbers *(min, max)* for each entity participating in a relationship.
- Examples: (0, 1), (1, 1), (1, N)., (1, 7).
-
- ## Attributes
collapsed:: true
- What are **Attributes**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:08.122Z
card-last-score:: 1
collapsed:: true
- **Attributes** are ^^named property^^ or characteristic of an entity.
- Each entity has a set of attributes associated with it.
- Several types of attributes exist:
- Key.
- Composite.
- Derived.
- Multi-valued.
- **Notation:**
- **Chen:** An oval enclosing the name of the attribute.
- ![image.png](../assets/image_1664889737597_0.png)
- **Crow:** Listed in the entity box.
- ![image.png](../assets/image_1664889775094_0.png)
- What are **key attributes**? #card
card-last-interval:: 5.52
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T04:37:24.678Z
card-last-reviewed:: 2022-11-14T16:37:24.678Z
card-last-score:: 3
collapsed:: true
- Each entity type must have an attribute or set of attributes that ^^uniquely identifies^^ each instance from other instances of the same type.
- A **candidate key** is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type.
- A **primary key (PK)** is a candidate key that has been selected as the identifier for an entity type.
- What is a **composite attribute**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:46:49.892Z
card-last-score:: 1
collapsed:: true
- A **composite attribute** is an attribute that is composed of several atomic (simple) attributes.
- If the composite attribute is referenced as a whole only, then there is no need to subdivide it into component attributes, otherwise you should divide it:
- ![image.png](../assets/image_1664890028074_0.png)
- What is a **derived attribute**? #card
card-last-interval:: 31.05
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-15T21:12:18.376Z
card-last-reviewed:: 2022-11-14T20:12:18.376Z
card-last-score:: 5
collapsed:: true
- A **derived attribute** is an attribute whose values can be determined from another attribute.
- For Chen's notation, the notation is a *dotted oval*.
- ![image.png](../assets/image_1664890167971_0.png)
- For Crow's Foot notation, derived attributes can be represented by enclosing the attribute in [square brackets], e.g., [age].
- What is a **multi-valued attribute**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:18:04.379Z
card-last-score:: 1
- A **multi-valued attribute** is an attribute which has lower & upper bounds on the number of values for an individual entry.
- For Chen's notation, multi-valued attributes can be represented by one oval inside another.
- ![image.png](../assets/image_1664890277505_0.png)
- For Crow's Foot notation, multi-valued attributes can be represented by enclosing the attribute in {curly brackets}.
- E.g., {skills}.
-
-
- # Total & Partial Participation
collapsed:: true
- What is **Total Participation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-18T20:26:04.052Z
card-last-reviewed:: 2022-11-14T20:26:04.052Z
card-last-score:: 3
- **Total Participation** (Mandatory Participation): ==All instances of an entity must participate in the relationship==, i.e., *every* entity instance in one set *must* be related to an entity instance in the second via the relationship.
- For example, the entity "Student" must have **total participation** in the relationship "enrolled" with the entity "Course" - each student *must* be enrolled in a course.
- What is **Partial Participation**? #card
card-last-interval:: 7.45
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-22T02:28:50.125Z
card-last-reviewed:: 2022-11-14T16:28:50.125Z
card-last-score:: 3
- **Partial Participation** (Optional Participation): Some subset of instances of an entity will participate in the relationship, but not all, i.e., *some* entity instances in one set are related to an entity instance in the second via the relationship.
- For example, the entity "Course" would have a **partial participation** in the relationship "enrolled" with the entity "Student" - a course might not have any students enrolled in it.
- ## Chen's Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:06:29.661Z
card-last-score:: 1
- In both total & partial participation, the line(s) are drawn from the participating entity to the relationship (the diamond) to indicate the participation of that instance from that entity in the relationship,
- **Total Participation:** Double parallel lines.
- ![image.png](../assets/image_1664968200677_0.png)
- **Partial Participation:** Single line.
- ![image.png](../assets/image_1664968259633_0.png)
- Examples:
- ![image.png](../assets/image_1664968387620_0.png)
-
- ## Crow's Foot Notation for Participation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:01.789Z
card-last-score:: 1
collapsed:: true
- Use the idea of **Ordinality / Optionality**.
- **Optionality of 0:** If an entity $A$ has partial participation in a relationship to entity $B$, then $A$ is associated with 0 or more instances of entity $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 0** is a **bar**: $|$
- **Optionality of 1:** If an entity $A$ has full participation in a relationship to entity $B$, then $A$ is associated with at least 1 or more of $B$, so the **optionality sign** goes beside $B$.
- The optionality sign for an **Optionality of 1** is a **circle** or "o": $\bigcirc$
- [And vice-versa.]
- In Crow's Foot notation, there is no diamond, so there is always a direct relationship line between the entities.
- The **optionality sign** is drawn on this line.
- The optionality drawn beside some entity $A$ refers to how an instance of entity $B$ is related to entity $A$.
- That is, whether $B$ can be involved partially (0) or not (1).
- Example in *Right to Left* Relationships:
- ![image.png](../assets/image_1664969067049_0.png)
-
- ## Note on Weak Entities
- **Note:** ^^A **weak entity type** always has a total participation constraint.^^
- We need to show the "identifying relationship".
- ![image.png](../assets/image_1664969168410_0.png)
- ### Chen's Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:03.316Z
card-last-score:: 1
collapsed:: true
- **Entity:** Double rectangle.
- **Relationship:** Double diamond.
- The weak entity has full participation in the relationship.
- ![image.png](../assets/image_1664969357926_0.png)
-
- ### Crow's Foot Notation for Weak Entities #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T19:36:10.883Z
card-last-score:: 1
collapsed:: true
- In Crow's Foot Notation, we can represent the **weak entity** as a normal entity, but do not choose any attributes as the primary key.
- For an attribute that partially determines the entity instances, we choose the "required" option.
- We usually represent the relationship between entities using a **solid line**.
- This indicates that it is an "identifying" relationship.
- ![image.png](../assets/image_1664969566989_0.png)
-
-
- In general, with entities, there may be two valid solutions, one with a weak entity, and one without.
- There is not a huge difficulty if you do not identify weak entities in a solution so long as all the entities have **primary attributes**.
- It may be slightly non-optimal in terms of introducing an additional primary key that is not needed, but this is not a huge problem for us at this level.
-
- # Entities or Multi-Valued Attributes?
collapsed:: true
- Sometimes, it may not be clear whether something should be modelled as a multi-valued attribute or an entity.
- Both may be equally correct, as long as you represent all the information that you are asked to.
- You may see (very little) difference between the two approaches if you map either approach to tables in a database.
-
- # Steps to Create an ER Model
- 1. Identify entities.
2. Identify relationships between entities.
3. Draw entities & relationships.
4. Add attributes to entities (& relationships, if appropriate).
5. Add cardinalities to relationships.
6. Add participation constraints (total or partial) to relationships.
7. Check that all entities have primary keys identified.
-
- # Mapping ER Models to Tables in the Relational Model
- Once you have you ER diagram, you now need to convert it into a set of tables so that you can implement it in a relational mode.
- Example: As MySQL tables using `CREATE TABLES` commands.
- This stage is called **Mapping ER Models to Tables in the Relational Model**, and it specifies a set of rules that must be followed in a certain order.
- The rules specified here are based on **Chen's Notation**.
- ## Steps
- [1] For each entity, create a table $R$ that includes all the **simple** attributes of the entity.
- [2] For strong entities, choose a key attribute as the primary key of the table.
- [3] For weak entities $R$, include the primary key attributes of the table that corresponds to the owner as foreign key attributes of $R$.
- The primary key of $R$ is a combination of the primary key of the owner and the partial key of the weak entity type.
- The relationship of the weak & strong entity is generally taken care of by this step.
- [4] For each **binary** $1:1$ relationship, identify entites $S$ & $T$ that participate in the relation.
- If applicable, choose the entity that has **total participation** in the relation.
- Include the primary key of the other relation as the foreign key in this table.
- Include any attributes of the relationship as attributes of the chosen table.
- If both entities have total participation in the relationship, you can choose either one for the foreign key and proceed as above, or you can map two entities & their associated attributes & relationship attributes into one table.
- [5] For each **binary** $1:N$ relationship, identify the table $S$ that represents the $N$ side and the table $T$ that represents the $1$ side.
- Include the primary key of table $T$ as a foreign key in $S$ such that each entity on the $N$ side is related to at most one entity instance on the $1$ side.
- Include any attributes of the relationship as attributes of $S$.
- For recursive $1:N$ relationships, choose the primary key of the table and include it as a foreign key in the same table with a different name.
- [6] For each $M:N$ relationship, create a new table $S$ to represent the relationship.
- Include the primary keys of the tables that represent the participating entity types as foreign keys in $S$ - their combination will form the primary key of $S$.
- Also include in $S$ any attributes of the relationship.
- For a recursive $M:N$ relationship, both foreign keys come from the same table (give different names to each) and become the new primary key.
- [7] For each **multi-valued attribute** $A$ of an entity $S$, create a new table $R$.
- $R$ will include:
- An attribute corresponding to $A$.
- The primary key of $S$, which will be a foreign key in $R$. Call this $K$.
- The primary key of $R$ is a combination of $A$ & $K$.
-
-

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:13:39.783Z
card-last-reviewed:: 2022-09-18T15:13:39.783Z
card-last-score:: 5
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:19:14.009Z
card-last-reviewed:: 2022-09-18T15:19:14.009Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:13:48.388Z
card-last-reviewed:: 2022-09-18T15:13:48.389Z
card-last-score:: 3
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:50:07.222Z
card-last-score:: 1
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:22:10.041Z
card-last-reviewed:: 2022-09-18T15:22:10.041Z
card-last-score:: 5
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:06.471Z
card-last-reviewed:: 2022-09-18T15:17:06.471Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:19:17.126Z
card-last-reviewed:: 2022-09-18T15:19:17.126Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:22:15.918Z
card-last-reviewed:: 2022-09-18T15:22:15.919Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:19:22.661Z
card-last-reviewed:: 2022-09-18T15:19:22.662Z
card-last-score:: 5
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T15:29:26.142Z
card-last-reviewed:: 2022-09-19T18:29:26.143Z
card-last-score:: 3
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:18:00.114Z
card-last-reviewed:: 2022-09-18T15:18:00.114Z
card-last-score:: 5
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-09-29T22:46:04.368Z
card-last-reviewed:: 2022-09-19T17:46:04.368Z
card-last-score:: 5
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:23:34.273Z
card-last-reviewed:: 2022-09-18T15:23:34.273Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}=\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:21:34.890Z
card-last-reviewed:: 2022-09-19T18:21:34.891Z
card-last-score:: 3
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T17:43:51.718Z
card-last-reviewed:: 2022-09-19T17:43:51.719Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T17:45:55.511Z
card-last-reviewed:: 2022-09-19T17:45:55.512Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:50:39.855Z
card-last-reviewed:: 2022-09-18T14:50:39.855Z
card-last-score:: 3
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:52:05.581Z
card-last-reviewed:: 2022-09-18T14:52:05.582Z
card-last-score:: 3
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: 4.55
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-24T06:47:06.630Z
card-last-reviewed:: 2022-09-19T17:47:06.630Z
card-last-score:: 5
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:30:23.072Z
card-last-score:: 1
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: 9.55
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-29T07:30:03.396Z
card-last-reviewed:: 2022-09-19T18:30:03.397Z
card-last-score:: 5
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 3.02
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T18:30:58.574Z
card-last-reviewed:: 2022-09-19T18:30:58.574Z
card-last-score:: 3
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:29:35.052Z
card-last-reviewed:: 2022-10-01T13:29:35.052Z
card-last-score:: 5
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**?
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:37:15.281Z
card-last-reviewed:: 2022-10-03T11:37:15.284Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:56.000Z
card-last-reviewed:: 2022-09-30T12:10:56.000Z
card-last-score:: 5
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: 3.57
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:41:41.392Z
card-last-reviewed:: 2022-10-06T09:41:41.392Z
card-last-score:: 5
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:04.081Z
card-last-reviewed:: 2022-10-03T11:40:04.082Z
card-last-score:: 5
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:30:26.549Z
card-last-reviewed:: 2022-10-01T17:30:26.549Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:37:17.248Z
card-last-reviewed:: 2022-10-03T11:37:17.248Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:07.495Z
card-last-reviewed:: 2022-10-03T11:40:07.496Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T12:12:49.015Z
card-last-score:: 1
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:02:29.057Z
card-last-score:: 1
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T23:00:00.000Z
card-last-reviewed:: 2022-10-01T17:34:22.458Z
card-last-score:: 1
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-11-01T19:10:35.516Z
card-last-interval:: 28.3
card-ease-factor:: 2.66
card-last-reviewed:: 2022-10-04T12:10:35.517Z
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:40:44.143Z
card-last-reviewed:: 2022-10-03T11:40:44.143Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}=\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T11:44:18.197Z
card-last-score:: 1
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:24.167Z
card-last-reviewed:: 2022-10-03T11:42:24.168Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:37.291Z
card-last-reviewed:: 2022-10-03T11:42:37.291Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:50.901Z
card-last-reviewed:: 2022-09-30T12:13:50.901Z
card-last-score:: 3
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:28:57.649Z
card-last-score:: 1
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:31:20.377Z
card-last-score:: 1
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:21:43.026Z
card-last-score:: 1
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-15T07:07:49.634Z
card-last-reviewed:: 2022-10-04T12:07:49.636Z
card-last-score:: 5
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:15.771Z
card-last-reviewed:: 2022-10-03T11:41:15.771Z
card-last-score:: 5
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:29:35.052Z
card-last-reviewed:: 2022-10-01T13:29:35.052Z
card-last-score:: 5
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**?
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:37:15.281Z
card-last-reviewed:: 2022-10-03T11:37:15.284Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:56.000Z
card-last-reviewed:: 2022-09-30T12:10:56.000Z
card-last-score:: 5
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: 3.57
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-09T22:41:41.392Z
card-last-reviewed:: 2022-10-06T09:41:41.392Z
card-last-score:: 5
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:04.081Z
card-last-reviewed:: 2022-10-03T11:40:04.082Z
card-last-score:: 5
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:30:26.549Z
card-last-reviewed:: 2022-10-01T17:30:26.549Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:37:17.248Z
card-last-reviewed:: 2022-10-03T11:37:17.248Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:07.495Z
card-last-reviewed:: 2022-10-03T11:40:07.496Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T10:31:37.141Z
card-last-reviewed:: 2022-10-07T10:31:37.142Z
card-last-score:: 5
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:46:08.943Z
card-last-reviewed:: 2022-10-07T10:46:08.944Z
card-last-score:: 5
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:29:23.069Z
card-last-reviewed:: 2022-10-06T17:29:23.069Z
card-last-score:: 5
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-11-01T19:10:35.516Z
card-last-interval:: 28.3
card-ease-factor:: 2.66
card-last-reviewed:: 2022-10-04T12:10:35.517Z
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:40:44.143Z
card-last-reviewed:: 2022-10-03T11:40:44.143Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}=\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T22:40:49.278Z
card-last-score:: 1
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:24.167Z
card-last-reviewed:: 2022-10-03T11:42:24.168Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:37.291Z
card-last-reviewed:: 2022-10-03T11:42:37.291Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:50.901Z
card-last-reviewed:: 2022-09-30T12:13:50.901Z
card-last-score:: 3
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: 3.33
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T22:24:00.798Z
card-last-reviewed:: 2022-10-08T15:24:00.798Z
card-last-score:: 5
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:46:54.597Z
card-last-reviewed:: 2022-10-07T10:46:54.597Z
card-last-score:: 5
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:00:03.971Z
card-last-reviewed:: 2022-10-08T15:00:03.971Z
card-last-score:: 5
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-15T07:07:49.634Z
card-last-reviewed:: 2022-10-04T12:07:49.636Z
card-last-score:: 5
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:15.771Z
card-last-reviewed:: 2022-10-03T11:41:15.771Z
card-last-score:: 5
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:29:35.052Z
card-last-reviewed:: 2022-10-01T13:29:35.052Z
card-last-score:: 5
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**?
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:37:15.281Z
card-last-reviewed:: 2022-10-03T11:37:15.284Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:56.000Z
card-last-reviewed:: 2022-09-30T12:10:56.000Z
card-last-score:: 5
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: 7.48
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-17T22:46:14.707Z
card-last-reviewed:: 2022-10-10T11:46:14.707Z
card-last-score:: 3
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:04.081Z
card-last-reviewed:: 2022-10-03T11:40:04.082Z
card-last-score:: 5
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:30:26.549Z
card-last-reviewed:: 2022-10-01T17:30:26.549Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:37:17.248Z
card-last-reviewed:: 2022-10-03T11:37:17.248Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:40:07.495Z
card-last-reviewed:: 2022-10-03T11:40:07.496Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T10:31:37.141Z
card-last-reviewed:: 2022-10-07T10:31:37.142Z
card-last-score:: 5
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:46:08.943Z
card-last-reviewed:: 2022-10-07T10:46:08.944Z
card-last-score:: 5
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:29:23.069Z
card-last-reviewed:: 2022-10-06T17:29:23.069Z
card-last-score:: 5
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-11-01T19:10:35.516Z
card-last-interval:: 28.3
card-ease-factor:: 2.66
card-last-reviewed:: 2022-10-04T12:10:35.517Z
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:40:44.143Z
card-last-reviewed:: 2022-10-03T11:40:44.143Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}=\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-14T11:40:32.583Z
card-last-reviewed:: 2022-10-10T11:40:32.584Z
card-last-score:: 3
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:24.167Z
card-last-reviewed:: 2022-10-03T11:42:24.168Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:42:37.291Z
card-last-reviewed:: 2022-10-03T11:42:37.291Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:43:51.820Z
card-last-reviewed:: 2022-10-10T11:43:51.821Z
card-last-score:: 3
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: 3.33
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T22:24:00.798Z
card-last-reviewed:: 2022-10-08T15:24:00.798Z
card-last-score:: 5
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:46:54.597Z
card-last-reviewed:: 2022-10-07T10:46:54.597Z
card-last-score:: 5
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: 2.8
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T10:00:03.971Z
card-last-reviewed:: 2022-10-08T15:00:03.971Z
card-last-score:: 5
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-15T07:07:49.634Z
card-last-reviewed:: 2022-10-04T12:07:49.636Z
card-last-score:: 5
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:15.771Z
card-last-reviewed:: 2022-10-03T11:41:15.771Z
card-last-score:: 5
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:39:44.342Z
card-last-score:: 1
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**?
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:37:15.281Z
card-last-reviewed:: 2022-10-03T11:37:15.284Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:56.000Z
card-last-reviewed:: 2022-09-30T12:10:56.000Z
card-last-score:: 5
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-30T04:36:51.695Z
card-last-reviewed:: 2022-11-18T18:36:51.696Z
card-last-score:: 3
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.8
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:51:33.945Z
card-last-score:: 1
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:44:53.701Z
card-last-reviewed:: 2022-11-14T16:44:53.701Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:49:52.921Z
card-last-reviewed:: 2022-11-14T16:49:52.922Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:36.002Z
card-last-reviewed:: 2022-11-14T16:51:36.002Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: 10.64
card-repeats:: 3
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-25T07:35:49.776Z
card-last-reviewed:: 2022-11-14T16:35:49.776Z
card-last-score:: 5
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: 8.72
card-repeats:: 3
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-23T09:37:05.082Z
card-last-reviewed:: 2022-11-14T16:37:05.082Z
card-last-score:: 3
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: 30.47
card-repeats:: 4
card-ease-factor:: 2.76
card-next-schedule:: 2022-12-15T07:21:06.252Z
card-last-reviewed:: 2022-11-14T20:21:06.252Z
card-last-score:: 5
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-score:: 5
card-repeats:: 5
card-next-schedule:: 2023-02-06T22:21:22.599Z
card-last-interval:: 84.1
card-ease-factor:: 2.76
card-last-reviewed:: 2022-11-14T20:21:22.599Z
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 29.26
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T22:47:38.996Z
card-last-reviewed:: 2022-11-14T16:47:38.997Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: 8.72
card-repeats:: 3
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-23T09:49:27.636Z
card-last-reviewed:: 2022-11-14T16:49:27.637Z
card-last-score:: 5
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:42.814Z
card-last-reviewed:: 2022-11-14T16:51:42.815Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:53.283Z
card-last-reviewed:: 2022-11-14T16:51:53.283Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:18:35.901Z
card-last-reviewed:: 2022-11-14T20:18:35.901Z
card-last-score:: 5
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:40:52.313Z
card-last-score:: 1
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: 10.97
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-25T15:37:19.443Z
card-last-reviewed:: 2022-11-14T16:37:19.444Z
card-last-score:: 3
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: 5.52
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T04:34:53.491Z
card-last-reviewed:: 2022-11-14T16:34:53.492Z
card-last-score:: 3
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:00:35.072Z
card-last-score:: 1
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 28.93
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-13T14:49:15.636Z
card-last-reviewed:: 2022-11-14T16:49:15.637Z
card-last-score:: 5
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,282 @@
- #[[ST2001 - Statistics in Data Science I]]
- **Previous Topic:** null
- **Next Topic:** [[Sampling]]
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662914505882_0.pdf)
-
- ## What is / are Statistics?
collapsed:: true
- What is a **statistic**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:39:44.342Z
card-last-score:: 1
- A **statistic** is any quantity computed from sample data.
- What is the **Science of Statistics**?
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:37:15.281Z
card-last-reviewed:: 2022-10-03T11:37:15.284Z
card-last-score:: 5
- The collecting, classifying, summarising, organising, analysing, estimation, and interpretation of information.
- What is the **role of statistics**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:10:56.000Z
card-last-reviewed:: 2022-09-30T12:10:56.000Z
card-last-score:: 5
- The field of statistics deals with the collection, presentation, analysis, and use of data to:
- make decisions
- solve problems
- design products & processes
- Statistics is the ^^science of uncertainty.^^
- What is the **role of probability** in statistics?
- **Probability** provides the framework for the study & application of statistics.
-
- ## Types of Statistics
collapsed:: true
- What is **Descriptive Statistics**? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-30T04:36:51.695Z
card-last-reviewed:: 2022-11-18T18:36:51.696Z
card-last-score:: 3
- **Descriptive Statistics** is the science of summarising data, both numerically & graphically.
- The analysis methods applicable depends on the variable being measured and the research questions that you are trying to answer.
- What is **Inferential Statistics**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.8
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:51:33.945Z
card-last-score:: 1
- **Inferential Statistics** is the science of using the ^^information in your sample^^ to ^^infer^^ something about the population of statistics.
-
- ## Important Terms
collapsed:: true
- What is an **experimental unit**? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:44:53.701Z
card-last-reviewed:: 2022-11-14T16:44:53.701Z
card-last-score:: 5
- An **experimental unit** / individual is a single object upon which we collect data. e.g., a person, thing, transaction, or event.
- What is a **population**? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:49:52.921Z
card-last-reviewed:: 2022-11-14T16:49:52.922Z
card-last-score:: 5
- A **population** is a ^^collection of experimental units^^ / individuals that we are interested in studying.
- What is a **sample**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:36.002Z
card-last-reviewed:: 2022-11-14T16:51:36.002Z
card-last-score:: 5
- A **sample** is a subset of experimental units from the population.
- What is a **variable**? #card
collapsed:: true
card-last-interval:: 10.64
card-repeats:: 3
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-25T07:35:49.776Z
card-last-reviewed:: 2022-11-14T16:35:49.776Z
card-last-score:: 5
- A **variable** is a ^^characteristic or property of an individual experimental unit^^.
- A variable may be measured, or more generally "observed" on each individual.
- What is **Qualitative Data**? #card
card-last-interval:: 8.72
card-repeats:: 3
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-23T09:37:05.082Z
card-last-reviewed:: 2022-11-14T16:37:05.082Z
card-last-score:: 3
- **Qualitative Data** is data which can be classified into categories.
- Two types of Qualitative Data:
- **Ordinal:** ordered qualitative data - e.g., a grade,
- **Nominal:** unordered qualitative data - e.g., a gender, a method of payment
- What is **Quantitative Data**? #card
card-last-interval:: 30.47
card-repeats:: 4
card-ease-factor:: 2.76
card-next-schedule:: 2022-12-15T07:21:06.252Z
card-last-reviewed:: 2022-11-14T20:21:06.252Z
card-last-score:: 5
- **Quantitative Data** is data in the form of counts or numbers - it cannot be classified into categories.
- Two types of Quantitative Data:
- **Discrete:** non-divisible, single points of data, **counts** - e.g., number of texts sent
- **Continuous:** measurements that, if placed on a number scale, can be placed in an infinite number of spaces between two whole numbers - e.g., age, rent, temperature
-
- Pie charts make data very difficult to interpret & read - **don't use them**.
-
- ## Numerical Summaries
- ### Central Tendency
- What is a **numerical summary**? #card
card-last-score:: 5
card-repeats:: 5
card-next-schedule:: 2023-02-06T22:21:22.599Z
card-last-interval:: 84.1
card-ease-factor:: 2.76
card-last-reviewed:: 2022-11-14T20:21:22.599Z
- A **numerical summary** is a way of summarising categorical data using a frequency count or percentage.
- How do you calculate the **sample mean**? #card
card-last-interval:: 29.26
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T22:47:38.996Z
card-last-reviewed:: 2022-11-14T16:47:38.997Z
card-last-score:: 5
- Suppose that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$. The **sample mean**, denoted by $\bar{x}$, is:
- $$\bar{x} = \sum_{i=1}^{n}\frac{x_i}{n}=\frac{x_1+x_2+...+x_n}{n}$$
:LOGBOOK:
CLOCK: [2022-09-12 Mon 18:35:39]
:END:
- ^^The sample mean is **sensitive** to extreme values^^
- How do you calculate the **sample median**? #card
card-last-interval:: 8.72
card-repeats:: 3
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-23T09:49:27.636Z
card-last-reviewed:: 2022-11-14T16:49:27.637Z
card-last-score:: 5
- Given that the observations in a sample are $x_1,\ x_2,\ ...\ ,\ x_n$, arranged in **increasing order** of magnitude, the **sample median** is:
- $$\bar{x} = \begin{cases}x_{(n+1)/2}, & \text{if $n$ is odd},\\ \frac{1}{2}(x_{n/2} + x_{n/2+1}), &\text{if $n$ is even.} \\ \end{cases}$$
- ^^The sample median is **not** sensitive to extreme values.^^
- What is the **mode**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:42.814Z
card-last-reviewed:: 2022-11-14T16:51:42.815Z
card-last-score:: 5
- The **mode** is the most frequent observation in a dataset.
- ### Variation
collapsed:: true
- What is the **range** of a sample? #card
collapsed:: true
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:51:53.283Z
card-last-reviewed:: 2022-11-14T16:51:53.283Z
card-last-score:: 5
- The **range** of a sample is the **maximum** - **minimum**.
- The range is a ^^poor measure of spread and is badly affected by outliers.^^
- The range is also ^^badly affected by outliers.^^
- #### Interquartile Range
- What is the **interquartile range** of a sample? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:18:35.901Z
card-last-reviewed:: 2022-11-14T20:18:35.901Z
card-last-score:: 5
- The **interquartile range** is the middle 50% of the data.
- Therefore, it is ^^robust to outliers.^^
- To calculate the **IQR**, first split the data in 4 quarters and subtract the value at $Q_3$ from the value at $Q_1$.
- $$IQR=Q_3-Q_1$$
- ![image.png](../assets/image_1663005545935_0.png)
- #### Tukey's Method for IQR
- There are also many other methods for calculating IQR.
- 1. Put data in **ascending** order.
2. The **lower quartile** ($Q_1$)is the **median** of the **lower** 50% of the data, including the median.
3. The **upper quartile** ($Q_3$) is the **median** of the **upper** 50% of the data, including the median.
- #### Standard Deviation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:40:52.313Z
card-last-score:: 1
- A common measure of spread is the **standard deviation**, which takes into account how far *each* data value is from the mean.
- A **deviation** is the distance of a datapoint from the mean.
- Since the sum of all the deviations would be zero, we square each deviation and find an average of the deviations called the **variance**.
- We then get the positive square root of the **sample variance** to get the the **sample standard deviation**, which is preferable to the sample variance, as the sample variance is in squared units.
- The **standard deviation** is ^^sensitive to outliers.^^
- How do you calculate the **sample variance**, and hence, the **sample standard deviation**? #card
card-last-interval:: 10.97
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-25T15:37:19.443Z
card-last-reviewed:: 2022-11-14T16:37:19.444Z
card-last-score:: 3
- The **sample variance**, denoted by $s^2$, is given by:
- $$s^2=\sum_{i=1}^{n} \frac{(x_i - \bar{x})^2}{n-1}$$
- The **sample standard deviation**, denoted by $s$, is the **positive square root** of $s^2$, that is:
- $$s=\sqrt{s^2}$$
- ### Shape
- #### Graphical Summaries of Data
- Depends on the variable of interest.
- **Categorical** response variable -> bar chart or pie chart.
- **Categorical** response variable ^^with an explanatory variable^^ -> grouped bar chart.
- **Continuous** response variable -> histogram, boxplot, densit plot.
- **Continuous** response variable ^^with an explanatory variable^^ -> grouped boxplot.
-
- What is a **boxplot**? #card
card-last-interval:: 5.52
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-20T04:34:53.491Z
card-last-reviewed:: 2022-11-14T16:34:53.492Z
card-last-score:: 3
- A **boxplot** is a graphical display showing centre, spread, shape, & outliers.
- It displays the **5-number summary**:
- *min, Q1, median, Q3, max*
- ![image.png](../assets/image_1663236210540_0.png)
- What is a **histogram**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:00:35.072Z
card-last-score:: 1
- **Histograms** are useful to show the general shape, location, and spread of data values.
- Representation by *area*.
- **Construction**
- Determine range of data *minimum, maximum*.
- Split into convenient intervals or *bins*.
- Usually use 5 to 15 intervals.
- Count number of observations in each interval - *frequency*.
- When talking about the shape of the data, make sure to address the following 3 questions:
- 1. Does the histogram have a single, central hump or several well-separated bumps?
2. Is the histogram or boxplot **symmetric**, or more spread out in one direction (skewed)?
3. Any unusual features? e.g.., outliers, spikes.
- ![image.png](../assets/image_1663237164731_0.png)
- ![image.png](../assets/image_1663237245117_0.png)
-
- #### Explanatory & Response Variables
collapsed:: true
- To identify the **explanatory** variable in a pair of variables, identify which of the two is suspected of affecting the other and plan an appropriate analysis
- explanatory variable -might effect-> response variable
- continent -might effect-> life expectancy.
-
- ## R Markdown
- What is **R Markdown**? #card
card-last-interval:: 28.93
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-12-13T14:49:15.636Z
card-last-reviewed:: 2022-11-14T16:49:15.637Z
card-last-score:: 5
- **R Markdown** is a file format for making ^^dynamic documents in R.^^
- R Markdown is written in Markdown and contains chunks of embedded R code (data management, summaries, graphics, analysis & interpretation) all in one document.
- Documents can be **knitted** to HTML, PDF, Word, and many other formats.
- ### Key Benefits of R Markdown
- Makes it easy to produce statistical reports with code, analysis, outputs, and write-up all in one place.
- Perfect for reproducible research.
- Easy to convert to different document types.
- ### Structure
- R Markdown contains **three** types of content:
- A **YAML Header**.
- Text, formatted with Markdown.
- Code chunks.

View File

@ -0,0 +1,194 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:29:57.359Z
card-last-score:: 1
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:55:21.827Z
card-last-reviewed:: 2022-09-18T14:55:21.828Z
card-last-score:: 3
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:22:08.303Z
card-last-reviewed:: 2022-09-19T18:22:08.303Z
card-last-score:: 3
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:23:15.213Z
card-last-reviewed:: 2022-09-19T18:23:15.213Z
card-last-score:: 3
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:25:31.450Z
card-last-reviewed:: 2022-09-19T18:25:31.450Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 3.02
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T18:30:51.722Z
card-last-reviewed:: 2022-09-19T18:30:51.722Z
card-last-score:: 3
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-09-22T15:21:45.332Z
card-last-interval:: 4
card-ease-factor:: 2.7
card-last-reviewed:: 2022-09-18T15:21:45.332Z
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **methods**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:06:15.300Z
card-last-reviewed:: 2022-09-19T18:06:15.300Z
card-last-score:: 3
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:22:14.771Z
card-last-score:: 1
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:23:53.666Z
card-last-reviewed:: 2022-09-19T18:23:53.666Z
card-last-score:: 3
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,196 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: 2.3
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-02T16:26:57.406Z
card-last-reviewed:: 2022-09-30T09:26:57.407Z
card-last-score:: 3
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:15:08.025Z
card-last-reviewed:: 2022-09-30T12:15:08.025Z
card-last-score:: 5
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T11:44:39.961Z
card-last-score:: 1
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-08T15:30:57.356Z
card-last-reviewed:: 2022-09-30T08:30:57.357Z
card-last-score:: 3
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:49.649Z
card-last-reviewed:: 2022-09-30T08:30:49.650Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:02.783Z
card-last-reviewed:: 2022-10-03T11:41:02.784Z
card-last-score:: 5
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-14T15:39:43.016Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-03T11:39:43.018Z
- The `this` keyword refers to the current object in a method or constructor.
- The most common use of `this` is to distinguish between class attributes & parameters of the same name.
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **qmethods**? #card
card-last-score:: 3
card-repeats:: 3
card-next-schedule:: 2022-10-08T19:10:46.142Z
card-last-interval:: 8.32
card-ease-factor:: 2.08
card-last-reviewed:: 2022-09-30T12:10:46.142Z
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T09:21:53.359Z
card-last-reviewed:: 2022-09-30T09:21:53.359Z
card-last-score:: 5
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:22.160Z
card-last-reviewed:: 2022-10-03T14:27:22.161Z
card-last-score:: 5
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,196 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: 9.84
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-16T13:11:17.984Z
card-last-reviewed:: 2022-10-06T17:11:17.985Z
card-last-score:: 5
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:15:08.025Z
card-last-reviewed:: 2022-09-30T12:15:08.025Z
card-last-score:: 5
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:23:51.445Z
card-last-reviewed:: 2022-10-07T10:23:51.446Z
card-last-score:: 5
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-09T23:00:00.000Z
card-last-reviewed:: 2022-10-09T08:51:16.259Z
card-last-score:: 1
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:49.649Z
card-last-reviewed:: 2022-09-30T08:30:49.650Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:02.783Z
card-last-reviewed:: 2022-10-03T11:41:02.784Z
card-last-score:: 5
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-14T15:39:43.016Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-03T11:39:43.018Z
- The `this` keyword refers to the current object in a method or constructor.
- The most common use of `this` is to distinguish between class attributes & parameters of the same name.
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **methods**? #card
card-last-score:: 3
card-repeats:: 4
card-next-schedule:: 2022-10-24T09:51:35.296Z
card-last-interval:: 15.05
card-ease-factor:: 1.94
card-last-reviewed:: 2022-10-09T08:51:35.296Z
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:14.037Z
card-last-reviewed:: 2022-10-07T10:19:14.038Z
card-last-score:: 5
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:22.160Z
card-last-reviewed:: 2022-10-03T14:27:22.161Z
card-last-score:: 5
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,196 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: 9.84
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-16T13:11:17.984Z
card-last-reviewed:: 2022-10-06T17:11:17.985Z
card-last-score:: 5
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 27.13
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-06T14:43:03.193Z
card-last-reviewed:: 2022-10-10T11:43:03.194Z
card-last-score:: 5
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:23:51.445Z
card-last-reviewed:: 2022-10-07T10:23:51.446Z
card-last-score:: 5
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-09T23:00:00.000Z
card-last-reviewed:: 2022-10-09T08:51:16.259Z
card-last-score:: 1
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:49.649Z
card-last-reviewed:: 2022-09-30T08:30:49.650Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:02.783Z
card-last-reviewed:: 2022-10-03T11:41:02.784Z
card-last-score:: 5
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-14T15:39:43.016Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-03T11:39:43.018Z
- The `this` keyword refers to the current object in a method or constructor.
- The most common use of `this` is to distinguish between class attributes & parameters of the same name.
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **methods**? #card
card-last-score:: 3
card-repeats:: 4
card-next-schedule:: 2022-10-24T09:51:35.296Z
card-last-interval:: 15.05
card-ease-factor:: 1.94
card-last-reviewed:: 2022-10-09T08:51:35.296Z
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:14.037Z
card-last-reviewed:: 2022-10-07T10:19:14.038Z
card-last-score:: 5
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:22.160Z
card-last-reviewed:: 2022-10-03T14:27:22.161Z
card-last-score:: 5
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,196 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: 9.84
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-16T13:11:17.984Z
card-last-reviewed:: 2022-10-06T17:11:17.985Z
card-last-score:: 5
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 27.13
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-06T14:43:03.193Z
card-last-reviewed:: 2022-10-10T11:43:03.194Z
card-last-score:: 5
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:23:51.445Z
card-last-reviewed:: 2022-10-07T10:23:51.446Z
card-last-score:: 5
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: 9.12
card-repeats:: 3
card-ease-factor:: 2.28
card-next-schedule:: 2022-10-29T10:28:34.075Z
card-last-reviewed:: 2022-10-20T08:28:34.075Z
card-last-score:: 5
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:33:20.443Z
card-last-reviewed:: 2022-10-20T08:33:20.443Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:02.783Z
card-last-reviewed:: 2022-10-03T11:41:02.784Z
card-last-score:: 5
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-14T15:39:43.016Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-03T11:39:43.018Z
- The `this` keyword refers to the current object in a method or constructor.
- The most common use of `this` is to distinguish between class attributes & parameters of the same name.
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **methods**? #card
card-last-score:: 3
card-repeats:: 4
card-next-schedule:: 2022-10-24T09:51:35.296Z
card-last-interval:: 15.05
card-ease-factor:: 1.94
card-last-reviewed:: 2022-10-09T08:51:35.296Z
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:14.037Z
card-last-reviewed:: 2022-10-07T10:19:14.038Z
card-last-score:: 5
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:22.160Z
card-last-reviewed:: 2022-10-03T14:27:22.161Z
card-last-score:: 5
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,196 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Introduction to Java]]
- **Next Topic:** [[More Java Code]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663059993088_0.pdf)
-
- What is the **structure of a class**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-12-01T00:46:31.046Z
card-last-reviewed:: 2022-11-09T12:46:31.046Z
card-last-score:: 3
- Every class has the following structure:
- ```java
public class ClassName
{
Fields
Constructors
Methods
}
```
- ## Fields
- What are **Fields**? #card
card-last-interval:: 27.13
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-06T14:43:03.193Z
card-last-reviewed:: 2022-10-10T11:43:03.194Z
card-last-score:: 5
- **Fields**, also known as **instance variables**, store values for an object.
- Fields define the state of an object.
- In BlueJ, use *Inspect* to view the state.
- Some values change frequently, others rarely, or not at all.
- ## Encapsulation
- What is **Encapsulation**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:23:51.445Z
card-last-reviewed:: 2022-10-07T10:23:51.446Z
card-last-score:: 5
- In **encapsulation**, the ^^variables of a class will be hidden from other classes^^ and can only be accessed through the methods of their current class.
- This is also known as **data hiding**.
- Why use encapsulation? #card
card-last-interval:: 9.12
card-repeats:: 3
card-ease-factor:: 2.28
card-next-schedule:: 2022-10-29T10:28:34.075Z
card-last-reviewed:: 2022-10-20T08:28:34.075Z
card-last-score:: 5
- In OOP, ^^each object is responsible for its own data.^^
- This allows an object to have greater control over which data is available to be viewed externally, and how external objects can mutate the object's state.
- ### Encapsulation Type: Private
- What is the effect of making a field **private**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:33:20.443Z
card-last-reviewed:: 2022-10-20T08:33:20.443Z
card-last-score:: 5
- Making a field **private** encapsulates their values inside their object.
- No external class or object can access a private field.
-
- ## Constructors
- What are **constructors**? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:41:02.783Z
card-last-reviewed:: 2022-10-03T11:41:02.784Z
card-last-score:: 5
- Constructors:
- Initialise an object.
- Have the same name as their class.
- Have a close association with the fields:
- They contain the initial values stored in the fields.
- They contain the parameter values often used for these.
- What is the point of the keyword `this`? #card
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-14T15:39:43.016Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-03T11:39:43.018Z
- The `this` keyword refers to the current object in a method or constructor.
- The most common use of `this` is to distinguish between class attributes & parameters of the same name.
- If the input parameter variables in your constructor have the **same name** as your fields, you must use the `this` keyword to distinguish between the two.
- `this` = "belonging to this object".
- E.g.,
- ```java
public Bicycle(int speed, int gear, int cadence)
{
this.speed = speed;
this.gear = gear;
this.cadence = cadence;
}
```
-
- ## Methods
- What are **methods**? #card
card-last-score:: 3
card-repeats:: 4
card-next-schedule:: 2022-10-24T09:51:35.296Z
card-last-interval:: 15.05
card-ease-factor:: 1.94
card-last-reviewed:: 2022-10-09T08:51:35.296Z
- **Methods** implement the *behaviour* of an object.
- They have a consistent structure comprised of a *header* and a *body*.
- ### Accessor Methods
- What are **accessor** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:14.037Z
card-last-reviewed:: 2022-10-07T10:19:14.038Z
card-last-score:: 5
- **Accessor** methods provide information about the state of an object.
- An accessor method always returns a type that is **not** `void`.
- An accessor method returns a value (*result*) of the type given in the **header**.
- The method will contain a **return** statement to return the value.
- ### Mutator Methods
- What are **mutator** methods? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:27:22.160Z
card-last-reviewed:: 2022-10-03T14:27:22.161Z
card-last-score:: 5
- **Mutator** methods alter (*mutate*) the state of an object.
- Achieved through changing the value of one or more fields.
- They typically contain one or more *assignment* statements.
- Often receive parameters.
- ![image.png](../assets/image_1663063179688_0.png)
- ### Mutator Methods: Set
- Each field may have a dedicated **set** mutator method.
- These have a simple, distinctive form:
- **void** return type
- method name related to the field name
- a single formal parameter, with the same type as the type of the field
- a single assignment statement
- A typical "set" method:
- ```java
public void setGear (int number)
{
gear = number;
}
```
- ### Protector Mutators
- A set method does not always have to assign unconditionally to the field.
- The parameter may be checked for validity and rejected if innappropriate.
- Mutators thereby protect fields.
- Mutators also support *encapsulation*.
- #### Protecting a Field
- ```java
public void setGear (int gearing)
{
// this conditional statement prevents innapropriate action.
// if protects the "gear" field from values that are too large or too small.
if (gearing >= 1 && gearing <= 18)
{
gear = gearing;
}
else
{
System.out.println("Exceeds maximum gear ratio. Gear not set");
}
}
```
- ### Method Structure
- The **header**:
- The head tells us:
- the *visibility* of the method to objects of other class.
- whether or not the method *returns a result*.
- the *name* of the method.
- whether or not the method takes *parameters*.
- E.g.,
- ```java
public int getSpeed()
```
- The **body** encloses the method's *statements*.
-
- ## C vs Java
- Unlike C, an OOP program will **not** have a pool of global variables that each method can access.
- Instead, ^^each object has its own data^^, and other objects rely on the *accessor* methods of the object to access the data.
-
- ## Conditional Statements
- Conditional statements in Java have the same format as in C.
- ```java
if (condition) {
do something;
}
else {
do somethingElse;
}
```
- ![image.png](../assets/image_1663063508214_0.png)

View File

@ -0,0 +1,455 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:12:09.399Z
card-last-reviewed:: 2022-09-18T15:12:09.400Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T15:29:17.338Z
card-last-reviewed:: 2022-09-19T18:29:17.338Z
card-last-score:: 3
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:47:35.615Z
card-last-score:: 1
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:07:45.168Z
card-last-reviewed:: 2022-09-18T15:07:45.169Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:28:42.062Z
card-last-reviewed:: 2022-09-19T18:28:42.063Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:31:34.947Z
card-last-score:: 1
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:29:49.066Z
card-last-reviewed:: 2022-09-19T18:29:49.066Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:18:53.347Z
card-last-reviewed:: 2022-09-18T15:18:53.348Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:00:01.564Z
card-last-reviewed:: 2022-09-18T15:00:01.564Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T14:59:31.645Z
card-last-reviewed:: 2022-09-18T14:59:31.645Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 3.02
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T18:30:45.595Z
card-last-reviewed:: 2022-09-19T18:30:45.595Z
card-last-score:: 3
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:48:22.142Z
card-last-score:: 1
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T17:43:35.395Z
card-last-score:: 1
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T17:42:08.681Z
card-last-reviewed:: 2022-09-19T17:42:08.682Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:21:56.198Z
card-last-reviewed:: 2022-09-19T18:21:56.199Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:21:42.644Z
card-last-reviewed:: 2022-09-19T18:21:42.644Z
card-last-score:: 3
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:50:21.132Z
card-last-reviewed:: 2022-09-18T14:50:21.132Z
card-last-score:: 3
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:24:52.976Z
card-last-reviewed:: 2022-09-19T18:24:52.976Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:31:23.177Z
card-last-score:: 1
- The primary role of the **Data Protection Officer** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
-
- When is the DPO a mandatory role? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:50:53.979Z
card-last-reviewed:: 2022-09-18T14:50:53.980Z
card-last-score:: 3
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T14:47:48.190Z
card-last-reviewed:: 2022-09-19T17:47:48.191Z
card-last-score:: 3
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:29:33.325Z
card-last-reviewed:: 2022-09-19T18:29:33.326Z
card-last-score:: 5
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T17:49:31.101Z
card-last-score:: 1
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides data processing services (e.g., storage) on behalf of the Data Controller even without determining the purposes & means of processing.
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:22:42.639Z
card-last-reviewed:: 2022-09-19T18:22:42.640Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T20:03:27.743Z
card-last-reviewed:: 2022-09-14T20:03:27.743Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:24:23.550Z
card-last-reviewed:: 2022-09-19T18:24:23.551Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T15:12:42.615Z
card-last-score:: 1
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:48:34.015Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen and first sent by the web server, and stored on the client computer by the web browser.
- The browser then send them back to the server with every request.
- Browser are required to:
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T17:43:45.262Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:22:21.715Z
card-last-reviewed:: 2022-09-19T18:22:21.716Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-23T18:27:39.082Z
card-last-reviewed:: 2022-09-19T18:27:39.082Z
card-last-score:: 3
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: 9.55
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-29T07:30:31.159Z
card-last-reviewed:: 2022-09-19T18:30:31.160Z
card-last-score:: 5
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-09-29T22:47:37.954Z
card-last-reviewed:: 2022-09-19T17:47:37.955Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:10:53.386Z
card-last-reviewed:: 2022-09-18T15:10:53.387Z
card-last-score:: 5
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:48:01.065Z
card-last-reviewed:: 2022-09-18T14:48:01.066Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T14:58:17.880Z
card-last-reviewed:: 2022-09-18T14:58:17.881Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,454 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:25:05.425Z
card-last-reviewed:: 2022-10-01T13:25:05.425Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-25T02:35:29.600Z
card-last-reviewed:: 2022-10-03T14:35:29.600Z
card-last-score:: 5
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: 3.33
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T16:20:39.425Z
card-last-reviewed:: 2022-09-30T09:20:39.425Z
card-last-score:: 3
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:40.152Z
card-last-reviewed:: 2022-09-30T08:30:40.153Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:28:42.062Z
card-last-reviewed:: 2022-09-19T18:28:42.063Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T11:22:24.235Z
card-last-reviewed:: 2022-09-30T09:22:24.235Z
card-last-score:: 3
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:29:49.066Z
card-last-reviewed:: 2022-09-19T18:29:49.066Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:23.953Z
card-last-reviewed:: 2022-10-01T17:35:23.953Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:16:08.483Z
card-last-reviewed:: 2022-09-30T12:16:08.484Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:30.278Z
card-last-reviewed:: 2022-09-30T08:26:30.278Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:40:56.633Z
card-last-reviewed:: 2022-10-03T11:40:56.633Z
card-last-score:: 5
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T09:08:33.941Z
card-last-reviewed:: 2022-09-30T09:08:33.941Z
card-last-score:: 3
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: 4.14
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-04T12:23:04.068Z
card-last-reviewed:: 2022-09-30T09:23:04.069Z
card-last-score:: 5
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:07.775Z
card-last-reviewed:: 2022-10-03T11:42:07.775Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:44:35.417Z
card-last-reviewed:: 2022-10-03T11:44:35.417Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:44:29.504Z
card-last-reviewed:: 2022-10-03T11:44:29.504Z
card-last-score:: 5
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:46.167Z
card-last-reviewed:: 2022-09-30T12:13:46.167Z
card-last-score:: 3
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:03.265Z
card-last-reviewed:: 2022-10-03T14:28:03.266Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: 3.51
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T00:26:04.053Z
card-last-reviewed:: 2022-10-04T12:26:04.054Z
card-last-score:: 5
- The primary role of the **Data Protection Officer** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
-
- When is the DPO a mandatory role? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:34:15.483Z
card-last-score:: 1
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:34:50.476Z
card-last-score:: 1
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: 11.32
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-01T01:29:33.325Z
card-last-reviewed:: 2022-09-19T18:29:33.326Z
card-last-score:: 5
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-08T12:32:03.127Z
card-last-reviewed:: 2022-10-04T12:32:03.127Z
card-last-score:: 3
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides data processing services (e.g., storage) on behalf of the Data Controller even without determining the purposes & means of processing.
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:45:06.983Z
card-last-reviewed:: 2022-10-03T11:45:06.983Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T05:45:53.547Z
card-last-reviewed:: 2022-09-30T08:45:53.547Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:28:54.471Z
card-last-reviewed:: 2022-09-30T08:28:54.471Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:09:15.904Z
card-last-score:: 1
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:28:04.276Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen and first sent by the web server, and stored on the client computer by the web browser.
- The browser then send them back to the server with every request.
- Browser are required to:
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:27:31.803Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:44:48.596Z
card-last-reviewed:: 2022-10-03T11:44:48.597Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:29:49.936Z
card-last-reviewed:: 2022-10-03T14:29:49.936Z
card-last-score:: 5
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:07:55.485Z
card-last-score:: 1
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-01T19:10:41.850Z
card-last-reviewed:: 2022-10-04T12:10:41.851Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:21:36.210Z
card-last-reviewed:: 2022-10-01T13:21:36.211Z
card-last-score:: 3
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:34.042Z
card-last-reviewed:: 2022-09-30T12:13:34.042Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:15:51.772Z
card-last-reviewed:: 2022-09-30T12:15:51.773Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,459 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:25:05.425Z
card-last-reviewed:: 2022-10-01T13:25:05.425Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-25T02:35:29.600Z
card-last-reviewed:: 2022-10-03T14:35:29.600Z
card-last-score:: 5
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:20:12.573Z
card-last-score:: 1
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:40.152Z
card-last-reviewed:: 2022-09-30T08:30:40.153Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-28T22:35:32.027Z
card-last-reviewed:: 2022-10-07T10:35:32.027Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:36:37.829Z
card-last-reviewed:: 2022-10-07T10:36:37.829Z
card-last-score:: 3
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-02T15:18:23.577Z
card-last-reviewed:: 2022-10-07T10:18:23.578Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:23.953Z
card-last-reviewed:: 2022-10-01T17:35:23.953Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:16:08.483Z
card-last-reviewed:: 2022-09-30T12:16:08.484Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:30.278Z
card-last-reviewed:: 2022-09-30T08:26:30.278Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:40:56.633Z
card-last-reviewed:: 2022-10-03T11:40:56.633Z
card-last-score:: 5
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:04.604Z
card-last-reviewed:: 2022-10-07T10:19:04.605Z
card-last-score:: 5
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:27:56.517Z
card-last-score:: 1
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:07.775Z
card-last-reviewed:: 2022-10-03T11:42:07.775Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:44:35.417Z
card-last-reviewed:: 2022-10-03T11:44:35.417Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:44:29.504Z
card-last-reviewed:: 2022-10-03T11:44:29.504Z
card-last-score:: 5
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:46.167Z
card-last-reviewed:: 2022-09-30T12:13:46.167Z
card-last-score:: 3
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:03.265Z
card-last-reviewed:: 2022-10-03T14:28:03.266Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T22:48:55.229Z
card-last-score:: 1
- The primary role of the **Data Protection Officer (DPO)** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
- When is the DPO a mandatory role? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:15:05.911Z
card-last-reviewed:: 2022-10-07T10:15:05.911Z
card-last-score:: 3
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:33:20.988Z
card-last-reviewed:: 2022-10-08T15:33:20.988Z
card-last-score:: 5
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:26:05.237Z
card-last-score:: 1
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-18T05:00:00.250Z
card-last-reviewed:: 2022-10-08T23:00:00.251Z
card-last-score:: 5
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides **data processing services** (e.g., storage) on behalf of the **Data Controller** ^^even without determining the purposes & means of processing.^^
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:45:06.983Z
card-last-reviewed:: 2022-10-03T11:45:06.983Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T05:45:53.547Z
card-last-reviewed:: 2022-09-30T08:45:53.547Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:28:54.471Z
card-last-reviewed:: 2022-09-30T08:28:54.471Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:31:55.799Z
card-last-reviewed:: 2022-10-07T10:31:55.800Z
card-last-score:: 3
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:26:56.144Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen & first sent by the web server, and stored on the client computer by the web browser.
- The browser then sends them back to the server with every request.
- Browsers are required to: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:27:06.498Z
card-last-score:: 1
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:19:47.270Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:44:48.596Z
card-last-reviewed:: 2022-10-03T11:44:48.597Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:29:49.936Z
card-last-reviewed:: 2022-10-03T14:29:49.936Z
card-last-score:: 5
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: 2.98
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T14:29:23.228Z
card-last-reviewed:: 2022-10-08T15:29:23.228Z
card-last-score:: 3
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-01T19:10:41.850Z
card-last-reviewed:: 2022-10-04T12:10:41.851Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:21:36.210Z
card-last-reviewed:: 2022-10-01T13:21:36.211Z
card-last-score:: 3
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:13:34.042Z
card-last-reviewed:: 2022-09-30T12:13:34.042Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:15:51.772Z
card-last-reviewed:: 2022-09-30T12:15:51.773Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,459 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:25:05.425Z
card-last-reviewed:: 2022-10-01T13:25:05.425Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-25T02:35:29.600Z
card-last-reviewed:: 2022-10-03T14:35:29.600Z
card-last-score:: 5
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:36:09.651Z
card-last-score:: 1
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:40.152Z
card-last-reviewed:: 2022-09-30T08:30:40.153Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-28T22:35:32.027Z
card-last-reviewed:: 2022-10-07T10:35:32.027Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:36:37.829Z
card-last-reviewed:: 2022-10-07T10:36:37.829Z
card-last-score:: 3
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-02T15:18:23.577Z
card-last-reviewed:: 2022-10-07T10:18:23.578Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:23.953Z
card-last-reviewed:: 2022-10-01T17:35:23.953Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:16:08.483Z
card-last-reviewed:: 2022-09-30T12:16:08.484Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:30.278Z
card-last-reviewed:: 2022-09-30T08:26:30.278Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:40:56.633Z
card-last-reviewed:: 2022-10-03T11:40:56.633Z
card-last-score:: 5
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-16T15:43:04.896Z
card-last-reviewed:: 2022-10-19T08:43:04.897Z
card-last-score:: 5
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:38:02.257Z
card-last-reviewed:: 2022-10-10T11:38:02.257Z
card-last-score:: 3
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:07.775Z
card-last-reviewed:: 2022-10-03T11:42:07.775Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:44:35.417Z
card-last-reviewed:: 2022-10-03T11:44:35.417Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:44:29.504Z
card-last-reviewed:: 2022-10-03T11:44:29.504Z
card-last-score:: 5
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:43:37.835Z
card-last-reviewed:: 2022-10-10T11:43:37.836Z
card-last-score:: 5
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:03.265Z
card-last-reviewed:: 2022-10-03T14:28:03.266Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:39:55.084Z
card-last-reviewed:: 2022-10-10T11:39:55.085Z
card-last-score:: 3
- The primary role of the **Data Protection Officer (DPO)** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
- When is the DPO a mandatory role? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:15:05.911Z
card-last-reviewed:: 2022-10-07T10:15:05.911Z
card-last-score:: 3
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:33:20.988Z
card-last-reviewed:: 2022-10-08T15:33:20.988Z
card-last-score:: 5
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:43.256Z
card-last-score:: 1
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-18T05:00:00.250Z
card-last-reviewed:: 2022-10-08T23:00:00.251Z
card-last-score:: 5
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides **data processing services** (e.g., storage) on behalf of the **Data Controller** ^^even without determining the purposes & means of processing.^^
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:45:06.983Z
card-last-reviewed:: 2022-10-03T11:45:06.983Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:42:55.024Z
card-last-reviewed:: 2022-10-10T11:42:55.025Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:28:54.471Z
card-last-reviewed:: 2022-09-30T08:28:54.471Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:31:55.799Z
card-last-reviewed:: 2022-10-07T10:31:55.800Z
card-last-score:: 3
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:51.259Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen & first sent by the web server, and stored on the client computer by the web browser.
- The browser then sends them back to the server with every request.
- Browsers are required to: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:33:35.161Z
card-last-score:: 1
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:57.374Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:44:48.596Z
card-last-reviewed:: 2022-10-03T11:44:48.597Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:29:49.936Z
card-last-reviewed:: 2022-10-03T14:29:49.936Z
card-last-score:: 5
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: 2.98
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T14:29:23.228Z
card-last-reviewed:: 2022-10-08T15:29:23.228Z
card-last-score:: 3
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-01T19:10:41.850Z
card-last-reviewed:: 2022-10-04T12:10:41.851Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:21:36.210Z
card-last-reviewed:: 2022-10-01T13:21:36.211Z
card-last-score:: 3
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:43:32.762Z
card-last-reviewed:: 2022-10-10T11:43:32.763Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:15:51.772Z
card-last-reviewed:: 2022-09-30T12:15:51.773Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,459 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:25:05.425Z
card-last-reviewed:: 2022-10-01T13:25:05.425Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-25T02:35:29.600Z
card-last-reviewed:: 2022-10-03T14:35:29.600Z
card-last-score:: 5
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-30T11:48:25.295Z
card-last-reviewed:: 2022-10-26T11:48:25.296Z
card-last-score:: 3
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:33:16.784Z
card-last-reviewed:: 2022-10-20T08:33:16.785Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-28T22:35:32.027Z
card-last-reviewed:: 2022-10-07T10:35:32.027Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:36:37.829Z
card-last-reviewed:: 2022-10-07T10:36:37.829Z
card-last-score:: 3
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-02T15:18:23.577Z
card-last-reviewed:: 2022-10-07T10:18:23.578Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:23.953Z
card-last-reviewed:: 2022-10-01T17:35:23.953Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:35:02.681Z
card-last-reviewed:: 2022-10-20T08:35:02.682Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:30.278Z
card-last-reviewed:: 2022-09-30T08:26:30.278Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:40:56.633Z
card-last-reviewed:: 2022-10-03T11:40:56.633Z
card-last-score:: 5
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-16T15:43:04.896Z
card-last-reviewed:: 2022-10-19T08:43:04.897Z
card-last-score:: 5
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:38:02.257Z
card-last-reviewed:: 2022-10-10T11:38:02.257Z
card-last-score:: 3
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:07.775Z
card-last-reviewed:: 2022-10-03T11:42:07.775Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:44:35.417Z
card-last-reviewed:: 2022-10-03T11:44:35.417Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:44:29.504Z
card-last-reviewed:: 2022-10-03T11:44:29.504Z
card-last-score:: 5
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:43:37.835Z
card-last-reviewed:: 2022-10-10T11:43:37.836Z
card-last-score:: 5
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:03.265Z
card-last-reviewed:: 2022-10-03T14:28:03.266Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:39:55.084Z
card-last-reviewed:: 2022-10-10T11:39:55.085Z
card-last-score:: 3
- The primary role of the **Data Protection Officer (DPO)** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
- When is the DPO a mandatory role? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:15:05.911Z
card-last-reviewed:: 2022-10-07T10:15:05.911Z
card-last-score:: 3
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:33:20.988Z
card-last-reviewed:: 2022-10-08T15:33:20.988Z
card-last-score:: 5
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:38:22.885Z
card-last-score:: 1
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-18T05:00:00.250Z
card-last-reviewed:: 2022-10-08T23:00:00.251Z
card-last-score:: 5
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides **data processing services** (e.g., storage) on behalf of the **Data Controller** ^^even without determining the purposes & means of processing.^^
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:45:06.983Z
card-last-reviewed:: 2022-10-03T11:45:06.983Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:42:55.024Z
card-last-reviewed:: 2022-10-10T11:42:55.025Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:32:03.843Z
card-last-reviewed:: 2022-10-20T08:32:03.844Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:31:55.799Z
card-last-reviewed:: 2022-10-07T10:31:55.800Z
card-last-score:: 3
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:51.259Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen & first sent by the web server, and stored on the client computer by the web browser.
- The browser then sends them back to the server with every request.
- Browsers are required to: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:33:35.161Z
card-last-score:: 1
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:57.374Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:44:48.596Z
card-last-reviewed:: 2022-10-03T11:44:48.597Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:29:49.936Z
card-last-reviewed:: 2022-10-03T14:29:49.936Z
card-last-score:: 5
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: 2.98
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T14:29:23.228Z
card-last-reviewed:: 2022-10-08T15:29:23.228Z
card-last-score:: 3
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-01T19:10:41.850Z
card-last-reviewed:: 2022-10-04T12:10:41.851Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-11-12T18:37:21.841Z
card-last-reviewed:: 2022-10-20T08:37:21.841Z
card-last-score:: 5
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:43:32.762Z
card-last-reviewed:: 2022-10-10T11:43:32.763Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:34:39.740Z
card-last-reviewed:: 2022-10-20T08:34:39.740Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,459 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous topic:** [[Introduction to Cybersecurity]]
- **Next Topic:** [[Introduction to Cryptography]]
- **Relevant lecture slides:** ![Lecture01.pdf](../assets/Lecture01_1662819128126_0.pdf)
-
- ## Motivation
- What are **Cyberattacks**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:25:05.425Z
card-last-reviewed:: 2022-10-01T13:25:05.425Z
card-last-score:: 5
- Cyberattacks are aimed at **accessing, changing, or destroying sensitive information**, extorting money, or interrupting normal business processes.
- Managing sensitive data may reduce the attack probability, or at least its impact.
- **GDPR** provides such a regulatory framework
-
- ## General Data Protection Regulation
- What is **GDPR**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-25T02:35:29.600Z
card-last-reviewed:: 2022-10-03T14:35:29.600Z
card-last-score:: 5
- The **General Data Protection Regulation** is a binding regulation in EU law on data protection in the European Union and the European Economic Area (EEA).
- The primary aim of GDPR is to ^^enhance individuals' control & rights over their personal data and to simplify the regulatory environment for international business.^^
- The regulation contains ^^provisions & requirements related to the processing of personal data of individuals^^ who are located in the EEA, and applies to any enterprise that is processing the personal data of individuals inside the EEA - ^^regardless of its location and the data subjects' citizenship or residence.^^
- ### GDPR Overview
- The GDPR sets out several key principles: #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-30T11:48:25.295Z
card-last-reviewed:: 2022-10-26T11:48:25.296Z
card-last-score:: 3
- Lawfulness
- Fairness & Transparency
- Purpose Limitation
- Data Minimsation
- Accuracy
- Storage Limitation
- Integrity & Confidentiality (Security)
- Accountability
- What is **Lawfulness** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:33:16.784Z
card-last-reviewed:: 2022-10-20T08:33:16.785Z
card-last-score:: 5
- You must identify ^^**valid grounds** under the GDPR (known as a "**lawful basis**")^^ for collecting & using personal data.
- Processing shall be lawful if and to the extent that at least one of the following applies:
- Consensual
- Necessary for the performance of a contract
- Necessary for compliance with a legal obligation
- Necessary to protect the vital interests of the data subject or another person
- Necessary for the performance of a task carried out in public interest
- Necessary for the purpose of legitimate interests
- What is **Fairness & Transparency** in GDPR? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-28T22:35:32.027Z
card-last-reviewed:: 2022-10-07T10:35:32.027Z
card-last-score:: 3
- You must ^^use personal data in a way that is fair.^^ This means that you must not process the data in a way that is unduly detrimental, unexpected, or misleading to the individuals concerned.
- You must be ^^clear, open, & honest^^ with data subjects from the start about how you will use their personal data.
- At the time personal data is being collected from data subjects, they must be informed via a "**Data Protection Notice**".
- What is a **Data Protection Notice**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:36:37.829Z
card-last-reviewed:: 2022-10-07T10:36:37.829Z
card-last-score:: 3
- A **Data Protection Notice** entails:
- The identity & contact details of the **data controller**
- The contact details of the **data protection officer**
- The **purpose of the processing** & the legal basis for the processing
- The recipients or categories of **recipients of the data**
- Details of any transfers out of the EEA, the safeguards in place, and the means by which to obtain a copy of them
- The **data retention** period or the criteria to determine the data retention period
- The **individual's rights** (access, rectification & erasure, restriction, complaint)
- What is **Purpose Limitation** in GDPR? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-02T15:18:23.577Z
card-last-reviewed:: 2022-10-07T10:18:23.578Z
card-last-score:: 5
- You must be ^^clear about what your purposes for processing^^ are from the start.
- You must ^^record your purposes^^ as part of your documentation obligations and specify them in your privacy information for individuals.
- You ^^can only use the personal data for a new purpose^^ if it is either compatible with your original purpose, you get **consent**, or you have a **clear basis in law**.
- What is **Data Minimisation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:23.953Z
card-last-reviewed:: 2022-10-01T17:35:23.953Z
card-last-score:: 5
- You must ensure that the personal data that you are processing is:
- **adequate** - sufficient to properly fulfil your stated purpose
- **relevant** - has a rational link to that purpose
- **limited** to what is necessary - you do not hold more than what you need for your stated purpose
- What is **Accuracy** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:35:02.681Z
card-last-reviewed:: 2022-10-20T08:35:02.682Z
card-last-score:: 5
- You should take all reasonable steps to ensure that the personal data you hold is ^^not incorrect or misleading^^ as to any matter of fact.
- You may need to ^^keep the personal data updated^^, although this will depend on what you are using it for.
- If you ^^discover that personal data is incorrect or misleading^^, you must take reasonable steps to correct or erase it as soon as possible.
- You must ^^carefully consider any challenges to the accuracy^^ of personal data.
- What is **Storage Limitation** in GDPR? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:30.278Z
card-last-reviewed:: 2022-09-30T08:26:30.278Z
card-last-score:: 5
- You must not keep personal data for ^^longer than you need it^^.
- You need to think about - and be able to justify - ^^how long you keep personal data^^. This will depend on your purposes for holding the data.
- You need a policy ^^setting standard retention periods^^ wherever possible, to comply with documentation requirements.
- You should also ^^periodically review the data you hold^^, and erase or anonymise it when you no longer need it.
- You must ^^carefully consider any challenges to your retention of data^^.
- Individuals have a **right to erasure** if you no longer need the data.
- You can ^^keep personal data for longer^^ if you are only keeping it for ^^personal interest archiving, scientific or historical research, or statistical purposes.^^
- What is **Accountability & Governance** in GDPR? #card
card-last-interval:: 10.56
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-14T00:40:56.633Z
card-last-reviewed:: 2022-10-03T11:40:56.633Z
card-last-score:: 5
- **Accountability** is one of the **data protection principles** - it makes you responsible for complying with the GDPR and says that ^^you must be able to demonstrate your compliance.^^
- You need to put in place appropriate technical & organisational measures to meet the requirements of accountability.
- Accountability requires controllers to maintain records of processing activities in order to demonstrate how they comply with the data protection principles, i.e.:
- Inventory of personal data
- Providing assurance of compliance
- Need to document
- Why it is held
- How it is collected
- When it will be deleted
- Who may gain access to it
- What is **Integrity & Confidentiality** in GDPR? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-16T15:43:04.896Z
card-last-reviewed:: 2022-10-19T08:43:04.897Z
card-last-score:: 5
- A key principle of GDPR is that you process personal data ^^securely by means of "appropriate technical & organisational measures"^^ - this is the "**security principle**".
- Doing this requires you to consider things like ^^risk analysis, organisational policies, and physical + technical measures.^^
- Where appropriate, you should look to use measures such as **pseudoanonymisation** and **encryption**.
- Your measures must ensure the ^^"confidentiality, integrity, & availability"^^ of your systems & services and the personal data you process with them.
- The measures must also enable you to ^^restore access & availability^^ to personal data in a timely manner in the event of a physical or technical incident.
- What is **Data Protection**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:38:02.257Z
card-last-reviewed:: 2022-10-10T11:38:02.257Z
card-last-score:: 3
- **Data Protection** is about an ^^individual's fundamental right to privacy.^^
- When an individual gives their personal data to any organisation, the recipient has the duty to keep the data both safe & private. This applies to both printed & electronic data.
- What does Data Protection Legislation do? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:07.775Z
card-last-reviewed:: 2022-10-03T11:42:07.775Z
card-last-score:: 3
- Data Protection Legislation:
- governs the way we deal with personal data / information
- provides a mechanism for safeguarding the privacy rights of individuals in relation to the processing of their data
- upholds rights and enforces obligations
- What is **Personal Data**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:44:35.417Z
card-last-reviewed:: 2022-10-03T11:44:35.417Z
card-last-score:: 5
- **Personal Data** is any information relating to an identified or ^^identifiable natural person^^ ("data subject").
- What is an **identifiable natural person**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:44:29.504Z
card-last-reviewed:: 2022-10-03T11:44:29.504Z
card-last-score:: 5
- An **identifiable natural person** is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier to one or more factors specific to the ^^physical, physiological, genetic, mental, economic, cultural, or social identity^^ of that natural person.
- What is **Data Processing**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:43:37.835Z
card-last-reviewed:: 2022-10-10T11:43:37.836Z
card-last-score:: 5
- **Data Processing** is ^^performing any operation on personal data^^, either manually or by automated means, including:
- Obtaining
- Storing
- Transmitting
- Recording
- Organising
- Altering
- Disclosing
- Erasing
- ### Entities in GDPR
- GDPR distinguishes between:
- The **Data Subject**
- The **Data Protection Officer (DPO)**
- The **Data Controller**
- The **Data Processor**
-
- What is the **Data Subject**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:03.265Z
card-last-reviewed:: 2022-10-03T14:28:03.266Z
card-last-score:: 5
- The **Data Subject** is the person to whom the data relates.
- GDPR only applies to living individuals, but any duty of confidence in place prior to the death extends beyond that point.
- In Ireland, the next of kin of the deceased are entitled to a Freedom of Information request to the deceased's personal data.
- What is the **DPO**? #card
card-last-interval:: 3.09
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-13T13:39:55.084Z
card-last-reviewed:: 2022-10-10T11:39:55.085Z
card-last-score:: 3
- The primary role of the **Data Protection Officer (DPO)** is to ^^ensure that their organisation processes the personal data of its staff, customers, and other data subjects in compliance with the applicable data protection rules.^^
- The Data Protection officer is required to be an expert within this field, along with the requirement for them to report to the highest management level.
- With this being a challenging aspect of GDPR compliance for smaller organisations, there is the option to make an external appointment of a third-part DPO.
- When is the DPO a mandatory role? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:15:05.911Z
card-last-reviewed:: 2022-10-07T10:15:05.911Z
card-last-score:: 3
- The DPO is a mandatory role within 3 different scenarios:
- 1. When the processing is undertaken by a public authority or body.
- 2. When an organisation's main activities require the frequent & large-scale monitoring of individual people.
- 3. Where large-scale processing of special categories of data or data relating to criminal records forms the core activities.
- What is the **Data Controller**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:33:20.988Z
card-last-reviewed:: 2022-10-08T15:33:20.988Z
card-last-score:: 5
- The **Data Controller** is the company or an individual who ^^has overall control over the processing of personal data.^^
- The Data Controller takes on the responsibility for GDPR compliance.
- A Data Controller needs to have had sufficient training and to be able to competently ensure the security & protection of data held within the organisation.
- What is the **Data Processor**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:38:22.885Z
card-last-score:: 1
- The **Data Processor** is the person who is ^^responsible for the processing of personal information.^^
- Generally, this role is undertaken under the instruction of the **data controller**.
- This might mean obtaining or recording the data, its adaption, and use. It may also include the disclosure of the data or making it available to others.
- Generally, the Data Processor is involved in the more technical elements of the operation, while the interpretation & main decision-making is the role of the Data Controller.
-
- ### Cloud Services & GDPR
- What makes a **Cloud Service Provider** a **Data Processor**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-18T05:00:00.250Z
card-last-reviewed:: 2022-10-08T23:00:00.251Z
card-last-score:: 5
- A **Cloud Service Provider** will be considered a **Data Processor** under GDPR if it provides **data processing services** (e.g., storage) on behalf of the **Data Controller** ^^even without determining the purposes & means of processing.^^
- A Cloud Service Provider that offers personal data processing services directly to Data Subjects will be considered a **Data Controller**.
- What are some key benefits of GDPR for Data Subjects? #card
collapsed:: true
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:45:06.983Z
card-last-reviewed:: 2022-10-03T11:45:06.983Z
card-last-score:: 3
- More information must be given to data subjects (e.g., how long the data will be kept, right to lodge a complaint).
- The Data Controller must explain & document the legal basis for processing the personal data.
- GDPR tightens the rules on how consent can be obtained.
- Must be distinguishable from other matters and in clear, plain language.
- It must be as easy to withdraw consent as it is to give it.
- Mandatory notification of security breaches without "undue delay" to the Data Protection Commissioner (within 72 hours).
- What are some key rights of Data Subjects? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:42:55.024Z
card-last-reviewed:: 2022-10-10T11:42:55.025Z
card-last-score:: 3
- Right of Access (copy to be provided within one month)
- Right to Erasure (the right to be forgotten)
- Right to Restriction of Processing
- Right to Object to Processing
- Right not to be subject to a decision based solely upon automated processing
- What are **Personal Data Security Breaches**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:32:03.843Z
card-last-reviewed:: 2022-10-20T08:32:03.844Z
card-last-score:: 5
- **Personal Data Security Breaches** include:
- Disclosure of confidential data to unauthorised individuals.
- Loss or theft of data or equipment upon which data is stored.
- Hacking, viruses, or other security attacks on IT equipment / systems / networks.
- Inappropriate access controls allowing unauthorised use of information.
- Emails containing personal data sent in error to the wrong recipient.
- Personal Data Security Breaches apply to both paper & electronic records.
-
- ## HTTP Cookies
- What is a **(HTTP) Cookie**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:31:55.799Z
card-last-reviewed:: 2022-10-07T10:31:55.800Z
card-last-score:: 3
- A **(HTTP) Cookie** is a small piece of data stored on the user's computer by the web browser while browsing a website.
- Cookies were designed to be a reliable mechanism for websites to remember stateful information (such as items in the shopping cart in an online store) or to record the user's browsing activity.
- They can be also be used to remember pieces of information that the user previously entered into form fields.
- **Authentication Cookies** are the most common method used by web servers to know whether the user is logged in or not, and which account they are logged into.
-
- #### Cookie Implementation
- How are cookies implemented? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:37:51.259Z
card-last-score:: 1
- Cookies are ^^arbitrary pieces of data^^ (i.e., large, random strings), usually chosen & first sent by the web server, and stored on the client computer by the web browser.
- The browser then sends them back to the server with every request.
- Browsers are required to: #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:33:35.161Z
card-last-score:: 1
- support cookies as large as 4,906 bytes in size
- support at least 50 cookies per domain
- support at least 3,000 cookies in total
-
- #### Cookie Structure
- What are the components of a cookie? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:35:57.374Z
card-last-score:: 1
- A cookie consists of the following components:
- Name
- Value
- Zero or more attributes (name - value pairs). These attributes store information such as the cookie's expiration, domain, and flags (such as *Secure* and *HttpOnly*)
-
- ### Session Cookies
- What is a **session cookie**? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:44:48.596Z
card-last-reviewed:: 2022-10-03T11:44:48.597Z
card-last-score:: 3
- A **session cookie** (aka in-memory cookie, transient cookie, or non-persistent cookie) is a cookie that ^^exists only in temporary memory while the user navigates its website.^^
- Web browsers normally delete session cookies when the user closes the browser.
- Session cookies do not have an expiration date assigned to them, which is how the browser know to treat them as session cookies.
-
-
- ### Persistent Cookies
- What is a **persistent cookie**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T20:29:49.936Z
card-last-reviewed:: 2022-10-03T14:29:49.936Z
card-last-score:: 5
- A **persistent cookie** is a cookie which ^^expires at a specific data or after a specific length of time.^^
- A persistent cookie's information will be transmitted to the server every time the user visits the website that the cookie belongs to, for the lifespan of the persistent cookie (as set by its creator), or every time that the user views a resource belonging to that website from another website (such as an advertisement).
-
- Persistent cookies are sometimes referred to as **tracking cookies** because they can be used by advertisers to record information about a user's web browsing habits.
- However, tracking cookies are mainly used for legitimate reasons, such as keeping users logged into their accounts on website to avoid re-entering login credentials at every visit.
-
- ### Cookie Attributes
- Consider the following response header sent by a webserver that contains 3 persistent cookies:
- ![image.png](../assets/image_1662819462897_0.png)
- What do the *Domain* and *Path* attributes do? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-11-18T18:45:59.521Z
card-last-reviewed:: 2022-11-09T12:45:59.521Z
card-last-score:: 3
- The *Domain* and *Path* attributes define the cookie's scope.
- What does the *Secure* attribute do? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-01T19:10:41.850Z
card-last-reviewed:: 2022-10-04T12:10:41.851Z
card-last-score:: 5
- The *Secure* attribute ensures that the cookie can only be transmitted over an **encrypted connection**, making it a "**secure cookie**".
- What does the *HttpOnly* attribute do? #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-11-12T18:37:21.841Z
card-last-reviewed:: 2022-10-20T08:37:21.841Z
card-last-score:: 5
- The *HttpOnly* attribute ^^directs cookies not to expose cookies through channels other than HTTP / HTTPS.^^
- This means that this HttpOnly cookie cannot be accessed via client-side scripting languages (notably JavaScript).
-
- ## GDPR & Cookies
- Generally, a user's consent must be sought before a cookie is installed in a web browser.
- There are **two** expemptions:
- The **Communications Exemption**
- The **Strictly Necessary Exemption**
-
- What is the **Communications Exemption**? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:43:32.762Z
card-last-reviewed:: 2022-10-10T11:43:32.763Z
card-last-score:: 3
- The **Communications Exemption** applies to cookies ^^whose sole purpose is for carrying out the transmission of a communication over a network^^, for example, to identify the communication endpoints.
- Cookies that meet these criteria are exempted from being required to ask for the user's consent prior to installation.
- **Example:** load-balancing cookies that distribute network traffic across different backend servers, also known as **session stickiness**.
- Here, a **load-balancer** creates an affinity between a client and a specific network server for the duration of a session using a cookie with a random & unique tracking ID.
- Subsequently, the load-balancer routes all the of the requests from this client to a specific backend server using the tracking ID, for the duration of the session.
- ![image.png](../assets/image_1662820187995_0.png){:height 426, :width 529}
-
-
- What is the **Strictly Necessary** exemption? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-17T15:34:39.740Z
card-last-reviewed:: 2022-10-20T08:34:39.740Z
card-last-score:: 5
- The **Strictly Necessary** exemption exempts cookies that are strictly necessary to provide the service of delivered over the internet, i.e., a website or app from being required to ask the user's consent prior to installation.
- ^^This service must have been explicitly requested by the user (i.e., typing in the URL), and the use of the cookie must be restricted to what is strictly necessary to provide that service.^^
- Cookies related to advertising are **not** strictly necessary, and must be consented to.
- Examples:
- A website uses session cookies to keep track of items that a user places in an online shopping basket (assuming that this cookie will be deleted once the session is over).
- Cookies that a record a user's language or country preference.

View File

@ -0,0 +1,87 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[Introduction to Cryptography]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_03.pdf](../assets/ct255_03_1664798420872_0.pdf)
-
- What is a **password**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-07T14:27:05.787Z
card-last-reviewed:: 2022-10-03T14:27:05.787Z
card-last-score:: 5
- A **password** is a memorised secret used to confirm the identity of a user.
- Typically, an arbitrary string of characters including letters, digits, or other symbols.
- A purely numeric secret is called a **Personal Identification Number (PIN)**.
- The secret is memorised by a party called the **claimant** while the party verifying the identity of the claimant is called the **verifier**.
- The claimant & the verifier communicate via an **authentication protocol**.
- # Some Password Alternatives
- One-Time Password (OTP).
- Transaction Authentication Number (TAN) list used for online banking - they can only be used once.
- Time-synchronised one-time passwords.
- Biometric methods.
- Fingerprints, irises, voice, face.
- Cognitive passwords.
- Use question & answer cue/response pairs to verify identity.
-
- # Algorithmic Generation of OTP
- Paper-based TANs are hard to manage -> both the claimant and the verifier need to have a copy of every OTP (possibly hundreds of them).
- Idea: each OTP may be created from the passt OTPs used.
- An example of this type of algorithm, credited to Leslie Lamport, uses a **one-way function** (hash function).
- ## One-Way Functions
- What is a **hash function**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:27:00.494Z
card-last-score:: 1
- A **one-way function** $H$ produces a fixed-size output $h$ based on a variable size input $s$.
- $$H(s) = h$$
- $H$ is also called a **hash function**, $h$ is called a **hash** (value).
- Important: *one-way property*:
- For a given hash code $h$, it is infeasible to find $s$ that $H(s) = h$.
- ### Leslie Lamport's Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:26:52.424Z
card-last-score:: 1
- For every claimant, a random seed (starting value) $s$ is chosen.
- A hash function $H(s)$ is applied repeatedly (e.g., 1,000 times) to the seed, giving a value of:
- $$H(H(H(...(H(s)....))))$$
- The user's first login uses an OTP $p$ derived by applying $H$ 999 times to the seed, i.e., $H^{999}(s)$.
- The verifier can authenticate that this is the correct OTP, because $H(p) = H^{1000}(s)$, the value stored.
- The value stored is then replaced by $p$ and the user is allowed to log in.
- The next login must be accompanied by $H^{998}(s)$.
- Again, this can be validated because hashing gives $H^{999}(s)$ which is $p$, the value stored after the previous login.
- The new value replaces $p$ and the user is authenticated.
- This process can be repeated another 997 times, each time the password will be $H$ applied one fewer times.
- ### Time-Synchronised OTP #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:26:07.581Z
card-last-score:: 1
- Each user has a unique piece of hardware called a **security token** that generates an OTP (e.g., mobile phone).
- Inside the token is an accurate clock that has been synchronised with the clock of the verifer.
- Both claimant token and verifier server calculate identical OTPs that are based on time.
- ![image.png](../assets/image_1664799869963_0.png)
-
- # Some New Biometric Methods
- **Hand geometry:** Measurement & comparison of the (unique) different physical characteristics of the hand.
- **Palm vein authentication:** Uses an infrared beam to penetrate the user's hand as it is waved over the system; the veins within the palm are returned as black lines.
- **Retina scan:** Provides an analysis of the capillary blood vessels located in the back of the eye.
- **Iris scan:** Provides an analysis of the rings, furrows, & freckles in the coloured ring that surrounds the pupil of the eye.
- Face recognition, signature, & voice analysis.
- **Behavioural biometrics:**
- ![image.png](../assets/image_1664800188644_0.png)
-
- # Multi-Factor Authentication
- This may include a combination of the following:
- Some physical object in the possession of the user, e.g., a USB stick with a secret token, a bank card, a key, etc.
- Some secret known only to the user, such as a password, PIN, TAN, etc.
- Some physical characteristic of the user (biometrics), such as a fingerprint, eye iris, voice, typing speed, pattern in key press intervals, etc.
- Somewhere you are, such as connection to a specific computing network or utilising a GPS signal to identify the location.

View File

@ -0,0 +1,87 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[Introduction to Cryptography]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_03.pdf](../assets/ct255_03_1664798420872_0.pdf)
-
- What is a **password**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-07T14:27:05.787Z
card-last-reviewed:: 2022-10-03T14:27:05.787Z
card-last-score:: 5
- A **password** is a memorised secret used to confirm the identity of a user.
- Typically, an arbitrary string of characters including letters, digits, or other symbols.
- A purely numeric secret is called a **Personal Identification Number (PIN)**.
- The secret is memorised by a party called the **claimant** while the party verifying the identity of the claimant is called the **verifier**.
- The claimant & the verifier communicate via an **authentication protocol**.
- # Some Password Alternatives
- One-Time Password (OTP).
- Transaction Authentication Number (TAN) list used for online banking - they can only be used once.
- Time-synchronised one-time passwords.
- Biometric methods.
- Fingerprints, irises, voice, face.
- Cognitive passwords.
- Use question & answer cue/response pairs to verify identity.
-
- # Algorithmic Generation of OTP
- Paper-based TANs are hard to manage -> both the claimant and the verifier need to have a copy of every OTP (possibly hundreds of them).
- Idea: each OTP may be created from the passt OTPs used.
- An example of this type of algorithm, credited to Leslie Lamport, uses a **one-way function** (hash function).
- ## One-Way Functions
- What is a **hash function**? #card
card-last-interval:: 1.47
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-08T21:37:51.695Z
card-last-reviewed:: 2022-10-07T10:37:51.696Z
card-last-score:: 3
- A **one-way function** $H$ produces a fixed-size output $h$ based on a variable size input $s$.
- $$H(s) = h$$
- $H$ is also called a **hash function**, $h$ is called a **hash** (value).
- Important: *one-way property*:
- For a given hash code $h$, it is infeasible to find $s$ that $H(s) = h$.
- ### Leslie Lamport's Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-reviewed:: 2022-10-07T10:48:57.261Z
card-last-score:: 1
- For every claimant, a random seed (starting value) $s$ is chosen.
- A hash function $H(s)$ is applied repeatedly (e.g., 1,000 times) to the seed, giving a value of:
- $$H(H(H(...(H(s)....))))$$
- The user's first login uses an OTP $p$ derived by applying $H$ 999 times to the seed, i.e., $H^{999}(s)$.
- The verifier can authenticate that this is the correct OTP, because $H(p) = H^{1000}(s)$, the value stored.
- The value stored is then replaced by $p$ and the user is allowed to log in.
- The next login must be accompanied by $H^{998}(s)$.
- Again, this can be validated because hashing gives $H^{999}(s)$ which is $p$, the value stored after the previous login.
- The new value replaces $p$ and the user is authenticated.
- This process can be repeated another 997 times, each time the password will be $H$ applied one fewer times.
- ### Time-Synchronised OTP #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-reviewed:: 2022-10-07T10:17:02.675Z
card-last-score:: 1
- Each user has a unique piece of hardware called a **security token** that generates an OTP (e.g., mobile phone).
- Inside the token is an accurate clock that has been synchronised with the clock of the verifer.
- Both claimant token and verifier server calculate identical OTPs that are based on time.
- ![image.png](../assets/image_1664799869963_0.png)
-
- # Some New Biometric Methods
- **Hand geometry:** Measurement & comparison of the (unique) different physical characteristics of the hand.
- **Palm vein authentication:** Uses an infrared beam to penetrate the user's hand as it is waved over the system; the veins within the palm are returned as black lines.
- **Retina scan:** Provides an analysis of the capillary blood vessels located in the back of the eye.
- **Iris scan:** Provides an analysis of the rings, furrows, & freckles in the coloured ring that surrounds the pupil of the eye.
- Face recognition, signature, & voice analysis.
- **Behavioural biometrics:**
- ![image.png](../assets/image_1664800188644_0.png)
-
- # Multi-Factor Authentication
- This may include a combination of the following:
- Some physical object in the possession of the user, e.g., a USB stick with a secret token, a bank card, a key, etc.
- Some secret known only to the user, such as a password, PIN, TAN, etc.
- Some physical characteristic of the user (biometrics), such as a fingerprint, eye iris, voice, typing speed, pattern in key press intervals, etc.
- Somewhere you are, such as connection to a specific computing network or utilising a GPS signal to identify the location.

View File

@ -0,0 +1,87 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[Introduction to Cryptography]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_03.pdf](../assets/ct255_03_1664798420872_0.pdf)
-
- What is a **password**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T01:20:19.049Z
card-last-reviewed:: 2022-10-07T15:20:19.050Z
card-last-score:: 3
- A **password** is a memorised secret used to confirm the identity of a user.
- Typically, an arbitrary string of characters including letters, digits, or other symbols.
- A purely numeric secret is called a **Personal Identification Number (PIN)**.
- The secret is memorised by a party called the **claimant** while the party verifying the identity of the claimant is called the **verifier**.
- The claimant & the verifier communicate via an **authentication protocol**.
- # Some Password Alternatives
- One-Time Password (OTP).
- Transaction Authentication Number (TAN) list used for online banking - they can only be used once.
- Time-synchronised one-time passwords.
- Biometric methods.
- Fingerprints, irises, voice, face.
- Cognitive passwords.
- Use question & answer cue/response pairs to verify identity.
-
- # Algorithmic Generation of OTP
- Paper-based TANs are hard to manage -> both the claimant and the verifier need to have a copy of every OTP (possibly hundreds of them).
- Idea: each OTP may be created from the passt OTPs used.
- An example of this type of algorithm, credited to Leslie Lamport, uses a **one-way function** (hash function).
- ## One-Way Functions
- What is a **hash function**? #card
card-last-interval:: 8.35
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-17T16:51:58.310Z
card-last-reviewed:: 2022-10-09T08:51:58.310Z
card-last-score:: 5
- A **one-way function** $H$ produces a fixed-size output $h$ based on a variable size input $s$.
- $$H(s) = h$$
- $H$ is also called a **hash function**, $h$ is called a **hash** (value).
- Important: *one-way property*:
- For a given hash code $h$, it is infeasible to find $s$ that $H(s) = h$.
- ### Leslie Lamport's Algorithm #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:02:53.292Z
card-last-score:: 1
- For every claimant, a random seed (starting value) $s$ is chosen.
- A hash function $H(s)$ is applied repeatedly (e.g., 1,000 times) to the seed, giving a value of:
- $$H(H(H(...(H(s)....))))$$
- The user's first login uses an OTP $p$ derived by applying $H$ 999 times to the seed, i.e., $H^{999}(s)$.
- The verifier can authenticate that this is the correct OTP, because $H(p) = H^{1000}(s)$, the value stored.
- The value stored is then replaced by $p$ and the user is allowed to log in.
- The next login must be accompanied by $H^{998}(s)$.
- Again, this can be validated because hashing gives $H^{999}(s)$ which is $p$, the value stored after the previous login.
- The new value replaces $p$ and the user is authenticated.
- This process can be repeated another 997 times, each time the password will be $H$ applied one fewer times.
- ### Time-Synchronised OTP #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:19:40.334Z
card-last-score:: 1
- Each user has a unique piece of hardware called a **security token** that generates an OTP (e.g., mobile phone).
- Inside the token is an accurate clock that has been synchronised with the clock of the verifer.
- Both claimant token and verifier server calculate identical OTPs that are based on time.
- ![image.png](../assets/image_1664799869963_0.png)
-
- # Some New Biometric Methods
- **Hand geometry:** Measurement & comparison of the (unique) different physical characteristics of the hand.
- **Palm vein authentication:** Uses an infrared beam to penetrate the user's hand as it is waved over the system; the veins within the palm are returned as black lines.
- **Retina scan:** Provides an analysis of the capillary blood vessels located in the back of the eye.
- **Iris scan:** Provides an analysis of the rings, furrows, & freckles in the coloured ring that surrounds the pupil of the eye.
- Face recognition, signature, & voice analysis.
- **Behavioural biometrics:**
- ![image.png](../assets/image_1664800188644_0.png)
-
- # Multi-Factor Authentication
- This may include a combination of the following:
- Some physical object in the possession of the user, e.g., a USB stick with a secret token, a bank card, a key, etc.
- Some secret known only to the user, such as a password, PIN, TAN, etc.
- Some physical characteristic of the user (biometrics), such as a fingerprint, eye iris, voice, typing speed, pattern in key press intervals, etc.
- Somewhere you are, such as connection to a specific computing network or utilising a GPS signal to identify the location.

View File

@ -0,0 +1,87 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[Introduction to Cryptography]]
- **Next Topic:** [[Hash Cracking Using Rainbow Tables]]
- **Relevant Slides:** ![ct255_03.pdf](../assets/ct255_03_1664798420872_0.pdf)
-
- What is a **password**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T01:20:19.049Z
card-last-reviewed:: 2022-10-07T15:20:19.050Z
card-last-score:: 3
- A **password** is a memorised secret used to confirm the identity of a user.
- Typically, an arbitrary string of characters including letters, digits, or other symbols.
- A purely numeric secret is called a **Personal Identification Number (PIN)**.
- The secret is memorised by a party called the **claimant** while the party verifying the identity of the claimant is called the **verifier**.
- The claimant & the verifier communicate via an **authentication protocol**.
- # Some Password Alternatives
- One-Time Password (OTP).
- Transaction Authentication Number (TAN) list used for online banking - they can only be used once.
- Time-synchronised one-time passwords.
- Biometric methods.
- Fingerprints, irises, voice, face.
- Cognitive passwords.
- Use question & answer cue/response pairs to verify identity.
-
- # Algorithmic Generation of OTP
- Paper-based TANs are hard to manage -> both the claimant and the verifier need to have a copy of every OTP (possibly hundreds of them).
- Idea: each OTP may be created from the passt OTPs used.
- An example of this type of algorithm, credited to Leslie Lamport, uses a **one-way function** (hash function).
- ## One-Way Functions
- What is a **hash function**? #card
card-last-interval:: 8.35
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-17T16:51:58.310Z
card-last-reviewed:: 2022-10-09T08:51:58.310Z
card-last-score:: 5
- A **one-way function** $H$ produces a fixed-size output $h$ based on a variable size input $s$.
- $$H(s) = h$$
- $H$ is also called a **hash function**, $h$ is called a **hash** (value).
- Important: *one-way property*:
- For a given hash code $h$, it is infeasible to find $s$ that $H(s) = h$.
- ### Leslie Lamport's Algorithm #card
card-last-interval:: 0.85
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-27T07:50:40.061Z
card-last-reviewed:: 2022-10-26T11:50:40.061Z
card-last-score:: 3
- For every claimant, a random seed (starting value) $s$ is chosen.
- A hash function $H(s)$ is applied repeatedly (e.g., 1,000 times) to the seed, giving a value of:
- $$H(H(H(...(H(s)....))))$$
- The user's first login uses an OTP $p$ derived by applying $H$ 999 times to the seed, i.e., $H^{999}(s)$.
- The verifier can authenticate that this is the correct OTP, because $H(p) = H^{1000}(s)$, the value stored.
- The value stored is then replaced by $p$ and the user is allowed to log in.
- The next login must be accompanied by $H^{998}(s)$.
- Again, this can be validated because hashing gives $H^{999}(s)$ which is $p$, the value stored after the previous login.
- The new value replaces $p$ and the user is authenticated.
- This process can be repeated another 997 times, each time the password will be $H$ applied one fewer times.
- ### Time-Synchronised OTP #card
card-last-interval:: 0.84
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-11T07:35:51.251Z
card-last-reviewed:: 2022-10-10T11:35:51.251Z
card-last-score:: 3
- Each user has a unique piece of hardware called a **security token** that generates an OTP (e.g., mobile phone).
- Inside the token is an accurate clock that has been synchronised with the clock of the verifer.
- Both claimant token and verifier server calculate identical OTPs that are based on time.
- ![image.png](../assets/image_1664799869963_0.png)
-
- # Some New Biometric Methods
- **Hand geometry:** Measurement & comparison of the (unique) different physical characteristics of the hand.
- **Palm vein authentication:** Uses an infrared beam to penetrate the user's hand as it is waved over the system; the veins within the palm are returned as black lines.
- **Retina scan:** Provides an analysis of the capillary blood vessels located in the back of the eye.
- **Iris scan:** Provides an analysis of the rings, furrows, & freckles in the coloured ring that surrounds the pupil of the eye.
- Face recognition, signature, & voice analysis.
- **Behavioural biometrics:**
- ![image.png](../assets/image_1664800188644_0.png)
-
- # Multi-Factor Authentication
- This may include a combination of the following:
- Some physical object in the possession of the user, e.g., a USB stick with a secret token, a bank card, a key, etc.
- Some secret known only to the user, such as a password, PIN, TAN, etc.
- Some physical characteristic of the user (biometrics), such as a fingerprint, eye iris, voice, typing speed, pattern in key press intervals, etc.
- Somewhere you are, such as connection to a specific computing network or utilising a GPS signal to identify the location.

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-22T23:00:00.000Z
card-last-reviewed:: 2022-09-22T20:28:27.102Z
card-last-score:: 1
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:14:38.542Z
card-last-reviewed:: 2022-09-26T12:14:38.542Z
card-last-score:: 3
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **SCRUM**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-22T23:00:00.000Z
card-last-reviewed:: 2022-09-22T20:41:57.926Z
card-last-score:: 1
- **Software Project Management Methodology (SCRUM)** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of SCRUM #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T12:13:39.492Z
card-last-reviewed:: 2022-09-26T12:13:39.493Z
card-last-score:: 3
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T12:14:54.429Z
card-last-reviewed:: 2022-09-26T12:14:54.429Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- Not for problem solving.
- Only team members, ScrumMaster, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- Answer 3 questions:
- What did you do yesterday?
- What will you do today?
- Is anything in your way?
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-07T11:41:44.734Z
card-last-reviewed:: 2022-10-03T11:41:44.735Z
card-last-score:: 5
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:14:50.132Z
card-last-score:: 1
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **Scrum**? #card
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-04T12:33:02.463Z
- **Scrum** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of SCRUM #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:13:22.626Z
card-last-score:: 1
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-08T12:15:05.415Z
card-last-reviewed:: 2022-10-04T12:15:05.415Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- Not for problem solving.
- Only team members, ScrumMaster, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- Answer 3 questions:
- What did you do yesterday?
- What will you do today?
- Is anything in your way?
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-07T11:41:44.734Z
card-last-reviewed:: 2022-10-03T11:41:44.735Z
card-last-score:: 5
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:14:18.767Z
card-last-reviewed:: 2022-10-07T10:14:18.769Z
card-last-score:: 3
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **Scrum**? #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-10T04:23:23.437Z
card-last-interval:: 2.77
card-ease-factor:: 2.36
card-last-reviewed:: 2022-10-07T10:23:23.438Z
- **Scrum** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of Scrum #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-11T10:49:04.551Z
card-last-interval:: 4
card-ease-factor:: 2.22
card-last-reviewed:: 2022-10-07T10:49:04.552Z
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-08T12:15:05.415Z
card-last-reviewed:: 2022-10-04T12:15:05.415Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- Not for problem solving.
- Only team members, ScrumMaster, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- Answer 3 questions:
- What did you do yesterday?
- What will you do today?
- Is anything in your way?
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:00:18.476Z
card-last-score:: 1
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:14:18.767Z
card-last-reviewed:: 2022-10-07T10:14:18.769Z
card-last-score:: 3
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **Scrum**? #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-10T04:23:23.437Z
card-last-interval:: 2.77
card-ease-factor:: 2.36
card-last-reviewed:: 2022-10-07T10:23:23.438Z
- **Scrum** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of Scrum #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-11T10:49:04.551Z
card-last-interval:: 4
card-ease-factor:: 2.22
card-last-reviewed:: 2022-10-07T10:49:04.552Z
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:54:31.522Z
card-last-reviewed:: 2022-10-08T22:54:31.568Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- **Not** for problem solving.
- Only team members, Scrum Master, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- ^^Answer 3 questions:^^
- ^^What did you do yesterday?^^
- ^^What will you do today?^^
- ^^Is anything in your way?^^
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T23:00:00.000Z
card-last-reviewed:: 2022-10-10T11:32:31.602Z
card-last-score:: 1
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:14:18.767Z
card-last-reviewed:: 2022-10-07T10:14:18.769Z
card-last-score:: 3
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **Scrum**? #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-10T04:23:23.437Z
card-last-interval:: 2.77
card-ease-factor:: 2.36
card-last-reviewed:: 2022-10-07T10:23:23.438Z
- **Scrum** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of Scrum #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-11T10:49:04.551Z
card-last-interval:: 4
card-ease-factor:: 2.22
card-last-reviewed:: 2022-10-07T10:49:04.552Z
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:54:31.522Z
card-last-reviewed:: 2022-10-08T22:54:31.568Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- **Not** for problem solving.
- Only team members, Scrum Master, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- ^^Answer 3 questions:^^
- ^^What did you do yesterday?^^
- ^^What will you do today?^^
- ^^Is anything in your way?^^
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,101 @@
- #[[CT216 - Software Engineering I]]
- **Previous Topic:** [[Software Processes]]
- **Next Topic:** [[SCRUM Roles & Ceremonies]]
- **Relevant Slides:** ![Week 3 - Introduction to Agile Methods - Scrum(1).pdf](../assets/Week_3_-_Introduction_to_Agile_Methods_-_Scrum(1)_1663848442133_0.pdf)
-
- # Software Development Lifecycle
- What is the **Software Lifecycle**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:38:27.177Z
card-last-score:: 1
- The **software lifecycle** is an abstract representation of a software process. It defines the steps, methods, tools, activities, and deliverables of a software development project.
- The following **lifecycle phases** are considered:
- 1. Requirement Analysis
2. System Design
3. Implementation
4. Integration & Deployment
5. Operation & Maintenance
- ## SDLC Limitations
- Classical project planning methods have a lot of disadvantages:
- Huge efforts during the planning phase (requirements + design).
- Poor requirements conversion in a rapidly changing environment.
- Treatment of staff as a factor of production.
-
- # Agile
- What is **Agile**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:14:18.767Z
card-last-reviewed:: 2022-10-07T10:14:18.769Z
card-last-score:: 3
- There is no single definition of Agile, but the Agile Manifesto is the closest to a defintion.
- Set of principles.
- Developed by Agile Alliance.
- Agile methods focus on:
- Individuals & interactions over processes & tools.
- Working software over comprehensive documentation.
- Customer collaboration over contract negotiation.
- Responding to change over following a plan.
- The [Agile Alliance](www.agilealliance.org) is a non-profit organisation promotes agile development.
- ## Agile Motivation
- Agile proponents argue:
- Software development processes relying on lifecycle models are too heavyweight or cumbersome.
- Too many things are done that are not directly related to the software product being produced, i.e., design, models, requirements docs, documentation that isn't shipped as part of the product.
- Difficulty with incomplete or changing requirements.
- Short development cycles (Mobile Apps).
- More active customer involvement needed.
- There are several Agile methods, including **Scrum** and **Extreme Programming (XP)**.
- ## SCRUM
- What is **Scrum**? #card
card-last-score:: 3
card-repeats:: 3
card-next-schedule:: 2022-10-29T05:29:46.102Z
card-last-interval:: 8.88
card-ease-factor:: 2.22
card-last-reviewed:: 2022-10-20T08:29:46.102Z
- **Scrum** is an agile project management methodology for managing product development.
- It allows us to rapidly and repeatedly inspect actual working software (every two weeks to one month).
- The business sets the priorities. The teams **self-manage** to determine the best way to deliver the highest priority features.
- Every two weeks to a month, anyone can see real, working software and decide to release it as is or continue to enhance it for another iteration.
- ### Characteristics of Scrum #card
card-last-score:: 3
card-repeats:: 2
card-next-schedule:: 2022-10-11T10:49:04.551Z
card-last-interval:: 4
card-ease-factor:: 2.22
card-last-reviewed:: 2022-10-07T10:49:04.552Z
- Self-organising teams.
- No need for project manager (in theory).
- Product progresses in a series of month-long or biweekly **sprints**.
- Assumes that the software cannot be well defined and requirements will change frequently.
- Requirements are captured as items in a list of **product backlog**.
- No specific engineering practices prescribed.
- XP, TDD, FDD.
- Best approach is to start with Scrum and then invent your own version using XP, TDD< FDD, etc.
- ### Daily SCRUM / Standup #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:54:31.522Z
card-last-reviewed:: 2022-10-08T22:54:31.568Z
card-last-score:: 5
- Parameters:
- Daily.
- 15-minutes.
- Stand-up.
- **Not** for problem solving.
- Only team members, Scrum Master, & Product Owners should talk.
- Should help to avoid additional unnecessary meetings.
- Commitment in front of peers to complete tasks.
- ^^Answer 3 questions:^^
- ^^What did you do yesterday?^^
- ^^What will you do today?^^
- ^^Is anything in your way?^^
- The Daily SCRUM is **not** a problem-solving session and is **not** a way to collect information about who is behind the schedule.
- It is a meeting in which members make commitments to each other and to the SCRUM Master.
- It is a good way for a SCRUM Master to track the progress of the team.
-

View File

@ -0,0 +1,315 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:46:36.580Z
card-last-reviewed:: 2022-09-18T14:46:36.581Z
card-last-score:: 3
- An **SQL Injection** is a ^^code injection technique^^ used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:46:04.518Z
card-last-reviewed:: 2022-09-18T14:46:04.518Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:41.558Z
card-last-reviewed:: 2022-09-18T14:39:41.558Z
card-last-score:: 5
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:52.620Z
card-last-reviewed:: 2022-09-18T14:39:52.621Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:44:53.575Z
card-last-reviewed:: 2022-09-18T14:44:53.575Z
card-last-score:: 5
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:43:53.075Z
card-last-reviewed:: 2022-09-18T14:43:53.075Z
card-last-score:: 3
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:46:57.114Z
card-last-reviewed:: 2022-09-18T14:46:57.114Z
card-last-score:: 5
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:43:30.989Z
card-last-reviewed:: 2022-09-18T14:43:30.989Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:44:36.754Z
card-last-reviewed:: 2022-09-18T14:44:36.755Z
card-last-score:: 3
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$C=E_K(P)$$
- What is **Decryption**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:45:19.915Z
card-last-reviewed:: 2022-09-18T14:45:19.915Z
card-last-score:: 3
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:41:43.525Z
card-last-reviewed:: 2022-09-18T14:41:43.525Z
card-last-score:: 5
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:39:33.475Z
card-last-reviewed:: 2022-09-18T14:39:33.475Z
card-last-score:: 3
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:46:48.201Z
card-last-reviewed:: 2022-09-18T14:46:48.202Z
card-last-score:: 3
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:44:47.075Z
card-last-reviewed:: 2022-09-18T14:44:47.075Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:40:09.236Z
card-last-reviewed:: 2022-09-18T14:40:09.237Z
card-last-score:: 5
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:43:16.213Z
card-last-score:: 1
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:24.910Z
card-last-reviewed:: 2022-09-18T14:39:24.912Z
card-last-score:: 5
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:46:25.289Z
card-last-score:: 1
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:41:09.367Z
card-last-score:: 1
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than Rail Fence Ciphers.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:45:30.141Z
card-last-reviewed:: 2022-09-18T14:45:30.142Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,315 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** null
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:46:36.580Z
card-last-reviewed:: 2022-09-18T14:46:36.581Z
card-last-score:: 3
- An **SQL Injection** is a ^^code injection technique^^ used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:46:04.518Z
card-last-reviewed:: 2022-09-18T14:46:04.518Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:41.558Z
card-last-reviewed:: 2022-09-18T14:39:41.558Z
card-last-score:: 5
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:52.620Z
card-last-reviewed:: 2022-09-18T14:39:52.621Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:44:53.575Z
card-last-reviewed:: 2022-09-18T14:44:53.575Z
card-last-score:: 5
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:43:53.075Z
card-last-reviewed:: 2022-09-18T14:43:53.075Z
card-last-score:: 3
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:46:57.114Z
card-last-reviewed:: 2022-09-18T14:46:57.114Z
card-last-score:: 5
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:43:30.989Z
card-last-reviewed:: 2022-09-18T14:43:30.989Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:44:36.754Z
card-last-reviewed:: 2022-09-18T14:44:36.755Z
card-last-score:: 3
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$C=E_K(P)$$
- What is **Decryption**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:45:19.915Z
card-last-reviewed:: 2022-09-18T14:45:19.915Z
card-last-score:: 3
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:41:43.525Z
card-last-reviewed:: 2022-09-18T14:41:43.525Z
card-last-score:: 5
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:39:33.475Z
card-last-reviewed:: 2022-09-18T14:39:33.475Z
card-last-score:: 3
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:46:48.201Z
card-last-reviewed:: 2022-09-18T14:46:48.202Z
card-last-score:: 3
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:44:47.075Z
card-last-reviewed:: 2022-09-18T14:44:47.075Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:40:09.236Z
card-last-reviewed:: 2022-09-18T14:40:09.237Z
card-last-score:: 5
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:43:16.213Z
card-last-score:: 1
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:39:24.910Z
card-last-reviewed:: 2022-09-18T14:39:24.912Z
card-last-score:: 5
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:46:25.289Z
card-last-score:: 1
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:41:09.367Z
card-last-score:: 1
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than Rail Fence Ciphers.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-22T14:45:30.141Z
card-last-reviewed:: 2022-09-18T14:45:30.142Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,316 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** [[Human Security & Passwords]]
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-08T12:29:41.822Z
card-last-reviewed:: 2022-10-04T12:29:41.822Z
card-last-score:: 5
-
- An **SQL Injection** is a ^^code injection technique^^ used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:13:18.768Z
card-last-reviewed:: 2022-09-30T12:13:18.768Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:07:59.453Z
card-last-reviewed:: 2022-09-30T12:07:59.456Z
card-last-score:: 5
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T08:29:48.663Z
card-last-reviewed:: 2022-09-30T08:29:48.663Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T12:11:06.873Z
card-last-reviewed:: 2022-09-30T12:11:06.874Z
card-last-score:: 3
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T12:09:48.972Z
card-last-reviewed:: 2022-09-30T12:09:48.972Z
card-last-score:: 5
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:13:23.717Z
card-last-reviewed:: 2022-09-30T12:13:23.717Z
card-last-score:: 5
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean?
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:09:14.829Z
card-last-reviewed:: 2022-09-30T12:09:14.829Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-10-04T12:09:55.038Z
card-last-interval:: 4
card-ease-factor:: 2.46
card-last-reviewed:: 2022-09-30T12:09:55.039Z
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$C=E_K(P)$$
- What is **Decryption**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-04T12:13:15.777Z
card-last-reviewed:: 2022-09-30T12:13:15.777Z
card-last-score:: 3
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:08:29.141Z
card-last-reviewed:: 2022-09-30T12:08:29.141Z
card-last-score:: 5
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-04T09:29:41.789Z
card-last-reviewed:: 2022-09-30T09:29:41.789Z
card-last-score:: 3
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T14:39:41.014Z
card-last-reviewed:: 2022-10-06T09:39:41.014Z
card-last-score:: 5
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:09:58.027Z
card-last-reviewed:: 2022-09-30T12:09:58.027Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:08:03.733Z
card-last-reviewed:: 2022-09-30T12:08:03.733Z
card-last-score:: 5
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-10-08T00:33:19.557Z
card-last-interval:: 3.51
card-ease-factor:: 2.6
card-last-reviewed:: 2022-10-04T12:33:19.558Z
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:29:32.337Z
card-last-score:: 1
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-04T09:08:18.147Z
card-last-reviewed:: 2022-09-30T09:08:18.150Z
card-last-score:: 3
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T19:11:23.311Z
card-last-reviewed:: 2022-09-30T09:11:23.312Z
card-last-score:: 3
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than Rail Fence Ciphers.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-08T12:33:34.285Z
card-last-reviewed:: 2022-10-04T12:33:34.285Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,316 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** [[Human Security & Passwords]]
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-19T12:58:54.850Z
card-last-reviewed:: 2022-10-08T22:58:54.851Z
card-last-score:: 5
-
- An **SQL Injection** is a ***code injection technique*** used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-11-10T01:19:02.918Z
card-last-reviewed:: 2022-10-07T10:19:02.918Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:24:55.999Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:24:55.999Z
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-18T14:39:00.206Z
card-last-reviewed:: 2022-10-07T10:39:00.207Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4.14
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T01:43:58.055Z
card-last-reviewed:: 2022-10-08T22:43:58.056Z
card-last-score:: 5
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:47:20.304Z
card-last-reviewed:: 2022-10-07T10:47:20.304Z
card-last-score:: 3
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:14:43.789Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:14:43.790Z
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean?
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:09:14.829Z
card-last-reviewed:: 2022-09-30T12:09:14.829Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-score:: 3
card-repeats:: 3
card-next-schedule:: 2022-10-16T16:40:50.413Z
card-last-interval:: 9.28
card-ease-factor:: 2.32
card-last-reviewed:: 2022-10-07T10:40:50.414Z
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$E_K(P) = C$$
- What is **Decryption**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:22:54.739Z
card-last-score:: 1
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-17T21:28:14.325Z
card-last-reviewed:: 2022-10-06T17:28:14.325Z
card-last-score:: 5
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-16T16:40:03.776Z
card-last-reviewed:: 2022-10-07T10:40:03.777Z
card-last-score:: 5
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T14:39:41.014Z
card-last-reviewed:: 2022-10-06T09:39:41.014Z
card-last-score:: 5
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-17T21:27:16.902Z
card-last-reviewed:: 2022-10-06T17:27:16.903Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T22:27:13.914Z
card-last-reviewed:: 2022-10-06T17:27:13.915Z
card-last-score:: 3
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-10-08T00:33:19.557Z
card-last-interval:: 3.51
card-ease-factor:: 2.6
card-last-reviewed:: 2022-10-04T12:33:19.558Z
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:31:42.282Z
card-last-reviewed:: 2022-10-07T10:31:42.283Z
card-last-score:: 5
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:39:25.895Z
card-last-reviewed:: 2022-10-07T10:39:25.895Z
card-last-score:: 3
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-01T22:18:29.590Z
card-last-reviewed:: 2022-10-06T17:18:29.591Z
card-last-score:: 5
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than Rail Fence Ciphers.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T12:50:18.491Z
card-last-reviewed:: 2022-10-09T08:50:18.492Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,316 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** [[Human Security & Passwords]]
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-19T12:58:54.850Z
card-last-reviewed:: 2022-10-08T22:58:54.851Z
card-last-score:: 5
-
- An **SQL Injection** is a ***code injection technique*** used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-11-10T01:19:02.918Z
card-last-reviewed:: 2022-10-07T10:19:02.918Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:24:55.999Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:24:55.999Z
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-18T14:39:00.206Z
card-last-reviewed:: 2022-10-07T10:39:00.207Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4.14
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T01:43:58.055Z
card-last-reviewed:: 2022-10-08T22:43:58.056Z
card-last-score:: 5
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:47:20.304Z
card-last-reviewed:: 2022-10-07T10:47:20.304Z
card-last-score:: 3
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:14:43.789Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:14:43.790Z
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean?
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:09:14.829Z
card-last-reviewed:: 2022-09-30T12:09:14.829Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-score:: 3
card-repeats:: 3
card-next-schedule:: 2022-10-16T16:40:50.413Z
card-last-interval:: 9.28
card-ease-factor:: 2.32
card-last-reviewed:: 2022-10-07T10:40:50.414Z
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$E_K(P) = C$$
- What is **Decryption**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-14T11:37:04.045Z
card-last-reviewed:: 2022-10-10T11:37:04.046Z
card-last-score:: 5
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-16T15:43:11.780Z
card-last-reviewed:: 2022-10-19T08:43:11.781Z
card-last-score:: 3
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-16T16:40:03.776Z
card-last-reviewed:: 2022-10-07T10:40:03.777Z
card-last-score:: 5
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T14:39:41.014Z
card-last-reviewed:: 2022-10-06T09:39:41.014Z
card-last-score:: 5
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-17T21:27:16.902Z
card-last-reviewed:: 2022-10-06T17:27:16.903Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T22:27:13.914Z
card-last-reviewed:: 2022-10-06T17:27:13.915Z
card-last-score:: 3
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-10-08T00:33:19.557Z
card-last-interval:: 3.51
card-ease-factor:: 2.6
card-last-reviewed:: 2022-10-04T12:33:19.558Z
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:31:42.282Z
card-last-reviewed:: 2022-10-07T10:31:42.283Z
card-last-score:: 5
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:39:25.895Z
card-last-reviewed:: 2022-10-07T10:39:25.895Z
card-last-score:: 3
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-01T22:18:29.590Z
card-last-reviewed:: 2022-10-06T17:18:29.591Z
card-last-score:: 5
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than Rail Fence Ciphers.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T12:50:18.491Z
card-last-reviewed:: 2022-10-09T08:50:18.492Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,317 @@
- #[[CT255 - Next Generation Technologies II]]
- **Previous Topic:** [[GDPR]]
- **Next Topic:** [[Human Security & Passwords]]
- **Relevant Slides:** ![ct255_02.pdf](../assets/ct255_02_1663458790357_0.pdf)
id:: 63265db7-1d41-44f7-b4cf-0bab377a7c1c
-
- ## SQL Injections
- What is an **SQL Injection**? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-19T12:58:54.850Z
card-last-reviewed:: 2022-10-08T22:58:54.851Z
card-last-score:: 5
-
- An **SQL Injection** is a ***code injection technique*** used to attack data-driven applications, in which malicious SQL statements are inserted for execution.
- It is a way of exploiting user input & SQL statements to compromise the database & retrieve sensitive data.
-
- ## Basic Terminology
- What is **Cryptography**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-11-10T01:19:02.918Z
card-last-reviewed:: 2022-10-07T10:19:02.918Z
card-last-score:: 5
- **Cryptography** is the art of encompassing the principles & methods of transforming an intelligible message into one that is unintelligible, and then retransforming that message back into its original form.
- What is **Plaintext**?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:24:55.999Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:24:55.999Z
- **Plaintext** is the ^^original, intelligible message.^^
- What is **Ciphertext**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-18T14:39:00.206Z
card-last-reviewed:: 2022-10-07T10:39:00.207Z
card-last-score:: 5
- **Ciphertext** is the encrypted messsage.
- What is a **Cipher**? #card
card-last-interval:: 4.14
card-repeats:: 2
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T01:43:58.055Z
card-last-reviewed:: 2022-10-08T22:43:58.056Z
card-last-score:: 5
- A **Cipher** is an algorithm for transforming an intelligible message into one that is unintelligible.
- What is a **Key**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-11T10:47:20.304Z
card-last-reviewed:: 2022-10-07T10:47:20.304Z
card-last-score:: 3
- A **Key** is some critical information used by the cipher, known only to the sender & receiver, selected from a **keyspace** (the set of all possible keys).
- What does **Encipher** mean?
card-last-score:: 5
card-repeats:: 3
card-next-schedule:: 2022-10-17T21:14:43.789Z
card-last-interval:: 11.2
card-ease-factor:: 2.8
card-last-reviewed:: 2022-10-06T17:14:43.790Z
- **Enciphering** is the process of converting plaintext into ciphertext using a cipher & a key.
- What does **Decipher** mean?
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T12:09:14.829Z
card-last-reviewed:: 2022-09-30T12:09:14.829Z
card-last-score:: 5
- **Deciphering** is the process of converting ciphertext back into plaintext using a cipher & a key.
- What is **Encryption**? #card
card-last-score:: 3
card-repeats:: 3
card-next-schedule:: 2022-10-16T16:40:50.413Z
card-last-interval:: 9.28
card-ease-factor:: 2.32
card-last-reviewed:: 2022-10-07T10:40:50.414Z
- **Encryption** is some mathematical function $E_K()$ mapping plaintext $P$ to ciphertext $C$ using the specified key $K$.
- $$E_K(P) = C$$
- What is **Decryption**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-14T11:37:04.045Z
card-last-reviewed:: 2022-10-10T11:37:04.046Z
card-last-score:: 5
- **Decryption** is some mathematical function ${E_K}^{-1}()$ mapping the ciphertext $C$ to plaintext $P$ using the specified key $K$.
- $$P={E_K}^{-1}(C)$$
- What is **Cryptanalysis**? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-11-16T15:43:11.780Z
card-last-reviewed:: 2022-10-19T08:43:11.781Z
card-last-score:: 3
- **Cryptanalysis** is the study of principles & methods of transforming an unintelligible message into an intelligible message without knowledge of the key.
- What is **Cryptology**?
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-16T16:40:03.776Z
card-last-reviewed:: 2022-10-07T10:40:03.777Z
card-last-score:: 5
- **Cryptology** is the field encompassing both cryptography & cryptanalysis.
-
-
- ## Model of Conventional Cryptosystem
- ![image.png](../assets/image_1663459919021_0.png){:height 304, :width 610}
- ## Cryptanalysis via Letter Frequency Distribution
- Human languages are **redundant** - letters are not equally commonly used.
- In the **English** language:
- **E** is by far the most common letter followed by T, R, N, I, O, A, and S.
- Other letters like Z, J, K, Q, and X are fairly rare.
- Certain letter combinations like **TH** are quite common.
- ![image.png](../assets/image_1663488626792_0.png)
-
- ### C Program for Frequency Analysis of single Characters
- ```c
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int main(int argc, char* argv[]) {
FILE* fp;
int data[26];
char c;
memset(data, 0, siezof(data));
if (argc != 2) {
return(-1);
}
if (fp = fopen(argv[1], "r" == NULL)) {
return(-2);
}
while(!feof(fp)) {
c = toupper(fgetc(fp));
if ((c >= 'A') && (c <= 'Z')) {
data[c-65]++;
}
}
for (int i = 0; i < 26; i++) {
printf("%c:%i\n", i+65, data[i]);
}
fclose(fp);
return(0);
}
```
- ## Known Plaintext Attacks (KPA)
- What is a **Known Plaintext Attack (KPA)**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T14:39:41.014Z
card-last-reviewed:: 2022-10-06T09:39:41.014Z
card-last-score:: 5
- The **Known Plaintext Attack (KPA)** is an attack model for cryptanalysis where the attacker has access to both:
- some of, or all of, the plaintext (called a **crib**)
- the ciphertext
-
-
- ## Caesar Cipher
- What is a **Caesar Cipher**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-17T21:27:16.902Z
card-last-reviewed:: 2022-10-06T17:27:16.903Z
card-last-score:: 5
- A **Caesar Cipher** involves using an offset alphabet to encrypt a message.
- We can use any shift from 1 to 25 to replace each plaintext letter with a letter a fixed distance away.
- The **key letter** represents the start of this offset alphabet.
- For example, a key letter of F means that A -> F, B -> G, and so on.
- ## Playfair Cipher
- Not even the large number of keys in a monoalphabetic cipher provides security.
- What is a **monoalphabetic cipher**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-16T22:27:13.914Z
card-last-reviewed:: 2022-10-06T17:27:13.915Z
card-last-score:: 3
- A **monoalphabetic cipher** is any cipher in which the letters of the plaintext are mapped to ciphertext letters based on a single alphabetic key.
- One approach to improving security over monoalphabetic ciphers is to to encrypt ^^multiple letters.^^
- The **Playfair Cipher** is one example of such an approach.
- The algorithm was invented by Charles Wheatstone in 1854, but named after his friend Baron Playfair.
- ### How does the Playfair Cipher work?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-10-08T00:33:19.557Z
card-last-interval:: 3.51
card-ease-factor:: 2.6
card-last-reviewed:: 2022-10-04T12:33:19.558Z
- ![image.png](../assets/image_1663491286810_0.png)
- 1. Create a 5x5 grid of letters; insert the keyword as shown, with each letter only considered once; fill the grid with the remaining letters in alphabetic order.
- 2. The letters are then encrypted in pairs.
- 3. Repeats have an "X" inserted.
- BALLOON -> BA LX LO ON
- 4. Letters that fall in the same row are replaced with the letter on the right.
- OK -> GM
- 5. Letters in the same column are replaced with the letter below.
- FO -> OU
- 6. Otherwise, each letter gets replaced by the letter in its row but in the other letters column.
- QM -> TH
- ### Security of the Playfair Cipher
- The security is much improved over simple monoalphabetic ciphers, as the Playfair Cipher has $26^2 = 676$ combinations.
- This requires a 676 entry frequency table to analyse (as compared to a 26 entry frequency table for a monoalphabetic cipher) and correspondingly, more ciphertext.
- However, the Playfair Cipher *can* be cracked through frequency analysis of letter pairs, given a few hundred letters.
-
- ## Vigenère Cipher
- [Blaise de Vigenère](https://en.wikipedia.org/wiki/Blaise_de_Vigen%C3%A8re) is generally credited as the inventor of the **Polyalphabetic Substitution Cipher**.
- What is a **Polyalphabetic Substitution Cipher**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:31:42.282Z
card-last-reviewed:: 2022-10-07T10:31:42.283Z
card-last-score:: 5
- A **Polyalphabetic Substitution Cipher** uses multiple substitution alphabets, as opposed to a monoalphabetic cipher which uses a single alphabetic key.
- The Vigenère Cipher improves security by using many monoalphabetic substitution alphabets, so each letter can be replaced by many others.
- You use a **key** to select which alphabet is used for each letter of the message.
- The $i^{th}$ letter of the key specifies the $i^{th}$ alphabet to use.
- Use each alphabet in turn.
- Repeat from the start after the end of the key is reached.
-
- ### Vigenère Steps
- ![image.png](../assets/image_1663494147352_0.png)
- 1. Write the plaintext out, and write the keyword underneath it, repeated, for the length of the plaintext.
- 2. Use each key letter in turn as a Caesar cipher key.
- 3. Encrypt the corresponding plaintext letter.
- In this example, we use the keyword "CIPHER". Hence, we have the following translation alphabets:
- ![image.png](../assets/image_1663494236099_0.png)
- ### How to crack the Vigenère Cipher
- 1. Search the ciphertext for repeated strings of letters - the longer the string, the better.
- 2. For each occurrence of a repeated string, count how many letters are between the first letters in the string, and add one.
- 3. Factorise that number.
- 4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor, $n$ is most likely the length of the keyword used to encipher the ciphertext.
- 5. Do a frequency count on the ciphertext, on every $n^{th}$ letter. You should end up with $n$ different frequency counts.
- 6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.
- 7. Undo the shifts and read the message.
- ## Enigma (Rotor Ciphers)
- ### Rotor Ciphers
- The mechanisation / automation of encryption.
- An $\text{N}$-stage polyalphabetic algorithm modulo 26.
- $26^N$ steps before a repetition, where $N$ is the number of cylinders.
- The Enigma machine had 5 cylinders, so:
- $$26^{N=5}=11,881,376 \text{ steps}$$
-
- ### Breaking Enigma using **Cribs**
- The starting point for breaking Enigma was based on the following:
- Plaintext messages were likely to contain certain phrases.
- Weather reports contained the term "WETTER VORHERSAGE".
- Military units often sent messages containing "KEINE BESONDEREN EREIGNISSE" ("nothing to report").
- A plaintext letter was never mapped onto the same ciphertext letter.
- While the cryptanalysts in Bletchely Park did not know exactly where these cribs were placed in an intercepted message, they could exclude certain positions.
- ![image.png](../assets/image_1663500888551_0.png)
- From here, possible rotor start positions & rotor wiring would be systematically examined using the "bombe" - an electromechanical device designed by Turing that replicated the action of several Enigma machines wired together.
-
- ## Transposition Ciphers
- What are **Transposition Ciphers**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-16T07:39:25.895Z
card-last-reviewed:: 2022-10-07T10:39:25.895Z
card-last-score:: 3
- **Transposition** or **Permutation Ciphers** hide the message by rearranging the letter order ^^without altering the actual letters used.^^
- This can be recognised since the ciphertext has the same frequency distribution as the original text.
- ### Rail Fence Cipher
id:: 6344093b-2f4f-4c58-95e4-39a8b30d16c3
- Write plaintext letters out diagonally over a number of rows, then read off the cipher row by row.
- ![image.png](../assets/image_1663501467907_0.png)
- ### Row Transposition Cipher
- What are **Row Transposition Ciphers**? #card
card-last-interval:: 26.21
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-01T22:18:29.590Z
card-last-reviewed:: 2022-10-06T17:18:29.591Z
card-last-score:: 5
- **Row Transposition Ciphers** are a more complex kind of transposition cipher than ((6344093b-2f4f-4c58-95e4-39a8b30d16c3))s.
- Plaintext letters are written out in rows over a specified number of columns.
- The columns are then re-ordered according to some key before reading off the columns
- ![image.png](../assets/image_1663501773385_0.png)
-
- ## Product Ciphers
- Ciphers using just substitutions or transpositions are not secure because of language characteristics.
- Consider using several ciphers in succession to make it harder to crack:
- Two substitutions make a more complex substitution.
- Two transpositions make a more complex transposition.
- However, a substitution followed by a transposition makes a much harder cipher.
-
-
- # Steganography
- What is **Steganography**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T12:50:18.491Z
card-last-reviewed:: 2022-10-09T08:50:18.492Z
card-last-score:: 5
- **Steganography** is an alternative to encryption that hides the existence of the message.
- For example:
- Using only a subset of letters / words in a message marked in some way.
- Using invisible ink.
- Hiding in LSB in graphic image or sound file.
- The drawback of steganography is that it's not very economical in terms of overheads to hide a message.
-

View File

@ -0,0 +1,143 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[OOP Modelling]]
- **Next Topic:**
- **Relevant Slides:** ![Lecture-9__2022.pdf](../assets/Lecture-9_2022_1665043655336_0.pdf) ![Lecture-10__2022.pdf](../assets/Lecture-10_2022_1665044307581_0.pdf)
-
- # Object Equality #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:26:39.173Z
card-last-reviewed:: 2022-10-06T17:26:39.174Z
card-last-score:: 5
collapsed:: true
- When you use `==` with reference variables, you are checking if the variables **point** to the same object.
- So, using `==` on strings will only return true if the Strings are references to the same object. It will return to false even if the strings contain the same data.
- The value of a string variable is the **memory location** where its String object is stored.
- When checking for equality between objects, you must use the `equals` method.
- The `equals` method is an instance method that ^^all objects of built-in classes have.^^
- However, for any class that you define, you will have to write your own equals method.
- All equals methods must have the following method signature:
- ```java
public boolean equals(Object object)
```
- Its specific purpose is to define equality between objects.
- It returns a **boolean** value.
- It is **commutative**.
- `str1.equals(str4)` returns the same value as `str4.equals(str1)`.
- Example:
- ```java
String str1 = "Java";
String str2 = "Ja";
String str3 = "va";
String str4 = str2 + str3;
str1.equals(str4) ? System.out.println("true") : System.out.println("false");
```
- # `instanceof` #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:29:22.029Z
card-last-reviewed:: 2022-10-07T10:29:22.030Z
card-last-score:: 5
- `instanceof` is an operator that is used to determine if a variable is pointing to an object with a particular type.
- ```java
System.out.println(bike2 instanceof Bicycle ? "true" : "false");
```
- # Object
collapsed:: true
- What is the type of `Object obj`? #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:29:39.466Z
card-last-reviewed:: 2022-10-07T10:29:39.466Z
card-last-score:: 5
- `obj` is a variable whose type is `java.lang.Object`.
- What is `java.lang.Object`? #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:30:00.069Z
card-last-reviewed:: 2022-10-07T10:30:00.070Z
card-last-score:: 5
- `java.lang.Object` is a class that provides the ^^most generic definition^^ of an object in Java.
- It is the **parent class** of every class in Java.
- For example. A `Bicycle` object is a `Bicycle` object **and** a `java.lang.Object` object.
-
- # Casting
- ```java
Bicycle bike1 = (Bicycle) myObject;
String str1 = (String) obj;
```
- Here, we can **cast** (convert) a variable from a higher type (`Object`), to a lower type (`Bicycle`).
- This is allowed, as `anObject` point to a Bicycle object - we can check this using `instanceof`.
- `obj` points to a String object - we can check this using `instanceof`.
- Note that the variable type being converted is ^^not the object.^^
- # Class Hierarchy
- ## Is-a Relationships
- Java organises all its classes in a class hierarchy.
- For example, a car is a type of vehicle, which is a type of object.
- These relationships can be described as "is-a" relationships.
- A car **is-a** vehicle; a vehicle **is-a**(n) object.
- We refer the higher-up types as **parents** and the lower types as **children**.
- Car *is-a child* of Vehicle.
- Vehicle *is-a parent* of Car.
- Object *is the parent* of Vehicle & Car.
- ## Key Ideas in Class Hierarchy
- The top of the hierarchy represents the ^^most **generic** attributes & behaviours.^^
- The bottom (sometimes referred to as "leaves") represent the ^^most **specific** attributes & behaviours.^^
- Each level inherits and customises the attributes & behaviours from the level above it.
- `java.lang.Object` is *the* **superclass**, the parent of all classes in Java.
- Every class in Java has the `java.lang.Object` as its superclass (parent).
- ![image.png](../assets/image_1665133543483_0.png)
- All the classes shown above **inherit** (receive) methods from the superclass `java.lang.Object`.
- What is **OOP Inheritance**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T22:40:20.126Z
card-last-score:: 1
- **Inheritance** is the means by which objects automatically receive features (fields) & behaviours (methods) from their **superclass**.
- The methods of this superclass are available to all objects of this Class, even though these methods may not be shown in the Class code.
- For example: `.equals()`.
- ### Generic Methods
- All the methods provided by the `java.lang.Object` are *generic*.
- They only relate to `java.lang.Object` classes, not the subclasses.
- When a subclass inherits these methods, it needs to customise them.
- This is why we had to override `.equals()` with our own version for the example Bicycle class.
- ### Overriding
- What is **overriding**? #card
card-last-interval:: 0.81
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-09T17:46:07.249Z
card-last-reviewed:: 2022-10-08T22:46:07.249Z
card-last-score:: 3
- **Overriding** is when you write your own version of a method that you have inherited from a superclass.
- It is creating a specific version of a method inherited from a parent (superclass) class.
- When overriding a method, you must keep every part of the method signature the same - You can only change the code in the method body.
- Its name, its parameter types & order, its access level (e.g., public, protected), and its return type.
- #### Annotation
- It is good practice to **annotate** your overridden methods using `@Override`.
- You code will compile & run without it, but it is considered good practice to annotate the methods that are overridden inherited from the superclass.
- ```java
@Override
public boolean equals(Object obj)
{
obj == null ? return false;
if (obj instanceof Bicycle)
{
Bicycle bike = (Bicycle) obj;
if (this.speed == bike.getSpeed() && this.gear == bike.getGear())
{
return true;
}
}
return false;
}
```

View File

@ -0,0 +1,143 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[OOP Modelling]]
- **Next Topic:** [[Coding Up Inheritance]]
- **Relevant Slides:** ![Lecture-9__2022.pdf](../assets/Lecture-9_2022_1665043655336_0.pdf) ![Lecture-10__2022.pdf](../assets/Lecture-10_2022_1665044307581_0.pdf)
-
- # Object Equality #card
card-last-interval:: 3.45
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:26:39.173Z
card-last-reviewed:: 2022-10-06T17:26:39.174Z
card-last-score:: 5
collapsed:: true
- When you use `==` with reference variables, you are checking if the variables **point** to the same object.
- So, using `==` on strings will only return true if the Strings are references to the same object. It will return to false even if the strings contain the same data.
- The value of a string variable is the **memory location** where its String object is stored.
- When checking for equality between objects, you must use the `equals` method.
- The `equals` method is an instance method that ^^all objects of built-in classes have.^^
- However, for any class that you define, you will have to write your own equals method.
- All equals methods must have the following method signature:
- ```java
public boolean equals(Object object)
```
- Its specific purpose is to define equality between objects.
- It returns a **boolean** value.
- It is **commutative**.
- `str1.equals(str4)` returns the same value as `str4.equals(str1)`.
- Example:
- ```java
String str1 = "Java";
String str2 = "Ja";
String str3 = "va";
String str4 = str2 + str3;
str1.equals(str4) ? System.out.println("true") : System.out.println("false");
```
- # `instanceof` #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:29:22.029Z
card-last-reviewed:: 2022-10-07T10:29:22.030Z
card-last-score:: 5
- `instanceof` is an operator that is used to determine if a variable is pointing to an object with a particular type.
- ```java
System.out.println(bike2 instanceof Bicycle ? "true" : "false");
```
- # Object
collapsed:: true
- What is the type of `Object obj`? #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:29:39.466Z
card-last-reviewed:: 2022-10-07T10:29:39.466Z
card-last-score:: 5
- `obj` is a variable whose type is `java.lang.Object`.
- What is `java.lang.Object`? #card
card-last-interval:: 4.59
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T00:30:00.069Z
card-last-reviewed:: 2022-10-07T10:30:00.070Z
card-last-score:: 5
- `java.lang.Object` is a class that provides the ^^most generic definition^^ of an object in Java.
- It is the **parent class** of every class in Java.
- For example. A `Bicycle` object is a `Bicycle` object **and** a `java.lang.Object` object.
-
- # Casting
- ```java
Bicycle bike1 = (Bicycle) myObject;
String str1 = (String) obj;
```
- Here, we can **cast** (convert) a variable from a higher type (`Object`), to a lower type (`Bicycle`).
- This is allowed, as `anObject` point to a Bicycle object - we can check this using `instanceof`.
- `obj` points to a String object - we can check this using `instanceof`.
- Note that the variable type being converted is ^^not the object.^^
- # Class Hierarchy
- ## Is-a Relationships
- Java organises all its classes in a class hierarchy.
- For example, a car is a type of vehicle, which is a type of object.
- These relationships can be described as "is-a" relationships.
- A car **is-a** vehicle; a vehicle **is-a**(n) object.
- We refer the higher-up types as **parents** and the lower types as **children**.
- Car *is-a child* of Vehicle.
- Vehicle *is-a parent* of Car.
- Object *is the parent* of Vehicle & Car.
- ## Key Ideas in Class Hierarchy
- The top of the hierarchy represents the ^^most **generic** attributes & behaviours.^^
- The bottom (sometimes referred to as "leaves") represent the ^^most **specific** attributes & behaviours.^^
- Each level inherits and customises the attributes & behaviours from the level above it.
- `java.lang.Object` is *the* **superclass**, the parent of all classes in Java.
- Every class in Java has the `java.lang.Object` as its superclass (parent).
- ![image.png](../assets/image_1665133543483_0.png)
- All the classes shown above **inherit** (receive) methods from the superclass `java.lang.Object`.
- What is **OOP Inheritance**? #card
card-last-interval:: 1.05
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-11T12:34:06.705Z
card-last-reviewed:: 2022-10-10T11:34:06.705Z
card-last-score:: 3
- **Inheritance** is the means by which objects automatically receive features (fields) & behaviours (methods) from their **superclass**.
- The methods of this superclass are available to all objects of this Class, even though these methods may not be shown in the Class code.
- For example: `.equals()`.
- ### Generic Methods
- All the methods provided by the `java.lang.Object` are *generic*.
- They only relate to `java.lang.Object` classes, not the subclasses.
- When a subclass inherits these methods, it needs to customise them.
- This is why we had to override `.equals()` with our own version for the example Bicycle class.
- ### Overriding
- What is **overriding**? #card
card-last-interval:: 7.48
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-17T22:44:46.299Z
card-last-reviewed:: 2022-10-10T11:44:46.300Z
card-last-score:: 5
- **Overriding** is when you write your own version of a method that you have inherited from a superclass.
- It is creating a specific version of a method inherited from a parent (superclass) class.
- When overriding a method, you must keep every part of the method signature the same - You can only change the code in the method body.
- Its name, its parameter types & order, its access level (e.g., public, protected), and its return type.
- #### Annotation
- It is good practice to **annotate** your overridden methods using `@Override`.
- You code will compile & run without it, but it is considered good practice to annotate the methods that are overridden inherited from the superclass.
- ```java
@Override
public boolean equals(Object obj)
{
obj == null ? return false;
if (obj instanceof Bicycle)
{
Bicycle bike = (Bicycle) obj;
if (this.speed == bike.getSpeed() && this.gear == bike.getGear())
{
return true;
}
}
return false;
}
```

View File

@ -0,0 +1,39 @@
- #[[CT2106 - Object-Oriented Programming]]
- No previous topic
- **Relevant Slides:** ![Lecture00.pdf](../assets/Lecture00_1662850272554_0.pdf)
-
- ## Definitions
- What is a **class**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:27:33.091Z
card-last-reviewed:: 2022-09-19T18:27:33.091Z
card-last-score:: 5
- A **class** is a type of *blueprint* or *template* from which you make objects.
- What is an **object**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T14:51:10.520Z
card-last-score:: 1
- A (Java) **object** is a self-contained component which consists of *methods* and *properties*.
- It is a piece of code that has a **state** and has **behaviour**.
- Often, they represent a "real-life" object.
- An object is created by *instantiating* a **class**.
- What is **bytecode**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:11:44.761Z
card-last-reviewed:: 2022-09-18T15:11:44.762Z
card-last-score:: 5
- Unlike other high-level programming languages, Java code is **not** compiled into machine-specific code that can be executed by a microprocessor.
- Instead, Java programs are compiled into **bytecode**. The bytecode is input into a **Java Virtual Machine (JVM)**, which interprets & executes the code. The JVM is usually a program itself.
- Bytecode is **platform independent**.
- The JVM is specific for each platform, but the bytecode for the program remains the same across different platforms.
- The main trade-off is the effect it has on the execution speed.
-
- **Next Topic:** [[Introduction to Java]]
-

View File

@ -0,0 +1,39 @@
- #[[CT2106 - Object-Oriented Programming]]
- No previous topic
- **Relevant Slides:** ![Lecture00.pdf](../assets/Lecture00_1662850272554_0.pdf)
-
- ## Definitions
- What is a **class**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:29:42.373Z
card-last-reviewed:: 2022-10-03T14:29:42.374Z
card-last-score:: 5
- A **class** is a type of *blueprint* or *template* from which you make objects.
- What is an **object**? #card
card-last-interval:: 3.33
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-03T16:21:16.230Z
card-last-reviewed:: 2022-09-30T09:21:16.231Z
card-last-score:: 3
- A (Java) **object** is a self-contained component which consists of *methods* and *properties*.
- It is a piece of code that has a **state** and has **behaviour**.
- Often, they represent a "real-life" object.
- An object is created by *instantiating* a **class**.
- What is **bytecode**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:21:49.464Z
card-last-reviewed:: 2022-10-01T13:21:49.465Z
card-last-score:: 3
- Unlike other high-level programming languages, Java code is **not** compiled into machine-specific code that can be executed by a microprocessor.
- Instead, Java programs are compiled into **bytecode**. The bytecode is input into a **Java Virtual Machine (JVM)**, which interprets & executes the code. The JVM is usually a program itself.
- Bytecode is **platform independent**.
- The JVM is specific for each platform, but the bytecode for the program remains the same across different platforms.
- The main trade-off is the effect it has on the execution speed.
-
- **Next Topic:** [[Introduction to Java]]
-

View File

@ -0,0 +1,39 @@
- #[[CT2106 - Object-Oriented Programming]]
- No previous topic
- **Relevant Slides:** ![Lecture00.pdf](../assets/Lecture00_1662850272554_0.pdf)
-
- ## Definitions
- What is a **class**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:29:42.373Z
card-last-reviewed:: 2022-10-03T14:29:42.374Z
card-last-score:: 5
- A **class** is a type of *blueprint* or *template* from which you make objects.
- What is an **object**? #card
card-last-interval:: 8.8
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-15T12:29:13.252Z
card-last-reviewed:: 2022-10-06T17:29:13.252Z
card-last-score:: 5
- A (Java) **object** is a self-contained component which consists of *methods* and *properties*.
- It is a piece of code that has a **state** and has **behaviour**.
- Often, they represent a "real-life" object.
- An object is created by *instantiating* a **class**.
- What is **bytecode**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:21:49.464Z
card-last-reviewed:: 2022-10-01T13:21:49.465Z
card-last-score:: 3
- Unlike other high-level programming languages, Java code is **not** compiled into machine-specific code that can be executed by a microprocessor.
- Instead, Java programs are compiled into **bytecode**. The bytecode is input into a **Java Virtual Machine (JVM)**, which interprets & executes the code. The JVM is usually a program itself.
- Bytecode is **platform independent**.
- The JVM is specific for each platform, but the bytecode for the program remains the same across different platforms.
- The main trade-off is the effect it has on the execution speed.
-
- **Next Topic:** [[Introduction to Java]]
-

View File

@ -0,0 +1,357 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 15.72
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-19T04:41:54.766Z
card-last-reviewed:: 2022-10-03T11:41:54.767Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T12:15:02.333Z
card-last-score:: 1
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:28:18.563Z
card-last-score:: 1
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:14:42.612Z
card-last-reviewed:: 2022-09-30T12:14:42.612Z
card-last-score:: 3
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:14:56.915Z
card-last-reviewed:: 2022-09-30T12:14:56.915Z
card-last-score:: 3
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T08:44:55.362Z
card-last-score:: 1
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:33.849Z
card-last-reviewed:: 2022-09-30T08:26:33.849Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T09:26:44.930Z
card-last-reviewed:: 2022-09-30T09:26:44.930Z
card-last-score:: 5
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-27T22:08:23.636Z
card-last-reviewed:: 2022-10-04T12:08:23.637Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-13T18:17:13.863Z
card-last-reviewed:: 2022-10-04T12:17:13.863Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-08T12:35:13.598Z
card-last-reviewed:: 2022-10-04T12:35:13.598Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:42.165Z
card-last-reviewed:: 2022-10-03T11:42:42.165Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T08:45:10.629Z
card-last-reviewed:: 2022-09-30T08:45:10.629Z
card-last-score:: 3
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:15:41.002Z
card-last-reviewed:: 2022-09-30T12:15:41.003Z
card-last-score:: 5
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:34:07.625Z
card-last-score:: 1
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:14:37.382Z
card-last-reviewed:: 2022-09-30T12:14:37.382Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-10-28T18:17:48.915Z
card-last-interval:: 24.27
card-ease-factor:: 2.56
card-last-reviewed:: 2022-10-04T12:17:48.915Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:15:32.324Z
card-last-reviewed:: 2022-09-30T12:15:32.324Z
card-last-score:: 3
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-04T23:00:00.000Z
card-last-reviewed:: 2022-10-04T12:28:12.548Z
card-last-score:: 1
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T11:43:05.343Z
card-last-score:: 1
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:15:27.475Z
card-last-reviewed:: 2022-09-30T12:15:27.476Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T08:44:42.623Z
card-last-reviewed:: 2022-09-30T08:44:42.623Z
card-last-score:: 3
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, without explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T08:44:33.401Z
card-last-score:: 1
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,364 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 15.72
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-19T04:41:54.766Z
card-last-reviewed:: 2022-10-03T11:41:54.767Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:32:46.591Z
card-last-reviewed:: 2022-10-07T10:32:46.591Z
card-last-score:: 3
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:24:46.993Z
card-last-reviewed:: 2022-10-06T17:24:46.993Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:14:42.612Z
card-last-reviewed:: 2022-09-30T12:14:42.612Z
card-last-score:: 3
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:14:56.915Z
card-last-reviewed:: 2022-09-30T12:14:56.915Z
card-last-score:: 3
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:33:30.852Z
card-last-reviewed:: 2022-10-07T10:33:30.852Z
card-last-score:: 3
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:33.849Z
card-last-reviewed:: 2022-09-30T08:26:33.849Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:22.819Z
card-last-reviewed:: 2022-10-07T10:19:22.820Z
card-last-score:: 5
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-27T22:08:23.636Z
card-last-reviewed:: 2022-10-04T12:08:23.637Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-13T18:17:13.863Z
card-last-reviewed:: 2022-10-04T12:17:13.863Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 9.68
card-repeats:: 3
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-19T00:50:25.340Z
card-last-reviewed:: 2022-10-09T08:50:25.341Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:42.165Z
card-last-reviewed:: 2022-10-03T11:42:42.165Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:18.480Z
card-last-reviewed:: 2022-10-07T10:39:18.480Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 8.76
card-repeats:: 3
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-09T06:15:41.002Z
card-last-reviewed:: 2022-09-30T12:15:41.003Z
card-last-score:: 5
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: 4.28
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T16:20:44.796Z
card-last-reviewed:: 2022-10-07T10:20:44.796Z
card-last-score:: 5
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:14:37.382Z
card-last-reviewed:: 2022-09-30T12:14:37.382Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-10-28T18:17:48.915Z
card-last-interval:: 24.27
card-ease-factor:: 2.56
card-last-reviewed:: 2022-10-04T12:17:48.915Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T09:15:32.324Z
card-last-reviewed:: 2022-09-30T12:15:32.324Z
card-last-score:: 3
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:32:01.681Z
card-last-score:: 1
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:15:01.657Z
card-last-reviewed:: 2022-10-08T15:15:01.658Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:15:27.475Z
card-last-reviewed:: 2022-09-30T12:15:27.476Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:11.689Z
card-last-reviewed:: 2022-10-07T10:39:11.689Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, without explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-08T23:00:00.000Z
card-last-reviewed:: 2022-10-08T15:18:56.111Z
card-last-score:: 1
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,364 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 15.72
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-19T04:41:54.766Z
card-last-reviewed:: 2022-10-03T11:41:54.767Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:32:46.591Z
card-last-reviewed:: 2022-10-07T10:32:46.591Z
card-last-score:: 3
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-10T03:24:46.993Z
card-last-reviewed:: 2022-10-06T17:24:46.993Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:44:02.090Z
card-last-reviewed:: 2022-10-10T11:44:02.091Z
card-last-score:: 3
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 27.13
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-06T14:42:59.239Z
card-last-reviewed:: 2022-10-10T11:42:59.239Z
card-last-score:: 5
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:33:30.852Z
card-last-reviewed:: 2022-10-07T10:33:30.852Z
card-last-score:: 3
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:33.849Z
card-last-reviewed:: 2022-09-30T08:26:33.849Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:22.819Z
card-last-reviewed:: 2022-10-07T10:19:22.820Z
card-last-score:: 5
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-27T22:08:23.636Z
card-last-reviewed:: 2022-10-04T12:08:23.637Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-13T18:17:13.863Z
card-last-reviewed:: 2022-10-04T12:17:13.863Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 9.68
card-repeats:: 3
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-19T00:50:25.340Z
card-last-reviewed:: 2022-10-09T08:50:25.341Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:42.165Z
card-last-reviewed:: 2022-10-03T11:42:42.165Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:18.480Z
card-last-reviewed:: 2022-10-07T10:39:18.480Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:43:12.658Z
card-last-reviewed:: 2022-10-10T11:43:12.658Z
card-last-score:: 3
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: 4.28
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T16:20:44.796Z
card-last-reviewed:: 2022-10-07T10:20:44.796Z
card-last-score:: 5
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:14:37.382Z
card-last-reviewed:: 2022-09-30T12:14:37.382Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-10-28T18:17:48.915Z
card-last-interval:: 24.27
card-ease-factor:: 2.56
card-last-reviewed:: 2022-10-04T12:17:48.915Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:44:09.198Z
card-last-reviewed:: 2022-10-10T11:44:09.198Z
card-last-score:: 5
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: 2.09
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T13:39:40.453Z
card-last-reviewed:: 2022-10-10T11:39:40.453Z
card-last-score:: 5
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:15:01.657Z
card-last-reviewed:: 2022-10-08T15:15:01.658Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:15:27.475Z
card-last-reviewed:: 2022-09-30T12:15:27.476Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:11.689Z
card-last-reviewed:: 2022-10-07T10:39:11.689Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, without explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: 0.75
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-11T05:41:10.864Z
card-last-reviewed:: 2022-10-10T11:41:10.864Z
card-last-score:: 3
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,364 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 15.72
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-19T04:41:54.766Z
card-last-reviewed:: 2022-10-03T11:41:54.767Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 3.71
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-11T03:32:46.591Z
card-last-reviewed:: 2022-10-07T10:32:46.591Z
card-last-score:: 3
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 14.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-03T12:29:26.988Z
card-last-reviewed:: 2022-10-20T08:29:26.989Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-27T18:44:02.090Z
card-last-reviewed:: 2022-10-10T11:44:02.091Z
card-last-score:: 3
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 27.13
card-repeats:: 4
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-06T14:42:59.239Z
card-last-reviewed:: 2022-10-10T11:42:59.239Z
card-last-score:: 5
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-11T10:33:30.852Z
card-last-reviewed:: 2022-10-07T10:33:30.852Z
card-last-score:: 3
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T12:26:33.849Z
card-last-reviewed:: 2022-09-30T08:26:33.849Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:19:22.819Z
card-last-reviewed:: 2022-10-07T10:19:22.820Z
card-last-score:: 5
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-27T22:08:23.636Z
card-last-reviewed:: 2022-10-04T12:08:23.637Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-13T18:17:13.863Z
card-last-reviewed:: 2022-10-04T12:17:13.863Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 9.68
card-repeats:: 3
card-ease-factor:: 2.42
card-next-schedule:: 2022-10-19T00:50:25.340Z
card-last-reviewed:: 2022-10-09T08:50:25.341Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:42:42.165Z
card-last-reviewed:: 2022-10-03T11:42:42.165Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:18.480Z
card-last-reviewed:: 2022-10-07T10:39:18.480Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:43:12.658Z
card-last-reviewed:: 2022-10-10T11:43:12.658Z
card-last-score:: 3
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: 4.28
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-11T16:20:44.796Z
card-last-reviewed:: 2022-10-07T10:20:44.796Z
card-last-score:: 5
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:14:37.382Z
card-last-reviewed:: 2022-09-30T12:14:37.382Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 5
card-repeats:: 4
card-next-schedule:: 2022-10-28T18:17:48.915Z
card-last-interval:: 24.27
card-ease-factor:: 2.56
card-last-reviewed:: 2022-10-04T12:17:48.915Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 21.53
card-repeats:: 4
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-31T23:44:09.198Z
card-last-reviewed:: 2022-10-10T11:44:09.198Z
card-last-score:: 5
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: 2.09
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-12T13:39:40.453Z
card-last-reviewed:: 2022-10-10T11:39:40.453Z
card-last-score:: 5
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T15:15:01.657Z
card-last-reviewed:: 2022-10-08T15:15:01.658Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 13.2
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-13T16:15:27.475Z
card-last-reviewed:: 2022-09-30T12:15:27.476Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:39:11.689Z
card-last-reviewed:: 2022-10-07T10:39:11.689Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, without explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: 0.75
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-11T05:41:10.864Z
card-last-reviewed:: 2022-10-10T11:41:10.864Z
card-last-score:: 3
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,364 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-18T17:47:35.469Z
card-last-reviewed:: 2022-11-17T09:47:35.470Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-25T06:29:02.836Z
card-last-reviewed:: 2022-11-14T16:29:02.836Z
card-last-score:: 5
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T04:21:50.434Z
card-last-reviewed:: 2022-11-14T20:21:50.434Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:18:39.799Z
card-last-reviewed:: 2022-11-14T20:18:39.800Z
card-last-score:: 5
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 86.42
card-repeats:: 5
card-ease-factor:: 2.66
card-next-schedule:: 2023-02-09T06:22:06.927Z
card-last-reviewed:: 2022-11-14T20:22:06.927Z
card-last-score:: 5
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:36:00.957Z
card-last-score:: 1
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:37:34.017Z
card-last-reviewed:: 2022-11-14T16:37:34.018Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-12-08T06:03:14.822Z
card-last-reviewed:: 2022-11-14T20:03:14.822Z
card-last-score:: 3
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 47.41
card-repeats:: 5
card-ease-factor:: 2.28
card-next-schedule:: 2023-01-01T05:18:44.919Z
card-last-reviewed:: 2022-11-14T20:18:44.919Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 19.01
card-repeats:: 4
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-03T16:48:07.382Z
card-last-reviewed:: 2022-11-14T16:48:07.383Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 25.4
card-repeats:: 4
card-ease-factor:: 2.52
card-next-schedule:: 2022-12-12T18:47:31.780Z
card-last-reviewed:: 2022-11-17T09:47:31.780Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 15.05
card-repeats:: 4
card-ease-factor:: 1.94
card-next-schedule:: 2022-11-29T17:40:29.100Z
card-last-reviewed:: 2022-11-14T16:40:29.100Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:50.803Z
card-last-reviewed:: 2022-11-14T20:03:50.803Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:21:18.467Z
card-last-reviewed:: 2022-11-14T20:21:18.468Z
card-last-score:: 3
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:38:56.391Z
card-last-score:: 1
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:09.544Z
card-last-reviewed:: 2022-11-14T16:47:09.544Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.56
card-last-reviewed:: 2022-11-14T20:19:00.925Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 56.69
card-repeats:: 5
card-ease-factor:: 2.42
card-next-schedule:: 2023-01-07T03:37:12.551Z
card-last-reviewed:: 2022-11-11T11:37:12.551Z
card-last-score:: 5
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:42:45.174Z
card-last-score:: 1
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-25T13:05:15.346Z
card-last-reviewed:: 2022-11-21T13:05:15.347Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:14.600Z
card-last-reviewed:: 2022-11-14T16:47:14.600Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:33.858Z
card-last-reviewed:: 2022-11-14T20:03:33.859Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, usually within an explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-20T08:30:42.712Z
card-last-reviewed:: 2022-11-11T11:30:42.713Z
card-last-score:: 3
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,364 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-18T17:47:35.469Z
card-last-reviewed:: 2022-11-17T09:47:35.470Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-25T06:29:02.836Z
card-last-reviewed:: 2022-11-14T16:29:02.836Z
card-last-score:: 5
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T04:21:50.434Z
card-last-reviewed:: 2022-11-14T20:21:50.434Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:18:39.799Z
card-last-reviewed:: 2022-11-14T20:18:39.800Z
card-last-score:: 5
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 86.42
card-repeats:: 5
card-ease-factor:: 2.66
card-next-schedule:: 2023-02-09T06:22:06.927Z
card-last-reviewed:: 2022-11-14T20:22:06.927Z
card-last-score:: 5
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:36:00.957Z
card-last-score:: 1
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:37:34.017Z
card-last-reviewed:: 2022-11-14T16:37:34.018Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-12-08T06:03:14.822Z
card-last-reviewed:: 2022-11-14T20:03:14.822Z
card-last-score:: 3
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 47.41
card-repeats:: 5
card-ease-factor:: 2.28
card-next-schedule:: 2023-01-01T05:18:44.919Z
card-last-reviewed:: 2022-11-14T20:18:44.919Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 19.01
card-repeats:: 4
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-03T16:48:07.382Z
card-last-reviewed:: 2022-11-14T16:48:07.383Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 25.4
card-repeats:: 4
card-ease-factor:: 2.52
card-next-schedule:: 2022-12-12T18:47:31.780Z
card-last-reviewed:: 2022-11-17T09:47:31.780Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 15.05
card-repeats:: 4
card-ease-factor:: 1.94
card-next-schedule:: 2022-11-29T17:40:29.100Z
card-last-reviewed:: 2022-11-14T16:40:29.100Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:50.803Z
card-last-reviewed:: 2022-11-14T20:03:50.803Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:21:18.467Z
card-last-reviewed:: 2022-11-14T20:21:18.468Z
card-last-score:: 3
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:38:56.391Z
card-last-score:: 1
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:09.544Z
card-last-reviewed:: 2022-11-14T16:47:09.544Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.56
card-last-reviewed:: 2022-11-14T20:19:00.925Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 56.69
card-repeats:: 5
card-ease-factor:: 2.42
card-next-schedule:: 2023-01-07T03:37:12.551Z
card-last-reviewed:: 2022-11-11T11:37:12.551Z
card-last-score:: 5
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:42:45.174Z
card-last-score:: 1
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-25T13:05:15.346Z
card-last-reviewed:: 2022-11-21T13:05:15.347Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:14.600Z
card-last-reviewed:: 2022-11-14T16:47:14.600Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:33.858Z
card-last-reviewed:: 2022-11-14T20:03:33.859Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, usually within an explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-20T08:30:42.712Z
card-last-reviewed:: 2022-11-11T11:30:42.713Z
card-last-score:: 3
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```

View File

@ -0,0 +1,180 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Entity Relationship Models]]
- **Next Topic:** [[Normalisation]]
- **Relevant Slides:** ![SQL Joins and Union Queries class.pdf](../assets/SQL_Joins_and_Union_Queries_class_1665572555489_0.pdf)
-
- # Joins
- What are **Joins**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-19T23:00:00.000Z
card-last-reviewed:: 2022-10-19T08:46:10.068Z
card-last-score:: 1
- ^^**Joins** combine multiple tables into one table.^^
- This new (temporary) table is then queried to return results so that we can return values from any of the table that were joined.
- Tables are joined by specifying links (**joins**) across attributes in the tables.
- Joins are carried out on 2 tables at a time, but many tables can be joined in one.
- For example, a third table could be joined to a table that results from joining two tables.
- ## Specifying Joins #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-19T23:00:00.000Z
card-last-reviewed:: 2022-10-19T08:57:28.441Z
card-last-score:: 1
- [1]. In SQL, we must specify *all the tables* that are part of the join in the `FROM` clause.
- [2]. We must then specify the **join condition** - for an inner join, the condition is `foreign_key = primary_key / candidate_key`.
- [3]. The join condition can be specified in the `FROM` or `WHERE` clause.
- ## Different Types of Joins
- **Inner Join** is the default when using an **implicit join**.
- For **explicit joins**, we must explicitly state the join used.
- ![https://www.csestack.org/wp-content/uploads/2020/10/sql-table-joins.png](https://www.csestack.org/wp-content/uploads/2020/10/sql-table-joins.png)
- ### Inner Joins #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-19T23:00:00.000Z
card-last-reviewed:: 2022-10-19T08:56:56.972Z
card-last-score:: 1
- An `INNER JOIN` includes the tuples from the first (left) of the two tables ^^only when they satisfy the join condition^^ and tuples from the second (right) table ^^only when they also satisfy the join condition.^^
- Example:
- ```sql
SELECT *
FROM employee INNER JOIN dependent
ON ssn = essn;
```
- ### Left Joins #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-18T08:43:49.911Z
card-last-reviewed:: 2022-10-18T08:43:49.912Z
card-last-score:: 1
- **Left (outer) joins** include all of the tuples from the first (left) of two tables, regardless of whether or not they satisfy the join condition or if there are matching values in the second (right) table.
- Tuples from the second (right) table are only included when they satisfy the join condition.
- [Essentially the same as right joins.]
- Example:
- ```sql
SELECT *
FROM employee LEFT JOIN department ON
employee.ssn = department.mgrssn;
```
- ### Right Joins #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-19T23:00:00.000Z
card-last-reviewed:: 2022-10-19T08:46:25.910Z
card-last-score:: 1
- **Right (outer) joins** include all of the tuples from the second (right) table, regardless of whether or not they satisfy the join condition or if there are matching values in the first (left) table.
- Tuples from the first (left) table are include only if the satisfy the join condition.
- [Essentially the same as left joins.]
- Example:
- ```sql
SELECT *
FROM employee RIGHT JOIN department ON
employee.ssn = department.mgrssn;
```
- ## Inner Joining Tables #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-17T23:00:00.000Z
card-last-reviewed:: 2022-10-17T13:27:52.837Z
card-last-score:: 1
- The results of an inner join operation between two tables $R(A_1, A_2, \dots, A_n)$ and $S(B_1, B_2, \dots, B_m)$ is a table $Q(A_1, A_2, \dots, A_n, B_1, B_2, \dots, B_m)$.
- $Q$ has one tuple for each combination of tuples (one from $R$ & one from $S$) ^^whenever the combination satisfies the join condition^^ - the join will retrieve **all** attributes in each table.
- ### Example: Inner Join Condition for the `employee` & `dependent` Tables
- **Join Condition:** `ssn = essn`.
- Full query retrieving all employees & their dependents (*dependants* in non-American English), when they have dependents:
- ```SQL
SELECT *
FROM employee INNER JOIN dependent
ON ssn = essn;
```
- #### Note
- When attributes with the same name from different tables are used in a join query, you need to specify the table name to avoid ambiguity.
- For example:
- `bdate` in `employee` & `dependent`.
- We can refer to these unambiguously as `employee.bdate` & `dependent.bdate`.
-
- ## Implicit & Explicit Joins
- The **join condition** can be specified *implicitly* or *explicitly*.
- What is an **explicit join**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-17T23:00:00.000Z
card-last-reviewed:: 2022-10-17T13:27:41.672Z
card-last-score:: 1
- An **explicit join** is specified in the `FROM` clause where the tables to be joined are listed.
- The keyword `INNER JOIN` is used for inner joins, and the **join condition** is listed using the keyword `ON`.
- Syntax:
- ```SQL
SELECT [DISTINCT] <attribute list>
FROM <table>
[INNER / LEFT / RIGHT] JOIN <table>
ON <join condition>
WHERE <condition>
```
- What is an **implicit join**? #card
id:: 6346a49e-2951-4bdb-ab74-6920aa664c41
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-17T23:00:00.000Z
card-last-reviewed:: 2022-10-17T13:28:05.696Z
card-last-score:: 1
- An **implicit join** is specified on the `WHERE` clause without using the keyword `ON`.
- It is referred to as a **join condition**.
- All the tables must be listed in the `FROM` clause, separated by commas.
- The **join condition** is contained in the `WHERE` clause.
- If there are other conditions, the join condition is appended on with `AND`.
- Other conditions can be specified in the `WHERE` clause as well as the join condition.
- All implicit joins are **inner joins** - all rows from both tables will be returned whenever there is a match between the attributes in the join table.
- Syntax:
- ```sql
SELECT [DISTINCT] <attribute list>
FROM <table>, <table>
WHERE <join condition> AND
<condition>
```
- ## Self-Joins & Aliases
- What is a **self-join**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-18T23:00:00.000Z
card-last-reviewed:: 2022-10-18T13:32:33.200Z
card-last-score:: 1
- A **self-join** is a normal SQL join that joins a table to itself.
- This is accomplished by using **aliases** to give each "instance" of the table a separate name using the keyword `AS`.
- # Sub-Queries VS Joins
- Can sub-queries & joins be used interchangeably? #card
card-last-interval:: 0.98
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-20T07:45:55.095Z
card-last-reviewed:: 2022-10-19T08:45:55.097Z
card-last-score:: 3
- In some cases, you can replace a join with a sub-query.
- But recall:
- Joins are needed when values across multiple tables must be displayed.
- Sub-queries are needed when an existing value from a table needs to be retrieved & used as part of the query solution.
- Sub-queries are needed when an aggregate function needs to be performed & used as part of a query solution.
- # Union Queries
- The keyword `UNION` is used to combine the results of two or more queries or tables.
- MySQL does not support minus or intersection (intersect) operators, but the same functionality can be built using join.
- For union queries, tables must be **union compatible**.
- What does it mean to be **union compatible**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-18T23:00:00.000Z
card-last-reviewed:: 2022-10-18T13:32:11.787Z
card-last-score:: 1
- Two relations are **union compatible** if the schemas of two relations match.
- i.e., there are the same number of attributes in each relation, and each pair of corresponding attributes have the same **domain**.
-
-

View File

@ -0,0 +1,52 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[First Java Code]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture03.pdf](../assets/Lecture03_1663063871202_0.pdf)
-
- # Composition & Inheritance
- ## Composition
- What is **Composition**? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-21T20:50:47.300Z
card-last-reviewed:: 2022-09-17T20:50:47.301Z
card-last-score:: 5
- **Composition** is a type of "has-a" relationship. One object is **composed** of another and relies upon its services for its own functionality.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- The class `RacingBike` **has-a** `Wheel` - **Composition**.
- How do you represent **Composition** in OOP class diagrams? #card
card-last-interval:: 9.55
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-29T07:30:34.783Z
card-last-reviewed:: 2022-09-19T18:30:34.784Z
card-last-score:: 5
- In OOP class diagrams, a **diamond shape** indicates **composition** or a "has-a" relationship.
- ![image.png](../assets/image_1663271062397_0.png)
- This class diagram tells us that a `Vehicle` object is composed of a single `Engine` object.
- How do you realise **Composition** in Java? #card
card-last-interval:: 4
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-21T19:47:12.777Z
card-last-reviewed:: 2022-09-17T19:47:12.777Z
card-last-score:: 3
- To realise a "has-a" relationship in Java, you must ^^create a link between the **participant classes** using a **reference type variable**.^^
- The reference declaration is in the **owner** class.
-
- # Inheritance
- What is **Inheritance**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:31:39.621Z
card-last-score:: 1
- **Inheritance** is a type of "is-a" relationship.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- A `RacingBike` **is-a** type of `Bicycle` - **Inheritance**.
-
-

View File

@ -0,0 +1,52 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[First Java Code]]
- **Next Topic:** null
- **Relevant Slides:** ![Lecture03.pdf](../assets/Lecture03_1663063871202_0.pdf)
-
- # Composition & Inheritance
- ## Composition
- What is **Composition**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-04T09:28:05.146Z
card-last-reviewed:: 2022-09-30T09:28:05.146Z
card-last-score:: 5
- **Composition** is a type of "has-a" relationship. One object is **composed** of another and relies upon its services for its own functionality.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- The class `RacingBike` **has-a** `Wheel` - **Composition**.
- How do you represent **Composition** in OOP class diagrams? #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-15T07:08:01.227Z
card-last-reviewed:: 2022-10-04T12:08:01.227Z
card-last-score:: 5
- In OOP class diagrams, a **diamond shape** indicates **composition** or a "has-a" relationship.
- ![image.png](../assets/image_1663271062397_0.png)
- This class diagram tells us that a `Vehicle` object is composed of a single `Engine` object.
- How do you realise **Composition** in Java? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:28:00.640Z
card-last-score:: 1
- To realise a "has-a" relationship in Java, you must ^^create a link between the **participant classes** using a **reference type variable**.^^
- The reference declaration is in the **owner** class.
-
- # Inheritance
- What is **Inheritance**? #card
card-last-interval:: 2.22
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-10-02T14:27:50.641Z
card-last-reviewed:: 2022-09-30T09:27:50.642Z
card-last-score:: 3
- **Inheritance** is a type of "is-a" relationship.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- A `RacingBike` **is-a** type of `Bicycle` - **Inheritance**.
-
-

View File

@ -0,0 +1,52 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[First Java Code]]
- **Next Topic:** [[Variables & Types]]
- **Relevant Slides:** ![Lecture03.pdf](../assets/Lecture03_1663063871202_0.pdf)
-
- # Composition & Inheritance
- ## Composition
- What is **Composition**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-17T15:22:03.790Z
card-last-reviewed:: 2022-10-07T10:22:03.790Z
card-last-score:: 3
- **Composition** is a type of "has-a" relationship. One object is **composed** of another and relies upon its services for its own functionality.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- The class `RacingBike` **has-a** `Wheel` - **Composition**.
- How do you represent **Composition** in OOP class diagrams? #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-15T07:08:01.227Z
card-last-reviewed:: 2022-10-04T12:08:01.227Z
card-last-score:: 5
- In OOP class diagrams, a **diamond shape** indicates **composition** or a "has-a" relationship.
- ![image.png](../assets/image_1663271062397_0.png)
- This class diagram tells us that a `Vehicle` object is composed of a single `Engine` object.
- How do you realise **Composition** in Java? #card
card-last-interval:: 3.58
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-10T23:31:31.282Z
card-last-reviewed:: 2022-10-07T10:31:31.282Z
card-last-score:: 5
- To realise a "has-a" relationship in Java, you must ^^create a link between the **participant classes** using a **reference type variable**.^^
- The reference declaration is in the **owner** class.
-
- # Inheritance
- What is **Inheritance**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-15T14:18:15.221Z
card-last-reviewed:: 2022-10-06T17:18:15.222Z
card-last-score:: 3
- **Inheritance** is a type of "is-a" relationship.
- It is one of the fundamental relationships between classes in OOP.
- For example:
- A `RacingBike` **is-a** type of `Bicycle` - **Inheritance**.
-
-

View File

@ -0,0 +1,227 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Joins & Union Queries]]
- **Next Topic:** [[Query Processing: Relational Algebra]]
- **Relevant Slides:** ![normalisation_2022_part1.pdf](../assets/normalisation_2022_part1_1666177004532_0.pdf) ![normalisation_2022_part2.pdf](../assets/normalisation_2022_part2_1666776016494_0.pdf)
-
- # Normalisation
- What is **normalisation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:26:27.892Z
card-last-score:: 1
- **Normalisation** takes each table through a series of tests to "verify" whether or not it belongs to a certain **normal form**.
- Normal forms to check:
- 1^{st}, 2^{nd}, & 3^{rd} normal forms (**NF**).
- Boyce-Codd normal form - strong 3NF.
- 4^{th} & 5^{th} Normal Forms.
- ### Normalisation Provides:
- 1. A formal framework for analysing relation schemas based on **keys** & **functional dependencies** among attributes.
2. A series of **tests** so that a database can be normalised to any degree (e.g., from 1NF to 5NF).
- However, normalisation does not necessarily provide a good design if considered in isolation to everything else.
- What are **normalisation rules**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:34:44.367Z
card-last-score:: 1
- **Normalisation** rules gives us a *formal measure* of why one grouping of attributes in a relation schema may be better than the other.
- Why normalise? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:34:00.942Z
card-last-score:: 1
- 1. Redundancy will be reduced or eliminated, reducing storage space as a result.
2. The task of maintaining data integrity is made easier.
- However, with normalisation, tables are usually added to the schema and are linked with foreign keys, which causes queries to become more complex as the often require data from multiple tables (requiring joins or subqueries).
- ## Normalised & Un-Normalised Databases
- Both normalised & un-normalised databases have advantages & disadvantages.
- If a data base is **normalised**:
- No (or very little) redundancy.
- No anomalies when inserting, deleting, or modifying data.
- More tables.
- More foreign & primary keys to link tables.
- More complex queries.
- ### Redundancy
- What is **redundancy**? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-29T06:39:09.263Z
card-last-reviewed:: 2022-10-26T11:39:09.264Z
card-last-score:: 5
- **Redundancy** is the unnecessary duplication of data in a database.
- Consequences of redundancy:
- Space is wasted.
- Data can become inconsistent due to potential problems with update, insert, & delete operations.
- What is **duplication**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:39:40.006Z
card-last-score:: 1
- Duplicated data can naturally be present in a database and is not necessarily redundant.
- For example, an attribute can have two identical values.
- In the company database `ESSN` in `works_on` may be duplicated across many projects.
- Data is **duplicated** rather than **redundant** if information is lost when deleting data
-
- ## Alternatives to Normalisation #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-29T06:34:59.505Z
card-last-reviewed:: 2022-10-26T11:34:59.505Z
card-last-score:: 5
- The alternative to normalisation is to retain redundant data and maintain data integrity by means of code consistency checks.
- In some applications, the number of insertions may be very small or non-existent and in such cases, the overhead of normalised tables is generally not required.
- ## De-Normalisation
- What is **de-normalisation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:26:15.641Z
card-last-score:: 1
- **De-normalisation** is a process of making compromises to the normalised tables by ^^introducing intentional redundancy^^ for performance reasons (specifically, querying performance).
- Typically, de-normalisation will improve query times at the expense of data updates (insert, delete, update).
- # Functional Dependencies
- What is **Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:24:09.173Z
card-last-score:: 1
- If $A$ & $B$ are attributes of a relation $R$, then $B$ is **functionally dependent (FD)** on $A$ if each value of $A$ is associated with exactly one value of $B$.
- i.e., values in $B$ are uniquely determined by values of $A$.
- Functional Dependency is one of the main concepts associated with normalisation.
- It describes the ^^relationship between attributes.^^
- $A$ -> $B$:
- FD from $A$ to $B$.
- $B$ is FD on $A$.
- ![image.png](../assets/image_1666178440078_0.png)
- $A$ -> $B$ does not necessarily imply $B$ -> $A$.
- $A$ <-> $B$ denotes $A$ -> $B$ & $B$ -> $A$.
- $A$ -> $\{B,C\}$ denotes $A$ -> $B$ & $A$ -> $C$.
- $\{A,B\}$ -> $C$ denotes that it is the **combination** of $A$ & $B$ that uniquely determines $C$.
- ### Note on FDs
- A functional dependency is a property of a relation schema $R$ and cannot be inferred automatically. Instead, it must be defined explicitly by someone who knows the **semantics** of $R$.
- You will either be explicitly given all FDs, or given enough information about the attributes & the domain to *reasonably* infer the FDs (perhaps having to make assumptions).
- ## Types of FDs #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:27:22.772Z
card-last-score:: 1
- ### Full Functional Dependency
- What is a **Full Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-20T23:00:00.000Z
card-last-reviewed:: 2022-10-20T08:24:16.042Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **full functional dependency** if when some attribute (either $X$ or $Y$) is removed from the left-hand side, the dependency ^^does not hold.^^
- There may be any number of attributes on the LHS.
- ### Partial Functional Dependency
- What is a **Partial Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:40:00.928Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **partial functional dependency** if some attribute (either $X$ or $Y$) can be removed from the LHS and the dependency ^^still holds.^^
- There may be any number of attributes on the LHS.
- ### Transitive Functional Dependency
- What is a **Transitive Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:31:05.365Z
card-last-score:: 1
- A functional dependency $X$ -> $Y$ is a **transitive functional dependency** in the relation $R$ if there is a set of attributes $Z$ that is neither a candidate key nor a subset of any key of $R$, and borht $X$ -> $Z$ & $Z$ -> $Y$ hold
- What is a **Candidate Key (CK)**? #card
card-last-interval:: 2.8
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-10-29T06:42:33.029Z
card-last-reviewed:: 2022-10-26T11:42:33.029Z
card-last-score:: 5
- A **candidate key (CK)** is one or more attribute(s) in a relation with which you can determine all the attributes in the relation.
- Every relation has one or more candidate keys.
- We pick one such candidate key as the primary key of a relation.
- # Inference Rules for FDs
- Typically, the main obvious functional dependencies $F$ are specified for a schema.
- However, many others can be inferred from $F$.
- We call these the **closure** of $F$: $F^+$.
- **1. Reflexive:** Trivially, an attribute, or a set of attributes, always determines itself.
- **2. Augmentation:** If $X$ - $Y$, we can infer $XZ$ -> $YZ$.
- **3. Transitive:** If $X$ -> $Y$ & $Y$ -> $Z$, we can infer $X$ -> $Z$.
- **4. Decomposition:** If $X$ -> $YZ$, we can infer $X$ -> $Y$.
- **5. Union (additive):** If $X$ -> $Y$ and $X$ -> $Z$, we can infer if $X$ -> $YZ$.
- **6. Pseudotransitive:** If $X$ -> $Y$ and $WY$ -> $Z$, we can infer $WX$ -> $Z$.
- Note: Rules 1,2, & 3 are collectively called **Armstrong's Axioms**.
- # Normal Forms
- ## First Normal Form (1NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:32:19.711Z
card-last-score:: 1
- A table is in in **1NF** if the table does not have any repeating groups (a group of attributes that occur a variable number of times in each record (non-atomic)).
- To ensure first normal form, choose an appropriate primary key (if one is not already specified) and if required, split the table into two or more tables to remove repeating groups.
- ## Second Normal Form (2NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:42:43.105Z
card-last-score:: 1
- A relation in **2NF** must be in 1NF and be such that where there is a composite primary key, all non-key attributes must be dependent on the *entire* primary key.
- If partial dependencies exist, create new relations to split the attributes such that the partial dependency no longer holds.
- ## Third Normal Form (3NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:44:42.467Z
card-last-score:: 1
- A relation is in **3NF** if it is in 3NF and there are no dependencies between attributes that are not primary keys.
- That is, no transitive dependencies exist in the table.
- ### Steps to Normalise to 3NF #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:43:05.713Z
card-last-score:: 1
- 1. Identify an appropriate **Primary Key** if not already given.
- This puts the table into **1NF**.
- 2. Draw a diagram of **Functional Dependencies** from the primary key.
3. Identify if the dependencies are Full, Partial, or Transitive.
4. Using the diagram of the functional dependencies from the previous steps:
- 5. Normalise to **2NF** by ^^removing **partial dependencies**^^ - creating new tables as a result. <ins>Ensure that all new tables have Primary Keys.</ins>
6. Normalise to **3NF** by ^^removing **transitive dependencies**^^ (if they exist), creating new tables as a result. <ins>Ensure that any new tables have Primary Keys and are in 2NF</ins>.
7. Check that all resulting tables are themselves in 1NF, 2NF, and 3NF (in particular, make sure that they all have PKs of their own).
- ## Boyce-Codd Normal Form (BCNF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-10-26T23:00:00.000Z
card-last-reviewed:: 2022-10-26T11:40:29.231Z
card-last-score:: 1
- Only in rare cases does a 3NF table not meet the requirements off **BCNF**.
- These cases are when a table has more than one candidate key.
- Depending on the functional dependencies, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF.
- If a table in 3NF **does not** have multiple overlapping candidate keys, then it is guaranteed to be in **BCNF**.
-
-

View File

@ -0,0 +1,227 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Joins & Union Queries]]
- **Next Topic:** [[Query Processing: Relational Algebra]]
- **Relevant Slides:** ![normalisation_2022_part1.pdf](../assets/normalisation_2022_part1_1666177004532_0.pdf) ![normalisation_2022_part2.pdf](../assets/normalisation_2022_part2_1666776016494_0.pdf)
-
- # Normalisation
- What is **normalisation**? #card
card-last-interval:: 0.95
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-18T17:36:23.333Z
card-last-reviewed:: 2022-11-17T19:36:23.333Z
card-last-score:: 3
- **Normalisation** takes each table through a series of tests to "verify" whether or not it belongs to a certain **normal form**.
- Normal forms to check:
- 1^{st}, 2^{nd}, & 3^{rd} normal forms (**NF**).
- Boyce-Codd normal form - strong 3NF.
- 4^{th} & 5^{th} Normal Forms.
- ### Normalisation Provides:
- 1. A formal framework for analysing relation schemas based on **keys** & **functional dependencies** among attributes.
2. A series of **tests** so that a database can be normalised to any degree (e.g., from 1NF to 5NF).
- However, normalisation does not necessarily provide a good design if considered in isolation to everything else.
- What are **normalisation rules**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:28.568Z
card-last-score:: 1
- **Normalisation** rules gives us a *formal measure* of why one grouping of attributes in a relation schema may be better than the other.
- Why normalise? #card
card-last-interval:: 0.95
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-22T11:06:25.159Z
card-last-reviewed:: 2022-11-21T13:06:25.160Z
card-last-score:: 3
- 1. Redundancy will be reduced or eliminated, reducing storage space as a result.
2. The task of maintaining data integrity is made easier.
- However, with normalisation, tables are usually added to the schema and are linked with foreign keys, which causes queries to become more complex as the often require data from multiple tables (requiring joins or subqueries).
- ## Normalised & Un-Normalised Databases
- Both normalised & un-normalised databases have advantages & disadvantages.
- If a data base is **normalised**:
- No (or very little) redundancy.
- No anomalies when inserting, deleting, or modifying data.
- More tables.
- More foreign & primary keys to link tables.
- More complex queries.
- ### Redundancy
- What is **redundancy**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-18T20:19:53.452Z
card-last-reviewed:: 2022-11-14T20:19:53.453Z
card-last-score:: 5
- **Redundancy** is the unnecessary duplication of data in a database.
- Consequences of redundancy:
- Space is wasted.
- Data can become inconsistent due to potential problems with update, insert, & delete operations.
- What is **duplication**? #card
card-last-interval:: 2.97
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T19:17:11.015Z
card-last-reviewed:: 2022-11-14T20:17:11.015Z
card-last-score:: 5
- Duplicated data can naturally be present in a database and is not necessarily redundant.
- For example, an attribute can have two identical values.
- In the company database `ESSN` in `works_on` may be duplicated across many projects.
- Data is **duplicated** rather than **redundant** if information is lost when deleting data
-
- ## Alternatives to Normalisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:19:50.503Z
card-last-score:: 1
- The alternative to normalisation is to retain redundant data and maintain data integrity by means of code consistency checks.
- In some applications, the number of insertions may be very small or non-existent and in such cases, the overhead of normalised tables is generally not required.
- ## De-Normalisation
- What is **de-normalisation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:49.533Z
card-last-score:: 1
- **De-normalisation** is a process of making compromises to the normalised tables by ^^introducing intentional redundancy^^ for performance reasons (specifically, querying performance).
- Typically, de-normalisation will improve query times at the expense of data updates (insert, delete, update).
- # Functional Dependencies
- What is **Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-22T00:00:00.000Z
card-last-reviewed:: 2022-11-21T13:06:48.090Z
card-last-score:: 1
- If $A$ & $B$ are attributes of a relation $R$, then $B$ is **functionally dependent (FD)** on $A$ if each value of $A$ is associated with exactly one value of $B$.
- i.e., values in $B$ are uniquely determined by values of $A$.
- Functional Dependency is one of the main concepts associated with normalisation.
- It describes the ^^relationship between attributes.^^
- $A$ -> $B$:
- FD from $A$ to $B$.
- $B$ is FD on $A$.
- ![image.png](../assets/image_1666178440078_0.png)
- $A$ -> $B$ does not necessarily imply $B$ -> $A$.
- $A$ <-> $B$ denotes $A$ -> $B$ & $B$ -> $A$.
- $A$ -> $\{B,C\}$ denotes $A$ -> $B$ & $A$ -> $C$.
- $\{A,B\}$ -> $C$ denotes that it is the **combination** of $A$ & $B$ that uniquely determines $C$.
- ### Note on FDs
- A functional dependency is a property of a relation schema $R$ and cannot be inferred automatically. Instead, it must be defined explicitly by someone who knows the **semantics** of $R$.
- You will either be explicitly given all FDs, or given enough information about the attributes & the domain to *reasonably* infer the FDs (perhaps having to make assumptions).
- ## Types of FDs #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:30.192Z
card-last-score:: 1
- ### Full Functional Dependency
- What is a **Full Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:15:37.929Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **full functional dependency** if when some attribute (either $X$ or $Y$) is removed from the left-hand side, the dependency ^^does not hold.^^
- There may be any number of attributes on the LHS.
- ### Partial Functional Dependency
- What is a **Partial Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:25.672Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **partial functional dependency** if some attribute (either $X$ or $Y$) can be removed from the LHS and the dependency ^^still holds.^^
- There may be any number of attributes on the LHS.
- ### Transitive Functional Dependency
- What is a **Transitive Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:18.865Z
card-last-score:: 1
- A functional dependency $X$ -> $Y$ is a **transitive functional dependency** in the relation $R$ if there is a set of attributes $Z$ that is neither a candidate key nor a subset of any key of $R$, and both $X$ -> $Z$ & $Z$ -> $Y$ hold
- What is a **Candidate Key (CK)**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-18T06:20:02.759Z
card-last-reviewed:: 2022-11-14T20:20:02.759Z
card-last-score:: 3
- A **candidate key (CK)** is one or more attribute(s) in a relation with which you can determine all the attributes in the relation.
- Every relation has one or more candidate keys.
- We pick one such candidate key as the primary key of a relation.
- # Inference Rules for FDs
- Typically, the main obvious functional dependencies $F$ are specified for a schema.
- However, many others can be inferred from $F$.
- We call these the **closure** of $F$: $F^+$.
- **1. Reflexive:** Trivially, an attribute, or a set of attributes, always determines itself.
- **2. Augmentation:** If $X$ - $Y$, we can infer $XZ$ -> $YZ$.
- **3. Transitive:** If $X$ -> $Y$ & $Y$ -> $Z$, we can infer $X$ -> $Z$.
- **4. Decomposition:** If $X$ -> $YZ$, we can infer $X$ -> $Y$.
- **5. Union (additive):** If $X$ -> $Y$ and $X$ -> $Z$, we can infer if $X$ -> $YZ$.
- **6. Pseudotransitive:** If $X$ -> $Y$ and $WY$ -> $Z$, we can infer $WX$ -> $Z$.
- Note: Rules 1,2, & 3 are collectively called **Armstrong's Axioms**.
- # Normal Forms
- ## First Normal Form (1NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:12:58.023Z
card-last-score:: 1
- A table is in in **1NF** if the table ==does not have any repeating groups== (a group of attributes that occur a variable number of times in each record (non-atomic)).
- To ensure first normal form, choose an appropriate primary key (if one is not already specified) and if required, split the table into two or more tables to remove repeating groups.
- ## Second Normal Form (2NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:09.806Z
card-last-score:: 1
- A relation in **2NF** must be in 1NF and be such that where there is a composite primary key, all non-key attributes must be dependent on the *entire* primary key.
- If partial dependencies exist, create new relations to split the attributes such that the partial dependency no longer holds.
- ## Third Normal Form (3NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:39.277Z
card-last-score:: 1
- A relation is in **3NF** if it is in 3NF and there are no dependencies between attributes that are not primary keys.
- That is, no transitive dependencies exist in the table.
- ### Steps to Normalise to 3NF #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:36.919Z
card-last-score:: 1
- 1. Identify an appropriate **Primary Key** if not already given.
- This puts the table into **1NF**.
- 2. Draw a diagram of **Functional Dependencies** from the primary key.
3. Identify if the dependencies are Full, Partial, or Transitive.
4. Using the diagram of the functional dependencies from the previous steps:
- 5. Normalise to **2NF** by ^^removing **partial dependencies**^^ - creating new tables as a result. <ins>Ensure that all new tables have Primary Keys.</ins>
6. Normalise to **3NF** by ^^removing **transitive dependencies**^^ (if they exist), creating new tables as a result. <ins>Ensure that any new tables have Primary Keys and are in 2NF</ins>.
7. Check that all resulting tables are themselves in 1NF, 2NF, and 3NF (in particular, make sure that they all have PKs of their own).
- ## Boyce-Codd Normal Form (BCNF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:25.144Z
card-last-score:: 1
- Only in rare cases does a 3NF table not meet the requirements off **BCNF**.
- These cases are when a table has more than one candidate key.
- Depending on the functional dependencies, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF.
- If a table in 3NF **does not** have multiple overlapping candidate keys, then it is guaranteed to be in **BCNF**.
-
-

View File

@ -0,0 +1,227 @@
- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[Joins & Union Queries]]
- **Next Topic:** [[Query Processing: Relational Algebra]]
- **Relevant Slides:** ![normalisation_2022_part1.pdf](../assets/normalisation_2022_part1_1666177004532_0.pdf) ![normalisation_2022_part2.pdf](../assets/normalisation_2022_part2_1666776016494_0.pdf)
-
- # Normalisation
- What is **normalisation**? #card
card-last-interval:: 0.95
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-18T17:36:23.333Z
card-last-reviewed:: 2022-11-17T19:36:23.333Z
card-last-score:: 3
- **Normalisation** takes each table through a series of tests to "verify" whether or not it belongs to a certain **normal form**.
- Normal forms to check:
- 1^{st}, 2^{nd}, & 3^{rd} normal forms (**NF**).
- Boyce-Codd normal form - strong 3NF.
- 4^{th} & 5^{th} Normal Forms.
- ### Normalisation Provides:
- 1. A formal framework for analysing relation schemas based on **keys** & **functional dependencies** among attributes.
2. A series of **tests** so that a database can be normalised to any degree (e.g., from 1NF to 5NF).
- However, normalisation does not necessarily provide a good design if considered in isolation to everything else.
- What are **normalisation rules**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:28.568Z
card-last-score:: 1
- **Normalisation** rules gives us a *formal measure* of why one grouping of attributes in a relation schema may be better than the other.
- Why normalise? #card
card-last-interval:: 0.95
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-11-22T11:06:25.159Z
card-last-reviewed:: 2022-11-21T13:06:25.160Z
card-last-score:: 3
- 1. Redundancy will be reduced or eliminated, reducing storage space as a result.
2. The task of maintaining data integrity is made easier.
- However, with normalisation, tables are usually added to the schema and are linked with foreign keys, which causes queries to become more complex as the often require data from multiple tables (requiring joins or subqueries).
- ## Normalised & Un-Normalised Databases
- Both normalised & un-normalised databases have advantages & disadvantages.
- If a data base is **normalised**:
- No (or very little) redundancy.
- No anomalies when inserting, deleting, or modifying data.
- More tables.
- More foreign & primary keys to link tables.
- More complex queries.
- ### Redundancy
- What is **redundancy**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-11-18T20:19:53.452Z
card-last-reviewed:: 2022-11-14T20:19:53.453Z
card-last-score:: 5
- **Redundancy** is the unnecessary duplication of data in a database.
- Consequences of redundancy:
- Space is wasted.
- Data can become inconsistent due to potential problems with update, insert, & delete operations.
- What is **duplication**? #card
card-last-interval:: 2.97
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-17T19:17:11.015Z
card-last-reviewed:: 2022-11-14T20:17:11.015Z
card-last-score:: 5
- Duplicated data can naturally be present in a database and is not necessarily redundant.
- For example, an attribute can have two identical values.
- In the company database `ESSN` in `works_on` may be duplicated across many projects.
- Data is **duplicated** rather than **redundant** if information is lost when deleting data
-
- ## Alternatives to Normalisation #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:19:50.503Z
card-last-score:: 1
- The alternative to normalisation is to retain redundant data and maintain data integrity by means of code consistency checks.
- In some applications, the number of insertions may be very small or non-existent and in such cases, the overhead of normalised tables is generally not required.
- ## De-Normalisation
- What is **de-normalisation**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-19T00:00:00.000Z
card-last-reviewed:: 2022-11-18T18:34:49.533Z
card-last-score:: 1
- **De-normalisation** is a process of making compromises to the normalised tables by ^^introducing intentional redundancy^^ for performance reasons (specifically, querying performance).
- Typically, de-normalisation will improve query times at the expense of data updates (insert, delete, update).
- # Functional Dependencies
- What is **Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-22T00:00:00.000Z
card-last-reviewed:: 2022-11-21T13:06:48.090Z
card-last-score:: 1
- If $A$ & $B$ are attributes of a relation $R$, then $B$ is **functionally dependent (FD)** on $A$ if each value of $A$ is associated with exactly one value of $B$.
- i.e., values in $B$ are uniquely determined by values of $A$.
- Functional Dependency is one of the main concepts associated with normalisation.
- It describes the ^^relationship between attributes.^^
- $A$ -> $B$:
- FD from $A$ to $B$.
- $B$ is FD on $A$.
- ![image.png](../assets/image_1666178440078_0.png)
- $A$ -> $B$ does not necessarily imply $B$ -> $A$.
- $A$ <-> $B$ denotes $A$ -> $B$ & $B$ -> $A$.
- $A$ -> $\{B,C\}$ denotes $A$ -> $B$ & $A$ -> $C$.
- $\{A,B\}$ -> $C$ denotes that it is the **combination** of $A$ & $B$ that uniquely determines $C$.
- ### Note on FDs
- A functional dependency is a property of a relation schema $R$ and cannot be inferred automatically. Instead, it must be defined explicitly by someone who knows the **semantics** of $R$.
- You will either be explicitly given all FDs, or given enough information about the attributes & the domain to *reasonably* infer the FDs (perhaps having to make assumptions).
- ## Types of FDs #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:19:30.192Z
card-last-score:: 1
- ### Full Functional Dependency
- What is a **Full Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-18T00:00:00.000Z
card-last-reviewed:: 2022-11-17T20:15:37.929Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **full functional dependency** if when some attribute (either $X$ or $Y$) is removed from the left-hand side, the dependency ^^does not hold.^^
- There may be any number of attributes on the LHS.
- ### Partial Functional Dependency
- What is a **Partial Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:25.672Z
card-last-score:: 1
- A functional dependency $\{X, Y\}$ -> $Z$ is a **partial functional dependency** if some attribute (either $X$ or $Y$) can be removed from the LHS and the dependency ^^still holds.^^
- There may be any number of attributes on the LHS.
- ### Transitive Functional Dependency
- What is a **Transitive Functional Dependency**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:16:18.865Z
card-last-score:: 1
- A functional dependency $X$ -> $Y$ is a **transitive functional dependency** in the relation $R$ if there is a set of attributes $Z$ that is neither a candidate key nor a subset of any key of $R$, and both $X$ -> $Z$ & $Z$ -> $Y$ hold
- What is a **Candidate Key (CK)**? #card
card-last-interval:: 3.45
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-11-18T06:20:02.759Z
card-last-reviewed:: 2022-11-14T20:20:02.759Z
card-last-score:: 3
- A **candidate key (CK)** is one or more attribute(s) in a relation with which you can determine all the attributes in the relation.
- Every relation has one or more candidate keys.
- We pick one such candidate key as the primary key of a relation.
- # Inference Rules for FDs
- Typically, the main obvious functional dependencies $F$ are specified for a schema.
- However, many others can be inferred from $F$.
- We call these the **closure** of $F$: $F^+$.
- **1. Reflexive:** Trivially, an attribute, or a set of attributes, always determines itself.
- **2. Augmentation:** If $X$ - $Y$, we can infer $XZ$ -> $YZ$.
- **3. Transitive:** If $X$ -> $Y$ & $Y$ -> $Z$, we can infer $X$ -> $Z$.
- **4. Decomposition:** If $X$ -> $YZ$, we can infer $X$ -> $Y$.
- **5. Union (additive):** If $X$ -> $Y$ and $X$ -> $Z$, we can infer if $X$ -> $YZ$.
- **6. Pseudotransitive:** If $X$ -> $Y$ and $WY$ -> $Z$, we can infer $WX$ -> $Z$.
- Note: Rules 1,2, & 3 are collectively called **Armstrong's Axioms**.
- # Normal Forms
- ## First Normal Form (1NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:12:58.023Z
card-last-score:: 1
- A table is in in **1NF** if the table ==does not have any repeating groups== (a group of attributes that occur a variable number of times in each record (non-atomic)).
- To ensure first normal form, choose an appropriate primary key (if one is not already specified) and if required, split the table into two or more tables to remove repeating groups.
- ## Second Normal Form (2NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:09.806Z
card-last-score:: 1
- A relation in **2NF** must be in 1NF and be such that where there is a composite primary key, all non-key attributes must be dependent on the *entire* primary key.
- If partial dependencies exist, create new relations to split the attributes such that the partial dependency no longer holds.
- ## Third Normal Form (3NF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:15:39.277Z
card-last-score:: 1
- A relation is in **3NF** if it is in 3NF and there are no dependencies between attributes that are not primary keys.
- That is, no transitive dependencies exist in the table.
- ### Steps to Normalise to 3NF #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:36.919Z
card-last-score:: 1
- 1. Identify an appropriate **Primary Key** if not already given.
- This puts the table into **1NF**.
- 2. Draw a diagram of **Functional Dependencies** from the primary key.
3. Identify if the dependencies are Full, Partial, or Transitive.
4. Using the diagram of the functional dependencies from the previous steps:
- 5. Normalise to **2NF** by ^^removing **partial dependencies**^^ - creating new tables as a result. <ins>Ensure that all new tables have Primary Keys.</ins>
6. Normalise to **3NF** by ^^removing **transitive dependencies**^^ (if they exist), creating new tables as a result. <ins>Ensure that any new tables have Primary Keys and are in 2NF</ins>.
7. Check that all resulting tables are themselves in 1NF, 2NF, and 3NF (in particular, make sure that they all have PKs of their own).
- ## Boyce-Codd Normal Form (BCNF) #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T20:17:25.144Z
card-last-score:: 1
- Only in rare cases does a 3NF table not meet the requirements off **BCNF**.
- These cases are when a table has more than one candidate key.
- Depending on the functional dependencies, a 3NF table with two or more overlapping candidate keys may or may not be in BCNF.
- If a table in 3NF **does not** have multiple overlapping candidate keys, then it is guaranteed to be in **BCNF**.
-
-

View File

@ -0,0 +1,35 @@
- #[[CT2106 - Object-Oriented Programming]]
- **Previous Topic:** [[Variables & Types]]
- **Next Topic:** [[Introduction to Inheritance]]
- **Relevant Slides:** ![Lecture-6__2022.pdf](../assets/Lecture-6_2022_1663835887381_0.pdf) ![Lecture-7__2022.pdf](../assets/Lecture-7_2022_1664439118886_0.pdf) ![Lecture-8__2022.pdf](../assets/Lecture-8_2022_1664528150319_0.pdf)
-
- # Modelling the Problem
- A major part of OOP is modelling the problem.
- The goal is to identify the **principle objects** in the problem domain, which we model as classes, the **responsibility** of each of these objects, and the **collaborations** between objects.
- The objective of OOP Modelling is to produce a simplified **class diagram**.
- **Classes** represent real-world entities.
- **Associations** represent collaborations between the entities.
- **Attributes** represent the data held about these entities.
- **Generalisation** can be used to simplify the structure of the model.
- What are **nouns** in OOP? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-20T02:53:27.410Z
card-last-reviewed:: 2022-10-08T22:53:27.411Z
card-last-score:: 5
- **Nouns** are candidate objects in OOP.
-
- # OOP Principles
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T10:41:16.772Z
card-last-reviewed:: 2022-10-07T10:41:16.772Z
card-last-score:: 5
- Consider the following principles when assigning responsibilities:
- An **Object** is responsible for its own data.
- An Object is responsibility for communicating its state.
- **Single Responsibility Principle:** Each **Class** should have a ^^single responsibility.^^
- All its services should be aligned with that responsibility.
-

View File

@ -0,0 +1,541 @@
- #[[CT213 - Computer Systems & Organisation]]
- No previous topic.
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662828507609_0.pdf)
-
- ## Traditional Classes of Computer Systems
- What is a **Personal Computer (PC)**?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-09-23T18:28:00.836Z
card-last-interval:: 4
card-ease-factor:: 2.7
card-last-reviewed:: 2022-09-19T18:28:00.836Z
- A **Personal Computer** is a computer designed for use by an individual, usually incorporating a graphics display, a keyboard, and a mouse.
- What is a **Server**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:51:19.171Z
card-last-reviewed:: 2022-09-18T14:51:19.172Z
card-last-score:: 3
- A **server** is a computer used for running larger programs for multiple users, often simultaneously, and typically accessed only via a network.
- What is a **Supercomputer**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:20:18.972Z
card-last-reviewed:: 2022-09-18T15:20:18.973Z
card-last-score:: 5
- A **supercomputer** is a member of a class of computers with the highest performance (and cost). They are configured as servers and typically cost tens to hundreds of millions of dollars.
- What is an **Embedded Computer**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:21.268Z
card-last-reviewed:: 2022-09-18T15:17:21.269Z
card-last-score:: 5
- An **embedded computer** is a computer inside another device, used for running one predetermined application or collection of software.
- What are **Personal Mobile Devices**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:24.814Z
card-last-reviewed:: 2022-09-18T15:17:24.814Z
card-last-score:: 5
- **Personal Mobile Devices** are small, wireless devices that connect to the internet.
- They rely on batteries for power, and software is installed by downloading apps.
- Conventional examples include smartphones and tablets.
- What is **Cloud Computing**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:02.669Z
card-last-reviewed:: 2022-09-18T15:16:02.669Z
card-last-score:: 5
- **Cloud Computing** refers to large collections of servers that provide services over the internet.
- Some providers rent dynamically varying number of servers as a utility.
- What is **Software as a Service**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T17:42:30.362Z
card-last-reviewed:: 2022-09-19T17:42:30.363Z
card-last-score:: 5
- **Software as a Service** delivers software & data as a service over the internet, usually via a thing program, such as a browser.
- Examples include web search & email.
-
- ## Computer Systems
collapsed:: true
- ![image.png](../assets/image_1662829382080_0.png){:height 410, :width 414}
- What is **Application Software**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:30.513Z
card-last-reviewed:: 2022-09-18T15:17:30.513Z
card-last-score:: 5
- **Application Software** consists of user-installed applications & programs.
- Application Software provides services to the user that are commonly useful.
- What is the purpose of the **Operating System**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:09.752Z
card-last-reviewed:: 2022-09-18T15:16:09.753Z
card-last-score:: 5
- The **Operating System** interfaces between a user's program and the hardware, provides a variety of services, and performs supervisory functions.
- What is the purpose of the **Hardware**?
- The **Hardware** performs the tasks.
-
- ## Seven Great Ideas in Computer Organisation
- ### 1. Use **Abstraction** to Simplify Design #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:25:18.076Z
card-last-reviewed:: 2022-09-19T18:25:18.076Z
card-last-score:: 5
- A major productivity technique for hardware & software is to use **abstractions** to characterise the design at different levels of representation
- Lower-level details are hidden to offer a simpler model at higher levels.
- ### 2. Make the **Common Case** Fast #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:44.377Z
card-last-reviewed:: 2022-09-18T15:16:44.378Z
card-last-score:: 5
- Making the **common case fast** will tend to enhance performance better than optimising the rare case.
- The common case is often simpler than the rare case, and hence is usually easier to enhance.
- ### 3. Performance via **Parallelism** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:03:58.567Z
card-last-reviewed:: 2022-09-18T15:03:58.567Z
card-last-score:: 5
- Involves speeding up performance by using designs that compute operations in **parallel**.
- ### 4. Performance via **Pipelining** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-21T20:52:29.657Z
card-last-reviewed:: 2022-09-17T20:52:29.657Z
card-last-score:: 5
- **Performance via Pipelining** is a particular pattern of **parallelism** that is so prevalent in computer architecture that it merits its own name.
- ### 5. Performance via **Prediction** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:23:23.782Z
card-last-reviewed:: 2022-09-19T18:23:23.782Z
card-last-score:: 5
- In some cases, it can be ^^faster on average to guess and start working^^ that to wait until you know for sure (assuming that the mechanism to recover from a misprediction is not too expensive, and your prediction is relatively accurate).
- ### 6. Hierarchy of Memories #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:18:36.340Z
card-last-reviewed:: 2022-09-18T15:18:36.341Z
card-last-score:: 5
- Computer Architects have found that they can address conflicting demands with a **hierarchy of memories**.
- The ^^fastest, smallest, & most expensive memory per bit^^ is at the top of the hierarchy.
- The ^^slowest, largest, & cheapest per bit^^ is at the bottom of the hierarchy.
- ### 7. Dependability via **Redundancy** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:19:01.340Z
card-last-reviewed:: 2022-09-18T15:19:01.341Z
card-last-score:: 5
- Since any physical device can fail, we make systems **dependable** by including ^^redundant components^^ that can take over when a failure occurs *and* help detect failures.
-
- ## Hardware Organisation
- What does basic computer organisation look like? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:27:44.250Z
card-last-reviewed:: 2022-09-19T18:27:44.250Z
card-last-score:: 5
- ![image.png](../assets/image_1662830400492_0.png)
- What is an **integrated circuit**? #card
collapsed:: true
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-30T12:47:21.597Z
card-last-reviewed:: 2022-09-19T17:47:21.597Z
card-last-score:: 5
- An **integrated circuit**, also called a **chip**, is a device combining dozens to millions of transistors.
- ### The CPU
collapsed:: true
- What is a **CPU**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:22:22.644Z
card-last-reviewed:: 2022-09-18T15:22:22.644Z
card-last-score:: 5
- The **Central Processing Unit (CPU)**, also called the **processor**, is the ^^active part of the computer^^, which contains the datapath & control, and which adds numbers, tests numbers, signals I/O devices to activate, and so on.
- The CPU is ^^responsible for executing programs.^^
- What are the steps that the CPU takes to process programs? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T14:58:32.353Z
card-last-reviewed:: 2022-09-18T14:58:32.353Z
card-last-score:: 3
- **1. Fetch:** Retrieve an instruction from ^^program memory.^^
- **2. Decode:** Break down the instruction into parts that have significance to specific sections of the CPU.
- **3. Execute:** Various portions of the CPU are connected to perform the desired operation.
- **4. Write Back:** Simply "writes back" the results of the execute step ^^if necessary.^^
- ### CPU Organisation
- What does the **organisation** of the CPU look like? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:13:04.465Z
card-last-reviewed:: 2022-09-18T15:13:04.465Z
card-last-score:: 3
- Processors are made up of:
- A **Control Unit**
- **Execution Unit(s)**
- A **Register File**
- ![image.png](../assets/image_1662830846361_0.png){:height 339, :width 418}
-
- #### Control Unit
- What does the **Control Unit** do? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:27:08.868Z
card-last-reviewed:: 2022-09-19T18:27:08.869Z
card-last-score:: 5
- The **Control Unit** ^^controls the execution^^ of the instructions stored in main memory.
- It ^^retrieves & executes^^ them.
- What is the architecture of the control unit? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T20:02:39.216Z
card-last-reviewed:: 2022-09-14T20:02:39.217Z
card-last-score:: 3
- The control unit contains a **fetch unit**, a **decode unit**, and an **execute unit**.
- It also contains two special registers:
- **Program Counter (PC):** keeps the address of the next instruction
- **Instruction Register (IR):** keeps the instruction being executed
- ![image.png](../assets/image_1662837864357_0.png){:height 266, :width 550}
-
- ### The Memory Subsystem
collapsed:: true
- How is the memory divided into storage locations? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:17:42.430Z
card-last-reviewed:: 2022-09-18T15:17:42.430Z
card-last-score:: 5
- Memory is divided into a set of storage location which can hold data.
- Locations are numbered.
- Addresses are used to tell the memory which location the processor wants to access.
- What are the two hierarchies of memory? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:29:01.554Z
card-last-score:: 1
- **1. Nonvolatile / ROM (Read Only Memory):** Read only memory.
- Used to store the BIOS and / or a *bootstrap* or *bootloader* program.
- **2. Volatile / RAM (Random Access Memory):** Read / Write memory.
- Also called **Primary Memory**.
- Used to hold the programs, operating system, and data required by the computer.
- #### Primary Memory
- How is primary memory connected to the CPU? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:06:00.776Z
card-last-reviewed:: 2022-09-19T18:06:00.777Z
card-last-score:: 5
- **Primary Memory** is directly connected to the Central Processing Unit of the computer.
- It must be present for the CPU to function correctly.
- What are the three types of Primary Storage? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T15:10:32.299Z
card-last-score:: 1
- **1. Processors Register:**
- Contains information that the CPU needs to carry out the current instruction.
- **2. Cache Memory:**
- A special type of internal memory used by many CPUs to increase their **throughput**.
- **3. Main Memory:**
- Contains the programs that are currently being run and the data that the programs are operating on.
- What is the **address width**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:21:24.382Z
card-last-reviewed:: 2022-09-18T15:21:24.382Z
card-last-score:: 5
- The **address width** is the number of bits used to represent an address in memory.
- The **width** limits the amount of memory that a computer can access.
- Most computers use a **64 bit address**, which means that the maximum number of locations is $$ 2^{64} \approx 16 \text{ billion gigabytes}$$.
- What operations does the memory subsystem support? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:11.140Z
card-last-reviewed:: 2022-09-18T15:17:11.140Z
card-last-score:: 5
- The memory subsystem supports two operations:
- **Load** (or read) + the address of the data location to be read.
- **Store** (or write) + the address of the location & the data to be written.
- How many bytes may the memory system read or write at a time? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:57.109Z
card-last-reviewed:: 2022-09-18T15:16:57.110Z
card-last-score:: 3
- Read & Write operations ^^operate at the width of the system's data bus^^, usually 32 bit or 64 bit.
- How is a section of memory addressed? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:12:03.194Z
card-last-reviewed:: 2022-09-18T15:12:03.195Z
card-last-score:: 5
- The address ^^contains only the address of the lowest byte^^, and a number of bytes to be read is specified, e.g., 4 bytes.
- #### Memory Alignment & Words of Data
- When the computer's **word size** is 4 bytes, the data to be read should be at a memory address which is ^^some multiple of four.^^
- When this is not the case, e.g., the data starts at address 14 instead of 16, then the computer has to read two or more 4 byte chunks and do some calculation before the requested data has been read, or it may generate ^^an alignment fault.^^
- Even though the previous data structure end is at, for example, address 13, the next data structure should start at address 16. Two **padding bytes** are inserted between the two data structures at addresses 14 & 15 to align the next data structure at address 16.
- ### The I/O Subsystem
- What are **input devices**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:21:35.278Z
card-last-reviewed:: 2022-09-18T15:21:35.278Z
card-last-score:: 5
- Anything that feeds data into the computer.
- What are **output devices**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:16:34.930Z
card-last-reviewed:: 2022-09-18T15:16:34.931Z
card-last-score:: 5
- Display / transmit information back to the user.
- What does the **I/O Subsystem** contain? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:21:46.984Z
card-last-reviewed:: 2022-09-19T18:21:46.985Z
card-last-score:: 5
- The **I/O Subsystem** contains the devices that the computer uses to communicate with the outside world and to store data.
- How do I/O devices communicate with the processor? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:53:22.144Z
card-last-reviewed:: 2022-09-18T14:53:22.145Z
card-last-score:: 3
- I/O devices usually communicate with the processor using the **I/O Bus**.
- PCs use the **PCI Express (Peripheral Component Interconnect Express)** bus for their I/O bus.
- The Operating System needs a **device driver** to access a given I/O device.
- What is a **device driver**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:11:51.628Z
card-last-reviewed:: 2022-09-18T15:11:51.628Z
card-last-score:: 5
- A **device driver** is a program that allows the OS to control an I/O device.
- #### I/O Read / Write Operations
- The I/O read & write operations are similar to the memory read & write operations.
- How does the processor address I/O devices? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T14:48:48.247Z
card-last-reviewed:: 2022-09-19T17:48:48.248Z
card-last-score:: 3
- A processor may use:
- **Memory-Mapped I/O:** when the address of the I/O device is in the **direct memory space**, and the ^^sequences to read/write data in the device are the same as the memory read/write sequences.^^
- **Isolated I/O:** similar process to Memory-Mapped I/O, but the processor has a ^^second set of control signals to distinguish between a **memory access** and am **I/O access**.^^
- What is **IO/M**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T15:11:25.751Z
card-last-score:: 1
- **IO/M** is a **status signal** in **Isolated I/O** that denotes whether the read/write operation pertains to the memory or to the I/O subsystem.
- When the **signal is low** (IO/M = 0), i.e., IO/M is `true`, it denotes **memory-related operations**.
- When the **signal is high**, (IO/M = 1), i.e., IO/M is `false`, it denotes an **I/O operation**.
-
-
-
- ## Programs
- What are programs? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:13:13.990Z
card-last-reviewed:: 2022-09-18T15:13:13.990Z
card-last-score:: 5
- Programs are ^^sequences of instructions^^ that tell the computer what to do.
- To the computer, a program is made out of a ^^sequence of numbers that represent individual operations.^^
- These operations are known as **machine instructions** or just **instructions**.
- A set of instructions that a processor can execute is known as an **instruction set**.
- ### Program Development Tools
collapsed:: true
- What is a **high-level programming language**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:08:53.500Z
card-last-reviewed:: 2022-09-18T15:08:53.500Z
card-last-score:: 5
- A **high-level programming language** is a ^^portable language^^ such as C that is ^^composed of words & algebraic notation^^ that can be translated by a compiler into **assembly language**.
- What is a **compiler**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T17:43:48.914Z
card-last-reviewed:: 2022-09-19T17:43:48.914Z
card-last-score:: 5
- A **compiler** is a program that translates statements in a given high-level language into assembly language statements.
- What is an **assembler**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:10:09.892Z
card-last-reviewed:: 2022-09-18T15:10:09.892Z
card-last-score:: 5
- An **assembler** is a program that translates symbolic, assembly language versions of instructions into the ^^binary version.^^
- What is **Assembly Language**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-09-30T21:48:54.576Z
card-last-reviewed:: 2022-09-19T17:48:54.576Z
card-last-score:: 5
- **Assembly Language** is a ^^symbolic representation^^ of **machine instructions**.
- What is **Machine Language**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:22:03.638Z
card-last-score:: 1
- **Machine Language** is a ^^binary representation^^ of **machine instructions**.
- What is an **instruction**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:19:26.835Z
card-last-reviewed:: 2022-09-18T15:19:26.836Z
card-last-score:: 5
- An **instruction** is a command that the computer hardware understands & obeys.
-
- ## Operating Systems
- What is an **Operating System**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:25:01.958Z
card-last-reviewed:: 2022-09-19T18:25:01.958Z
card-last-score:: 5
- Possible definition: a program that runs on the computer that ^^knows about all the hardware^^ and usually ^^runs in privileged mode^^, having ^^access to physical resources that user programs can't control^^, and has the ^^ability to start & stop user programs.^^
- The OS is responsible for managing the physical resources of complex systems, such as PCs, workstations, mainframe computers, etc.
- It is also responsible for ^^loading & executing programs^^ and ^^interfacing with the users.^^
- Usually, there is no operating system for **small embedded systems**.
- Computers designed for one specific task.
-
- ### Multiprogramming
- What is **Multiprogramming**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:23:28.557Z
card-last-reviewed:: 2022-09-19T18:23:28.557Z
card-last-score:: 5
- **Multiprogramming** is a technique that allows the system to ^^present the illusion that multiple programs are running on the computer simultaneously.^^
- Many multiprogrammed computers are **multiuser**.
- They allow multiple users to be logged in at a time.
- How is multiprogramming achieved? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:19:07.150Z
card-last-reviewed:: 2022-09-18T15:19:07.151Z
card-last-score:: 5
- Multiprogramming is achieved by ^^switching rapidly between programs.^^
- How does the processor decide which process to execute next? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:20:14.106Z
card-last-reviewed:: 2022-09-18T15:20:14.106Z
card-last-score:: 5
- **FCFS - First Come, First Served:** processes are moved to the CPU in the order in which they arrive.
- **SJN - Shortest Job Next:** looks at all processes in the **ready state** and dispatches the one with the smallest service time.
- **Round Robin:** distributes the processing time equitably among all ready processes.
- ### Context Switching
- What is a **Context Switch**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:10:00.458Z
card-last-reviewed:: 2022-09-18T15:10:00.458Z
card-last-score:: 5
- When a program timeslice ends, the OS stops it, removes it, and gives another program control over the processor.
- This is a **context switch**.
- How does the OS go about a Context Switch? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:19:47.984Z
card-last-reviewed:: 2022-09-18T15:19:47.984Z
card-last-score:: 3
- copies the current program register file into memory
- restores the contents of the next program's register file into the processor
- starts executing the next program
- From the program point of view, ^^no program can tell that a context switch has been performed.^^
- ### Protection
- Three rules of Protection in multiprogrammed computers: #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T14:45:26.235Z
card-last-reviewed:: 2022-09-19T17:45:26.235Z
card-last-score:: 3
- 1. The result of any program running on the multiprogram computer ^^must be the same as if the program was the only program running on the computer.^^
- 2. Programs ^^must not be able to access other programs' data^^ and must be confident that their data will not be modified by other programs (for security and privacy).
- 3. Programs ^^must not interfere with other programs' use of I/O devices.^^
- How is protection achieved? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:26:58.100Z
card-last-score:: 1
- Protection is achieved by the ^^operating system having full control over the resources of the system (processor, memory, and I/O devices)^^ through:
- **Privileged Mode:** the operating system is the only one that can control the physical resources it executes in privileged mode.
- User programs execute in **user mode**.
- **Virtual Memory:** each program operates as if it were the only program on the computer, occupying a full set of the address space in its virtual space.
- The OS is *translating* memory addresses that the program references into physical addresses used by the memory system.
-
-
- **Next Topic:** [[Programming Models]]

View File

@ -0,0 +1,541 @@
- #[[CT213 - Computer Systems & Organisation]]
- No previous topic.
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662828507609_0.pdf)
-
- ## Traditional Classes of Computer Systems
- What is a **Personal Computer (PC)**?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-09-23T18:28:00.836Z
card-last-interval:: 4
card-ease-factor:: 2.7
card-last-reviewed:: 2022-09-19T18:28:00.836Z
- A **Personal Computer** is a computer designed for use by an individual, usually incorporating a graphics display, a keyboard, and a mouse.
- What is a **Server**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:51:19.171Z
card-last-reviewed:: 2022-09-18T14:51:19.172Z
card-last-score:: 3
- A **server** is a computer used for running larger programs for multiple users, often simultaneously, and typically accessed only via a network.
- What is a **Supercomputer**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:20:18.972Z
card-last-reviewed:: 2022-09-18T15:20:18.973Z
card-last-score:: 5
- A **supercomputer** is a member of a class of computers with the highest performance (and cost). They are configured as servers and typically cost tens to hundreds of millions of dollars.
- What is an **Embedded Computer**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:21.268Z
card-last-reviewed:: 2022-09-18T15:17:21.269Z
card-last-score:: 5
- An **embedded computer** is a computer inside another device, used for running one predetermined application or collection of software.
- What are **Personal Mobile Devices**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:24.814Z
card-last-reviewed:: 2022-09-18T15:17:24.814Z
card-last-score:: 5
- **Personal Mobile Devices** are small, wireless devices that connect to the internet.
- They rely on batteries for power, and software is installed by downloading apps.
- Conventional examples include smartphones and tablets.
- What is **Cloud Computing**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:02.669Z
card-last-reviewed:: 2022-09-18T15:16:02.669Z
card-last-score:: 5
- **Cloud Computing** refers to large collections of servers that provide services over the internet.
- Some providers rent dynamically varying number of servers as a utility.
- What is **Software as a Service**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T17:42:30.362Z
card-last-reviewed:: 2022-09-19T17:42:30.363Z
card-last-score:: 5
- **Software as a Service** delivers software & data as a service over the internet, usually via a thing program, such as a browser.
- Examples include web search & email.
-
- ## Computer Systems
collapsed:: true
- ![image.png](../assets/image_1662829382080_0.png){:height 410, :width 414}
- What is **Application Software**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:30.513Z
card-last-reviewed:: 2022-09-18T15:17:30.513Z
card-last-score:: 5
- **Application Software** consists of user-installed applications & programs.
- Application Software provides services to the user that are commonly useful.
- What is the purpose of the **Operating System**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:09.752Z
card-last-reviewed:: 2022-09-18T15:16:09.753Z
card-last-score:: 5
- The **Operating System** interfaces between a user's program and the hardware, provides a variety of services, and performs supervisory functions.
- What is the purpose of the **Hardware**?
- The **Hardware** performs the tasks.
-
- ## Seven Great Ideas in Computer Organisation
- ### 1. Use **Abstraction** to Simplify Design #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:25:18.076Z
card-last-reviewed:: 2022-09-19T18:25:18.076Z
card-last-score:: 5
- A major productivity technique for hardware & software is to use **abstractions** to characterise the design at different levels of representation
- Lower-level details are hidden to offer a simpler model at higher levels.
- ### 2. Make the **Common Case** Fast #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:44.377Z
card-last-reviewed:: 2022-09-18T15:16:44.378Z
card-last-score:: 5
- Making the **common case fast** will tend to enhance performance better than optimising the rare case.
- The common case is often simpler than the rare case, and hence is usually easier to enhance.
- ### 3. Performance via **Parallelism** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:03:58.567Z
card-last-reviewed:: 2022-09-18T15:03:58.567Z
card-last-score:: 5
- Involves speeding up performance by using designs that compute operations in **parallel**.
- ### 4. Performance via **Pipelining** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-21T20:52:29.657Z
card-last-reviewed:: 2022-09-17T20:52:29.657Z
card-last-score:: 5
- **Performance via Pipelining** is a particular pattern of **parallelism** that is so prevalent in computer architecture that it merits its own name.
- ### 5. Performance via **Prediction** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:23:23.782Z
card-last-reviewed:: 2022-09-19T18:23:23.782Z
card-last-score:: 5
- In some cases, it can be ^^faster on average to guess and start working^^ that to wait until you know for sure (assuming that the mechanism to recover from a misprediction is not too expensive, and your prediction is relatively accurate).
- ### 6. Hierarchy of Memories #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:18:36.340Z
card-last-reviewed:: 2022-09-18T15:18:36.341Z
card-last-score:: 5
- Computer Architects have found that they can address conflicting demands with a **hierarchy of memories**.
- The ^^fastest, smallest, & most expensive memory per bit^^ is at the top of the hierarchy.
- The ^^slowest, largest, & cheapest per bit^^ is at the bottom of the hierarchy.
- ### 7. Dependability via **Redundancy** #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:19:01.340Z
card-last-reviewed:: 2022-09-18T15:19:01.341Z
card-last-score:: 5
- Since any physical device can fail, we make systems **dependable** by including ^^redundant components^^ that can take over when a failure occurs *and* help detect failures.
-
- ## Hardware Organisation
- What does basic computer organisation look like? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:27:44.250Z
card-last-reviewed:: 2022-09-19T18:27:44.250Z
card-last-score:: 5
- ![image.png](../assets/image_1662830400492_0.png)
- What is an **integrated circuit**? #card
collapsed:: true
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-30T12:47:21.597Z
card-last-reviewed:: 2022-09-19T17:47:21.597Z
card-last-score:: 5
- An **integrated circuit**, also called a **chip**, is a device combining dozens to millions of transistors.
- ### The CPU
collapsed:: true
- What is a **CPU**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:22:22.644Z
card-last-reviewed:: 2022-09-18T15:22:22.644Z
card-last-score:: 5
- The **Central Processing Unit (CPU)**, also called the **processor**, is the ^^active part of the computer^^, which contains the datapath & control, and which adds numbers, tests numbers, signals I/O devices to activate, and so on.
- The CPU is ^^responsible for executing programs.^^
- What are the steps that the CPU takes to process programs? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T14:58:32.353Z
card-last-reviewed:: 2022-09-18T14:58:32.353Z
card-last-score:: 3
- **1. Fetch:** Retrieve an instruction from ^^program memory.^^
- **2. Decode:** Break down the instruction into parts that have significance to specific sections of the CPU.
- **3. Execute:** Various portions of the CPU are connected to perform the desired operation.
- **4. Write Back:** Simply "writes back" the results of the execute step ^^if necessary.^^
- ### CPU Organisation
- What does the **organisation** of the CPU look like? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:13:04.465Z
card-last-reviewed:: 2022-09-18T15:13:04.465Z
card-last-score:: 3
- Processors are made up of:
- A **Control Unit**
- **Execution Unit(s)**
- A **Register File**
- ![image.png](../assets/image_1662830846361_0.png){:height 339, :width 418}
-
- #### Control Unit
- What does the **Control Unit** do? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:27:08.868Z
card-last-reviewed:: 2022-09-19T18:27:08.869Z
card-last-score:: 5
- The **Control Unit** ^^controls the execution^^ of the instructions stored in main memory.
- It ^^retrieves & executes^^ them.
- What is the architecture of the control unit? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T20:02:39.216Z
card-last-reviewed:: 2022-09-14T20:02:39.217Z
card-last-score:: 3
- The control unit contains a **fetch unit**, a **decode unit**, and an **execute unit**.
- It also contains two special registers:
- **Program Counter (PC):** keeps the address of the next instruction
- **Instruction Register (IR):** keeps the instruction being executed
- ![image.png](../assets/image_1662837864357_0.png){:height 266, :width 550}
-
- ### The Memory Subsystem
collapsed:: true
- How is the memory divided into storage locations? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:17:42.430Z
card-last-reviewed:: 2022-09-18T15:17:42.430Z
card-last-score:: 5
- Memory is divided into a set of storage location which can hold data.
- Locations are numbered.
- Addresses are used to tell the memory which location the processor wants to access.
- What are the two hierarchies of memory? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:29:01.554Z
card-last-score:: 1
- **1. Nonvolatile / ROM (Read Only Memory):** Read only memory.
- Used to store the BIOS and / or a *bootstrap* or *bootloader* program.
- **2. Volatile / RAM (Random Access Memory):** Read / Write memory.
- Also called **Primary Memory**.
- Used to hold the programs, operating system, and data required by the computer.
- #### Primary Memory
- How is primary memory connected to the CPU? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:06:00.776Z
card-last-reviewed:: 2022-09-19T18:06:00.777Z
card-last-score:: 5
- **Primary Memory** is directly connected to the Central Processing Unit of the computer.
- It must be present for the CPU to function correctly.
- What are the three types of Primary Storage? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T15:10:32.299Z
card-last-score:: 1
- **1. Processors Register:**
- Contains information that the CPU needs to carry out the current instruction.
- **2. Cache Memory:**
- A special type of internal memory used by many CPUs to increase their **throughput**.
- **3. Main Memory:**
- Contains the programs that are currently being run and the data that the programs are operating on.
- What is the **address width**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:21:24.382Z
card-last-reviewed:: 2022-09-18T15:21:24.382Z
card-last-score:: 5
- The **address width** is the number of bits used to represent an address in memory.
- The **width** limits the amount of memory that a computer can access.
- Most computers use a **64 bit address**, which means that the maximum number of locations is $$ 2^{64} \approx 16 \text{ billion gigabytes}$$.
- What operations does the memory subsystem support? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:17:11.140Z
card-last-reviewed:: 2022-09-18T15:17:11.140Z
card-last-score:: 5
- The memory subsystem supports two operations:
- **Load** (or read) + the address of the data location to be read.
- **Store** (or write) + the address of the location & the data to be written.
- How many bytes may the memory system read or write at a time? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:16:57.109Z
card-last-reviewed:: 2022-09-18T15:16:57.110Z
card-last-score:: 3
- Read & Write operations ^^operate at the width of the system's data bus^^, usually 32 bit or 64 bit.
- How is a section of memory addressed? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:12:03.194Z
card-last-reviewed:: 2022-09-18T15:12:03.195Z
card-last-score:: 5
- The address ^^contains only the address of the lowest byte^^, and a number of bytes to be read is specified, e.g., 4 bytes.
- #### Memory Alignment & Words of Data
- When the computer's **word size** is 4 bytes, the data to be read should be at a memory address which is ^^some multiple of four.^^
- When this is not the case, e.g., the data starts at address 14 instead of 16, then the computer has to read two or more 4 byte chunks and do some calculation before the requested data has been read, or it may generate ^^an alignment fault.^^
- Even though the previous data structure end is at, for example, address 13, the next data structure should start at address 16. Two **padding bytes** are inserted between the two data structures at addresses 14 & 15 to align the next data structure at address 16.
- ### The I/O Subsystem
- What are **input devices**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:21:35.278Z
card-last-reviewed:: 2022-09-18T15:21:35.278Z
card-last-score:: 5
- Anything that feeds data into the computer.
- What are **output devices**? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:16:34.930Z
card-last-reviewed:: 2022-09-18T15:16:34.931Z
card-last-score:: 5
- Display / transmit information back to the user.
- What does the **I/O Subsystem** contain? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:21:46.984Z
card-last-reviewed:: 2022-09-19T18:21:46.985Z
card-last-score:: 5
- The **I/O Subsystem** contains the devices that the computer uses to communicate with the outside world and to store data.
- How do I/O devices communicate with the processor? #card
collapsed:: true
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-22T14:53:22.144Z
card-last-reviewed:: 2022-09-18T14:53:22.145Z
card-last-score:: 3
- I/O devices usually communicate with the processor using the **I/O Bus**.
- PCs use the **PCI Express (Peripheral Component Interconnect Express)** bus for their I/O bus.
- The Operating System needs a **device driver** to access a given I/O device.
- What is a **device driver**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:11:51.628Z
card-last-reviewed:: 2022-09-18T15:11:51.628Z
card-last-score:: 5
- A **device driver** is a program that allows the OS to control an I/O device.
- #### I/O Read / Write Operations
- The I/O read & write operations are similar to the memory read & write operations.
- How does the processor address I/O devices? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T14:48:48.247Z
card-last-reviewed:: 2022-09-19T17:48:48.248Z
card-last-score:: 3
- A processor may use:
- **Memory-Mapped I/O:** when the address of the I/O device is in the **direct memory space**, and the ^^sequences to read/write data in the device are the same as the memory read/write sequences.^^
- **Isolated I/O:** similar process to Memory-Mapped I/O, but the processor has a ^^second set of control signals to distinguish between a **memory access** and am **I/O access**.^^
- What is **IO/M**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-18T23:00:00.000Z
card-last-reviewed:: 2022-09-18T15:11:25.751Z
card-last-score:: 1
- **IO/M** is a **status signal** in **Isolated I/O** that denotes whether the read/write operation pertains to the memory or to the I/O subsystem.
- When the **signal is low** (IO/M = 0), i.e., IO/M is `true`, it denotes **memory-related operations**.
- When the **signal is high**, (IO/M = 1), i.e., IO/M is `false`, it denotes an **I/O operation**.
-
-
-
- ## Programs
- What are programs? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:13:13.990Z
card-last-reviewed:: 2022-09-18T15:13:13.990Z
card-last-score:: 5
- Programs are ^^sequences of instructions^^ that tell the computer what to do.
- To the computer, a program is made out of a ^^sequence of numbers that represent individual operations.^^
- These operations are known as **machine instructions** or just **instructions**.
- A set of instructions that a processor can execute is known as an **instruction set**.
- ### Program Development Tools
collapsed:: true
- What is a **high-level programming language**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:08:53.500Z
card-last-reviewed:: 2022-09-18T15:08:53.500Z
card-last-score:: 5
- A **high-level programming language** is a ^^portable language^^ such as C that is ^^composed of words & algebraic notation^^ that can be translated by a compiler into **assembly language**.
- What is a **compiler**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T17:43:48.914Z
card-last-reviewed:: 2022-09-19T17:43:48.914Z
card-last-score:: 5
- A **compiler** is a program that translates statements in a given high-level language into assembly language statements.
- What is an **assembler**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:10:09.892Z
card-last-reviewed:: 2022-09-18T15:10:09.892Z
card-last-score:: 5
- An **assembler** is a program that translates symbolic, assembly language versions of instructions into the ^^binary version.^^
- What is **Assembly Language**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-09-30T21:48:54.576Z
card-last-reviewed:: 2022-09-19T17:48:54.576Z
card-last-score:: 5
- **Assembly Language** is a ^^symbolic representation^^ of **machine instructions**.
- What is **Machine Language**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:22:03.638Z
card-last-score:: 1
- **Machine Language** is a ^^binary representation^^ of **machine instructions**.
- What is an **instruction**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:19:26.835Z
card-last-reviewed:: 2022-09-18T15:19:26.836Z
card-last-score:: 5
- An **instruction** is a command that the computer hardware understands & obeys.
-
- ## Operating Systems
- What is an **Operating System**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-23T18:25:01.958Z
card-last-reviewed:: 2022-09-19T18:25:01.958Z
card-last-score:: 5
- Possible definition: a program that runs on the computer that ^^knows about all the hardware^^ and usually ^^runs in privileged mode^^, having ^^access to physical resources that user programs can't control^^, and has the ^^ability to start & stop user programs.^^
- The OS is responsible for managing the physical resources of complex systems, such as PCs, workstations, mainframe computers, etc.
- It is also responsible for ^^loading & executing programs^^ and ^^interfacing with the users.^^
- Usually, there is no operating system for **small embedded systems**.
- Computers designed for one specific task.
-
- ### Multiprogramming
- What is **Multiprogramming**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-23T18:23:28.557Z
card-last-reviewed:: 2022-09-19T18:23:28.557Z
card-last-score:: 5
- **Multiprogramming** is a technique that allows the system to ^^present the illusion that multiple programs are running on the computer simultaneously.^^
- Many multiprogrammed computers are **multiuser**.
- They allow multiple users to be logged in at a time.
- How is multiprogramming achieved? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.7
card-next-schedule:: 2022-09-22T15:19:07.150Z
card-last-reviewed:: 2022-09-18T15:19:07.151Z
card-last-score:: 5
- Multiprogramming is achieved by ^^switching rapidly between programs.^^
- How does the processor decide which process to execute next? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:20:14.106Z
card-last-reviewed:: 2022-09-18T15:20:14.106Z
card-last-score:: 5
- **FCFS - First Come, First Served:** processes are moved to the CPU in the order in which they arrive.
- **SJN - Shortest Job Next:** looks at all processes in the **ready state** and dispatches the one with the smallest service time.
- **Round Robin:** distributes the processing time equitably among all ready processes.
- ### Context Switching
- What is a **Context Switch**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-09-22T15:10:00.458Z
card-last-reviewed:: 2022-09-18T15:10:00.458Z
card-last-score:: 5
- When a program timeslice ends, the OS stops it, removes it, and gives another program control over the processor.
- This is a **context switch**.
- How does the OS go about a Context Switch? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-22T15:19:47.984Z
card-last-reviewed:: 2022-09-18T15:19:47.984Z
card-last-score:: 3
- copies the current program register file into memory
- restores the contents of the next program's register file into the processor
- starts executing the next program
- From the program point of view, ^^no program can tell that a context switch has been performed.^^
- ### Protection
- Three rules of Protection in multiprogrammed computers: #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-09-28T14:45:26.235Z
card-last-reviewed:: 2022-09-19T17:45:26.235Z
card-last-score:: 3
- 1. The result of any program running on the multiprogram computer ^^must be the same as if the program was the only program running on the computer.^^
- 2. Programs ^^must not be able to access other programs' data^^ and must be confident that their data will not be modified by other programs (for security and privacy).
- 3. Programs ^^must not interfere with other programs' use of I/O devices.^^
- How is protection achieved? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-19T23:00:00.000Z
card-last-reviewed:: 2022-09-19T18:26:58.100Z
card-last-score:: 1
- Protection is achieved by the ^^operating system having full control over the resources of the system (processor, memory, and I/O devices)^^ through:
- **Privileged Mode:** the operating system is the only one that can control the physical resources it executes in privileged mode.
- User programs execute in **user mode**.
- **Virtual Memory:** each program operates as if it were the only program on the computer, occupying a full set of the address space in its virtual space.
- The OS is *translating* memory addresses that the program references into physical addresses used by the memory system.
-
-
- **Next Topic:** [[Programming Models]]

View File

@ -0,0 +1,541 @@
- #[[CT213 - Computer Systems & Organisation]]
- No previous topic.
- **Relevant Slides:** ![Lecture01.pdf](../assets/Lecture01_1662828507609_0.pdf)
-
- ## Traditional Classes of Computer Systems
- What is a **Personal Computer (PC)**?
card-last-score:: 5
card-repeats:: 2
card-next-schedule:: 2022-09-23T18:28:00.836Z
card-last-interval:: 4
card-ease-factor:: 2.7
card-last-reviewed:: 2022-09-19T18:28:00.836Z
- A **Personal Computer** is a computer designed for use by an individual, usually incorporating a graphics display, a keyboard, and a mouse.
- What is a **Server**? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T05:29:45.315Z
card-last-reviewed:: 2022-09-30T08:29:45.315Z
card-last-score:: 3
- A **server** is a computer used for running larger programs for multiple users, often simultaneously, and typically accessed only via a network.
- What is a **Supercomputer**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:37:52.646Z
card-last-reviewed:: 2022-10-03T11:37:52.647Z
card-last-score:: 5
- A **supercomputer** is a member of a class of computers with the highest performance (and cost). They are configured as servers and typically cost tens to hundreds of millions of dollars.
- What is an **Embedded Computer**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:30:37.659Z
card-last-reviewed:: 2022-10-01T17:30:37.659Z
card-last-score:: 5
- An **embedded computer** is a computer inside another device, used for running one predetermined application or collection of software.
- What are **Personal Mobile Devices**?
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:31:08.211Z
card-last-reviewed:: 2022-10-01T17:31:08.211Z
card-last-score:: 5
- **Personal Mobile Devices** are small, wireless devices that connect to the internet.
- They rely on batteries for power, and software is installed by downloading apps.
- Conventional examples include smartphones and tablets.
- What is **Cloud Computing**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:28:27.277Z
card-last-reviewed:: 2022-10-01T17:28:27.278Z
card-last-score:: 5
- **Cloud Computing** refers to large collections of servers that provide services over the internet.
- Some providers rent dynamically varying number of servers as a utility.
- What is **Software as a Service**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:42:21.001Z
card-last-reviewed:: 2022-10-03T11:42:21.001Z
card-last-score:: 5
- **Software as a Service** delivers software & data as a service over the internet, usually via a thing program, such as a browser.
- Examples include web search & email.
-
- ## Computer Systems
collapsed:: true
- ![image.png](../assets/image_1662829382080_0.png){:height 410, :width 414}
- What is **Application Software**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:31:12.546Z
card-last-reviewed:: 2022-10-01T17:31:12.546Z
card-last-score:: 5
- **Application Software** consists of user-installed applications & programs.
- Application Software provides services to the user that are commonly useful.
- What is the purpose of the **Operating System**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:11:34.929Z
card-last-reviewed:: 2022-09-30T12:11:34.929Z
card-last-score:: 3
- The **Operating System** interfaces between a user's program and the hardware, provides a variety of services, and performs supervisory functions.
- What is the purpose of the **Hardware**?
- The **Hardware** performs the tasks.
-
- ## Seven Great Ideas in Computer Organisation
- ### 1. Use **Abstraction** to Simplify Design #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:28:36.002Z
card-last-reviewed:: 2022-10-03T14:28:36.003Z
card-last-score:: 5
- A major productivity technique for hardware & software is to use **abstractions** to characterise the design at different levels of representation
- Lower-level details are hidden to offer a simpler model at higher levels.
- ### 2. Make the **Common Case** Fast #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:29:14.556Z
card-last-reviewed:: 2022-10-01T17:29:14.557Z
card-last-score:: 5
- Making the **common case fast** will tend to enhance performance better than optimising the rare case.
- The common case is often simpler than the rare case, and hence is usually easier to enhance.
- ### 3. Performance via **Parallelism** #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:16:13.241Z
card-last-reviewed:: 2022-09-30T12:16:13.242Z
card-last-score:: 3
- Involves speeding up performance by using designs that compute operations in **parallel**.
- ### 4. Performance via **Pipelining** #card
card-last-interval:: 10.8
card-repeats:: 3
card-ease-factor:: 2.7
card-next-schedule:: 2022-10-11T04:28:23.981Z
card-last-reviewed:: 2022-09-30T09:28:23.982Z
card-last-score:: 5
- **Performance via Pipelining** is a particular pattern of **parallelism** that is so prevalent in computer architecture that it merits its own name.
- ### 5. Performance via **Prediction** #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:27:10.532Z
card-last-reviewed:: 2022-10-03T14:27:10.533Z
card-last-score:: 5
- In some cases, it can be ^^faster on average to guess and start working^^ that to wait until you know for sure (assuming that the mechanism to recover from a misprediction is not too expensive, and your prediction is relatively accurate).
- ### 6. Hierarchy of Memories #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:35:13.824Z
card-last-reviewed:: 2022-10-01T17:35:13.824Z
card-last-score:: 3
- Computer Architects have found that they can address conflicting demands with a **hierarchy of memories**.
- The ^^fastest, smallest, & most expensive memory per bit^^ is at the top of the hierarchy.
- The ^^slowest, largest, & cheapest per bit^^ is at the bottom of the hierarchy.
- ### 7. Dependability via **Redundancy** #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:35:28.794Z
card-last-reviewed:: 2022-10-01T17:35:28.794Z
card-last-score:: 5
- Since any physical device can fail, we make systems **dependable** by including ^^redundant components^^ that can take over when a failure occurs *and* help detect failures.
-
- ## Hardware Organisation
- What does basic computer organisation look like? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T18:29:58.087Z
card-last-reviewed:: 2022-10-03T14:29:58.087Z
card-last-score:: 5
- ![image.png](../assets/image_1662830400492_0.png)
- What is an **integrated circuit**?
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-11-04T20:17:29.178Z
card-last-reviewed:: 2022-10-04T12:17:29.178Z
card-last-score:: 5
collapsed:: true
- An **integrated circuit**, also called a **chip**, is a device combining dozens to millions of transistors.
- ### The CPU
collapsed:: true
- What is a **CPU**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T16:40:13.962Z
card-last-reviewed:: 2022-10-03T11:40:13.962Z
card-last-score:: 5
- The **Central Processing Unit (CPU)**, also called the **processor**, is the ^^active part of the computer^^, which contains the datapath & control, and which adds numbers, tests numbers, signals I/O devices to activate, and so on.
- The CPU is ^^responsible for executing programs.^^
- What are the steps that the CPU takes to process programs? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-09T18:15:57.207Z
card-last-reviewed:: 2022-09-30T12:15:57.208Z
card-last-score:: 5
- **1. Fetch:** Retrieve an instruction from ^^program memory.^^
- **2. Decode:** Break down the instruction into parts that have significance to specific sections of the CPU.
- **3. Execute:** Various portions of the CPU are connected to perform the desired operation.
- **4. Write Back:** Simply "writes back" the results of the execute step ^^if necessary.^^
- ### CPU Organisation
- What does the **organisation** of the CPU look like? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-09T20:25:22.675Z
card-last-reviewed:: 2022-10-01T13:25:22.675Z
card-last-score:: 3
- Processors are made up of:
- A **Control Unit**
- **Execution Unit(s)**
- A **Register File**
- ![image.png](../assets/image_1662830846361_0.png){:height 339, :width 418}
-
- #### Control Unit
- What does the **Control Unit** do? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-13T19:29:08.350Z
card-last-reviewed:: 2022-10-03T14:29:08.350Z
card-last-score:: 3
- The **Control Unit** ^^controls the execution^^ of the instructions stored in main memory.
- It ^^retrieves & executes^^ them.
- What is the architecture of the control unit? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-09T05:29:05.114Z
card-last-reviewed:: 2022-09-30T08:29:05.115Z
card-last-score:: 3
- The control unit contains a **fetch unit**, a **decode unit**, and an **execute unit**.
- It also contains two special registers:
- **Program Counter (PC):** keeps the address of the next instruction
- **Instruction Register (IR):** keeps the instruction being executed
- ![image.png](../assets/image_1662837864357_0.png){:height 266, :width 550}
-
- ### The Memory Subsystem
collapsed:: true
- How is the memory divided into storage locations? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:34:12.415Z
card-last-reviewed:: 2022-10-01T17:34:12.416Z
card-last-score:: 5
- Memory is divided into a set of storage location which can hold data.
- Locations are numbered.
- Addresses are used to tell the memory which location the processor wants to access.
- What are the two hierarchies of memory? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:22:16.754Z
card-last-score:: 1
- **1. Nonvolatile / ROM (Read Only Memory):** Read only memory.
- Used to store the BIOS and / or a *bootstrap* or *bootloader* program.
- **2. Volatile / RAM (Random Access Memory):** Read / Write memory.
- Also called **Primary Memory**.
- Used to hold the programs, operating system, and data required by the computer.
- #### Primary Memory
- How is primary memory connected to the CPU? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:43:34.991Z
card-last-reviewed:: 2022-10-03T11:43:34.992Z
card-last-score:: 5
- **Primary Memory** is directly connected to the Central Processing Unit of the computer.
- It must be present for the CPU to function correctly.
- What are the three types of Primary Storage? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-06T23:00:00.000Z
card-last-reviewed:: 2022-10-06T09:40:50.799Z
card-last-score:: 1
- **1. Processor Register:**
- Contains information that the CPU needs to carry out the current instruction.
- **2. Cache Memory:**
- A special type of internal memory used by many CPUs to increase their **throughput**.
- **3. Main Memory:**
- Contains the programs that are currently being run and the data that the programs are operating on.
- What is the **address width**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T13:30:08.146Z
card-last-reviewed:: 2022-09-30T08:30:08.147Z
card-last-score:: 3
- The **address width** is the number of bits used to represent an address in memory.
- The **width** limits the amount of memory that a computer can access.
- Most computers use a **64 bit address**, which means that the maximum number of locations is $$ 2^{64} \approx 16 \text{ billion gigabytes}$$.
- What operations does the memory subsystem support? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:30:33.402Z
card-last-reviewed:: 2022-10-01T17:30:33.403Z
card-last-score:: 5
- The memory subsystem supports two operations:
- **Load** (or read) + the address of the data location to be read.
- **Store** (or write) + the address of the location & the data to be written.
- How many bytes may the memory system read or write at a time? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T22:29:45.960Z
card-last-reviewed:: 2022-10-01T17:29:45.960Z
card-last-score:: 5
- Read & Write operations ^^operate at the width of the system's data bus^^, usually 32 bit or 64 bit.
- How is a section of memory addressed? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:24:14.775Z
card-last-reviewed:: 2022-10-01T13:24:14.776Z
card-last-score:: 5
- The address ^^contains only the address of the lowest byte^^, and a number of bytes to be read is specified, e.g., 4 bytes.
- #### Memory Alignment & Words of Data
- When the computer's **word size** is 4 bytes, the data to be read should be at a memory address which is ^^some multiple of four.^^
- When this is not the case, e.g., the data starts at address 14 instead of 16, then the computer has to read two or more 4 byte chunks and do some calculation before the requested data has been read, or it may generate ^^an alignment fault.^^
- Even though the previous data structure end is at, for example, address 13, the next data structure should start at address 16. Two **padding bytes** are inserted between the two data structures at addresses 14 & 15 to align the next data structure at address 16.
- ### The I/O Subsystem
- What are **input devices**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:11:40.506Z
card-last-reviewed:: 2022-09-30T12:11:40.507Z
card-last-score:: 5
- Anything that feeds data into the computer.
- What are **output devices**? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:12:09.769Z
card-last-reviewed:: 2022-09-30T12:12:09.770Z
card-last-score:: 5
- Display / transmit information back to the user.
- What does the **I/O Subsystem** contain? #card
collapsed:: true
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:11:38.831Z
card-last-reviewed:: 2022-09-30T12:11:38.832Z
card-last-score:: 5
- The **I/O Subsystem** contains the devices that the computer uses to communicate with the outside world and to store data.
- How do I/O devices communicate with the processor? #card
collapsed:: true
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T12:14:49.335Z
card-last-score:: 1
- I/O devices usually communicate with the processor using the **I/O Bus**.
- PCs use the **PCI Express (Peripheral Component Interconnect Express)** bus for their I/O bus.
- The Operating System needs a **device driver** to access a given I/O device.
- What is a **device driver**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:23:41.787Z
card-last-reviewed:: 2022-10-01T13:23:41.788Z
card-last-score:: 5
- A **device driver** is a program that allows the OS to control an I/O device.
- #### I/O Read / Write Operations
- The I/O read & write operations are similar to the memory read & write operations.
- How does the processor address I/O devices? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:35:03.201Z
card-last-score:: 1
- A processor may use:
- **Memory-Mapped I/O:** when the address of the I/O device is in the **direct memory space**, and the ^^sequences to read/write data in the device are the same as the memory read/write sequences.^^
- **Isolated I/O:** similar process to Memory-Mapped I/O, but the processor has a ^^second set of control signals to distinguish between a **memory access** and am **I/O access**.^^
- What is **IO/M**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.22
card-next-schedule:: 2022-10-04T09:11:16.531Z
card-last-reviewed:: 2022-09-30T09:11:16.532Z
card-last-score:: 3
- **IO/M** is a **status signal** in **Isolated I/O** that denotes whether the read/write operation pertains to the memory or to the I/O subsystem.
- When the **signal is low** (IO/M = 0), i.e., IO/M is `true`, it denotes **memory-related operations**.
- When the **signal is high**, (IO/M = 1), i.e., IO/M is `false`, it denotes an **I/O operation**.
-
-
-
- ## Programs
- What are programs? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-11T18:29:21.265Z
card-last-reviewed:: 2022-10-01T13:29:21.266Z
card-last-score:: 5
- Programs are ^^sequences of instructions^^ that tell the computer what to do.
- To the computer, a program is made out of a ^^sequence of numbers that represent individual operations.^^
- These operations are known as **machine instructions** or just **instructions**.
- A set of instructions that a processor can execute is known as an **instruction set**.
- ### Program Development Tools
collapsed:: true
- What is a **high-level programming language**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:17:29.873Z
card-last-reviewed:: 2022-10-01T13:17:29.874Z
card-last-score:: 5
- A **high-level programming language** is a ^^portable language^^ such as C that is ^^composed of words & algebraic notation^^ that can be translated by a compiler into **assembly language**.
- What is a **compiler**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-11T16:12:39.064Z
card-last-reviewed:: 2022-09-30T12:12:39.064Z
card-last-score:: 5
- A **compiler** is a program that translates statements in a given high-level language into assembly language statements.
- What is an **assembler**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T17:19:13.686Z
card-last-reviewed:: 2022-10-01T13:19:13.686Z
card-last-score:: 5
- An **assembler** is a program that translates symbolic, assembly language versions of instructions into the ^^binary version.^^
- What is **Assembly Language**? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-11-07T03:25:47.210Z
card-last-reviewed:: 2022-10-04T12:25:47.210Z
card-last-score:: 5
- **Assembly Language** is a ^^symbolic representation^^ of **machine instructions**.
- What is **Machine Language**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-04T09:26:52.257Z
card-last-reviewed:: 2022-09-30T09:26:52.258Z
card-last-score:: 3
- **Machine Language** is a ^^binary representation^^ of **machine instructions**.
- What is an **instruction**? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-14T15:37:20.118Z
card-last-reviewed:: 2022-10-03T11:37:20.118Z
card-last-score:: 5
- An **instruction** is a command that the computer hardware understands & obeys.
-
- ## Operating Systems
- What is an **Operating System**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.46
card-next-schedule:: 2022-10-03T23:00:00.000Z
card-last-reviewed:: 2022-10-03T14:28:14.754Z
card-last-score:: 1
- Possible definition: a program that runs on the computer that ^^knows about all the hardware^^ and usually ^^runs in privileged mode^^, having ^^access to physical resources that user programs can't control^^, and has the ^^ability to start & stop user programs.^^
- The OS is responsible for managing the physical resources of complex systems, such as PCs, workstations, mainframe computers, etc.
- It is also responsible for ^^loading & executing programs^^ and ^^interfacing with the users.^^
- Usually, there is no operating system for **small embedded systems**.
- Computers designed for one specific task.
-
- ### Multiprogramming
- What is **Multiprogramming**? #card
card-last-interval:: 10.24
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-10-10T17:10:38.026Z
card-last-reviewed:: 2022-09-30T12:10:38.027Z
card-last-score:: 3
- **Multiprogramming** is a technique that allows the system to ^^present the illusion that multiple programs are running on the computer simultaneously.^^
- Many multiprogrammed computers are **multiuser**.
- They allow multiple users to be logged in at a time.
- How is multiprogramming achieved? #card
card-last-interval:: 11.2
card-repeats:: 3
card-ease-factor:: 2.8
card-next-schedule:: 2022-10-12T21:35:33.411Z
card-last-reviewed:: 2022-10-01T17:35:33.411Z
card-last-score:: 5
- Multiprogramming is achieved by ^^switching rapidly between programs.^^
- How does the processor decide which process to execute next? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-12T17:37:46.365Z
card-last-reviewed:: 2022-10-03T11:37:46.366Z
card-last-score:: 3
- **FCFS - First Come, First Served:** processes are moved to the CPU in the order in which they arrive.
- **SJN - Shortest Job Next:** looks at all processes in the **ready state** and dispatches the one with the smallest service time.
- **Round Robin:** distributes the processing time equitably among all ready processes.
- ### Context Switching
- What is a **Context Switch**? #card
card-last-interval:: 9.28
card-repeats:: 3
card-ease-factor:: 2.32
card-next-schedule:: 2022-10-10T19:19:02.676Z
card-last-reviewed:: 2022-10-01T13:19:02.677Z
card-last-score:: 3
- When a program timeslice ends, the OS stops it, removes it, and gives another program control over the processor.
- This is a **context switch**.
- How does the OS go about a Context Switch? #card
card-last-interval:: 8.32
card-repeats:: 3
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-11T18:37:36.618Z
card-last-reviewed:: 2022-10-03T11:37:36.618Z
card-last-score:: 3
- copies the current program register file into memory
- restores the contents of the next program's register file into the processor
- starts executing the next program
- From the program point of view, ^^no program can tell that a context switch has been performed.^^
- ### Protection
- Three rules of Protection in multiprogrammed computers: #card
card-last-interval:: 17.31
card-repeats:: 4
card-ease-factor:: 2.08
card-next-schedule:: 2022-10-20T21:34:44.306Z
card-last-reviewed:: 2022-10-03T14:34:44.306Z
card-last-score:: 3
- 1. The result of any program running on the multiprogram computer ^^must be the same as if the program was the only program running on the computer.^^
- 2. Programs ^^must not be able to access other programs' data^^ and must be confident that their data will not be modified by other programs (for security and privacy).
- 3. Programs ^^must not interfere with other programs' use of I/O devices.^^
- How is protection achieved? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.36
card-next-schedule:: 2022-09-30T23:00:00.000Z
card-last-reviewed:: 2022-09-30T09:24:47.114Z
card-last-score:: 1
- Protection is achieved by the ^^operating system having full control over the resources of the system (processor, memory, and I/O devices)^^ through:
- **Privileged Mode:** the operating system is the only one that can control the physical resources it executes in privileged mode.
- User programs execute in **user mode**.
- **Virtual Memory:** each program operates as if it were the only program on the computer, occupying a full set of the address space in its virtual space.
- The OS is *translating* memory addresses that the program references into physical addresses used by the memory system.
-
-
- **Next Topic:** [[Programming Models]]

Some files were not shown because too many files have changed in this diff Show More