11 KiB
11 KiB
- #CT230 - Database Systems I
- Previous Topic: Normalisation
- Next Topic: Query Processing & Optimisation
- Relevant Slides:
-
Query Processing
- What is Query Processing? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:16:17.961Z
card-last-score:: 1
- Query Processing transforms SQL (a high-level language) into a correct & efficient low-level language representation of relational algebra.
- Each relational algebra operator has code associated with it, which, when run, performs the operation on the data specified, allowing the specified data to be output as the result.
-
Steps Involved in Processing an SQL Query #card
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T15:52:53.439Z card-last-score:: 1-
- Process (Parse & Translate) the query and create an internal representation of the query.
- This may be an Operator Tree, Query Tree, or Query Graph (for more complicated queries).
-
- Optimise.
- Execute / Evaluate returning results.
-
- What do you need to translate SQL to Relational Algebra? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:21:00.003Z
card-last-score:: 1
- To translate SQL to Relational Algebra, you must have a meaningful set of relational algebra operators, and a mapping (translation) between SQL code & relational algebra expressions.
- What is Query Processing? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:16:17.961Z
card-last-score:: 1
-
Relational Algebra
- Two formal languages exist for the relational model:
- Relational Algebra (procedural).
- Relational Calculus (non-procedural).
- Both are logically equivalent.
-
Relational Algebra Operations
- A basic set of operations exist for the relational model.
- These allow for the specification of basic retrieval requests.
- A sequence of Relational Algebra (RA) operations forms a relational algebra expression.
- RA operations are divided into two groups:
- Operations based on mathematical set theory (e.g., union, product, etc.).
- Specific relational database operations.
- A basic set of operations exist for the relational model.
-
Relational Algebra vs SQL #card
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T15:53:39.774Z card-last-score:: 1- The core operations & functions (i.e., programs) in the internal modules of most relational database systems are based on relational algebra.
- SQL is a declarative language - It allows you to specify the results that you require, not the order of the operations to retrieve these results.
- Relational Algebra is procedural - We must specify exactly how to retrieve results when using relational algebra.
-
Relational Algebra Expressions
- A valid relational algebra expression is built by connecting tables or expressions with defined unary & binary operators & their arguments (if applicable).
- Temporary relations resulting from a relational algebra expression can be used as input to a new relational algebra expression.
- Expressions in brackets are evaluated first.
- Relational Algebra operators are either unary or binary.
-
Working With the RelaX Calculator
- There is no standard language for relational algebra like there is for SQL.
- One University group have developed a calculator that supports a fairly command standard.
- Note that it is Case Sensitive.
- The RelaX calculator provides a number of datasets with the option of also using your own dataset.
-
Loading a Dataset
-
- Go to the "Group Editor" tab.
- Copy text from the file on Blackboard and add.
- Then choose "Preview".
- Then choose "Use group in Editor".
- Note: Only stored temporarily.
-
-
Note on Degrees
- The degree of the relation resulting from a selection of a table
R
is the same as the degree ofR
, i.e., they have the same number of attributes (columns). - The operation is commutative - i.e., a sequence of selects can be applied in any order.
- E.g.:
-
\sigma_{\text{hours < 20 and pno = 10}}\text{works\_on}
-
\sigma_{\text{pno = 10 and hours < 20}}\text{works\_on}
-
- The degree of the relation resulting from a selection of a table
- Two formal languages exist for the relational model:
-
Relational Algebra: Unary Operators
- Each operation takes one relation or expression as input and gives a new relation as a result.
-
Selection Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:16:42.201Z card-last-score:: 1\sigma
) #card- Used to select certain tuples (rows) from a relation
R
. - Notation:
\sigma_pR
, wherep
is the selection predicate (i.e., a condidtion) andR
is a relation / table name. - Note: The Selection (
\sigma
) operator in relational algebra is not the same as theSELECT
clause in an SQL query.- An SQL
SELECT
query could be equivalent to a combination of relational algebra operators, (\sigma
,\pi
, orJOIN
).
- An SQL
-
Example (Using Company Schema)
- Find the projects with pno = 10 and hours worked < 20.
background-color:: green
-
\sigma_{\text{hours < 20 AND pno = 10}}\text{works\_on}
- Returns the set:
- {(333445555, 10, 10.0 ), (999887777, 10, 10.0)}
-
- Find the projects with pno = 10 and hours worked < 20.
background-color:: green
- Used to select certain tuples (rows) from a relation
-
Projection Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:20:37.894Z card-last-score:: 1\pi
) #card- Used to return certain attributes / columns.
- Notation:
-
\pi_{A_1, A_2, \cdots, A_k}(R)
- Where
A_1, \cdots, A_k
are attribute names,R
is a relation name.
-
- The result is a relation with the
k
attributes listed in the same order as they appear in the list.- Duplicate tuples are removed from the result.
- Note: Commutativity does not hold.
-
Example (Using Company Schema)
- List all the department numbers where employees work.
background-color:: green
-
\pi_\text{dno}\text{employee}
- Returns: {5,4,1}.
-
- List all the department numbers where employees work.
background-color:: green
-
Rename Operators (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:23:44.470Z card-last-score:: 1\rho
&\leftarrow
) #card- Rename Operation (
\rho
). - Notation:
\rho_x(E)
, where the result of the expressionE
is saved with the namex
.
- Rename Operation (
-
Order Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-22T00:00:00.000Z card-last-reviewed:: 2022-11-21T13:09:00.157Z card-last-score:: 1\tau
) #card- Used to order by certain columns from a relation
R
. - Notation:
\tau_{A_1, A_2, \cdots, A_k}R
whereA_1, A_2, \cdots, A_k
are attributes with either ASC or DESC.
- Used to order by certain columns from a relation
-
Group By Operator (
\gamma
)- Used to group by certain columns from a relation
R
.
- Used to group by certain columns from a relation
-
Aggregate Functions Supported by RelaX
- (Not part of Relation Algebra),
COUNT(*)
.COUNT(column)
.MIN(column)
.MAX(column)
.SUM(column)
.AVG(column)
.
-
Binary Operators
- General syntax:
(child_expression) function argument (child_argument)
. -
Union Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T15:54:46.077Z card-last-score:: 1\cup
) #card- Notation:
(R) \cup (S)
, whereR
&S
are relations. - Returns all tuples from
R
and all tuples fromS
. - Note: No duplicates will be returned.
- Notation:
-
Intersection Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T20:25:05.543Z card-last-score:: 1\cap
) #card- Notation:
(R) \cap (S)
, whereR
&S
are relations. - Returns all tuples from
R
that are also inS
.
- Notation:
-
Set Difference (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T20:26:25.449Z card-last-score:: 1-
) #card- Notation:
(R) - (S)
whereR
&S
are relations. - Returns tuples that are in relation
R
but not inS
. - Note:
(R) - (S)
and(S) - (R)
are not the same.
- Notation:
-
Union Compatibility
- What is Union Compatibility? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:22.644Z
card-last-score:: 1
- For union, intersection, & minus, relations must be union compatible.
- That is, schemas of relations must match, i.e., have the same number of attributes and each corresponding attributes have the same domain.
- For union, intersection, & minus, relations must be union compatible.
- What is Union Compatibility? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.5
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T15:50:22.644Z
card-last-score:: 1
-
Cartesian Product Operator (
id:: 636a860b-5170-4227-befa-68876a53c856 card-last-interval:: 0.98 card-repeats:: 2 card-ease-factor:: 2.36 card-next-schedule:: 2022-11-22T12:05:57.590Z card-last-reviewed:: 2022-11-21T13:05:57.590Z card-last-score:: 3\times
) (Cross-Join) #card- Notation:
(R) \times (S)
whereR
&S
are relations / tables. - Returns: Tuples comprising the concatenation (combination) of every tuple in
R
with every tuple inS
. - Note: No condition specified.
-
Cartesian Product vs Join #card
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:20:31.096Z card-last-score:: 1- The main difference between a Cartesian product operator and a Join operator is that, with a Join, only tuples satisfying a condition appear in the result, while in a Cartesian product operator, all combinations of tuples are included in the result.
- Notation:
-
Join Operator (
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:23:36.847Z card-last-score:: 1\Join
) #card- The Join Operator is a hybrid operator - it is a combination of the Cartesian Product operator (
\times
) & a Select operator (\sigma
). - Tables are joined together based on the condition specified.
- The Join Operator is a hybrid operator - it is a combination of the Cartesian Product operator (
-
Equi & Theta Joins #card
card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.5 card-next-schedule:: 2022-11-19T00:00:00.000Z card-last-reviewed:: 2022-11-18T18:37:52.403Z card-last-score:: 1- Notation:
(R_1) \Join p (R_2)
wherep
is the join condition andR_1
&R_2
are relations. - Result: The
JOIN
operation returns all combinations of tuples from relationR_1
& relationR_2
satisfying the join conditionp
. - Note: EQUI JOINS use only equality comparisons (
=
) in the join conditionp
.
- Notation:
- General syntax: