Files
uni/year2/semester1/logseq-stuff/pages/Introduction to SQL & DDL.md

364 lines
17 KiB
Markdown

- #[[CT230 - Database Systems I]]
- **Previous Topic:** [[The Relational Model]]
- **Next Topic:** [[SQL DML Statement]]
- **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf)
-
- # SQL
- What is **SQL**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-18T17:47:35.469Z
card-last-reviewed:: 2022-11-17T09:47:35.470Z
card-last-score:: 5
- **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems.
- ### Features of SQL
- SQL is based on *relational algebra*.
- All relational, set, and hybrid operators are supported.
- SQL also has additional operators to allow easier query development.
- SQL has been *standardised* since 1987.
- The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part.
- Recent standards include XML-related features in addition to many others, including JSON data types.
- ### ANSI/ISO SQL
- Despite standards, there can be a lack of portability between database systems due to:
- Complexity & size of standards (not all vendors will implement all of the standard).
- The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base.
- The vendor may want to maintain backward compatibility.
- The vendor may want to maintain "Vendor lock-in".
- What is the **standardised SQL syntax** comprised of? #card
card-last-interval:: 10.6
card-repeats:: 3
card-ease-factor:: 2.56
card-next-schedule:: 2022-11-25T06:29:02.836Z
card-last-reviewed:: 2022-11-14T16:29:02.836Z
card-last-score:: 5
- The **standardised SQL syntax** comprises 3 components:
- **DDL -** Data Definition Language
- **DCL -** Data Control Language
- **DML -** Data Manipulation Language
- ### DCL: Data Control Language
- What is **DCL** used for? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T04:21:50.434Z
card-last-reviewed:: 2022-11-14T20:21:50.434Z
card-last-score:: 5
- **Data Control Language** is used to control access to the database & to database relations.
- It is the role of the **database administrator**.
- Very important in multi-user systems.
- What are the typical **DCL** commands? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:18:39.799Z
card-last-reviewed:: 2022-11-14T20:18:39.800Z
card-last-score:: 5
- ```sql
GRANT
REVOKE
```
- These can be used to:
- grant / revoke access to the database.
- grant / revoke access to individual relations.
-
- ### DDL: Data Definition Language
- What is **DDL**? #card
card-last-interval:: 86.42
card-repeats:: 5
card-ease-factor:: 2.66
card-next-schedule:: 2023-02-09T06:22:06.927Z
card-last-reviewed:: 2022-11-14T20:22:06.927Z
card-last-score:: 5
- **Data Definition Language** is a standardised language to ^^define the schema of a database.^^
- It's the back-end of "design" options on the Interface (e.g., Create options).
- What are the typical **DDL** commands? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:36:00.957Z
card-last-score:: 1
- The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes.
- Common DDL keywords include:
- ```sql
CREATE
ALTER
DROP
ADD
CONSTRAINT
```
- #### Create a table, its indexes, & constraints
- Steps:
- 1. Specify **table** (relation) name.
2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**.
3. Specify the **Primary Key** of the table: choose one or more attributes.
4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step).
- Steps 1-3 ^^must be completed for all tables.^^
- #### Data Types
- The main data types are **strings**, **numeric**, and **date/time**.
-
- **Strings**
- What can **strings** contain? #card
card-last-interval:: 33.64
card-repeats:: 4
card-ease-factor:: 2.9
card-next-schedule:: 2022-12-18T07:37:34.017Z
card-last-reviewed:: 2022-11-14T16:37:34.018Z
card-last-score:: 5
- **Strings** can contain ^^letters, numbers, & special characters.^^
- Types of string: #card
card-last-interval:: 23.43
card-repeats:: 4
card-ease-factor:: 2.42
card-next-schedule:: 2022-12-08T06:03:14.822Z
card-last-reviewed:: 2022-11-14T20:03:14.822Z
card-last-score:: 3
- `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1.
- `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535.
- if `size` is not specified, it is unlimited
- `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`.
-
- **Date/Time**
- Types of date/time: #card
card-last-interval:: 47.41
card-repeats:: 5
card-ease-factor:: 2.28
card-next-schedule:: 2023-01-01T05:18:44.919Z
card-last-reviewed:: 2022-11-14T20:18:44.919Z
card-last-score:: 3
- `DATE` Format: YYYY-MM_DD
- `TIME` Format: hh:mm:ss
- `DATETIME` Format: YYYY-MM-DD hh:mm:ss
- `YEAR` A year in four-digit format
-
- **Numeric**
- The maximum `size` value is 255.
- MySQL supports **unsigned** numeric types but not all DBMS do.
- Types of numerics: #card
card-last-interval:: 19.01
card-repeats:: 4
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-03T16:48:07.382Z
card-last-reviewed:: 2022-11-14T16:48:07.383Z
card-last-score:: 3
- `INTEGERS` - *See next block*.
- `BOOL / BOOLEAN` - 0 is False; non-zero is True.
- `FLOAT` - A floating-point number. 4 bytes, single precision.
- `DOUBLE` - A floating-point number. 8 bytes, double precision.
- `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number.
- `size` = total number of digits (max 65, default 10)
- `d` = number of digits after the decimal point (max 30, default 0)
-
- **Integers**
- Types of integers:
- | **Type** | **Bytes** | **Range** |
| `TINYINT` | 1 | -128 to 127 |
| `SMALLINT` | 2 | -32,768 to 32,767 |
| `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 |
| `INT` | 4| -2,147,483,648 to 2,147,483,647 |
| `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
- ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size.
-
- **Others**
- Unicode char/string
- Binary
- Blob, Json, etc.
- ### DML: Data Manipulation Language
- What is **DML**? #card
card-last-interval:: 25.4
card-repeats:: 4
card-ease-factor:: 2.52
card-next-schedule:: 2022-12-12T18:47:31.780Z
card-last-reviewed:: 2022-11-17T09:47:31.780Z
card-last-score:: 5
- **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^
- What are the typical **DML** commands? #card
card-last-interval:: 15.05
card-repeats:: 4
card-ease-factor:: 1.94
card-next-schedule:: 2022-11-29T17:40:29.100Z
card-last-reviewed:: 2022-11-14T16:40:29.100Z
card-last-score:: 3
- ```sql
INSERT -- insert data
SELECT -- query (select) data
UPDATE -- update data
DELETE -- delete data
```
-
- # Autonumber
- What does `AUTO_INCREMENT` do in MySQL? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:50.803Z
card-last-reviewed:: 2022-11-14T20:03:50.803Z
card-last-score:: 5
- Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^
- Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored.
- Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted.
- By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted.
-
- # Constraints
- ## Types of Constraints
- ### Foreign Keys
- What is the syntax for specifying **Foreign Keys**? #card
card-last-interval:: 41.44
card-repeats:: 5
card-ease-factor:: 2.18
card-next-schedule:: 2022-12-26T06:21:18.467Z
card-last-reviewed:: 2022-11-14T20:21:18.468Z
card-last-score:: 3
- ```sql
FOREIGN KEY (attributename) REFERENCES tablename(attributename)
```
- You need to specify:
- The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint.
- The attribute name(s) that will identify the foreign keys in the current table.
- If there is more than one attribute, they should be separated by commas.
- Attribute names should be enclosed in brackets.
- The keyword `REFERENCES` to specify the attribute that the foreign key references.
- The table name and the attribute name of the attribute being referenced by the foreign key.
- Again, the attribute name(s)should be in brackets.
- The table name should be **outside** the brackets.
- ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^
- ### Using `ALTER` to Modify Design
- **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists.
- If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command
-
- #### Syntax for `ALTER` Command
- To add a constraint:
- ```SQL
ALTER TABLE tablename
ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name);
```
- To add an attribute (column) constraint:
- ```SQL
ALTER TABLE tablename
ADD attributename DATATYPE;
```
- ### Domain Constraints
id:: 6321ba81-0b92-447e-9c6f-1953528d51a8
- What is the **Domain Constraint**? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:38:56.391Z
card-last-score:: 1
- The value of each attribute A must be an **atomic** value from the **domain** dom(A).
- Essentially: ^^the data types & formats must match to that specified.^^
- ### Entity Integrity Constraints (Primary Key Constraints)
id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba
- What is the **Primary Key / Entity Integrity Constraint**? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:09.544Z
card-last-reviewed:: 2022-11-14T16:47:09.544Z
card-last-score:: 5
- The **primary key** should uniquely identify each tuple in a relation.
- This means:
- No duplicate values allowed for the primary key
- No `NULL`values allowed for the primary key
- **Note:** `NULL` values may possibly also not be permitted for other attributes.
- We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes.
- However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary.
-
- ### Referential Integrity Constraints
- What are **Referential Integrity Constraints**? #card
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.56
card-last-reviewed:: 2022-11-14T20:19:00.925Z
- **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^
- Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^
- In addition, a value for a foreign key attribute **must** exist already as a candidate key value.
- Essentially: "no unmatched foreign keys".
-
- ### Semantic Integrity Constraints
- What are **Semantic Integrity Constraints**? #card
card-last-interval:: 56.69
card-repeats:: 5
card-ease-factor:: 2.42
card-next-schedule:: 2023-01-07T03:37:12.551Z
card-last-reviewed:: 2022-11-11T11:37:12.551Z
card-last-score:: 5
- **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column.
- How are **Semantic Integrity Constraints** specified? #card
card-last-interval:: -1
card-repeats:: 1
card-ease-factor:: 2.6
card-next-schedule:: 2022-11-15T00:00:00.000Z
card-last-reviewed:: 2022-11-14T16:42:45.174Z
card-last-score:: 1
- **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*.
- What are the two types of **Semantic Integrity Constraints**? #card
card-last-interval:: 4
card-repeats:: 2
card-ease-factor:: 2.18
card-next-schedule:: 2022-11-25T13:05:15.346Z
card-last-reviewed:: 2022-11-21T13:05:15.347Z
card-last-score:: 3
- **State Constraints:** Constrain an entity to being in certain states.
- **Transition Constraints:** Constrain an entity to only being updated in certain ways.
- ## Setting Constraints
- **Domain Constraints** are set automatically once the data type is chosen.
- **Entity Constraints** are also set automatically once a primary key has been chosen.
- Usually default constraints are set for foreign keys, but these can be changed.
-
- ## Update Operations & Constraint Violations
- The DBMS must check that the constraints are not violated whenever **update operations** are applied.
-
- ### Insert Operation
- What does the **Insert Operation** do? #card
card-last-interval:: 31.36
card-repeats:: 4
card-ease-factor:: 2.8
card-next-schedule:: 2022-12-16T00:47:14.600Z
card-last-reviewed:: 2022-11-14T16:47:14.600Z
card-last-score:: 5
- The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$.
- This can happen directly via the interface or via the query.
- If a constraint is violated, the DBMS will reject the insertion - usually with an explanation.
- ### Delete Operation
- How can a **Delete Operation** violate constraints? #card
card-last-interval:: 28.3
card-repeats:: 4
card-ease-factor:: 2.66
card-next-schedule:: 2022-12-13T03:03:33.858Z
card-last-reviewed:: 2022-11-14T20:03:33.859Z
card-last-score:: 5
- A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples.
- The DBMS can:
- reject deletion, usually within an explanation.
- attempt to *cascade* deletion.
- modify referencing attribute.
- #### Update Operation
card-last-score:: 1
card-repeats:: 1
card-next-schedule:: 2022-10-07T23:00:00.000Z
card-last-interval:: -1
card-ease-factor:: 2.5
card-last-reviewed:: 2022-10-07T10:32:05.858Z
- What is an **Update** Operation? #card
card-last-interval:: 8.88
card-repeats:: 3
card-ease-factor:: 2.22
card-next-schedule:: 2022-11-20T08:30:42.712Z
card-last-reviewed:: 2022-11-11T11:30:42.713Z
card-last-score:: 3
- An **update** operation is used to change the values of one or more attributes in a tuple of a table.
- Issues already discussed with insert & delete could arise with this operation, specifically:
- If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place.
- If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation.
- ### Cascade Update & Delete
- Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well.
- Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as:
- ```sql
DELETE FROM employee
WHERE ssn = 12345678;
```