- #[[CT230 - Database Systems I]] - **Previous Topic:** [[The Relational Model]] - **Next Topic:** [[SQL DML Statement]] - **Relevant Slides:** ![Lecture02.pdf](../assets/Lecture02_1663148803122_0.pdf) - - # SQL - What is **SQL**? #card card-last-interval:: 31.36 card-repeats:: 4 card-ease-factor:: 2.8 card-next-schedule:: 2022-12-18T17:47:35.469Z card-last-reviewed:: 2022-11-17T09:47:35.470Z card-last-score:: 5 - **Structured Query Language (SQL)** is a special-purpose **programming language** for relational database systems. - ### Features of SQL - SQL is based on *relational algebra*. - All relational, set, and hybrid operators are supported. - SQL also has additional operators to allow easier query development. - SQL has been *standardised* since 1987. - The American National Standards Institute (ANSI) and International Organization for Standardization (ISO) form SQL standard committees. Many vendors also take part. - Recent standards include XML-related features in addition to many others, including JSON data types. - ### ANSI/ISO SQL - Despite standards, there can be a lack of portability between database systems due to: - Complexity & size of standards (not all vendors will implement all of the standard). - The vendor may want to keep the syntax consistent with their other software products / OS or develop features to support their user base. - The vendor may want to maintain backward compatibility. - The vendor may want to maintain "Vendor lock-in". - What is the **standardised SQL syntax** comprised of? #card card-last-interval:: 10.6 card-repeats:: 3 card-ease-factor:: 2.56 card-next-schedule:: 2022-11-25T06:29:02.836Z card-last-reviewed:: 2022-11-14T16:29:02.836Z card-last-score:: 5 - The **standardised SQL syntax** comprises 3 components: - **DDL -** Data Definition Language - **DCL -** Data Control Language - **DML -** Data Manipulation Language - ### DCL: Data Control Language - What is **DCL** used for? #card card-last-interval:: 31.36 card-repeats:: 4 card-ease-factor:: 2.8 card-next-schedule:: 2022-12-16T04:21:50.434Z card-last-reviewed:: 2022-11-14T20:21:50.434Z card-last-score:: 5 - **Data Control Language** is used to control access to the database & to database relations. - It is the role of the **database administrator**. - Very important in multi-user systems. - What are the typical **DCL** commands? #card card-last-interval:: 41.44 card-repeats:: 5 card-ease-factor:: 2.18 card-next-schedule:: 2022-12-26T06:18:39.799Z card-last-reviewed:: 2022-11-14T20:18:39.800Z card-last-score:: 5 - ```sql GRANT REVOKE ``` - These can be used to: - grant / revoke access to the database. - grant / revoke access to individual relations. - - ### DDL: Data Definition Language - What is **DDL**? #card card-last-interval:: 86.42 card-repeats:: 5 card-ease-factor:: 2.66 card-next-schedule:: 2023-02-09T06:22:06.927Z card-last-reviewed:: 2022-11-14T20:22:06.927Z card-last-score:: 5 - **Data Definition Language** is a standardised language to ^^define the schema of a database.^^ - It's the back-end of "design" options on the Interface (e.g., Create options). - What are the typical **DDL** commands? #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.22 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:36:00.957Z card-last-score:: 1 - The typical DDL tasks include creating, altering, and removing **database objects** such as tables & indexes. - Common DDL keywords include: - ```sql CREATE ALTER DROP ADD CONSTRAINT ``` - #### Create a table, its indexes, & constraints - Steps: - 1. Specify **table** (relation) name. 2. For each attribute in the table, specify **Attribute Name**, **Data Type**, and any **constraints**. 3. Specify the **Primary Key** of the table: choose one or more attributes. 4. Specify **Foreign Keys** *if they exist* and assuming that the attributes & table you are referencing exist (you may have to return to this step). - Steps 1-3 ^^must be completed for all tables.^^ - #### Data Types - The main data types are **strings**, **numeric**, and **date/time**. - - **Strings** - What can **strings** contain? #card card-last-interval:: 33.64 card-repeats:: 4 card-ease-factor:: 2.9 card-next-schedule:: 2022-12-18T07:37:34.017Z card-last-reviewed:: 2022-11-14T16:37:34.018Z card-last-score:: 5 - **Strings** can contain ^^letters, numbers, & special characters.^^ - Types of string: #card card-last-interval:: 23.43 card-repeats:: 4 card-ease-factor:: 2.42 card-next-schedule:: 2022-12-08T06:03:14.822Z card-last-reviewed:: 2022-11-14T20:03:14.822Z card-last-score:: 3 - `CHAR(size)` is a string of **fixed length**. `size` can be from 0 to 255 - the default is 1. - `VARCHAR(size)` is a string of **variable length**. `size` can be from 0 to 65,535. - if `size` is not specified, it is unlimited - `TEXT` is the same thing as `VARCHAR` except it is unlimited by default, and takes no argument `size`. - - **Date/Time** - Types of date/time: #card card-last-interval:: 47.41 card-repeats:: 5 card-ease-factor:: 2.28 card-next-schedule:: 2023-01-01T05:18:44.919Z card-last-reviewed:: 2022-11-14T20:18:44.919Z card-last-score:: 3 - `DATE` Format: YYYY-MM_DD - `TIME` Format: hh:mm:ss - `DATETIME` Format: YYYY-MM-DD hh:mm:ss - `YEAR` A year in four-digit format - - **Numeric** - The maximum `size` value is 255. - MySQL supports **unsigned** numeric types but not all DBMS do. - Types of numerics: #card card-last-interval:: 19.01 card-repeats:: 4 card-ease-factor:: 2.18 card-next-schedule:: 2022-12-03T16:48:07.382Z card-last-reviewed:: 2022-11-14T16:48:07.383Z card-last-score:: 3 - `INTEGERS` - *See next block*. - `BOOL / BOOLEAN` - 0 is False; non-zero is True. - `FLOAT` - A floating-point number. 4 bytes, single precision. - `DOUBLE` - A floating-point number. 8 bytes, double precision. - `DECIMAL(size, d) / DEC(size,d)` - An exact, fixed-point number. - `size` = total number of digits (max 65, default 10) - `d` = number of digits after the decimal point (max 30, default 0) - - **Integers** - Types of integers: - | **Type** | **Bytes** | **Range** | | `TINYINT` | 1 | -128 to 127 | | `SMALLINT` | 2 | -32,768 to 32,767 | | `MEDIUMINT` | 3 | -8,388,608 to 8,388,607 | | `INT` | 4| -2,147,483,648 to 2,147,483,647 | | `BIGINT` | 8 | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | - ^^Note:^^ the number in brackets next to integers only refers to the number of digits to display, not size. - - **Others** - Unicode char/string - Binary - Blob, Json, etc. - ### DML: Data Manipulation Language - What is **DML**? #card card-last-interval:: 25.4 card-repeats:: 4 card-ease-factor:: 2.52 card-next-schedule:: 2022-12-12T18:47:31.780Z card-last-reviewed:: 2022-11-17T09:47:31.780Z card-last-score:: 5 - **Data Manipulation Language** is a standardised language used for ^^adding, deleting, & modifying data in a database.^^ - What are the typical **DML** commands? #card card-last-interval:: 15.05 card-repeats:: 4 card-ease-factor:: 1.94 card-next-schedule:: 2022-11-29T17:40:29.100Z card-last-reviewed:: 2022-11-14T16:40:29.100Z card-last-score:: 3 - ```sql INSERT -- insert data SELECT -- query (select) data UPDATE -- update data DELETE -- delete data ``` - - # Autonumber - What does `AUTO_INCREMENT` do in MySQL? #card card-last-interval:: 28.3 card-repeats:: 4 card-ease-factor:: 2.66 card-next-schedule:: 2022-12-13T03:03:50.803Z card-last-reviewed:: 2022-11-14T20:03:50.803Z card-last-score:: 5 - Specifying an attribute to `AUTO_INCREMENT` tells the DBMS to ^^generate a number automatically when a new tuple is inserted into a table.^^ - Often, this is used for an "artificial" **primary key** value which is needed to ensure that we have a primary key, but has no meaning for the data being stored. - Using `AUTO_INCREMENT` means that the DBMS takes care of inserting a unique value automatically every time a new tuple is inserted. - By default, `AUTO_INCREMENT` is **1**, and is incremented by 1 for each new tuple inserted. - - # Constraints - ## Types of Constraints - ### Foreign Keys - What is the syntax for specifying **Foreign Keys**? #card card-last-interval:: 41.44 card-repeats:: 5 card-ease-factor:: 2.18 card-next-schedule:: 2022-12-26T06:21:18.467Z card-last-reviewed:: 2022-11-14T20:21:18.468Z card-last-score:: 3 - ```sql FOREIGN KEY (attributename) REFERENCES tablename(attributename) ``` - You need to specify: - The keyword `FOREIGN KEY` to indicate that it is a foreign key constraint. - The attribute name(s) that will identify the foreign keys in the current table. - If there is more than one attribute, they should be separated by commas. - Attribute names should be enclosed in brackets. - The keyword `REFERENCES` to specify the attribute that the foreign key references. - The table name and the attribute name of the attribute being referenced by the foreign key. - Again, the attribute name(s)should be in brackets. - The table name should be **outside** the brackets. - ^^You cannot create a foreign key link unless the attribute that it is referencing exists.^^ - ### Using `ALTER` to Modify Design - **Remember:** You cannot create a foreign key link *unless* the attribute it's referencing already exists. - If you want to create everything but the foreign keys initially, you can add a foreign key later using the `ALTER TABLE` command - - #### Syntax for `ALTER` Command - To add a constraint: - ```SQL ALTER TABLE tablename ADD CONSTRAINT constraintname FOREIGN KEY (attributename) REFERENCES tablename(attribute name); ``` - To add an attribute (column) constraint: - ```SQL ALTER TABLE tablename ADD attributename DATATYPE; ``` - ### Domain Constraints id:: 6321ba81-0b92-447e-9c6f-1953528d51a8 - What is the **Domain Constraint**? #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.6 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:38:56.391Z card-last-score:: 1 - The value of each attribute A must be an **atomic** value from the **domain** dom(A). - Essentially: ^^the data types & formats must match to that specified.^^ - ### Entity Integrity Constraints (Primary Key Constraints) id:: 6321bafc-6bfc-42da-96a9-f05bcfdff9ba - What is the **Primary Key / Entity Integrity Constraint**? #card card-last-interval:: 31.36 card-repeats:: 4 card-ease-factor:: 2.8 card-next-schedule:: 2022-12-16T00:47:09.544Z card-last-reviewed:: 2022-11-14T16:47:09.544Z card-last-score:: 5 - The **primary key** should uniquely identify each tuple in a relation. - This means: - No duplicate values allowed for the primary key - No `NULL`values allowed for the primary key - **Note:** `NULL` values may possibly also not be permitted for other attributes. - We often see this constraint when filling out forms online ("*required") and the constraint is often necessary for non-key attributes. - However, we should be careful to only add `NOT NULL` constraints in the databases when they are really necessary. - - ### Referential Integrity Constraints - What are **Referential Integrity Constraints**? #card card-last-score:: 1 card-repeats:: 1 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-interval:: -1 card-ease-factor:: 2.56 card-last-reviewed:: 2022-11-14T20:19:00.925Z - **Referential Integrity Constraints** are specified between two relations and require the concept of a **foreign key**. The constraint ensures that ^^the database must **not** contain any unmatched foreign keys.^^ - Therefore, a relationship involving foreign keys **must** be between attributes of the ^^same type & size.^^ - In addition, a value for a foreign key attribute **must** exist already as a candidate key value. - Essentially: "no unmatched foreign keys". - - ### Semantic Integrity Constraints - What are **Semantic Integrity Constraints**? #card card-last-interval:: 56.69 card-repeats:: 5 card-ease-factor:: 2.42 card-next-schedule:: 2023-01-07T03:37:12.551Z card-last-reviewed:: 2022-11-11T11:37:12.551Z card-last-score:: 5 - **Semantic Integrity Constraints** ensure that the data entered into a row reflects an allowable value for that row. The value must be within the *domain*, or allowable set of values, for that column. - How are **Semantic Integrity Constraints** specified? #card card-last-interval:: -1 card-repeats:: 1 card-ease-factor:: 2.6 card-next-schedule:: 2022-11-15T00:00:00.000Z card-last-reviewed:: 2022-11-14T16:42:45.174Z card-last-score:: 1 - **Semantic Integrity Constraints** are specified & enforced using a *constraint specification language*. - What are the two types of **Semantic Integrity Constraints**? #card card-last-interval:: 4 card-repeats:: 2 card-ease-factor:: 2.18 card-next-schedule:: 2022-11-25T13:05:15.346Z card-last-reviewed:: 2022-11-21T13:05:15.347Z card-last-score:: 3 - **State Constraints:** Constrain an entity to being in certain states. - **Transition Constraints:** Constrain an entity to only being updated in certain ways. - ## Setting Constraints - **Domain Constraints** are set automatically once the data type is chosen. - **Entity Constraints** are also set automatically once a primary key has been chosen. - Usually default constraints are set for foreign keys, but these can be changed. - - ## Update Operations & Constraint Violations - The DBMS must check that the constraints are not violated whenever **update operations** are applied. - - ### Insert Operation - What does the **Insert Operation** do? #card card-last-interval:: 31.36 card-repeats:: 4 card-ease-factor:: 2.8 card-next-schedule:: 2022-12-16T00:47:14.600Z card-last-reviewed:: 2022-11-14T16:47:14.600Z card-last-score:: 5 - The **Insert Operation** provides a list of attribute values for a new tuple $t$ that is to be inserted into a relation $R$. - This can happen directly via the interface or via the query. - If a constraint is violated, the DBMS will reject the insertion - usually with an explanation. - ### Delete Operation - How can a **Delete Operation** violate constraints? #card card-last-interval:: 28.3 card-repeats:: 4 card-ease-factor:: 2.66 card-next-schedule:: 2022-12-13T03:03:33.858Z card-last-reviewed:: 2022-11-14T20:03:33.859Z card-last-score:: 5 - A **delete operation** can only violate **integrity constraints**, i.e., if the tuple being deleted is referenced by the foreign key from other tuples. - The DBMS can: - reject deletion, usually within an explanation. - attempt to *cascade* deletion. - modify referencing attribute. - #### Update Operation card-last-score:: 1 card-repeats:: 1 card-next-schedule:: 2022-10-07T23:00:00.000Z card-last-interval:: -1 card-ease-factor:: 2.5 card-last-reviewed:: 2022-10-07T10:32:05.858Z - What is an **Update** Operation? #card card-last-interval:: 8.88 card-repeats:: 3 card-ease-factor:: 2.22 card-next-schedule:: 2022-11-20T08:30:42.712Z card-last-reviewed:: 2022-11-11T11:30:42.713Z card-last-score:: 3 - An **update** operation is used to change the values of one or more attributes in a tuple of a table. - Issues already discussed with insert & delete could arise with this operation, specifically: - If a primary key is modified, that's essentially the same as deleting one tuple and inserting another tuple in its place. - If a foreign key is modified, the DBMS must ensure that the new value refers to an existing tuple in the reference relation. - ### Cascade Update & Delete - Whenever tuples in the **referenced** (master) table are deleted or updated, the respective tuples of the **referencing** (child) table with a matching foreign key column will be deleted or updated as well. - Note that if cascading `DELETE` is turned on, there could be many deletions performed with a single query such as: - ```sql DELETE FROM employee WHERE ssn = 12345678; ```