LectureFour - Relational Database Queries
LectureFour - Relational Database Queries
In SQL, the ORDER BY operator is used to sort the results of a query in either ascending or descending order based on one or more columns. The syntax is SELECT column_names FROM table_name ORDER BY column_name ASC|DESC. The AND operator is used to ensure that multiple conditions in a WHERE clause are all true for the records to be selected, meaning it filters results by applying a series of conditions that must all match. Conversely, the OR operator allows records to be selected if any one of the conditions applied is true, broadening the filter scope. These operators combined enable fine-tuning and complex dictation of the criteria for query results to match specific needs .
Relying on the Cartesian product in SQL can result in significant inefficiencies and performance issues, particularly with large datasets. The Cartesian product operates by concatenating every tuple from one table with every tuple from another, leading to exponential growth in the number of resulting tuples. This means that with large tables, such as a table P with 2500 tuples and table PP with 5000 tuples, their Cartesian product will produce 12,500,000 tuples. This scale can severely strain database resources, impacting query performance and system responsiveness. Consequently, it is crucial to be cautious when using Cartesian products, especially in queries involving multiple relations .
The WHERE clause in SQL is used to specify a selection criterion, allowing for the conditional selection of data from a table. It is an integral part of the SELECT statement, enabling the retrieval of specific data by applying logical conditions. Operators that can be used within a WHERE clause include equal (=), not equal (<>), greater than (>), less than (<), among others such as BETWEEN for specifying a range and LIKE for pattern searching. For instance, to select property owners living in 'Mabvuku', one would use: SELECT * FROM Parcel WHERE Location='Mabvuku' .
The SQL LIKE condition is used for pattern matching in a column. It allows for the retrieval of rows that match a specified pattern, thereby facilitating searches for textual data where exact matches are not feasible. Wildcards such as the asterisk (*) can be used to represent missing characters or strings within the pattern. For example, to return records with a surname starting with 'R', the SQL statement would be: SELECT * FROM Ownership WHERE surname LIKE 'R*'. This allows the query to match any surname beginning with the letter 'R' .
The two basic SQL data sublanguages are the SQL Data Manipulation Language (DML) and the SQL Data Definition Language (DDL). DML is used for executing queries and includes syntax for updating, inserting, and deleting records. It consists of commands like SELECT to extract data, UPDATE to modify data, DELETE to remove data, and INSERT INTO to add new data into database tables. The DDL permits the creation or deletion of database tables, defining indexes (keys), specifying links between tables, and imposing constraints. Some of its main statements are CREATE TABLE for creating tables, ALTER TABLE for changing tables, DROP TABLE for deleting tables, CREATE INDEX for creating an index, and DROP INDEX for deleting an index .
In SQL, single quotes are used to enclose text values as part of standard practice across most database systems. This is because SQL interprets text within single quotes as string literals. Although many databases also accept double quotes, the primary SQL standard promotes single quotes for text, ensuring consistency and avoiding syntax errors that may arise from database-specific configurations. For instance, the query SELECT * FROM Parcel WHERE Location='Mabvuku' uses single quotes around the text value 'Mabvuku' .
The SQL BETWEEN ... AND operator is used to select a range of data between two values inclusively. This operator is beneficial for filtering results according to specified numerical, textual, or date ranges, thereby simplifying queries that require results within specific intervals. The syntax is SELECT column_name FROM table_name WHERE column_name BETWEEN value1 AND value2. It is versatile and applicable to numbers, text, and date values, facilitating targeted data retrieval in queries such as filtering records within a date range or selecting numeric values within a certain threshold .
The SQL SELECT DISTINCT statement is used to return only distinct (different) values from the specified columns of a database table. This statement ensures that duplicate records are not included in the result set of a query. It is particularly useful when the goal is to list unique values or when data redundancy may lead to incorrect analyses. For instance, to select distinct locations from the Parcel table, the following query can be used: SELECT DISTINCT Location FROM Parcel .
The SQL INSERT INTO statement is used to add new rows into a database table. It supports two main formats for inserting data. The first format is for inserting values directly into all columns, using the syntax: INSERT INTO table_name VALUES (value1, value2, ...). The second format allows specifying specific columns for insertion, using the syntax: INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...). This flexibility enables the insertion of partial data when only certain fields need updating .
SQL manages multi-statement execution by requiring a separation between consecutive SQL statements, which is typically done using a semicolon. The semicolon acts as a statement terminator, enabling multiple statements to be sent to the server in a single request. While some database systems, like MS Access, treat the semicolon as optional, others mandate its use to ensure clarity and correctness in SQL execution order. As a best practice, it is advised to use the semicolon to avoid any ambiguity and potential errors in environments that require explicit statement separation .