SQL Notes
SQL Notes
DDL (Data Definition Language) and DML (Data Manipulation Language) serve different purposes in SQL. DDL is used to define and modify the database structure, including creating, altering, and dropping tables. An example is the CREATE TABLE command used to create a new table with specified columns and data types . On the other hand, DML is used for manipulating data within these structures. It includes commands like INSERT, UPDATE, and DELETE which allow for data operations such as adding new records, modifying existing records, or removing records. For instance, the INSERT INTO command adds a new record to a table .
Views in SQL are virtual tables that provide a way to present data retrieved from one or more tables in a simplified format without storing the data physically. They enhance database security by allowing users to access specific portions of data without giving them direct access to the base tables, thereby protecting sensitive information . Additionally, views simplify complex queries by encapsulating the logic in a defined view, which can be used as a simple table reference. This helps maintain consistency and reduces query complexity for end-users when accessing multi-table data .
The GROUP BY clause in SQL is used to arrange identical data into groups. This is particularly useful for aggregate functions such as COUNT, SUM, AVG, etc., where you need to perform calculations on groups of data rather than on the entire table . GROUP BY divides the result set into summary rows, typically with each group containing distinct values in specified columns. The DISTINCT clause, on the other hand, is used to remove duplicate rows from the result set, ensuring that only unique values are returned. While DISTINCT filters out duplicate data in selected columns, GROUP BY organizes data into groups and allows operations on those groups .
Aggregate functions in SQL, such as AVG, SUM, MAX, MIN, and COUNT, enhance data analysis capabilities by allowing users to perform calculations on multiple rows of data to return a single value. For example, AVG can be used to find the average salary of employees, SUM can total sales figures for a financial period, MAX can identify the highest score in a student database, and COUNT can tally the number of entries in a table. These functions enable comprehensive analysis of datasets, aiding in summarizing data points for better insights and informed decision-making .
A primary key in a relational database is a field or set of fields that uniquely identifies each row in a table. It enforces uniqueness and does not allow null values, ensuring that there are no duplicate records in the table . A candidate key is any column or combination of columns that can qualify as a unique key in a database. It has the potential to become the primary key and must also maintain uniqueness. However, only one candidate key is defined as the primary key, while the others remain as alternate keys .
Constraints in SQL are used to enforce rules on data and ensure data integrity within a database. They prevent invalid data entry into tables by providing regulations such as ensuring a column cannot have null values (NOT NULL constraint) or must have unique values (UNIQUE constraint). The PRIMARY KEY constraint enforces uniqueness across the table and ensures that each row can be uniquely identified. The CHECK constraint allows specifying a condition each row must satisfy, while the DEFAULT constraint assigns a default value to a column when no value is provided. Additionally, table-level constraints can enforce rules on multiple columns. All these constraints collectively ensure that the data remains accurate and reliable .
Normalization is critical in database design as it organizes data efficiently, minimizing redundancy and dependency. By structuring a database in a way that reduces data repetition, normalization addresses issues like data anomalies during insertions, updates, and deletions. It helps in maintaining data integrity by ensuring that data is logically stored, promoting consistency across the database . Furthermore, normalization aids in optimizing queries and improving database performance by reducing the data footprint and aligning with ideal storage structures .
Logical operators in SQL, including AND, OR, and NOT, play a crucial role in query formulation by allowing for complex conditional logic within queries. AND allows combining multiple conditions that must all be true for a row to be included in the result set, while OR permits any of the specified conditions to be true . NOT negates a condition, allowing for exclusion criteria. These operators enhance query formulation by enabling sophisticated filtering and retrieval of data based on multiple criteria, thus refining results and extracting precisely needed information from a database .
The ORDER BY clause in SQL is used to sort the result set of a query by one or more columns, either in ascending or descending order. This is significant because it allows users to organize the output data, making it easier to analyze and understand. By default, data is sorted in ascending order, but the DESC keyword can be used for descending order . Properly ordered data helps in better visualization and decision-making, especially when dealing with large datasets, as it improves the readability and relevance of the data presented .
The ALTER TABLE command in SQL significantly impacts the database schema as it allows modifications to the table structure without needing to recreate it. Specific operations that can be performed with ALTER TABLE include adding a new column using the ADD clause, modifying the definition of an existing column with the MODIFY clause, and dropping a constraint or column . These capabilities provide flexibility to adjust database structures as requirements evolve, making it a powerful tool for managing database schemas and ensuring they are up-to-date with usage needs .