r23 Dbms Unit 1 - Relational Model
r23 Dbms Unit 1 - Relational Model
PURPOSE OF DBMS:
Data Management and Organization: A DBMS provides a systematic way to store, retrieve,
and manage data in an organized manner, making it easy for users to access and manipulate large
amounts of data efficiently.
Data Integrity and Security: It enforces data integrity constraints to ensure data accuracy and
consistency. It also provides security mechanisms to control data access and protect sensitive
information from unauthorized users.
Data Sharing and Multi-user Support: DBMS allows multiple users to access and work with
the data concurrently while maintaining data consistency and isolation, which is essential for
collaborative environments.
Data Recovery and Backup: It includes features for data recovery and backup, ensuring that
data can be restored in case of accidental loss, hardware failure, or other issues, which maintains
the availability and reliability of data.
Efficient Query Processing: DBMS enables efficient data retrieval using structured query
languages (like SQL), allowing users to extract and analyze data easily, supporting decision-
making and reporting tasks.
VIEWS OF DATA:
In DBMS, the concept of views of data refers to the different levels of abstraction that allow
users to interact with data in various ways, based on their needs and roles. This concept is
foundational in the architecture of a DBMS and is structured into three levels.
oThis level represents how data is actually stored in the database, focusing on
storage, data structures, and file organization. It deals with physical storage details
and performance tuning but hides these complexities from users.
o Example: How data is arranged on disks and how index structures are
implemented for efficient retrieval.
2. Logical Level (Conceptual Level):
o The logical level provides an abstract view of the entire database, capturing the
relationships among data without specifying how data is stored. It focuses on the
structure and meaning of the data, rather than the technical details of storage.
o Example: Tables, attributes, and relationships among tables in an entity-
relationship model, showing how customer, order, and product data relate.
3. View Level (External Level):
o This level is concerned with how individual users interact with the database.
Different users may have different views depending on their role and need for
data. It presents only a subset of the database relevant to a specific user group,
offering data security and simplicity.
o Example: A sales team might only see customer names, orders, and prices, while
an HR team might see employee information without access to sales data.
DATA MODELS:
Data models in DBMS provide a framework for organizing, storing, and defining data
relationships. They are essential for shaping how data is stored, retrieved, and managed within a
database.
1. Two-Tier Architecture
2. Three-Tier Architecture
3. N-Tier Architecture
Relational databases are a type of database based on the relational model proposed by E. F. Codd
in 1970. This model organizes data into structured tables, or "relations," which makes it easy to
store, manage, and query data. Relational databases are widely used in applications because of
their simplicity, flexibility, and ability to handle large volumes of structured data. Here’s an
introduction to the core concepts of relational databases, as discussed in the works of
Ramakrishna, Elmasri, and Silberschatz:
1. Tables (Relations):
o In relational databases, data is stored in tables, each representing an entity or
concept (e.g., Customers, Orders, Products).
o A table has rows (records or tuples) representing individual instances of the entity
and columns (attributes) representing the properties or fields of that entity.
2. Rows (Tuples):
o Each row in a table represents a single, unique entry, such as a specific customer,
order, or product.
o A tuple consists of attribute values that collectively describe that particular
instance.
3. Columns (Attributes):
o Each column in a table represents an attribute, which is a characteristic of the
entity. For example, in a Customer table, columns might include CustomerID,
Name, Email, and Address.
o Attributes define the type of data that can be stored in a particular column, such as
integer, string, or date.
4. Primary Key:
o Each table typically has a primary key, a unique identifier for each record in the
table. This key ensures that each row is uniquely identifiable, preventing duplicate
entries.
o Example: In a Customer table, CustomerID could be the primary key.
5. Foreign Key:
o A foreign key is an attribute in one table that links to the primary key of another
table, establishing a relationship between the two tables.
o Example: An Order table may have a CustomerID foreign key that references the
CustomerID primary key in the Customer table, associating each order with a
specific customer.
6. Relationships:
o Relational databases allow relationships between tables, which can be:
One-to-One: Each row in one table relates to only one row in another.
One-to-Many: One row in a table can relate to multiple rows in another
(e.g., a customer may place multiple orders).
Many-to-Many: Rows in one table can relate to multiple rows in another
(e.g., students and courses).
7. Structured Query Language (SQL):
o SQL is the standard language used for interacting with relational databases. It
enables users to create, read, update, and delete data (CRUD operations) and
perform complex queries.
o SQL operations include SELECT for retrieving data, INSERT for adding new data,
UPDATE for modifying data, and DELETE for removing data.
Data Integrity: With rules such as primary keys, foreign keys, and constraints, relational
databases enforce data accuracy and integrity.
Flexibility: Relational databases can handle large volumes of data and complex queries,
making them ideal for diverse applications.
Data Consistency: The use of ACID (Atomicity, Consistency, Isolation, Durability)
properties ensures transactions are reliably processed, even in multi-user environments.
Data Security: Access control can be implemented at various levels, allowing for secure,
role-based data access.
Applications
Relational databases are widely used in domains requiring structured, consistent data, such as
banking, e-commerce, healthcare, and enterprise resource planning (ERP) systems. Their ability
to manage complex data relationships and ensure data integrity has made them the backbone of
many enterprise and business applications.
KEYS IN DBMS:
In relational databases, keys are essential elements that help in identifying, organizing, and
maintaining relationships between records in tables. Keys enforce uniqueness and facilitate
efficient data retrieval and linking across tables.
In the context of relational databases, keys are critical elements that uniquely identify records,
establish relationships, and ensure data integrity. The concept of keys is central to relational
database design, as they help organize and maintain consistency across tables.
1. Primary Key
Definition: A primary key is a unique identifier for each record (row) in a table. No two
rows can have the same value for the primary key.
Characteristics:
o Uniqueness: Each record must have a unique primary key.
o Non-null: A primary key cannot contain a NULL value.
o Immutability: The value of the primary key should not change once assigned.
Example: In a Student table, StudentID can be the primary key, ensuring each student has
a unique ID.
2. Foreign Key
Definition: A foreign key is an attribute (or a combination of attributes) in one table that
links to the primary key in another table. It establishes a relationship between the two
tables.
Characteristics:
o A foreign key may have duplicate values, meaning multiple rows in the
referencing table can refer to the same row in the referenced table.
o A foreign key can contain NULL values if the relationship allows for optional
associations.
Example: In an Orders table, the CustomerID field could be a foreign key that refers to
the CustomerID primary key in a Customers table, linking each order to a specific
customer.
3. Candidate Key
Definition: A candidate key is a set of one or more attributes that can uniquely identify
each record in a table. A table may have multiple candidate keys, and one of them is
chosen as the primary key.
Characteristics:
o Any candidate key can potentially be used as the primary key.
o Candidate keys must be unique and minimal, meaning they cannot contain any
unnecessary attributes.
Example: In an Employee table, both EmployeeID and Email might be candidate keys, as
each can uniquely identify an employee.
4. Alternate Key
Definition: An alternate key is any candidate key that is not chosen as the primary key. It
provides an alternative way of identifying records uniquely.
Characteristics:
o An alternate key must also be unique and non-null.
Example: If EmployeeID is chosen as the primary key, then Email might be considered
an alternate key for the Employee table.
5. Composite Key
Definition: A composite key is a primary key that consists of more than one attribute. It
is used when a single attribute is not sufficient to uniquely identify a record.
Characteristics:
o The combination of values in the attributes that form the composite key must be
unique for each row.
Example: In a CourseRegistration table, a composite key might consist of StudentID and
CourseID, where each student can register for multiple courses, but the combination of
student and course must be unique.
6. Superkey
Definition: A superkey is a set of attributes that uniquely identifies each record in a table.
A superkey may contain additional attributes beyond those necessary for uniqueness.
Characteristics:
o All primary keys are superkeys, but not all superkeys are primary keys. A
superkey can contain extra attributes that are not required to ensure uniqueness.
Example: If StudentID is the primary key for a Student table, then the combination of
StudentID and Name can be a superkey, even though Name is not necessary for
uniqueness.
7. Unique Key
Definition: A unique key ensures that all values in a column (or a set of columns) are
distinct, but unlike the primary key, it allows NULL values.
Characteristics:
o Unique keys ensure that the values in the columns are unique, but they can still
contain NULLs (depending on the DBMS).
o A table can have multiple unique keys.
Example: In a User table, the Username field might have a unique constraint, ensuring
that each username is distinct, but it can allow a NULL value.
8. Secondary Key
Definition: A secondary key is an attribute or set of attributes used for data retrieval
purposes, but it is not a part of the primary key.
Characteristics:
o Secondary keys help improve query performance, especially for searching or
sorting data based on non-primary key attributes.
Example: In a Product table, a secondary key might be the ProductName attribute,
which can be used for quick searching but is not necessarily unique.
Integrity constraints in relational databases are rules that ensure the accuracy, consistency, and
validity of the data stored in a database. These constraints enforce data integrity by limiting the
types of data that can be inserted into tables and ensuring that the relationships between tables
remain valid. Integrity constraints are crucial in maintaining the quality and reliability of the
database. The main types of integrity constraints are as follows:
Definition: These constraints ensure that the values entered into a column are of the
correct data type, fall within an acceptable range, and adhere to predefined formats or
standards.
Examples:
o If a column is defined to store integer values, only integers can be inserted.
o If a column is defined to store dates, only valid date values are allowed.
o If a column is defined with a range, such as an age column between 18 and 100,
values outside this range would be disallowed.
Definition: These constraints ensure that each row (record) in a table can be uniquely
identified, and that there is no duplication of records.
Primary Key Constraint: The primary key must contain unique, non-null values for
each record in the table.
o Uniqueness: Ensures that no two records in the table have the same value for the
primary key.
o Not-null: Ensures that the primary key cannot have NULL values, as it is used to
uniquely identify rows.
Example: The EmployeeID column in the Employee table is defined as the primary key
to guarantee that each employee has a unique identifier.
Definition: These constraints ensure that foreign key relationships between tables are
valid. A foreign key in one table must match a primary key in another table, or it must be
NULL (if allowed).
Foreign Key Constraint: This rule ensures that data in the foreign key column(s)
corresponds to valid entries in the primary key column(s) of another table.
o Example: A foreign key in the Orders table might reference the CustomerID
primary key in the Customers table.
o Cascading Actions: Referential integrity can include actions like "CASCADE"
(where changes in the referenced table propagate to the referencing table) or "SET
NULL" (where a foreign key is set to NULL if the referenced record is deleted).
CustomerID INT,
);
Referential actions:
Definition: These are custom rules that are defined by users to enforce business-specific
conditions that cannot be captured by the built-in domain, entity, or referential
constraints.
Example: If a business rule requires that an employee's salary must always be greater
than the minimum wage, a user-defined constraint can enforce this rule.
Definition: A check constraint allows the database to enforce a condition that must be
true for each row. It is used to limit the range of values that can be stored in a column.
Example: Enforcing that the Age of an employee is between 18 and 100 years.
Domain Integrity: Ensures data is valid based on its type and range.
Entity Integrity: Ensures that each record is uniquely identifiable and no nulls in
primary keys.
Referential Integrity: Ensures that relationships between tables are valid, with foreign
keys properly linking to primary keys.
User-Defined Integrity: Custom rules set by the user to enforce business logic.
NULL Integrity: Specifies whether columns can have NULL values.
Check Constraints: Enforces specific conditions on the data, like ranges or values.
RELATIONAL ALGEBRA:
1. Select (σ):
o The Select operation retrieves rows from a relation that satisfy a specified
condition.
o Syntax: σ_condition(R)
o Example: σ_age > 25(Employees) selects all rows from the Employees relation
where the age attribute is greater than 25.
2. Project (π):
o The Project operation retrieves specific columns (attributes) from a relation,
effectively reducing the number of attributes.
o Syntax: π_attribute1, attribute2,...(R)
o Example: π_name, age(Employees) projects the name and age columns from the
Employees relation.
3. Union (∪):
o The Union operation combines the tuples of two relations (which must have the
same number and type of attributes) and returns all distinct tuples from both
relations.
o Syntax: R ∪ S
o Example: Employees ∪ Contractors gives a relation containing all unique tuples
from both Employees and Contractors.
4. Difference (−):
o The Difference operation retrieves the tuples that are in the first relation but not
in the second.
o Syntax: R − S
o Example: Employees − Contractors gives the tuples that are in the Employees
relation but not in the Contractors relation.
5. Cartesian Product (×):
o The Cartesian Product operation combines each tuple of one relation with each
tuple of another relation, resulting in a new relation with all possible pairs of
tuples.
o Syntax: R × S
o Example: Employees × Departments results in a relation where each employee is
paired with every department.
6. Rename (ρ):
o The Rename operation changes the name of a relation or its attributes.
o Syntax: ρ_new_relation_name(R)
o Example: ρ(Emp, Employees) renames the Employees relation to Emp.
7. Join (⨝):
o The Join operation combines tuples from two relations based on a common
attribute. There are different types of joins, such as Inner Join, Natural Join, and
Theta Join.
o Syntax (for Natural Join): R ⨝ S
o Example: Employees ⨝ Departments performs a natural join between Employees
and Departments, where the join is done on common attributes (e.g.,
department_id).
Derived Operations:
In addition to these basic operations, some other operations can be derived from combinations of
the above operations, such as:
To find the names of employees working in a specific department, we can use the following
relational algebra operations:
This query first selects the department "Sales," then joins the resulting relation with the
Employees relation to obtain the names of employees in the Sales department.
Foundation for SQL: Relational Algebra forms the theoretical basis for SQL (Structured
Query Language). SQL operations like SELECT, JOIN, and WHERE can be mapped to
relational algebra operations.
Query Optimization: Understanding relational algebra helps in designing and
optimizing queries by choosing the most efficient execution plans.
Formal Query Language: It provides a formal way to express queries, which is helpful
in optimizing database queries.
Set-Based Operations: Relational algebra is based on set theory, which allows for
efficient handling of data in relational databases.
These operations are building blocks for formulating queries and understanding how data can be
retrieved and manipulated in relational databases.
Relations:
4. Find the names of employees who are in departments with less than 10 employees
Query: List names of employees who work in a department that has fewer than 10
employees.
Relational Algebra: π_Name(Employees ⨝ (π_Dept_ID(σ_count(Emp_ID) <
10(Employees))))
Explanation: This query uses a Join and aggregates the employees based on the
department.
Query: List each department along with the number of employees in that department.
Relational Algebra: π_Dept_Name, count(Emp_ID)(Employees ⨝ Departments)
6. Find all employees who work in both the "HR" and "IT" departments
Query: Find employees who work in both the "HR" and "IT" departments (assuming each
department has its own row in the Employees table).
Relational Algebra: π_Name(σ_Dept_Name = 'HR'(Departments) ⨝ Employees) ∩
π_Name(σ_Dept_Name = 'IT'(Departments) ⨝ Employees)
Explanation: The intersection (∩) operator returns employees who work in both "HR"
and "IT."
Query: Find all employees working in the same department as "John Doe."
Relational Algebra: π_Name(σ_Dept_ID = (π_Dept_ID(σ_Name = 'John
Doe'(Employees)))(Employees))
Explanation: First, we select the Dept_ID for "John Doe" and use it to filter all employees
in the same department.
11. Find the employees with the highest salary in each department
Query: List the employees who earn the highest salary in their respective departments.
Relational Algebra: π_Emp_ID, Name, Salary(σ_Salary = max(Salary)(Employees))
Explanation: This uses a Select (σ) to filter employees who earn the maximum salary in
each department.
12. Find the department names that have employees with an age greater than 40
Query: List the departments where employees are older than 40.
Relational Algebra: π_Dept_Name(σ_Age > 40(Employees) ⨝ Departments)
σ (Select): Filters rows based on a condition (like Age > 30 or Dept_Name = 'Sales').
π (Project): Chooses specific columns (like Name or Dept_Name).
⨝ (Join): Combines two relations based on a common attribute (like Dept_ID).
∪ (Union): Combines two relations and removes duplicates.
− (Difference): Returns the rows in the first relation that are not in the second.
∩ (Intersection): Returns the common rows between two relations.