Assignment-1: Fundamental Of Data Science - G24AI2036
Q1: Normalize the following unnormalized table "Orders" to the second normal form (2NF). Identify
the functional dependencies and primary key.
Unnormalized Table: Orders
Order_id Cust_Name Cust_City Product_id Product_Na Quantity
me
1 John Smith New York 100 Laptop 2
2 John Smith New York 101 Printer 1
3 Mary Los Angeles 102 Smartphone 3
4 Mary Los Angeles 100 Laptop 1
Functional Dependencies:
1. Order_id -> Cust_Name, Cust_City
2. Product_id -> Product_Name
3. Order_id, Product_id -> Quantity
Primary Key: (Order_id, Product_id)
First Normal Form (1NF):
1. A table is in 1NF if each cell contains a single value, and there are no repeating groups of columns.
2. In the "Orders" table, each cell holds only one value (e.g., a single Order_id, a single Cust_Name, a
single Quantity).
3. There are no repeating groups of columns.
4. Therefore, the "Orders" table is already in 1NF.
Second Normal Form (2NF):
1. A table is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the
entire primary key.
2. In the "Orders" table, the primary key is the composite key (Order_id, Product_id).
3. We need to check for partial dependencies:
4. Cust_Name and Cust_City are fully functionally dependent on Order_id.
5. Product_Name is fully functionally dependent on Product_id.
6. Quantity is fully functionally dependent on the combination of Order_id and Product_id.
7. To achieve 2NF, we decompose the table into three tables to eliminate the partial dependencies:
2NF Tables:
Orders (Order_id, Cust_Name, Cust_City)
Primary Key: Order_id
Products (Product_id, Product_Name)
Primary Key: Product_id
OrderDetails (Order_id, Product_id, Quantity)
Primary Key: (Order_id, Product_id)
Q2: Consider two tables, "students" and "courses," with a common column "student_id." Write an
SQL query to retrieve the names of students who have not chosen courses.
SQL Query:
SELECT s.student_name
FROM students s
LEFT JOIN courses c ON s.student_id = c.student_id
WHERE c.course_id IS NULL;
Q3: Normalize the following unnormalized table "Employees" with columns: employee_id (primary
key), employee_name, department, and department_location. Normalize the table to the third
normal form (3NF).
Unnormalized Table: Employees
Emp_id Emp_name Dept Dept_location
1 John HR New York
2 Jane HR New York
3 Mark IT San Francisco
4 Sarah IT San Francisco
5 David IT San Francisco
Functional Dependencies:
1. Emp_id -> Emp_name, Dept, Dept_location
2. Dept -> Dept_location
Primary Key: Emp_id
First Normal Form (1NF):
A table is in 1NF if each cell contains a single value, and there are no repeating groups of columns.
In the "Employees" table, each cell holds only one value.
There are no repeating groups of columns.
Therefore, the "Employees" table is already in 1NF.
Second Normal Form (2NF):
A table is in 2NF if it is in 1NF and all non-key attributes are fully functionally dependent on the
entire primary key.
In the "Employees" table, the primary key is Emp_id. Since Emp_id is a single attribute, there are no
partial dependencies.
Therefore, the "Employees" table is already in 2NF.
Third Normal Form (3NF):
A table is in 3NF if it is in 2NF and no non-key attribute is transitively dependent on the primary key.
A transitive dependency exists when a non-key attribute depends on another non-key attribute.
In the "Employees" table, we have the following dependencies:
Emp_id -> Emp_name, Dept, Dept_location
Dept -> Dept_location
Here, Dept_location is dependent on Dept, and Dept is dependent on Emp_id. Thus, Dept_location
is transitively dependent on Emp_id through Dept.
To achieve 3NF, we remove the transitive dependency by creating a new table:
3NF Tables:
1. Employees (Emp_id, Emp_name, Dept)
Primary Key: Emp_id
2. Departments (Dept, Dept_location)
Primary Key: Dept
Q4: If we have two tables, Student1 and Student2. Perform union, intersection, and difference
operations on Student1 and Student2.
Student1:
SID SNAME
S01 XYZ
S02 PQR
S03 ABC
Student2:
SID SNAME
S01 XYZ
S05 LMN
S06 DEF
Union (Student1 ∪ Student2):
SID SNAME
S01 XYZ
S02 PQR
S03 ABC
S05 LMN
S06 DEF
Intersection (Student1 ∩ Student2):
SID SNAME
S01 XYZ
Difference (Student1 - Student2):
SID SNAME
S02 PQR
S03 ABC
Q5: Solve the following query using relational algebra and SQL. Select the SID, SNAME from the
StudentXYZ table whose marks are greater than or equal to 90. Write the query and result.
StudentXYZ:
SID SNAME Marks
S01 XYZ 99
S02 PQR 95
S03 ABC 89
S04 LMN 88
S05 AZQ 90
Relational Algebra:
Marks >= 90 (StudentXYZ)
SQL Query:
SELECT SID, SNAME
FROM StudentXYZ
WHERE Marks >= 90;
Result:
SID SNAME
S01 XYZ
S02 PQR
S05 AZQ
Q6: Solve the following query using SQL. Select the SID, SNAME from the StudentXYZ table whose
marks are greater than or equal to 90, SID must be in ascending order, and SNAME must be in
descending order. Write the query and result.
StudentXYZ:
SID SNAME Marks
S01 XYZ 99
S01 ABC 85
S02 PQR 95
S03 ABC 89
S04 LMN 88
S02 ADC 84
S05 AZQ 90
SQL Query:
SELECT SID, SNAME
FROM StudentXYZ
WHERE Marks >= 90
ORDER BY SID ASC, SNAME DESC;
Result:
SID SNAME
S01 XYZ
S02 PQR
S05 AZQ