0% found this document useful (0 votes)
6 views14 pages

MM 3

The document outlines informal design guidelines for relation schema design, emphasizing the importance of clear semantics, minimizing redundancy, reducing NULL values, and avoiding spurious tuples. It discusses various types of anomalies that can occur in poorly structured tables, such as insertion, deletion, and modification anomalies, and highlights the role of normalization in mitigating these issues. Additionally, it covers SQL commands for data manipulation and the use of cursors for row-by-row processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views14 pages

MM 3

The document outlines informal design guidelines for relation schema design, emphasizing the importance of clear semantics, minimizing redundancy, reducing NULL values, and avoiding spurious tuples. It discusses various types of anomalies that can occur in poorly structured tables, such as insertion, deletion, and modification anomalies, and highlights the role of normalization in mitigating these issues. Additionally, it covers SQL commands for data manipulation and the use of cursors for row-by-row processing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Informal Design Guidelines for Relation Schema Design

Good relation schema design ensures that the structure of the database supports efficiency, accuracy,
and maintainability.

Four key informal guidelines are used to evaluate relation schemas:

1. Making sure that the semantics of the attributes is clear in the schema
2. Reducing the redundant information in tuples
3. Reducing the NULL values in tuples
4. Disallowing the possibility of generating spurious tuples

1. Clear Semantics of Attributes (Guideline 1)

Definition:
Each relation schema should represent a single, well-defined concept—typically an entity or a
relationship.

Importance:

 Makes the schema easier to understand and explain.


 Reduces confusion about what each attribute represents.

Example:

 EMPLOYEE(Essn, Ename, Dno)clearly describes employees and shows a link to their department
via Dno.
 Avoid combining unrelated data like employee and department info in one table (EMP_DEPT),
which causes semantic ambiguity.

Guideline:
Design each relation to match only one real-world entityor relationship type.

2. Redundant Information and Update Anomalies (Guideline 2)

Redundancy:

 Occurs when the same information is stored repeatedly in multiple tuples.


 Wastes storage and increases chances of inconsistency.

Update Anomalies:

 Insertion Anomaly: Difficulty inserting data due to dependence on unrelated attribute values.
 Deletion Anomaly: Removing a tuple may lead to unintentional loss of important information.
 Modification Anomaly: A single logical update requires changes in multiple places, increasing the
risk of inconsistency.

Example:
In a schema like EMP_DEPT(Essn, Ename, Dno, Dname, Dmgr_ssn), department data is repeated for
every employee.
Guideline:
Structure schemas to minimize redundancyand avoid update anomaliesby using proper normalization
and foreign key relationships.

3. Reducing NULL Values in Tuples (Guideline 3)

Problem:

 NULLs can waste space and complicate query processing.


 NULLs may have multiple meanings: unknown, not applicable, or unrecorded.
 Can lead to errors in JOIN, COUNT, or SUM operations.

Example:
If only 10–15% of employees have Office_number, it is better to move that attribute to a separate table
like EMP_OFFICE(Essn, Office_number).

Guideline:
Avoid placing attributes in base tables that are frequently NULL. Use separate tables for optional or
rarely-used attributes.

4. Avoiding Spurious Tuples (Guideline 4)

Concern:
When relations are improperly decomposed, joining them back can produce invalid tuples, called
spurious tuples.

Example:

 If EMP_PROJis decomposed into EMP_LOCS(Ename, Plocation)and EMP_PROJ1(Ssn, Pname,


Plocation), and joined on Plocation, it may produce combinations of employees and projects that
never existed.

Guideline:
Always join relations using foreign key–primary keypairs to avoid generating incorrect data during joins.
UPDATE anomalies

insertion, deletion, and modification anomalies. Why are they considered bad? Illustrate with
examples.

Introduction:

In relational databases, anomaliesrefer to unexpected problems that arise during insertion, deletion, or
modification of datain non-normalized(poorly structured) tables. These anomalies occur due to
redundant or improper data storage, and they affect data integrity, consistency, and efficiency.

There are three major typesof anomalies:

1. Insertion Anomaly:

An insertion anomalyoccurs when certain data cannot be insertedinto the database without the
presence of additional, unrelated data.

Example:

Consider a relation EMP_DEPTthat stores both employee and department information:

EmpID EmpName DeptID DeptName

101 John 10 HR

102 Alice 20 Finance

Suppose we want to add a new department(e.g., "Marketing") that currently has no employees.
We cannot insertthis department alone unless we also insert dummy employee detailsor use NULLsfor
EmpID and EmpName. But if EmpID is a primary key, NULLs are not allowed, violating entity integrity.

Thus, the design prevents insertion of valid data(the department) unless irrelevant or incomplete
data(employee) is also provided.

2. Deletion Anomaly:

A deletion anomalyarises when deleting data about one entityalso causes the loss of valuable
information about another entity.

Example:

From the same EMP_DEPTtable:

EmpID EmpName DeptID DeptName

101 John 10 HR

102 Alice 20 Finance


If Alice is the last employeein the Financedepartment and we delete her record, the Finance
department's informationis also deleted.
This is a problem because the department may still exist, even if it currently has no employees.

Hence, deletion of employee data leads to loss of department data, which should have been preserved.

3. Modification Anomaly:

A modification anomalyoccurs when the same data is stored redundantlyand must be updated in
multiple places. If not updated consistently, it leads to data inconsistency.

Example:

Suppose DeptName "HR"needs to be updated to "Human Resources":

EmpID EmpName DeptID DeptName

101 John 10 HR

103 Susan 10 HR

You must update all rows where DeptID = 10.


If one row is updated and another is missed, the database ends up with inconsistent department
namesfor the same DeptID (both "HR"and "Human Resources").

Why Are They Bad? :

 Redundancy:Wastes space and complicates updates.


 Inconsistency:Risk of conflicting data.
 Data Loss:Important information may be accidentally removed.
 Complexity:Operations become error-prone.

Solution – Normalization:

These anomalies can be avoided by applying normalization, especially up to Third Normal Form (3NF).
This process involves:

 Dividing data into separate relations(e.g., EMPLOYEE and DEPARTMENT)


 Linking them via foreign keys
 Ensuring that each table stores only one type of entity

This structure removes redundancy, preserves integrity, and allows safe insertions, deletions, and
modifications.
Normalization:

Normalization is the process of organizing data in a database to remove redundancy and avoid
problems like insertion, deletion, and update anomalies. It involves dividing a large table into smaller
related tables and ensuring that the data is stored logically and efficiently.

1NF (First Normal Form)

A relation is in 1NF if:

 It has only atomic (indivisible) values.


 There are no repeating or multivalued attributes.

Example (Not in 1NF):

StudentID Name Subjects


1 Alice Math, Science
2 Bob English
Here, Subjectshas multiple values → violates 1NF.

Converted to 1NF:

StudentID Name Subject


1 Alice Math
1 Alice Science
2 Bob English

2NF (Second Normal Form)

A relation is in 2NF if:

 It is in 1NF.
 There is no partial dependencyon a composite primary key.

Example (Not in 2NF):

StudentID CourseID StudentName CourseName


1 C1 Alice DBMS
2 C2 Bob OS
Here, StudentNamedepends only on StudentID, not on the whole key (StudentID, CourseID).

Converted to 2NF:

Student Table:

StudentID StudentName
1 Alice
2 Bob
Course Table:

CourseID CourseName
C1 DBMS
C2 OS
Enrollment Table:

StudentID CourseID
1 C1
2 C2

3NF (Third Normal Form)

A relation is in 3NF if:

 It is in 2NF.
 There is no transitive dependency(i.e., non-key attributes should not depend on other non-key
attributes).

Example (Not in 3NF):

EmployeeID Name DepartmentID DepartmentName


1 Alice 10 HR
2 Bob 20 IT
Here, DepartmentNamedepends on DepartmentIDwhich is not a key → violates 3NF.

Converted to 3NF:

Employee Table:

EmployeeID Name DepartmentID


1 Alice 10
2 Bob 20
Department Table:

DepartmentID DepartmentName
10 HR
20 IT
BCNF (Boyce-Codd Normal Form)

A relation is in BCNF if:

 It is in 3NF.
 For every functional dependency X → Y, X is a superkey.

Example (Not in BCNF):

Student Course Instructor


Alice DBMS Prof. A
Alice OS Prof. A
FDs:


 ❌
{Student, Course} → Instructor
Instructor → Course

(Instructor is not a superkey)

Converted to BCNF:

Instructor Table:

Instructor Course
Prof. A DBMS
Prof. A OS
StudentInstructor Table:

Student Instructor
Alice Prof. A

4NF (Fourth Normal Form)

A relation is in 4NF if:

 It is in BCNF.
 No non-trivial multivalued dependenciesexist (except when LHS is a superkey).

Example (Not in 4NF):

Employee Project Dependent


Alice Alpha John
Alice Alpha Mary
Alice Beta John
Alice Beta Mary
→ Two independent MVDs:

 Employee →→ Project
 Employee →→ Dependent

Converted to 4NF:

EmployeeProject Table:

Employee Project
Alice Alpha
Alice Beta
EmployeeDependent Table:

Employee Dependent
Alice John
Alice Mary

5NF (Fifth Normal Form / PJNF)

A relation is in 5NF if:

 It is in 4NF.
 It has no join dependency violationsthat can cause lossless joins.

Example (Not in 5NF):

Supplier Part Project


S1 P1 J1
S1 P1 J2
S1 P2 J1
S1 P2 J2
This can be losslessly split into 3 tables.

Converted to 5NF:

SupplierPart:

Supplier Part
S1 P1
S1 P2
SupplierProject:

Supplier Project
S1 J1
S1 J2
PartProject:

Part Project
P1 J1
P1 J2
P2 J1
P2 J2

Data Types and Domains in SQL

In SQL, data typesdefine the kind of values a column can hold. They form the domainof each attribute.

1. INT (Integer)

 Used to store whole numbers.


 Example: age INTcan store values like 25, 30.

2. VARCHAR(n)

 Stores variable-length character strings up to ncharacters.


 Example: name VARCHAR(50)can store names up to 50 characters.
 Strings are case-sensitive and enclosed in single quotes.

3. DATE

 Stores calendar dates in YYYY-MM-DDformat.


 Validates real dates only.
 Example: birthdate DATE→ '2025-07-06'.

4. BOOLEAN

 Stores logical values: TRUE, FALSE, or UNKNOWN(due to NULL).


 Example: is_active BOOLEAN.
Substring Pattern Matching in SQL

SQL allows pattern matchingwithin string data using the LIKEoperator and wildcards.

Wildcards used with LIKE:

 %→ Matches zero or more characters


 _→ Matches exactly one character

Syntax:
SELECT column_name
FROM table_name
WHERE column_name LIKE 'pattern';

Examples:

1. Find names that start with 'A':


SELECT Name
FROM Students
WHERE Name LIKE 'A%';

 Matches: Alice, Andrew, Amit...

2. Find names that end with 'son':


SELECT Name
FROM Employees
WHERE Name LIKE '%son';

 Matches: Jackson, Thomson...

3. Find names where the second letter is 'a':


SELECT Name
FROM Employees
WHERE Name LIKE '_a%';

 Matches: Rahul, Ramesh, David...

4. Find names that contain 'dat':


SELECT Name
FROM Employees
WHERE Name LIKE '%dat%';

 Matches: Pradatta, Sudath...

Common Uses of %:

Pattern Meaning
'A%' Starts with A
'%son' Ends with son
'%dat%' Contains datanywhere
'%' Matches anystring (including empty)
Summary Points:

 SQL LIKEis used for substring/pattern matching.


 %= any number of characters; _= single character.
 You can match beginning, middle, or end parts of strings.
 Very useful in searching and filteringstring data.

The INSERTCommand

Definition:

The INSERTcommand is used to add one or more new rows (tuples) to a table.

Syntax (Types):

1. Basic Form (All Columns):

INSERT INTO table_name


VALUES (value1, value2, ..., valueN);

2.
With Column Names:

INSERT INTO table_name (column1, column2, ..., columnN)


VALUES (value1, value2, ..., valueN);

Examples:

Example 1 – Insert a single row:

INSERT INTO EMPLOYEE


VALUES ('John', 'Doe', 5, '123456789');

Example 2 – Insert with specific columns:

INSERT INTO EMPLOYEE (Fname, Lname, Dno, Ssn)


VALUES ('Alice', 'Smith', 3, '987654321');

The DELETECommand

Definition:

The DELETEcommand is used to remove one or more rows (tuples) from a table based on a condition.

Syntax:
DELETE FROM table_name
WHERE condition;

Note: Without a WHEREclause, all rows in the table will be deleted.

Examples:

Example 1 – Delete specific rows:

DELETE FROM EMPLOYEE


WHERE Dno = 4;

Example 2 – Delete all rows in the table:

DELETE FROM EMPLOYEE;

The UPDATECommand

Definition:

The UPDATEcommand is used to modify existing values in one or more rows of a table.

Syntax:
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;
Examples:

Example 1 – Update project details:

UPDATE PROJECT
SET Plocation = 'Bangalore', Dnum = 2
WHERE Pnumber = 10;

Example 2 – Give 10% raise to department 5 employees:


UPDATE EMPLOYEE
SET Salary = Salary * 1.1
WHERE Dno = 5;

The ALTERCommand

Definition:

The ALTERcommand is used to change the structure of an existing table (such as adding, modifying, or
removing columns).

Syntax:

1. Add a column:

ALTER TABLE table_name


ADD column_name datatype;

2.
Modify a column's datatype:

ALTER TABLE table_name


MODIFY column_name new_datatype;

3.
Drop a column:

ALTER TABLE table_name


DROP COLUMN column_name;

Examples:

Add a column:

ALTER TABLE EMPLOYEE


ADD Email VARCHAR(50);

Modify a column:

ALTER TABLE EMPLOYEE


MODIFY Salary DECIMAL(10,2);

Drop a column:

ALTER TABLE EMPLOYEE


DROP COLUMN Email;

The DROPCommand

Definition:

The DROPcommand is used to permanently delete an entire table or other database objects.

Syntax:
DROP TABLE table_name;

Example:
DROP TABLE EMPLOYEE;

Note: This removes the entire table and its data permanently.
Cursor in SQL

A cursorallows row-by-rowprocessing of query results in SQL, useful when you can't process all rows at
once.

Steps to Use Cursor

1. DECLARE– define the cursor with a query


2. OPEN– execute the query
3. FETCH– get one row at a time
4. PROCESS– perform logic
5. CLOSE– finish and release resources
6. DEALLOCATE– remove cursor

Main Properties
Property Meaning
FORWARD_ONLY Only move forward
SCROLL Move in any direction
READ ONLY Cannot update rows
UPDATEABLE Can update rows via cursor
STATIC Snapshot of data
DYNAMIC Reflects real-time table changes
KEYSET Shows updates, but not new/deleted rows

Use When

 You need row-wise logic


 Loops or conditional row handling is required

Functional Dependencies ?

You might also like