0% found this document useful (0 votes)

18 views10 pages

Normalization

The document provides a comprehensive guide to database normalization in SQL, explaining its importance in organizing data to reduce redundancy, improve query performance, and enhance data integrity. It outlines the various normalization forms from First Normal Form (1NF) to Fifth Normal Form (5NF), detailing their characteristics and how they can be applied to real-world scenarios. The tutorial is aimed at beginners in the data industry, particularly those aspiring to become data scientists or engineers.

Uploaded by

Mohit Subramaniam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views10 pages

Normalization

Uploaded by

Mohit Subramaniam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

EN

T U TO R I A L S category

Home Tutorials SQL

Normalization in SQL (1NF

- 5NF): A Beginner’s Guide
Database normalization is an important process used to organize and
structure relational databases. This process ensures that data is stored in a
way that minimizes redundancy, simplifies querying, and improves data
integrity.

Contents May 28, 2024 · 9 min read

Samuel Shaibu
Data Scientist | Microsoft Certified Data Analyst Associate | Technical Writer

TO P I C S

SQL

Data Analysis

In this article, we will be exploring the basic concepts you need to know about
normalization, its importance, and the various techniques that are involved. This article is for,
but not limited to, those looking to break into the data industry and become data scientists
or data engineers.

Associate Data Engineer in SQL

Gain practical knowledge in ETL, SQL, and data warehousing for data
engineering.

Explore Track

What is Normalization in SQL?

Normalization, in this context, is the process of organizing data within a database (relational
database) to eliminate data anomalies, such as redundancy.

In simpler terms, it involves breaking down a large, complex table into smaller and simpler
tables while maintaining data relationships.

Normalization is commonly used when dealing with large datasets.

Let’s take a brief look at some scenarios where normalization is often used.

Data integrity
Imagine a database that contains customer information. Without normalization, if a
customer changes their age, we would need to update it in multiple places, which would
increase the risk of inconsistencies. By normalizing the data, we can have separate tables
linked by a unique identifier that will ensure that the data remains accurate and consistent.

Efficiency querying
Let’s consider a complex database with multiple related tables that stores redundant
information. In this scenario, queries involving joins become more complicated and resource-
intensive. Normalization will help simplify querying by breaking down data into smaller
tables, with each table containing only relevant information, thereby reducing the need for
complex joins.

Storage optimization
A major problem with redundant data is that it occupies unnecessary storage space. For
instance, if we store the same product details in every order record, it leads to duplication.
With normalization, you can eliminate redundancy by splitting data into separate tables.

Why is Normalization in SQL Important?

Normalization plays a crucial role in database design. Here are several reasons why it’s
essential:

Reduces redundancy: Redundancy is when the same information is stored multiple

times, and a good way of avoiding this is by splitting data into smaller tables.

Improves query performance: You can perform faster query execution on smaller tables
that have undergone normalization.

Minimizes update anomalies: With normalized tables, you can easily update data
without affecting other records.

Enhances data integrity: It ensures that data remains consistent and accurate.

What Causes the Need for Normalization?

If a table is not properly normalized and has data redundancy, it will not only take up extra
data storage space but also make it difficult to handle and update the database.

There are several factors that drive the need for normalization, from data redundancy(as
covered above) to difficulty managing relationships. Let’s get right into it:

Insertion, deletion, and update anomalies: Any form of change in a table can lead to
errors or inconsistencies in other tables if not handled carefully. These changes can
either be adding new data to a database, updating the data, or deleting records, which
can lead to unintended loss of data.

Difficulty in managing relationships: It becomes more challenging to maintain complex

relationships in an unnormalized structure.

Other factors that drive the need for normalization are partial dependencies and
transitive dependencies, in which partial dependencies can lead to data redundancy
and update anomalies, and transitive dependencies can lead to data anomalies. We will
be looking at how these dependencies can be dealt with to ensure database
normalization in the coming sections.

Different Types of Database Normalization

So far, we have looked at what normalization in SQL is, why normalization in SQL is
important, and what causes the need for normalization. Database normalization comes in
different forms, each with increasing levels of data organization.

In this section, we will briefly discuss the different normalization levels and then explore
them deeper in the next section.

Image by Author
First Normal Form (1NF)
This normalization level ensures that each column in your data contains only atomic values.
Atomic values in this context means that each entry in a column is indivisible. It is like saying
that each cell in a spreadsheet should hold just one piece of information. 1NF ensures
atomicity of data, with each column cell containing only a single value and each column
having unique names.

Second Normal Form (2NF)

Eliminates partial dependencies by ensuring that non-key attributes depend only on the
primary key. What this means, in essence, is that there should be a direct relationship
between each column and the primary key, and not between other columns.

Third Normal Form (3NF)

Removes transitive dependencies by ensuring that non-key attributes depend only on the
primary key. This level of normalization builds on 2NF.

Boyce-Codd Normal Form (BCNF)

This is a more strict version of 3NF that addresses additional anomalies. At this
normalization level, every determinant is a candidate key.

Fourth Normal Form (4NF)

This is a normalization level that builds on BCNF by dealing with multi-valued dependencies.

Fifth Normal Form (5NF)

5NF is the highest normalization level that addresses join dependencies. It is used in specific
scenarios to further minimize redundancy by breaking a table into smaller tables.

Database Normalization With Real-World Examples

We have already highlighted all the data normalization levels. Let’s further explore each of
them in more depth with examples and explanations.

First Normal Form (1NF) Normalization

1NF ensures that each column cell contains only atomic values. Imagine a library database
with a table storing book information (title, author, genre, and borrowed_by). If the table is
not normalized, borrowed_by could contain a list of borrower names separated by commas.
This violates 1NF, as a single cell holds multiple values. The table below is a good
representation of a table that violates 1NF, as described earlier.

title author genre borrowed_by

John Doe, Jane Doe, James

To Kill a Mockingbird Harper Lee Fiction
Brown

J. R. R. Fantas
The Lord of the Rings Emily Garcia, David Lee
Tolkien y

Harry Potter and the Sorcerer’s J.K. Fantas

Michael Chen
Stone Rowling y

The solution?

In 1NF, we create a separate table for borrowers and link them to the book table. These
tables can either be linked using the foreign key in the borrower table or a separate linking
table. The foreign key in the borrowers table approach involves adding a foreign key column
to the borrowers table that references the primary key of the books table. This will enforce a
relationship between the tables, ensuring data consistency.

You can find a representation of this below:

Books table
book_id (PK) title author genre

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

Borrowers table

borrower_id (PK) name book_id (FK)

1 John Doe 1

2 Jane Doe 1

3 James Brown 1

4 Emily Garcia 2

5 David Lee 2

6 Michael Chen 3

Second Normal Form (2NF)

This level of normalization, as already described, builds upon 1NF by ensuring there are no
partial dependencies on the primary key. In simpler terms, all non-key attributes must
depend on the entire primary key and not just part of it.

From the 1NF that was implemented, we already have two separate tables (you can check
the 1NF section).

Now, let’s say we want to link these tables to record borrowings. The initial approach might
be to simply add a borrower_id column to the books table, as shown below:

book_id borrower_id
title author genre
(PK) (FK)

1 To Kill a Mockingbird Harper Lee Fiction 1

J. R. R. Fantas
2 The Lord of the Rings NULL
Tolkien y

Harry Potter and the Sorcerer’s J.K. Fantas

3 6
Stone Rowling y

This might look like a solution, but it violates 2NF simply because the borrower_id only
partially depends on the book_id. A book can have multiple borrowers, but a single
borrower_id can only be linked to one book in this structure. This creates a partial
dependency.

The solution?
We need to achieve the many-to-many relationship between books and borrowers to
achieve 2NF. This can be done by introducing a separate table:

Book_borrowings table

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date

1 1 1 2024-05-04

2 2 4 2024-05-04

3 3 6 2024-05-04

This table establishes a clear relationship between books and borrowers. The book_id and
borrower_id act as foreign keys, referencing the primary keys in their respective tables. This
approach ensures that borrower_id depends on the entire primary key (book_id) of the
books table, complying with 2NF.

Third Normal Form (3NF)

3NF builds on 2NF by eliminating transitive dependencies. A transitive dependency occurs
when a non-key attribute depends on another non-key attribute, which in turn depends on
the primary key. It basically takes its meaning from the transitive law.

From the 2NF we already implemented, there are three tables in our library database:

Books table

book_id (PK) title author genre

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

Borrowers table

borrower_id (PK) name book_id (FK)

1 John Doe 1

2 Jane Doe 1

3 James Brown 1

4 Emily Garcia 2

5 David Lee 2

6 Michael Chen 3

Book_borrowings table

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date

1 1 1 2024-05-04

2 2 4 2024-05-04

3 3 6 2024-05-04

The 2NF structure looks efficient, but there might be a hidden dependency. Imagine we add
a due_date column to the books table. This might seem logical at first sight, but it’s going to
create a transitive dependency where:

The due_date column depends on the borrowing_id (a non-key attribute) from the
book_borrowings table.

The borrowing_id in turn depends on book_id (the primary key) of the books table.

The implication of this is that due_date relies on an intermediate non-key attribute

(borrowing_id) instead of directly depending on the primary key (book_id). This violates 3NF.

The solution?

We can move the due_date column to the most appropriate table by updating the
book_borrowings table to include the due_date and returned_date columns.

Below is the updated table:

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date due_date

1 1 1 2024-05-04 2024-05-20

2 2 4 2024-05-04 2024-05-18

3 3 6 2024-05-04 2024-05-10

By placing the due_date column in the book_borrowing table, we have successfully

eliminated the transitive dependency.

What this means is that due_date now directly depends on the combined relationship
between book_id and borrower_id. In this context, book_id and borrower_id are acting as a
composite foreign key, which together form the primary key of the book_borrowings table.

Boyce-Codd Normal Form (BCNF)

BCNF is based on functional dependencies that consider all candidate keys in a
relationship.

Functional dependencies (FD) define relationships between attributes within a relational

database. An FD states that the value of one column determines the value of another
related column. FDs are very important because they guide the process of normalization by
identifying dependencies and ensuring data is appropriately distributed across tables.

BCNF is a stricter version of 3NF. It ensures that every determinant (a set of attributes that
uniquely identify a row) in a table is a candidate key (a minimal set of attributes that
uniquely identify a row). The whole essence of this is that all determinants should be able to
serve as primary keys.

It ensures that every functional dependency (FD) has a superkey as its determinant. In other
words, if X —> Y (X determines Y) holds, X must be a candidate key (superkey) of the
relation. Please note that X and Y are columns in a data table.

As a build-up from the 3NF, we have three tables:

Books table
book_id (PK) title author genre

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

Borrowers table

borrower_id (PK) name book_id (FK)

1 John Doe 1

2 Jane Doe 1

3 James Brown 1

4 Emily Garcia 2

5 David Lee 2

6 Michael Chen 3

Book_borrowings table

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date due_date

1 1 1 2024-05-04 2024-05-20

2 2 4 2024-05-04 2024-05-18

3 3 6 2024-05-04 2024-05-10

While the 3NF structure is good, there might be a hidden determinant in the
book_borrowings table. Assuming one borrower cannot borrow the same book twice
simultaneously, the combination of book_id and borrower_id together uniquely identifies a
borrowing record.

This structure violates BCNF since the combined set (book_id and borrower_id) is not the
primary key of the table (which is just borrowing_id).

The solution?

To achieve BCNF, we can either decompose the book_borrowings table into two separate
tables or make the combined attribute set the primary key.

1. Approach 1 (decompose the table): In this approach, we will be decomposing the

book_borrowings table into separate tables:

A table with borrowing_id as the primary key, borrowed_date, due_date, and

returned_date.
Another separate table to link books and borrowers, with book_id as a foreign key,
borrower_id as a foreign key, and potentially additional attributes specific to the
borrowing event.

2. Approach 2 (make the combined attribute set the primary key): We can consider
making book_id and borrower_id a composite primary key for uniquely identifying
borrowing records. The problem with this approach is that it won’t serve its purpose if a
borrower can borrow the same book multiple times.

In the end, your choice between these options depends on your specific data needs and
how you want to model borrowing relationships.

Fourth Normal Form (4NF)

4NF deals with multi-valued dependencies. A multi-valued dependency exists when one
attribute can have multiple dependent attributes, and these dependent attributes are
independent of the primary key. It’s quite complex, but we will be exploring it deeper using
an example.

The library example we’ve been using throughout these explanations is not applicable at
this normalization level. 4NF typically applies to situations where a single attribute might
have multiple dependent attributes that don’t directly relate to the primary key.

Let’s use another scenario. Imagine a database that stores information about publications.
We will be considering a “Publications” table with columns, title, author, publication_year,
and keywords.

publication_i publication_
title author keywords
d (PK) year

To Kill a Harper
1 1960 Coming-of-Age, Legal
Mockingbird Lee

The Lord of the J. R. R. Fantasy, Epic,

2 1954
Rings Tolkien Adventure

Pride and Jane Romance, Social

3 1813
Prejudice Austen Commentary

The table structure above is violating 4NF because:

The keywords column has a multi-valued dependency on the primary key

publication_id. What this means is that a publication can have multiple keywords, and
these keywords are independent of the publication’s unique identifier.

The solution?

We can create a separate table.

Publication_keywords table

publication_id (FK) keyword

1 Coming-of-Age

1 Legal

2 Fantasy

2 Epic
2 Adventure

3 Romance

3 Social Commentary

The newly created table (Publication_keywords) establishes a many-to-many relationship

between publication and keywords. Each publication can have multiple keywords linked
through the publication_id, which is a foreign key, and each keyword can be associated with
multiple publications.

With this, we have successfully eliminated the multi-valued dependency and achieved 4NF.

Fifth Normal Form (5NF)

5NF is the most complex form of normalization that eliminates join dependencies. This is a
situation where data needs to be joined from multiple tables to answer a specific query,
even when those tables are already in 4NF.

In simpler terms, 5NF ensures that no additional information can be derived by joining the
tables together that wasn’t already available in the separate tables.

Join dependencies are less likely to occur when tables are already normalized (in 3NF or
4NF), hence the difficulty in creating a clear and straightforward example for 5NF.

However, let’s take a look at this scenario where 5NF might be relevant:

Imagine a university database with normalized tables for “Courses” and “Enrollments.”

Courses table

course_id (PK) course_name department

101 Introduction to Programming Computer Science

202 Data Structures and Algorithms Computer Science

301 Web Development I Computer Science

401 Artificial Intelligence Computer Science

Enrollments table

enrollment_id (PK) student_id (FK) course_id (FK) grade

1 12345 101 A

2 12345 202 B

3 56789 301 A-

4 56789 401 B+

Assuming these tables are already in 3NF or 4NF, a join dependency might exist depending
on how data is stored. For instance, a course has a prerequisite requirement stored within
the “Courses” table as the “prerequisite_course_id” column.
This might seem efficient at first glance. However, consider a query that needs to retrieve a
student’s enrolled courses and their respective prerequisites. In this scenario, you would
need to join the “Courses” and “Enrollments” tables, then potentially join the “Courses” table
to retrieve prerequisite information.

The Solution?

To potentially eliminate the join dependency and achieve 5NF, we could introduce a
separate “Course Prerequisites” table:

Course_prerequisite table

course_id (FK) prerequisite_course_id (FK)

202 101

301 NULL

401 202

This approach separates prerequisite information and allows efficient retrieval of enrolled
courses and their prerequisites in a single join between the “Enrollments” and
“Course_prerequisites” tables.

Note: We are assuming a student can only have one prerequisite per course.

5NF is a very complex and rare type of normalization, so as someone just starting their
learning journey in data, you might not find an application. However, it’s going to be added
knowledge and will make you prepared when you stumble on complex databases.

Build Your SQL Skills

If you are reading this, then congratulations to you for sticking around to the end. It has
been a great ride exploring what normalization in SQL is, why normalization in SQL is
important, what causes the need for normalization, and the different types of database
normalization. The scenarios used in explaining the different types of normalization are so
you can fully understand and also be able to apply this knowledge in your learning journey.

Normalization is a fundamental skill for anybody starting their career in any data-related
career path. By understanding these principles, you are now ready to build efficient and
well-organized databases.

Learning is very important in the data space, and for you to enhance your SQL skills, we
have some resources for you.

Database Normalization Course (Importance of Data Normalization) with PostgreSQL

Database Normalization Course (Walking through the normalization process) with

PostgreSQL

Data Modeling in SQL Code Along

QL Query Examples and Tutorial

FAQs

What is normalization in DBMS?

Database normalization is a technique that optimally designs the schema of a
relational database. It involves dividing tables into smaller subtables and storing
pointers to data rather than replicating it.

Why is normalization important?

Normalization
No ratings yet
Normalization
15 pages
Report For Blood Bank
No ratings yet
Report For Blood Bank
10 pages
Unit 3 Relational Database Designs
No ratings yet
Unit 3 Relational Database Designs
36 pages
Normalization Lec4
No ratings yet
Normalization Lec4
29 pages
Dbms Theory Notes Unit IV
No ratings yet
Dbms Theory Notes Unit IV
73 pages
Unit III Dbms
No ratings yet
Unit III Dbms
8 pages
Database Normalization
No ratings yet
Database Normalization
6 pages
Normalisation Concepts in Database
No ratings yet
Normalisation Concepts in Database
5 pages
Primary Key
No ratings yet
Primary Key
10 pages
What Is Database Normalization
No ratings yet
What Is Database Normalization
12 pages
NORMALIZATION
No ratings yet
NORMALIZATION
6 pages
DBMS 22eg106a10
No ratings yet
DBMS 22eg106a10
10 pages
Database Normalization
No ratings yet
Database Normalization
10 pages
Understanding Database Normalization Techniques
No ratings yet
Understanding Database Normalization Techniques
4 pages
Programming Assignment Unit 3
No ratings yet
Programming Assignment Unit 3
9 pages
Database Normalization
No ratings yet
Database Normalization
8 pages
Normalization Detail
No ratings yet
Normalization Detail
9 pages
Normalisation
No ratings yet
Normalisation
23 pages
Topic 2 - Normalization Notes
No ratings yet
Topic 2 - Normalization Notes
5 pages
Data Normalization
No ratings yet
Data Normalization
25 pages
Database Journal Unit 3
No ratings yet
Database Journal Unit 3
5 pages
Normalization in SQL
No ratings yet
Normalization in SQL
12 pages
Normalization
No ratings yet
Normalization
22 pages
Normalization
No ratings yet
Normalization
17 pages
RDBMS Normalization Explained
No ratings yet
RDBMS Normalization Explained
8 pages
Coronel DatabaseSystems 13e Ch06
No ratings yet
Coronel DatabaseSystems 13e Ch06
54 pages
Normalization
No ratings yet
Normalization
8 pages
Lecture 9 & 10 - Normalization
No ratings yet
Lecture 9 & 10 - Normalization
31 pages
Understanding Database Normalization
No ratings yet
Understanding Database Normalization
14 pages
Database Normalization and Dependencies
No ratings yet
Database Normalization and Dependencies
65 pages
MYSQL DAY - 20 (Normalization)
No ratings yet
MYSQL DAY - 20 (Normalization)
13 pages
Understanding Normalization in DBMS
No ratings yet
Understanding Normalization in DBMS
11 pages
4.what Is Normalization PDF
No ratings yet
4.what Is Normalization PDF
9 pages
Normalization of Database-Ass-2
No ratings yet
Normalization of Database-Ass-2
31 pages
12 Normalization
No ratings yet
12 Normalization
41 pages
Week3 Normalization 121139
No ratings yet
Week3 Normalization 121139
5 pages
Normalization FORM
No ratings yet
Normalization FORM
5 pages
Database Normalization
No ratings yet
Database Normalization
7 pages
LESSON 7. Normalization of Database Tables
No ratings yet
LESSON 7. Normalization of Database Tables
34 pages
Normalization
No ratings yet
Normalization
31 pages
DMBS Unit 2
No ratings yet
DMBS Unit 2
16 pages
1NF to 2NF Decomposition Flowchart
No ratings yet
1NF to 2NF Decomposition Flowchart
32 pages
Week 2
No ratings yet
Week 2
34 pages
Database Techniques DB Normalization
No ratings yet
Database Techniques DB Normalization
37 pages
Database Normalization Explained
No ratings yet
Database Normalization Explained
34 pages
Lesson5 NORMALIZATION (Midtrem)
No ratings yet
Lesson5 NORMALIZATION (Midtrem)
29 pages
Normalization
No ratings yet
Normalization
12 pages
Normalization Lesson
No ratings yet
Normalization Lesson
13 pages
Normalisation 2025
No ratings yet
Normalisation 2025
74 pages
Normalization of Database Tables
No ratings yet
Normalization of Database Tables
21 pages
Chapter 6 - Normalization of Database Tables
No ratings yet
Chapter 6 - Normalization of Database Tables
23 pages
What Is Normalization in DBMS
No ratings yet
What Is Normalization in DBMS
9 pages
Normalization
No ratings yet
Normalization
9 pages
Normalization
No ratings yet
Normalization
30 pages
Chapter 4 DB
No ratings yet
Chapter 4 DB
30 pages
Dbmsmicroproject 2
No ratings yet
Dbmsmicroproject 2
7 pages
Normalization Module3 Complete
No ratings yet
Normalization Module3 Complete
12 pages
Medical Report
No ratings yet
Medical Report
3 pages
Appendix 4
No ratings yet
Appendix 4
1 page
Untitled Document
No ratings yet
Untitled Document
20 pages
CDHPM - Job Description For Engineers
No ratings yet
CDHPM - Job Description For Engineers
2 pages
Yt Da
No ratings yet
Yt Da
8 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
Untitled Document
No ratings yet
Untitled Document
2 pages
Olympics DA
No ratings yet
Olympics DA
7 pages
Untitled Document
No ratings yet
Untitled Document
1 page
Interview Ques
100% (1)
Interview Ques
102 pages
Revision Coding
No ratings yet
Revision Coding
8 pages
Bece202l Signals-And-Systems TH 1.0 65 Bece202l
No ratings yet
Bece202l Signals-And-Systems TH 1.0 65 Bece202l
3 pages
Dbms 2mark
No ratings yet
Dbms 2mark
18 pages
Tourism Management System: November 2016
No ratings yet
Tourism Management System: November 2016
17 pages
DBMS Question Bank for Students
No ratings yet
DBMS Question Bank for Students
15 pages
Chapter5-DATA AND KNOWLEDGE MANAGEMENT
No ratings yet
Chapter5-DATA AND KNOWLEDGE MANAGEMENT
39 pages
CS3492-DBMS Study Material - Unit I
50% (4)
CS3492-DBMS Study Material - Unit I
11 pages
AL ICT - Databse
100% (1)
AL ICT - Databse
39 pages
Database Final Project Report UTM
No ratings yet
Database Final Project Report UTM
70 pages
SQL Injection Interview Insights
No ratings yet
SQL Injection Interview Insights
2 pages
Lecture 5
No ratings yet
Lecture 5
4 pages
DBMS Viva Questions MCA Idol
100% (10)
DBMS Viva Questions MCA Idol
14 pages
Dbms Bcs403 Second Ia Question Bank 24-25
No ratings yet
Dbms Bcs403 Second Ia Question Bank 24-25
2 pages
F.Y.B.sc.-Sem II Syllabus
No ratings yet
F.Y.B.sc.-Sem II Syllabus
110 pages
Gate Da Dbms Updated
No ratings yet
Gate Da Dbms Updated
29 pages
BCA Applied Mathematics Course Handbook
No ratings yet
BCA Applied Mathematics Course Handbook
27 pages
Query Processing in DBMS: Parsing Focus
No ratings yet
Query Processing in DBMS: Parsing Focus
13 pages
Normalization Exercise Dentistry
No ratings yet
Normalization Exercise Dentistry
3 pages
Course Structure: Master in Computer Applications (MCA) (Two Years Programme)
No ratings yet
Course Structure: Master in Computer Applications (MCA) (Two Years Programme)
73 pages
DBMS Normalization & Dependencies
No ratings yet
DBMS Normalization & Dependencies
64 pages
DDM - III Unit
No ratings yet
DDM - III Unit
14 pages
Unit 15 Information Systems and Data Governance
No ratings yet
Unit 15 Information Systems and Data Governance
20 pages
Store Management System Overview
100% (1)
Store Management System Overview
50 pages
STET Complete-Book - 1
No ratings yet
STET Complete-Book - 1
371 pages
Project Parikshit
No ratings yet
Project Parikshit
40 pages
Dbms Module 3
No ratings yet
Dbms Module 3
12 pages
Data Warehouse Implementation of Examination Datab
No ratings yet
Data Warehouse Implementation of Examination Datab
6 pages
Student Attendance Database Design
No ratings yet
Student Attendance Database Design
35 pages
Ty Syllabus 2025-2026
No ratings yet
Ty Syllabus 2025-2026
27 pages
MODULE 3 For BSIT 3B
No ratings yet
MODULE 3 For BSIT 3B
17 pages
DBMS Course Handout Spring 2025
No ratings yet
DBMS Course Handout Spring 2025
5 pages
Syllabus 2023batch Updated 4thjuly2025 PDF
No ratings yet
Syllabus 2023batch Updated 4thjuly2025 PDF
207 pages

Normalization

Uploaded by

Normalization

Uploaded by

EN

Home Tutorials SQL

Normalization in SQL (1NF

Contents May 28, 2024 · 9 min read

Associate Data Engineer in SQL

What is Normalization in SQL?

Normalization is commonly used when dealing with large datasets.

Why is Normalization in SQL Important?

Reduces redundancy: Redundancy is when the same information is stored multiple

What Causes the Need for Normalization?

Difficulty in managing relationships: It becomes more challenging to maintain complex

Different Types of Database Normalization

Second Normal Form (2NF)

Third Normal Form (3NF)

Boyce-Codd Normal Form (BCNF)

Fourth Normal Form (4NF)

Fifth Normal Form (5NF)

Database Normalization With Real-World Examples

First Normal Form (1NF) Normalization

title author genre borrowed_by

John Doe, Jane Doe, James

Harry Potter and the Sorcerer’s J.K. Fantas

You can find a representation of this below:

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

borrower_id (PK) name book_id (FK)

Second Normal Form (2NF)

1 To Kill a Mockingbird Harper Lee Fiction 1

Harry Potter and the Sorcerer’s J.K. Fantas

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date

Third Normal Form (3NF)

book_id (PK) title author genre

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

borrower_id (PK) name book_id (FK)

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date

The implication of this is that due_date relies on an intermediate non-key attribute

Below is the updated table:

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date due_date

By placing the due_date column in the book_borrowing table, we have successfully

Boyce-Codd Normal Form (BCNF)

Functional dependencies (FD) define relationships between attributes within a relational

As a build-up from the 3NF, we have three tables:

1 To Kill a Mockingbird Harper Lee Fiction

2 The Lord of the Rings J. R. R. Tolkien Fantasy

3 Harry Potter and the Sorcerer’s Stone J.K. Rowling Fantasy

borrower_id (PK) name book_id (FK)

borrowing_id (PK) book_id (FK) borrower_id (FK) borrowed_date due_date

1. Approach 1 (decompose the table): In this approach, we will be decomposing the

A table with borrowing_id as the primary key, borrowed_date, due_date, and

Fourth Normal Form (4NF)

The Lord of the J. R. R. Fantasy, Epic,

Pride and Jane Romance, Social

The table structure above is violating 4NF because:

The keywords column has a multi-valued dependency on the primary key

We can create a separate table.

publication_id (FK) keyword

The newly created table (Publication_keywords) establishes a many-to-many relationship

Fifth Normal Form (5NF)

course_id (PK) course_name department

101 Introduction to Programming Computer Science

202 Data Structures and Algorithms Computer Science

301 Web Development I Computer Science

401 Artificial Intelligence Computer Science

enrollment_id (PK) student_id (FK) course_id (FK) grade

course_id (FK) prerequisite_course_id (FK)

Build Your SQL Skills

Database Normalization Course (Importance of Data Normalization) with PostgreSQL

Database Normalization Course (Walking through the normalization process) with

Data Modeling in SQL Code Along

QL Query Examples and Tutorial

What is normalization in DBMS?

Why is normalization important?

You might also like