First Normal Form
First Normal Form (1NF) is a property of relational database design that
ensures that the data is organized in a structured manner, with no repeating
groups, and with each attribute containing only atomic (indivisible) values. In
other words, 1NF requires that each cell in a table should contain a single value,
and each column should have a unique name.
To illustrate 1NF, let's consider an example of a table that violates 1NF and then
transform it into a 1NF-compliant table.
Example of a table that violates 1NF:
Suppose we have a table to store information about students and their
respective courses and grades:
Student Student Course Course
ID Name 1 2 Course 3
John
1 Doe Math Science History
Jane
2 Smith English Math Geography
In this table, the "Course 1," "Course 2," and "Course 3" columns contain
repeating groups of course names. This violates the 1NF rule.
Transforming into 1NF:
To bring this table into 1NF, we need to remove the repeating groups by
creating a separate table for courses and linking it to the students through a
unique identifier.
Page 1 of 13
Students Table:
Student ID Student Name
1 John Doe
2 Jane Smith
Courses Table:
Course Course
ID Name
1 Math
2 Science
3 History
Page 2 of 13
Course Course
ID Name
4 English
5 Geography
Grades Table:
Student Course
ID ID Grade
1 1 A
1 2 B
1 3 C
2 4 A
Page 3 of 13
Student Course
ID ID Grade
2 1 B
2 5 A-
In this transformed design, we have split the original table into three separate
tables: Students, Courses, and Grades. The Students table contains unique
student information, the Courses table contains unique course information, and
the Grades table links students and courses with their respective grades.
Now, the data is organized in a structured manner, without any repeating
groups, and each attribute contains only atomic values, satisfying the
requirements of First Normal Form (1NF).
Second Normal Form (2NF) is a database normalization concept that
builds upon the First Normal Form (1NF). To be in 2NF, a table must meet the
following conditions:
1. It must be in 1NF.
2. It should not have partial dependencies, meaning that each non-key column in
the table is fully dependent on the entire primary key.
In simpler terms, 2NF ensures that non-key attributes (columns) depend on the
entire primary key, not just a part of it. This helps eliminate redundancy and
ensures better data organization.
Let's illustrate 2NF with an example. Suppose we have a table that stores
information about books and their authors, where each author can write
multiple books. The table looks like this:
Page 4 of 13
| Author ID | Author Name | Book ID | Book Title | Genre |
|-----------|---------------|---------|-----------------|------------|
| 101 | John Doe |1 | Book A | Fiction |
| 101 | John Doe |2 | Book B | Mystery |
| 102 | Jane Smith | 3 | Book C | Fiction |
| 103 | Bob Johnson | 4 | Book D | Romance |
| 103 | Bob Johnson | 5 | Book E | Mystery |
In this table, we have a composite primary key (Author ID, Book ID), and we
store the author's name, book ID, book title, and genre. The Author ID and Book
ID together uniquely identify each row.
However, there is a partial dependency present. The "Author Name" column
depends only on the "Author ID," which is part of the composite key, and not
on the "Book ID." This violates the second normal form.
To bring the table into 2NF, we need to split it into two separate tables: one for
Authors and another for Books:
Table: Authors
| Author ID | Author Name |
|-----------|---------------|
| 101 | John Doe |
| 102 | Jane Smith |
| 103 | Bob Johnson |
Table: Books
| Book ID | Author ID | Book Title | Genre |
|---------|-----------|-----------------|------------|
|1 | 101 | Book A | Fiction |
|2 | 101 | Book B | Mystery |
|3 | 102 | Book C | Fiction |
Page 5 of 13
|4 | 103 | Book D | Romance |
|5 | 103 | Book E | Mystery |
Now, the "Author Name" information is stored only once in the "Authors" table
and is fully dependent on the "Author ID" (which is the primary key in the
"Authors" table). This restructuring eliminates partial dependencies and brings
the table into 2NF.
Third Normal Form (3NF) is a database normalization concept that builds
upon the First Normal Form (1NF) and Second Normal Form (2NF). To be in 3NF,
a table must satisfy the following conditions:
1. It must be in 2NF.
2. It should not have transitive dependencies, meaning that there should be no
non-key attribute that depends on another non-key attribute.
In simpler terms, 3NF ensures that non-key attributes depend only on the
primary key and not on other non-key attributes. This further reduces
redundancy and improves data integrity.
Let's illustrate 3NF with an example. Consider a table that stores information
about students and their respective courses along with the instructors who
teach those courses:
| Student ID | Student Name | Course Name | Instructor | Instructor Office |
|------------|----------------|----------------|---------------|-------------------|
| 101 | John Doe | Math | Professor X | Room A101 |
| 101 | John Doe | Physics | Professor Y | Room B205 |
| 102 | Jane Smith | Physics | Professor X | Room A101 |
| 103 | Bob Johnson | Chemistry | Professor Z | Room C303 |
| 103 | Bob Johnson | Math | Professor X | Room A101 |
In this table, we have a composite primary key (Student ID, Course Name) that
uniquely identifies each row. We also store the student's name, course
instructor, and instructor's office.
Page 6 of 13
However, there is a transitive dependency present. The "Instructor Office"
column depends on the "Instructor" column, which is not part of the primary
key. Instead, "Instructor" depends on "Course Name," which is part of the
primary key. This violates the third normal form.
To bring the table into 3NF, we need to split it into three separate tables: one
for Students, one for Courses, and one for Instructors:
Table: Students
| Student ID | Student Name |
|------------|----------------|
| 101 | John Doe |
| 102 | Jane Smith |
| 103 | Bob Johnson |
Table: Courses
| Course Name | Instructor |
|-------------|---------------|
| Math | Professor X |
| Physics | Professor Y |
| Chemistry | Professor Z |
Table: Instructors
| Instructor | Instructor Office |
|--------------|-------------------|
| Professor X | Room A101 |
| Professor Y | Room B205 |
| Professor Z | Room C303 |
Now, the "Instructor Office" information is stored in the "Instructors" table and
depends only on the "Instructor" (which is the primary key in the "Instructors"
table). There are no transitive dependencies, and the table is in 3NF, eliminating
redundancy and ensuring data integrity.
Page 7 of 13
Fourth Normal Form (4NF) is a level of database normalization that builds
upon the previous levels, including the First Normal Form (1NF), Second Normal
Form (2NF), and Third Normal Form (3NF). To be in 4NF, a table must meet the
following conditions:
1. It must be in 3NF.
2. It should not have multi-valued dependencies, meaning that there should be no
set of non-key attributes (columns) that are functionally dependent on the
primary key but independent of each other.
In simpler terms, 4NF ensures that there are no non-key attributes that
represent multiple independent facts about the primary key. This helps
eliminate redundancy and anomalies that might occur when handling multi-
valued attributes.
Let's illustrate 4NF with an example. Suppose we have a table that stores
information about employees and the projects they are involved in:
| Employee ID | Employee Name | Project ID | Project Name | Project Location |
| 001 | John Doe | P001 | Project A | New York |
| 001 | John Doe | P002 | Project B | Los Angeles |
| 002 | Jane Smith | P001 | Project A | New York |
| 002 | Jane Smith | P003 | Project C | Chicago |
| 003 | Bob Johnson | P002 | Project B | Los Angeles |
| 003 | Bob Johnson | P003 | Project C | Chicago |
In this table, we have a composite primary key (Employee ID, Project ID), and we
store the employee's name, project ID, project name, and project location. The
Employee ID and Project ID together uniquely identify each row.
However, there is a multi-valued dependency present. The "Project Name" and
"Project Location" columns depend on the "Project ID," which is part of the
primary key, but they are also independent of each other. This violates the fourth
normal form.
Page 8 of 13
To bring the table into 4NF, we need to split it into two separate tables: one for
Employees and another for Projects:
Table: Employees
| Employee ID | Employee Name |
|-------------|----------------|
| 001 | John Doe |
| 002 | Jane Smith |
| 003 | Bob Johnson |
Table: Projects
| Project ID | Project Name | Project Location |
|------------|------------------|-------------------|
| P001 | Project A | New York |
| P002 | Project B | Los Angeles |
| P003 | Project C | Chicago |
Now, the multi-valued dependencies have been eliminated. The "Project Name"
and "Project Location" information is stored only once in the "Projects" table
and is fully dependent on the "Project ID" (which is the primary key in the
"Projects" table). This restructuring brings the table into 4NF, ensuring better
data organization and avoiding redundancy..
Boyce-Codd Normal Form (BCNF) is a level of database normalization that
is more restrictive than the Third Normal Form (3NF) and Fourth Normal Form
(4NF). It was introduced by Raymond F. Boyce and Edgar F. Codd to address
certain anomalies that might arise due to functional dependencies in a
database.
To be in BCNF, a database schema must satisfy the following conditions:
1. It must be in 3NF.
Page 9 of 13
2. For every non-trivial functional dependency (X → Y) in the table, X must be a
superkey (a set of attributes that uniquely identifies each row in the table).
In simpler terms, BCNF ensures that each non-trivial functional dependency is
based on a candidate key, meaning that there are no partial dependencies or
transitive dependencies in the table.
Let's illustrate BCNF with an example. Consider a table that stores information
about employees and their respective projects:
| Employee ID | Employee Name | Project ID | Project Name | Department |
|-------------|----------------|------------|------------------|-------------|
| 001 | John Doe | P001 | Project A | IT |
| 001 | John Doe | P002 | Project B | HR |
| 002 | Jane Smith | P001 | Project A | IT |
| 002 | Jane Smith | P003 | Project C | Finance |
| 003 | Bob Johnson | P002 | Project B | HR |
| 003 | Bob Johnson | P003 | Project C | Finance |
In this table, we have a composite primary key (Employee ID, Project ID), and we
store the employee's name, project ID, project name, and department. The
Employee ID and Project ID together uniquely identify each row.
However, there is a non-trivial functional dependency "Employee ID,
Department → Employee Name." This means that given an "Employee ID" and
"Department," we can determine the "Employee Name." However, the
"Employee ID" alone is sufficient to uniquely determine the "Employee Name,"
making "Department" unnecessary.
To bring the table into BCNF, we need to split it into two separate tables: one
for Employees and another for Projects:
Table: Employees
| Employee ID | Employee Name | Department |
Page 10 of 13
Sure, let's walk through an example to demonstrate Boyce-Codd Normal Form
(BCNF) with a sample table.
Consider a table that stores information about students, their courses, and the
instructors who teach those courses:
| Student ID | Student Name | Course ID | Course Name | Instructor ID | Instructor
Name |
|------------|--------------|-----------|-------------|---------------|-----------------|
| 101 | John Doe | C001 | Math | I001 | Prof. Smith |
| 101 | John Doe | C002 | Physics | I002 | Prof. Johnson |
| 102 | Jane Smith | C002 | Physics | I002 | Prof. Johnson |
| 102 | Jane Smith | C003 | History | I003 | Prof. Anderson |
| 103 | Bob Johnson | C001 | Math | I001 | Prof. Smith |
| 103 | Bob Johnson | C003 | History | I003 | Prof. Anderson |
Sure, let's walk through an example to demonstrate Boyce-Codd Normal Form
(BCNF) with a sample table.
Consider a table that stores information about students, their courses, and the
instructors who teach those courses:
| Student ID | Student Name | Course ID | Course Name | Instructor ID | Instructor Name
| 101 | John Doe | C001 | Math | I001 | Prof. Smith |
| 101 | John Doe | C002 | Physics | I002 | Prof. Johnson |
| 102 | Jane Smith | C002 | Physics | I002 | Prof. Johnson |
| 102 | Jane Smith | C003 | History | I003 | Prof. Anderson |
| 103 | Bob Johnson | C001 | Math | I001 | Prof. Smith |
| 103 | Bob Johnson | C003 | History | I003 | Prof. Anderson |
In this table, we have a composite primary key (Student ID, Course ID), and we
store the student's name, course name, instructor's ID, and instructor's name.
Page 11 of 13
The combination of Student ID and Course ID together uniquely identifies each
row.
Now, let's analyze whether the table is in BCNF:
1. It is already in 1NF, as each cell contains only atomic (indivisible) values.
2. It is already in 2NF, as it doesn't have partial dependencies. Each non-key
attribute depends on the entire composite key (Student ID, Course ID).
3. It is already in 3NF, as it doesn't have transitive dependencies. There are no non-
key attributes that depend on other non-key attributes.
Now, let's check for BCNF. We have a non-trivial functional dependency:
• Student ID, Course ID → Instructor Name
The combination of Student ID and Course ID determines the Instructor Name.
However, both Student ID and Course ID are part of the composite key. So, the
table violates BCNF.
To bring the table into BCNF, we need to decompose it into two separate tables:
one for student-course relationships and another for instructor-course
relationships:
Table: StudentCourses
| Student ID | Course ID | Student Name | Course Name |
|------------|-----------|--------------|-------------|
| 101 | C001 | John Doe | Math |
| 101 | C002 | John Doe | Physics |
| 102 | C002 | Jane Smith | Physics |
| 102 | C003 | Jane Smith | History |
| 103 | C001 | Bob Johnson | Math |
| 103 | C003 | Bob Johnson | History |
Page 12 of 13
Table: InstructorCourses
| Course ID | Instructor ID | Instructor Name |
|-----------|---------------|-----------------|
| C001 | I001 | Prof. Smith |
| C002 | I002 | Prof. Johnson |
| C003 | I003 | Prof. Anderson |
Now, the tables represent individual relations without violating BCNF. The
original table's non-trivial functional dependency has been eliminated. Each
table contains data relevant to a specific entity, and we can join them back using
the common attribute "Course ID" to retrieve the original information.
This decomposition satisfies BCNF and ensures better data organization,
avoiding redundancies and anomalies due to functional dependencies.
Page 13 of 13