Normalization and Concepts of SQL
Unit 4
- Vandana B Prajapati
Normalization and Concepts of SQL
Normalization
●Normalization is the process of organizing
data in a database.
●This includes creating tables and
establishing relationships between those
tables according to rules designed both to
protect the data and to make the database
more flexible by eliminating redundancy and
inconsistent dependency.
●If a database design is not perfect, it may
contain anomalies, which are like a bad
dream for any database administrator.
Managing a database with anomalies is next
to impossible.
Normalization
●Update anomalies − If data items are scattered and are not linked to each other
properly, then it could lead to strange situations. For example, when we try to
update one data item having its copies scattered over several places, a few
instances get updated properly while a few others are left with old values. Such
instances leave the database in an inconsistent state.
●Deletion anomalies − We tried to delete a record, but parts of it was left
undeleted because of unawareness, the data is also saved somewhere else.
●Insert anomalies − We tried to insert data in a record that does not exist at all.
●Normalization is a method to remove all these anomalies and bring the
database to a consistent state.
Normalization Rules
Functional Dependency: -
●Functional dependency (FD) is a set of constraints between two attributes in a
relation.
●Functional dependency says that if two tuples have same values for attributes
A1, A2,..., An, then those two tuples must have to have same values for
attributes B1, B2, ..., Bn.
●Functional dependency is represented by an arrow sign ( ) that is, X Y, where X
functionally determines Y.
●The left- hand side attributes determine the values of attributes on the right-
hand side.
Normalization Rules
Armstrong's Axioms: -
If F is a set of functional dependencies then the closure of F, denoted as F+, is the set
of all functional dependencies logically implied by F. Armstrong's Axioms are a set of
rules, that when applied repeatedly, generates a closure of functional dependencies.
●Reflexive rule − If alpha is a set of attributes and beta is_ subset_ of alpha, then
alpha holds beta.
●Augmentation rule − If a b holds and y is attribute set, then ay by also holds.
That is adding attributes in dependencies, does not change the basic
dependencies.
●Transitivity rule − Same as transitive rule in algebra, if a b holds and b c holds,
then a c also holds. a b is called as a functionally that determines b.
Normalization Rules
Trivial Functional Dependency: -
Trivial − If a functional dependency (FD) X Y holds, where Y is a subset of X, then it
is called a trivial FD. Trivial FDs always hold.
Non- trivial − If an FD X Y holds, where Y is not a subset of X, then it is called a non-
trivial FD.
Completely non- trivial − If an FD X Y holds, where x intersect Y = Φ, it is said to be
a completely non- trivial FD.
First Normal Form
For a table to be in the first normal form, it must
meet the following criteria:
● a single cell must not hold more than one value
(atomicity)
● there must be a primary key for identification
● no duplicated rows or columns
● each column must have only one value for each
row in the table
Example:
Solved
First Normal Form
First Normal Form is defined in the definition of relations (tables) itself. This rule
defines that all the attributes in a relation must have atomic domains. The values in
an atomic domain are indivisible units.
We re- arrange the relation (table) as below, to convert it to First Normal Form.
Each attribute must contain only a single value from its pre- defined domain.
Second Normal Form
Before we learn about the second normal form, we need to understand the
following −
Prime attribute − An attribute, which is a part of the candidate- key, is known as a
prime attribute.
Non- prime attribute − An attribute, which is not a part of the prime- key, is said to be
a non- prime attribute.
If we follow second normal form, then every non- prime attribute should be fully
functionally dependent on prime key attribute. That is, if X A holds, then there
should not be any proper subset Y of X, for which Y A also holds true.
Second Normal Form
The 1NF only eliminates repeating groups, not
redundancy. That’s why there is 2NF.
● A table is said to be in 2NF if it meets the following
criteria:
● it’s already in 1NF
● has no partial dependency. That is, all non- key
attributes are fully dependent on a primary key.
Example
●Together, student_ id + subject_ id forms a Candidate Key(learn about Database
Keys) for this table, which can be the Primary key.
Solved
Second Normal Form
We see here in Student_ Project relation that the prime key attributes are Stu_ ID
and Proj_ ID. According to the rule, non- key attributes, i.e. Stu_ Name and
Proj_ Name must be dependent upon both and not on any of the prime key attribute
individually.
But we find that Stu_ Name can be identified by Stu_ ID and Proj_ Name can be
identified by Proj_ ID independently. This is called partial dependency, which is not
allowed in Second Normal Form.
Second Normal Form
We broke the relation in two as depicted in the above picture. So there exists no
partial dependency.
Third Normal Form
When a table is in 2NF, it eliminates repeating groups and
redundancy, but it does not eliminate transitive partial
dependency.
● This means a non- prime attribute (an attribute that is not
part of the candidate’s key) is dependent on another non-
prime attribute. This is what the third normal form (3NF)
eliminates.
● So, for a table to be in 3NF, it must:
● be in 2NF
● have no transitive partial dependency.
Example
Solved
Third Normal Form
For a relation to be in Third Normal Form, it must be in Second Normal form and the
following must satisfy −
●No non- prime attribute is transitively dependent on prime key attribute.
●For any non- trivial functional dependency, X A, then either −
●X is a superkey or,
●A is prime attribute.
Third Normal Form
We find that in the above Student_ detail relation, Stu_ ID is the key and only prime
key attribute. We find that City can be identified by Stu_ ID as well as Zip itself.
Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ ID Zip
City, so there exists transitive dependency.
To bring this relation into third normal form, we break the relation into two relations
as follows −
Boyce-Codd Normal Form
Boyce- Codd Normal Form (BCNF) is an extension of Third Normal Form on strict
terms. BCNF states that −
●For any non- trivial functional dependency, X A, X must be a super- key.
In the above image, Stu_ ID is the super- key in the relation Student_ Detail and Zip is
the super- key in the relation Zip Codes. So, Stu_ ID Stu_ Name, Zip and Zip City
Which confirms that both the relations are in BCNF.
Example:
SQL
●SQL stands for ― Structured Query Language. SQL is used to communicate with a
database.
●According to ANSI (American National Standards Institute), it is the standard
language for database management system.
●SQL statements are used to perform tasks such as update on a database, or
retrieve data from a database.
●The standard SQL commands such as Select, Insert, Update, Delete, Create and
Drop can be used to accomplish almost everything that one needs to do with a
database. Query is statement requesting retrieval of information.
●SQL is a standard language that works with database programs like MS Access,
DB2, MS SQL Server, Oracle, MySQL, and other database systems, although most
of them also have their own additional proprietary extensions that are usually
only used on their system.
SQL
In short, SQL is: -
●Stands for ―Structured Query Language.
●Allows you to access databases.
●Is an ANSI (American National Standards Institute) standard computer language.
●Can execute queries against a database.
●Can retrieve data from a database.
●Can insert new records in a database.
●Can delete records from a database.
●Can update records in a database.
Basic Structure of SQL
●SQL includes Data Definition Language (DDL) statements and Data Manipulation
Language (DML) statements.
●DDL statements, such as CREATE, ALTER, and DROP, modify the schema of a
database.
●DML statements, such as SELECT, INSERT, UPDATE, and DELETE, manipulate
data in tables.
●The basic structure of an SQL expression consists of three clauses: -
● Select
● From
● Where
Basic Structure of SQL
●The select clause corresponds to the projection operation of the relational
algebra. It is used to list the attributes desired in the result of a query.
●The from clause corresponds to the Cartesian- product operation of the
relational algebra. It list the relations to be scanned in the evaluation of the
expression.
●The where clause corresponds to the selection predicate of the relational
algebra. It consists of a predicate involving attributes of the relations that
appear in the from clause.
SQL Data types
●Data types are used to represent the nature of the data that can be stored in the
database table.
●For example, in a particular column of a table, if we want to store a string type of
data then we will have to declare a string data type of this column.
●Data types mainly classified into three categories for every database: -
● String Data types
● Numeric Data types
● Date and time Data types
Oracle Data types - String Data types
Data Types Descriptions
CHAR(size) It is used to store character data within the predefined length. It can be stored up to
2000 bytes.
NCHAR(size) It is used to store national character data within the predefined length. It can be stored
up to 2000 bytes.
VARCHAR2(size) It is used to store variable string data within the predefined length. It can be stored up
to 4000 byte.
VARCHAR(SIZE) It is the same as VARCHAR2(size). You can also use VARCHAR(size), but it is suggested
to use VARCHAR2(size)
NVARCHAR2(size) It is used to store Unicode string data within the predefined length. We have to must
specify the size of NVARCHAR2 data type. It can be stored up to 4000 bytes.
Oracle Data types - Numeric Data Types
Data Types Descriptions
NUMBER(p, s) It contains precision p and scale s. The precision p can range from 1 to 38,
and the scale s can range from -84 to 127.
FLOAT(p) It is a subtype of the NUMBER data type. The precision p can range from
1 to 126.
BINARY_ FLOAT It is used for binary precision( 32-bit). It requires 5 bytes, including
length byte.
BINARY_ DOUBLE It is used for double binary precision (64-bit). It requires 9 bytes,
including length byte.
Oracle Data types - Date and Time Data Types
Data Types Descriptions
DATE It is used to store a valid date-time format with a fixed length. Its range varies
TIMESTAMP from
It January
is used 1, 4712
to store BC todate
the valid December 31, 9999 AD.
in YYYY-MM-DD with time hh:mm:ss format.
SQL Queries
Query: -
A query is statement requesting the retrieval of information. We have Data
Definition Languages(DDL) to specify database schemas and Data Manipulation
Language (DML) to express database updates and queries. In practice, these are
not to separate languages but are part of a single database language, like SQL.
SQL Statements: -
Most of the actions you need to perform on a database are done with SQL
statements.
SELECT * FROM Customers;
SQL Queries
Semicolon after SQL Statements?
●Some database systems require a semicolon at the end of each SQL statement.
●Semicolon is the standard way to separate each SQL statement in database
systems that allow more than one SQL statement to be executed in the same
call to the server.
●If using MS Access so, we do not have to put a semicolon after each SQL
statement, but some database programs force you to use it.
SQL Queries
Some of The Most Important SQL Commands: -
●SELECT - extracts data from a database
●UPDATE - updates data in a database
●DELETE - deletes data from a database
●INSERT INTO - inserts new data into a database
●CREATE DATABASE - creates a new database
●ALTER DATABASE - modifies a database
●CREATE TABLE - creates a new table
●ALTER TABLE - modifies a table
●DROP TABLE - deletes a table
●CREATE INDEX - creates an index (search key)
●DROP INDEX - deletes an index
SQL Queries
The SQL CREATE DATABASE Statement: -
The CREATE DATABASE statement is used to create a new SQL database.
Syntax: - CREATE DATABASE databasename;
Example: - CREATE DATABASE testDB;
Once a database is created, you can check it in the list of databases with the
following SQL command: SHOW DATABASES
SQL Queries
The SQL CREATE TABLE Statement: -
The CREATE TABLE statement is used to create a new SQL database.
Syntax: - CREATE TABLE tablename;
Example: - CREATE TABLE testtable;
Once a database is created, you can check it in the list of databases with the
following SQL command: describe tablename
SQL Queries
The SQL ALTER TABLE Statement: -
ALTER TABLE statement specifies how to add, modify, drop or delete columns in a
table. It is also used to rename a table.
Syntax: - ALTER TABLE table_ name
ADD column_ name column- definition;
Example: - ALTER TABLE customers
ADD customer_ age varchar2(50);
How to modify column of a table?
Syntax: - ALTER TABLE table_ name
MODIFY column_ name column_ type;
Example: - ALTER TABLE customers
MODIFY customer_ name varchar2(100) not null;
How to drop column of a table?
Syntax: - ALTER TABLE table_ name
DROP COLUMN column_ name;
Example: - ALTER TABLE customers
DROP COLUMN customer_ name;
How to rename column of a table?
Syntax: - ALTER TABLE table_ name
RENAME COLUMN old_ name to new_ name;
Example: - ALTER TABLE customers
RENAME COLUMN customer_ name to cname;
How to rename table?
Syntax: - ALTER TABLE table_ name
RENAME TO new_ table_ name;
Example: - ALTER TABLE customers
RENAME TO retailers;
How to Drop table?
DROP TABLE statement is used to remove or delete a table from the Oracle
database.
Syntax: - DROP TABLE table_ name
Example: - DROP TABLE customers;
CREATE TABLE AS Statement
The CREATE TABLE AS statement is used to create a table from an existing table by
copying the columns of existing table.
Note: - If you create the table in this way, the new table will contain records from
the existing table.
Syntax: - CREATE TABLE new_ table
AS (SELECT * FROM old_ table);
Example: - CREATE TABLE newcustomers
AS (SELECT * FROM customers)
Insert Query
INSERT statement is used to add a single record or multiple records into the table.
Syntax (Inserting a single record using the Values keyword): -
INSERT INTO table (column1, column2, ... column_ n ) VALUES (expression1,
expression2, ... expression_ n );
Syntax (Inserting multiple records using a SELECT statement): -
INSERT INTO table (column1, column2, ... column_ n ) SELECT expression1,
expression2, ... expression_ n FROM source_ table WHERE conditions;
Insert Query
Parameters: -
1) table: - The table to insert the records into.
2) column1, column2, ... column_ n: - The columns in the table to insert values.
3) expression1, expression2, ... expression_ n: - The values to assign to the columns
in the table. So column1 would be assigned the value of expression1, column2
would be assigned the value of expression2, and so on.
4) source_ table: - The source table when inserting data from another table.
5) conditions: - The conditions that must be met for the records to be inserted.
Insert Query
Insert Example: By VALUE keyword
It is the simplest way to insert elements to a database by using VALUE keyword.
Example: - INSERT INTO suppliers (supplier_ id, supplier_ name) VALUES (50,
'Flipkart');
Insert Example: By SELECT statement
This method is used for more complicated cases of insertion. In this method
insertion is done by SELECT statement. This method is used to insert multiple
elements.
In this method, we insert values to the "suppliers" table from "customers" table.
Both tables are already created with their respective columns.
Insert Query
Example: - INSERT INTO suppliers
(supplier_ id, supplier_ name)
SELECT age, address
FROM customers
WHERE age > 20;
Insert All Query
The INSERT ALL statement is used to insert multiple rows with a single INSERT
statement. You can insert the rows into one table or multiple tables by using only
one SQL command.
Syntax: - INSERT
INTO table_ name (column1, column2, column_ n) VALUES (expr1, expr2, expr_ n)
INTO table_ name(column1, column2, column_ n) VALUES (expr1, expr2, expr_ n)
INTO table_ name (column1, column2, column_ n) VALUES (expr1, expr2, expr_ n)
SELECT * FROM dual;
Insert All Query
Examples: - This example specifies how to insert multiple records in one table.
Here we insert three rows into the "suppliers" table.
INSERT
INTO suppliers (supplier_ id, supplier_ name) VALUES (20, 'Google')
INTO suppliers (supplier_ id, supplier_ name) VALUES (21, 'Microsoft')
INTO suppliers (supplier_ id, supplier_ name) VALUES (22, 'Apple')
SELECT * FROM dual;
Update Query
UPDATE statement is used to update the existing records in a table.
Traditional Update table method
Syntax: -
UPDATE table
SET column1 = expression1,
column2 = expression2,
...
column_ n = expression_ n
WHERE conditions;
Update Query
Example: (Update single column)
UPDATE suppliers
SET supplier_ name = 'Kingfisher'
WHERE supplier_ id = 2;
Example: (Update multiple columns)
UPDATE suppliers
SET supplier_ address = 'Agra',
supplier_ name = 'Bata shoes'
WHERE supplier_ id = 1;
Delete Query
DELETE statement is used to remove or delete a single record or multiple records
from a table.
Syntax: -
DELETE FROM table_ name
WHERE conditions;
Example: - (On one condition)
DELETE FROM customers
WHERE name = 'Sohan';
Delete Query
Example: - (On multiple conditions)
DELETE FROM customers
WHERE last_ name = 'Maurya'
AND customer_ id > 2;
Truncate Table Query
TRUNCATE TABLE statement is used to remove all records from a table. It works
same as DELETE statement but without specifying a WHERE clause. It is generally
used when you don't have to worry about rolling bac
Once a table is truncated, it can't be rolled back. The TRUNCATE TABLE statement
does not affect any of the table's indexes, triggers or dependencies.
Syntax: -
TRUNCATE TABLE table_ name
Truncate Table Query
Example: -
TRUNCATE TABLE customers;
TRUNCATE TABLE vs DELETE TABLE
Both the statements will remove the data from the "customers" table but the main
difference is that you can roll back the DELETE statement whereas you can't roll
back the TRUNCATE TABLE statement.
Select Query
The SELECT statement is used to retrieve data from one or more than one tables,
object tables, views, object views etc.
Note: - SELECT statement is very complex that consists of many clauses such as
ORDER BY, GROUP BY, HAVING, JOIN
Syntax: - SELECT expressions FROM tables WHERE conditions;
Examples (select all fields): - SELECT * FROM customers;
Examples (select specific fields): - SELECT age, address, salary FROM customers
WHERE age < 25 AND salary > '20000'
Change Password Query
We can change the password using alter command.
Syntax: - ALTER USER user_ name IDENTIFIED BY new_ password;
Example: -
Thank You