DBSC Study Notes
DBSC Study Notes
1. End Users
• The database is designed, created and maintained to serve the
information needs of end users, who use the data and information to
perform their jobs.
• Regardless of the elegance of the database design or the sophistication of
the hardware and software used. If the database does not provide
adequate information to users, it is failure. Users can be categorized
according to the way they access data.
2. DBMS
• The DBMS is a software system that receives and satisfies all requests for
data. Normally the DBMS provides concurrent access to multiple database
users.
• The commonly used DBMS are My SQL, MS Access, MS SQL Server, Oracle
etc.
3. Database
The “database more specifically in database system environment” is the
physical repository of all user’s data.
For example, student information like registration, exam, finance and library are
stored within a university database.
5. Application programmer
• This group includes programmers who write application programs for the
users.
• Their application programs may be written in a variety of hosts
programming languages such as C#, VB and JAVA.
• Each Application Program to access the database contents; calls the
database management system (DBMS) to perform the operations over the
database. For example, update or retrieval.
Data
• Structured data?
• Unstructured data or as multimedia data?
• Information
• For Example.
• Consider the following facts:
Ashraf 74125
Rashid 45689
Khalid 78541
• These facts satisfy our definition of Data, but data are useless in its
present form.
Information(cont..)
• By adding few additional data items and providing some structure, we
recognize a class BS 3rd semester for a particular course. Now this is some
useful information to some users like course instructor or Examination
Department.
Information(cont..)
• Another way to convert the data into information is to summarize it or
otherwise process and present it for human interpretation (to understand
it easily).
GPA
4
3.5
3
2.5
2 GPA
1.5
1
0.5
0
MGT
MKT
PRM
MGT
IS
ACCT
Data VS Information
DATA INFORMATION
Can’t used directly for decision making Can be used for decision making di
Levels of Data
• Real World Data
• Meta Data
• Existence Data
Metadata
• Data that describes the properties of data.
• Some of the properties of the data are data names, definitions, length or
size, and domain values.
• Also known as data about data.
Inconsistency
• One other important problem is inconsistency of data.
• The change in the record in one system is not reflected in the same record
stored in other system.
• So different systems in organization; store different facts about same
thing. This is inconsistency as is shown in figure below
Inconsistency(cont…)
Advantages of Database Approach(cont…)
Redundancy
Data Independence
Better Data Integrity
ER Data Model
Symbols
Attribute
Types of Attributes
Primary Key
Foreign Key
Relationships
Relationships types
1. Unary Relationship
2. Binary Relationship
3. Ternary Relationships
ER Data Model (or E-R Model)
The ER model defines the conceptual view of a database. It works around real-world
entities and the associations among them.
At view and engineering level, the ER model is considered as a good option for designing
databases.
E-R Diagram:
A graphical representation of an entity-relationship model.
Entity
Term used to mean three different meanings
Entity type
Entity instance
Entity set
Entity Type
A name/label assigned to items/objects that exist in an environment and
that have similar properties
It could be person, place, event or even concept
Entity Type
Distinguishable from other entity types on the basis of properties
Identified through abstraction (thinking closer) process
Different from External Entity (Not from un-related terminology)
Symbols
Reg Entity Type NAME
BOOK BOOK_COPY
Attribute
An attribute of an entity type is a defining property or quality of the instances of that entity type. Entity
instances of same entity type have the same attributes. (e.g. Student Identification, Student Name)
Every attribute has a domain
-Set of possible values for an attribute
1. Simple
Attributes-
2. Composite
f
3. Single Valued
Attributes-
4. Multi Valued
Attributes-
5. Derived
6. Key
Attributes-
Symbols for
Each represented as an oval, linked
with an ET symbol
Types of Attributes
Simple vs. composite
Single valued vs. multi-valued
Stored vs. derived
EMPLOY address
EE
Field (Column)
Customers
a field
Record (Row)
Customers
a record
Primary
Customers
primary key
field
y key is a unique identifier of records in a table.
y key values may be generated manually or automatically.
Primary
Key Roles (Performances)
foreign key
field
Relationships
Relationships are the connections and interactions between the entities
instances, e.g., Program and Student ETs are linked
Naming Relationships
• Up to you
• If there is no proper name of the association in the system then
participants’ names of abbreviations are used
STUDENT and CLASS have ENROLL relationship
However, it can also be named as STD_CLS
Symbol for
Relationships
DEPT EMP
STD BOOK
Relationships types
Relationship type can be identified like an entity type
Entities involved in a relationship are called its participants
Types of the relationships can be established on the basis of participant
ETs
One-to-One Relationships
One-to-Many Relationships
Many to One Relationships
Many-to-Many Relationships
1.One-to-one:
One instance from entity set X can be associated with at most one
instance of entity set Y and vice versa.
Example: One student can register for numerous courses. However, all
those courses have a single line back to that one student. But one at a
time.
One to one
Deptt HoD
2.One-to-many:
One instance from entity set X can be associated with multiple instances of
entity set Y, but an entity from entity set Y can be associated with at least
one entity.
For example, one class is consisting of multiple students.
3. Many to One
More than one instances from entity set X can be associated with at most
one instance of entity set Y. However, an instance from entity set Y may or
may not be associated with more than one instance from entity set X.
For example, many students belong to the same class.
4. Many to Many:
One instance from Entity X can be associated with more than one
Instances from Y and vice versa.
For example, Students as a group are associated with multiple faculty
members, and faculty members can be associated with multiple students.
Sample ERD
Relationship
A meaningful association among entity types.
Relationship is the association among the instances of one or more
entities.
Relationship is the glue that holds together the various component of E-R
model.
Example
STUDENT Completes COURSE
Relationship
Graphical Representation
Denoted by Line labeled with the name of relationship
Completes
STUDENT COURSE
Relationship
Associative Entity
An entity that associates (links) the instances of one or more entities
and contains attributes that belongs to the relationship of
participating entities.
Relationship
Associative Entity
When should a relationship with attributes can be an associative entity?
Associative entity exist only when the participating entity type has
many to many relationship.
Associative entity may have its own identifier and other attributes
Associative entity get primary key of participating entities with other
attributes belongs to the relationship.
Relationship
Associative Entity
Graphical Representation
Represented by diamond symbol enclosed in a rectangle
Date_Complete
Certificate_No d
Roll_No Title
Name Course_ID
Relationship
Degree of Relationship
The number of entity types that participate in a relationship is called
degree of the relationship.
The most commonly used relationship in E-R Model are Unary, Binary and
Ternary relationship.
Relationship
Unary Relationship
A relationship among the instances of single entity type
Unary relationship are also called recursive relationship
Relationship
Unary Relationship
Manages
EMPLOYEE
Relationship
Binary Relationship
A relationship among the instances of two entity types
STUDENT DEPARTMENT
STUDENT COURSE
Relationship
Ternary Relationship
A relationship among the instances of three entity types
Relational Keys
Candidate Key
Primary Key
Composite Primary Key
Alternate Key
Secondary Key
Foreign Key
Alternative terminologies
Introduction
The relational data model represents data in the form of tables.
The relational data model consists of the following three components:
Data structure
Data Manipulation
Data integrity
Introduction
Data structure
Data are organized in the form of tables with rows and columns.
Data Manipulation
Powerful operations (using the SQL language) are used to manipulate
(control and operate on) data stored in the relations.
Data integrity
Facilities are included to specify business rules that maintain the integrity
of data when they are manipulated.
Relational Keys
We must be able to store in and retrieve data from a table, based on some well-
known attribute’s values.
for achieving this goal relations (tables) contains some keys.
Keys
The set of attributes that can be used to identify or access records of a
table. The concept of key is very important.
Why and how?
A table may have many instances (records), from a few to several
thousands and even more.
Now out of many instances, when we want to retrieve a particular record,
and many times we do need it, then key is the only solution.
Keys
For example
Think of whole population of Afghanistan, the data of all citizen stored at
one place, let say with NIC (National Identity Card) Department.
Now if at some time we need to identify a particular person out of all
stored data, how can we do that?
Can we use name for that, well, think of any name, like Muhammad
Aslam, now we may find many people with this name in Afghanistan.
Another option is the combination of name and father name, then again,
Muhammad Shoaib s/o Muhammad Karim, there could be so many such
pairs. There could be many such examples. Another point for this is that
the key comes lengthy.
Keys
For example
However, if you think about National ID Card number, then no matter
whatever is the population of Afghanistan, you will be able always to pick
a single person. That is the key.
While defining an entity type we also generally define the key of that
entity type.
How do we select the key?
From the study of the real-world system; key attribute(s) already exist
there, sometimes if they don’t then the designer has to define one, like
sear number as auto number.
A key can be simple, that is, consisting of single attribute, or it could be
composite which consists of two or more attributes.
Types of Keys
Following are the major types of keys:
Candidate Key
Primary Key
Composite Primary Key or Composite Key
Alternate Key
Secondary Key
Foreign Key
Candidate key
One or more keys that are candidate for becoming the primary key are
called Candidate Keys
Candidate Key
Candidate Key
Candidate Key
STUDENT
Primary key
A candidate key chosen by the database designer to act as unique key is
the primary key. It is an attribute (or combination of attributes) that
uniquely identifies each record in a relation.
An entity type may have more than one candidate keys; in that case the
database designer has to designate one of them as primary key.
Since there is always only a single primary key in a table. If there is just one
candidate key, then obviously the same will be declared as primary key.
The primary key can also be defined as the successful candidate key.
Primary key
Primary key
Primary Key Alternate
Key
Primary key
There are certain things that are generally considered while making this
decision that which candidate key should be primary key
The key that is shorter, easy to remember, to type and more meaningful to
select as a primary key.
Primary key
A major characteristic of the PK is that it cannot have the NULL value.
If PK is a composite, then none of the attributes included in the PK can
have the NULL
Alternate Keys
Candidate keys which are not chosen as the primary key are known as
alternate keys.
For example
We have two candidate keys of table STUDENT in figure of a previous
slide, Reg_No and NIC_Number,
If we select Reg_No as PK then the NIC_Number will be alternate key.
Secondary Key
Many times we need to access certain instances i.e. group of records from
an entity type using the attribute’s value other than the PK.
The difference in accessing instances using the value of a key and non-key
attribute is that the search on the value of PK will always return a single
record (if it exists). While we can get more records in case of non-key
attribute.
Secondary Key
For example
We want to see how many of our students belong to Kabul
In that case we will access those instances of the STUDENT entity that
contain “Kabul” as their address.
so address is called secondary key, since we are accessing records on the
basis of address.
Foreign key
It is an attribute in a relation of a database that serves as the primary key
of another relation in the same database.
For example
Consider the relations EMPLOYEE1 and DEPARTMENT:
EMPLOYEE (Emp ID, Name, DeptNo, Salary)
DEPARTMENT(DeptNo, Dept Name, Location)
The attribute DeptNo is a foreign key in EMPLOYEE.
It allows a user to associate any employee with the department assigned.
An attribute can be presented as foreign key by a dashed underline.
( DeptNo )
Properties of Relation
Integrity Constraints
Domain Constraints
Entity Integrity
Referential integrity
Database System Concepts
Relation Properties and Constraints
Properties of Relation
We have defined relations as two-dimensional tables
of data.
However, not all tables are relations (Tables of
database while relationship means link or association).
Relations have several properties that distinguish them
from non-relational tables.
We summarize these properties on coming slides.
Properties of Relation
1. Each relation (or table) in a database has a unique name
It means that in one relational schema there should be
no other table having the same name.
Properties of Relation
2. An entry at the intersection of each row and column is atomic (or single
value).
A cell is the intersection of a row and a column, so it represents a value of
an attribute in a particular row.
The property means that the value stored in a single cell is considered as a
single value.
Properties of Relation
Properties of Relation
3. Each row is unique; no two rows in a relation are identical.
Two rows of a table cannot be same. The value of at least a single attribute
has to be different that makes the entire row distinct.
Properties of Relation
4. Each attribute (or column) within a table has a unique name.
It has to be unique, a table cannot have duplicated column/attribute
names.
In the table figure we seen, the bold items in the first row represent the
column/attribute names.
Properties of Relation
5. The sequence of columns (left to right) is not necessary. The columns of a
relation can be interchanged without changing the meaning or use of the
relation.
6. The sequence of rows (top to bottom) is not necessary. As with columns,
the rows of a relation may be inter changed or stored in any sequence.
Properties of Relation
7. The values of the attributes come from the same domain
Each attribute is assigned a domain when it is defined.
The domain represents the set of possible values that an attribute can
have.
Domain type may be numeric or text or some other valid data types.
Integrity Constraints
The relational data model includes several types of constraints, or
business rules. The main purpose is to maintain the accuracy and integrity
of data in the database.
The major types of integrity constraints are
Domain Constraints
Entity Integrity
Referential integrity
Domain Constraints
All the values that appear in a column of a relation must be taken from the
same domain.
A domain is the set of values that may be assigned to an attribute.
A domain definition usually consists of the following components:
Domain name
Data type
Size (or length).
Allowable values.
Entity Integrity
The entity integrity rule is designed to ensure that every relation has a
primary key (highly recommended), and that values for that primary key
are all valid.
It guarantees that every primary key attribute is non-null.
It guarantees that primary key value must not be duplicated.
Entity Integrity
A null is a value that may be assigned to an attribute when no other value
applies or when the value is unknown.
In reality, a null is not a value but rather the absence of a value.
Referential Integrity
Referential integrity constraint is a rule that maintains consistency among
the rows of two relations.
The rule states that if there is a Foreign Key in one relation, then either
each foreign key value must match a primary key value in another
relation, or the foreign key value will be null.
Normalization
• Main objective in developing a logical data model for relational database
systems is to create an accurate representation of the data, its
relationships, and constraints.
• To achieve this objective, we must identify a suitable set of relations.
Normalization
• Four most commonly used normal forms are first (1NF), second (2NF) and
third (3NF) normal forms, and Boyce–Codd normal form (BCNF).
• Based on functional dependencies among the attributes of a
relation/Table.
• A relation can be normalized to a specific form to prevent possible
occurrence of update anomalies.
Normalization
• Normalization is the process for assigning attributes to entities
• Reduces data redundancies
• Helps eliminate data anomalies
• Produces controlled redundancies to link tables
• Normalization stages
• 1NF - First normal form
• 2NF - Second normal form
• 3NF - Third normal form
• 4NF - Fourth normal form
Data Redundancy
• Major aim of relational database design is to group attributes into
relations to minimize data redundancy and reduce file storage space
required by base relations.
• Problems associated with data redundancy are illustrated by comparing
the following Staff and Branch relations with the StaffBranch relation.
Data Redundancy
Data Redundancy
Data Redundancy
• StaffBranch relation has redundant data: details of a branch are repeated
for every member of staff.
• In contrast, branch information appears only once for each branch in
Branch relation and only branchNo is repeated in Staff relation, to
represent where each member of staff works.
Update Anomalies
• Relations that contain redundant information may potentially suffer from
update anomalies.
• Types of update anomalies include:
• Insertion,
• Deletion,
• Modification.
Insertion anomalies
At the time of record insertion in to a table which is not suitably
normalized. It is often become time consuming, because some part of
stored record needs to be re-typed.
And Some spelling mistake may lead to data inconsistency.
While no mistake also leads to data redundancy.
For example entering new employee in the same branch using the
StaffBranch relation.
Deletion anomalies
At the time of record Deletion from a table which is not suitably
normalized. It is often become lost of necessary information, because
some part of stored record needs to maintained.
As the record can’t be partially deleted means it is atomic.
For example deletion of a staff member may completely delete a branch
from the system based on StaffBranch relation.
Modification anomalies
At the time of record Modification in a table which is not suitably
normalized. It is often leads the database to inconsistent state, because
some part of stored record needs to maintained coherent all the times.
As on a record referential integrity can’t be implement unless and until it is
not in primary and foreign key relationship .
For example modifying the staff member’s name or a branch’s address may left
the record set inconsistent or the modification may become quite hard in
consideration of StaffBranch relation, as the same modification may be required
in multiple records.
UNF to 1NF
Nominate an attribute or group of attributes to act as the key for the table.
All key attributes defined
No repeating rows in table
All attributes dependent on primary key
Second Normal Form (2NF)
Based on concept of full functional dependency:
A and B are attributes of a relation,
B is fully dependent on A if B is functionally dependent on A but not on
any proper subset of A.
2NF - A relation that is in 1NF and every non-primary-key attribute is fully
functionally dependent on the primary key (no partial dependency)
1NF to 2NF
Identify primary key for the 1NF relation.
Identify functional dependencies in the relation.
If partial dependencies exist on the primary key remove them by placing
them in a new relation along with copy of their determinant.
2NF to 3NF
Identify the primary key in the 2NF relation.
Identify functional dependencies in the relation.
Figure 4.7
Decomposition of
Table
Structure to Meet
Database System Concepts
SQL
Structured Query Language
Overview
SQL
DML-(Data Manipulation Language)
-SELECT (show data)
-INSERT (save data)
-UPDATE (change data)
-DELETE (remove data)
DDL-(Data Definition Language)
-CREATE
-ALTER
-DROP
SQL is a Standard
SQL is an ANSI (American National Standards Institute) standard computer
language for accessing and manipulating database systems. SQL statements are
used to retrieve and update data in a database. SQL works with database
programs like MS Access, MS SQL Server, Oracle, etc.
There are many different versions of the SQL language, but to be in compliance
with the ANSI standard, they must support the same major keywords in a
similar manner (such as SELECT, UPDATE, DELETE, INSERT, WHERE, and so on).
Note: Most of the SQL database programs also have their own proprietary
extensions in addition to the SQL standard!
A database most often contains one or more tables. Each table is
SQL Database
by a name (e.g. "Customers" or "Orders"). Tables contain records
with data.
Tables
Below is an example of a table called "Persons":
A database most often contains one or more tables. Each
LastName FirstName Address City
table
Hansen is identified by
Ola a name (e.g. "Customers"
Timoteivn 10 or "Orders").
Sandnes
Tables
Svendson
contain records
Tove
(rows) with data.
Borgvn 23 Sandnes
Below
Pettersen
is an example
Kari
of a table called "Persons":
Storgt 20 Stavanger
SQL Queries
With SQL, we can query a database and have a result set returned.
A query like this:
SELECT LastName FROM Persons
Gives a result set like this:
LastName
Hansen
Svendson
Pettersen
SQL The SELECT Statement
The SELECT statement is used to select data from a table. The tabular result is
stored in a result table (called the result-set).
Syntax
SELECT column_name(s)
FROM table_name
To select the columns named "LastName" and "FirstName", use a SELECT
statement like this:
SELECT LastName, FirstName FROM Persons
Persons
Result Set
LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari
Company
Sega
W3Schools
Orders
W3Schools 6798
Syntax
SELECT column FROM table
WHERE column operator value
With the WHERE clause, the following operators can be used:
Operator Description
= Equal
<> Not equal
> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
BETWEEN Between an inclusive range
LIKE Search for a pattern
WHERE City='Sandnes'
LastName FirstName Address City Year
Hansen Ola Timoteivn 10 Sandnes 1951
Svendson Tove Borgvn 23 Sandnes 1978
Svendson Stale Kaivn 18 Sandnes 1980
Using Quotes
Note that we have used single quotes around the conditional values in the
examples.
SQL uses single quotes around text values (most database systems will also
accept double quotes). Numeric values should not be enclosed in quotes.
This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
SELECT * FROM Persons WHERE FirstName=Tove
Using LIKE
The following SQL statement will return persons with first names that start with
an 'O':
SELECT * FROM Persons
WHERE FirstName LIKE 'O%'
The following SQL statement will return persons with first names that end with
an 'a':
SELECT * FROM Persons
WHERE FirstName LIKE '%a'
Using LIKE
The following SQL statement will return persons with first names that contain
the pattern 'la':
SELECT * FROM Persons
WHERE FirstName LIKE '%la%'
Inserting data in to a table
Syntax:
WHERE Clause
WHERE Clause is used to specify the condition.
Operators are used to specify the condition.
1)Relational Operators: =, >,<,<=,>=,<>,!=
2)Logical Operators: AND,OR, NOT
3)Special Operators: IN, BETWEEN, LIKE,IS
Syntax:
select * from table_name;
select col1_name,.. Coln_name from table_name;
WHERE <condition>
e.g.
select * from batch_detail
WHERE Trainer =‘Rajiv Arora’ ;
Result:
1 testing_01 Rajiv Arora 3
ORDER BY Clause
ORDER BY clause is used to impose an order on the result of a query.
It is used with SELECT stmt.
One or more column can be specified in ORDER BY clause.
Syntax:
SELECT <column_list>
FROM <table>
WHERE <condition>
ORDER BY<columns> [ASC|DESC] ;
GROUP BY Clause
GROUP BY clause is used to divide the rows in a table into smaller groups.
SQL groups the query after it retrieves the rows from a table.
Conditional Retrieval of rows from a grouped result is possible with the
HAVING clause
DDL Queries
CREATE TABLE
tblCust
(
cid NUMERIC(10,0) PRIMARY KEY,
cname VARCHAR(10),
email VARCHAR(20) UNIQUE);
TRUNCATE
The TRUNCATE action query removes all rows and reset the identity
column.
TRUNCATE TABLE tablename;
Drop Table
DROP TABLE tblName;
ALTER TABLE table_name ALTER COLUMN column_name column_type;
Sample queries
SELECT DEPT,COUNT(*) AS TOTAL_STD FROM TBL_Std
GROUP BY DEPT
SELECT * FROM TBL_STD ORDER BY SID DESC;
INSERT INTO TBL_Std(SNAME,AvgMarks) VALUES('Naveed',91);
UPDATE TBL_Std SET SPHONE='0738877878',DEPT='BCE'
WHERE SID=10
DELETE FROM TBL_Std WHERE DEPT IS NULL
CREATE TABLE TBL1
(
CID NUMERIC(10,0) PRIMARY KEY,
CNAME VARCHAR(20)
)
ALTER TABLE TBL1 ADD CPHONE VARCHAR(13)
DROP TABLE TBL1