Module Name: Database
Management Systems
Module Code: IS200
DCY1A – Edit
Data v/s Information
What is Data?
Data is the complete list of facts and details like text,
observations, figures, symbols and description of things. It is
the raw list of facts that are processed to gain information.
The basic concept of data is associated with scientific
research collected by different research organisations.
What is Information?
Information is the processed, organised and structured data.
It provides context for data. However, both the terms are used
together, information can be easily understood than data.
> CHANGE, EVERYDAY
Data v/s Information
> CHANGE, EVERYDAY
What is a Database?
Database is a collection of related data and data is a
collection of facts and figures that can be processed
to produce information.
Mostly data represents recordable facts. Data aids in
producing information, which is based on facts. For
example, if we have data about marks obtained by all
students, we can then conclude about toppers and
average marks.
> CHANGE, EVERYDAY
History of Database
Before the arrival of Databases, File Systems were
used.
Some problems with the File Systems:
Data dependence
Data separation and isolation leading to multiple
incompatible file formats
Data duplication and resulting data redundancy and
inconsistency
Difficulty in accessing and querying data since a new
program has to be written to carry out each new task
Integrity constraints have to be expressed in the
program code, and it is difficult to add new constraints
> CHANGE, EVERYDAY
or change existing ones
Why Database design is important?
Database Design is a collection of processes that facilitate
the designing, development, implementation and
maintenance of enterprise data management systems.
Properly designed database are easy to maintain, improves
data consistency and are cost effective in terms of disk
storage space. The database designer decides how the data
elements correlate and what data must be stored.
The main objectives of database designing are to produce
logical and physical designs models of the proposed database
system.
> CHANGE, EVERYDAY
Why Database design is important?
The logical model concentrates on the data requirements and
the data to be stored independent of physical considerations.
It does not concern itself with how the data will be stored or
where it will be stored physically.
The physical data design model involves translating the
logical design of the database onto physical media using
hardware resources and software systems such as database
management systems (DBMS).
> CHANGE, EVERYDAY
Database Management System
(DBMS)
Most users do not access a database directly. Instead, they use a
database management system (DBMS) to access it indirectly. A
DBMS is a collection of programs that enables you to enter, organize,
and select data in a database. For example, a ticket agent may run a
ticket system program on his or her desk computer that in turn
accesses a ticketing database.
a DBMS is a complex software application. While all database
management systems may not provide all of these features, these are
the general characteristics of today’s DBMSs. Using a database
requires considerable expertise and knowledge about the specific
DBMS being used. Some of the more popular DBMS's today are
MySQL, Microsoft SQL Server, Oracle, PostgreSQL, Microsoft Access,
> CHANGE, EVERYDAY
and IBM's DB2.
Database Management System
(DBMS)
The DBMS is expected to:
Allow users to create new databases and specify their schemas (logical structure of
data)
Give users the ability of query the data and modify the data
Support the storage of very large amounts of data, allowing efficient access to data for
queries and database modifications,
Enable durability, the recovery of the database in the case of failures
Control access to data from many users at once in isolation and ensure the actions on
data to be performed completely.
> CHANGE, EVERYDAY
Types of Databases
> CHANGE, EVERYDAY
Types of Databases
Flat File Database
Flat-type databases are considered “flat” because they are two-
dimensional tables consisting of rows and columns. Each column can be
referred to as a field (such as a person’s last name or a product’s ID
number), and each row can be referred to as a record (such as a person’s
or product’s information).
The following is an example of a simple flat-type database in which a
supply company has matched each customer with what he or she
consistently orders for easy retrieval and reordering purposes:
id customer order
1 allen notebook
2 smith paper
3 dennis pens
4 alex ink cartridges
5 sloan printer
> CHANGE, EVERYDAY
Types of Databases
hierarchical database
A hierarchical database is similar to a tree structure (such as a
family tree). In this database, each “parent” table can have multiple
“children,” but each child can have only one parent.
The Flat File brought in a lot of redundant data at every entry. For
instance, if I want to make a single data-set with all products
purchased at a grocery store with all information of the customer
and product, we will have every single row consisting of all
customer and product information. Wherever we have a repeat
product or customer, we have repeat data. People thought of storing
this as different tables and define a hierarchy to access all the data,
which will be called as hierarchical database.
> CHANGE, EVERYDAY
Types of Databases
Hierarchical Database is very similar to your folder
structure on the laptop. Every folder can contain sub-
folder and each sub-folder can still hold more sub-folders.
Finally in some folders we will store files. However, every
child node (sub-folder) will have a single parent (folder or
sub-folder). Finally, we can create a hierarchy of the
dataset :
> CHANGE, EVERYDAY
Types of Databases
Relational Database
Hierarchical databases can solve many purposes, its
applications are restricted to one-to-one mapping data
structures. For example, it can be used to show job profile
hierarchy in a corporate. But the structure will fail if the
reporting becomes slightly more complicated and a single
employee reports to many managers. Hence, people thought
of database structures which can have different kinds of
relations. This type of structure should allow one-to-many
mapping. Such table came to be known as Relational
database management system (RDBMS).
> CHANGE, EVERYDAY
Types of Databases
Relational Database
Following is an example RDBMS data structure :
> CHANGE, EVERYDAY
Types of Databases
NOSQL
NoSQL is often known as “Not Only SQL”. When people
realized that unstructured text carry tonnes of
information which they are unable to mine using RDBMS,
they started exploring ways to store such datasets.
Anything which is not RDBMS today is loosely known as
NoSQL. After social networks gained importance in the
market, such database became common in the industry.
Following is an example where it will become very
difficult to store the data on RDBMS :
> CHANGE, EVERYDAY
Types of Databases
NOSQL
Facebook stores terabytes of additional data every day.
Let’s try to imagine the structure in which this data can
be structured :
> CHANGE, EVERYDAY
Types of Databases
NOSQL
In the above diagram, same colour box fall into same
category object. For example the user, user’s friends, who
liked and Author of comments all are FB users. Now if we try
to store the entire data in RDBMS, for executing a single
query which can be just a response of opening home page,
we need to join multiple tables with trillions of row together
to find a combined table and then run algorithms to find the
most relevant information for the user. This does not look to
be a seconds job for sure. Hence we need to move from
tabular understanding of data to a more flow (graph) based
data structure. This is what brought NoSQL structures
> CHANGE, EVERYDAY
Relational Database
A relational database is a collection of data items with pre-
defined relationships between them. These items are organized
as a set of tables with columns and rows. Tables are used to
hold information about the objects to be represented in the
database. Each column in a table holds a certain kind of data
and a field stores the actual value of an attribute. The rows in
the table represent a collection of related values of one object
or entity. Each row in a table could be marked with a unique
identifier called a primary key, and rows among multiple tables
can be made related using foreign keys. This data can be
accessed in many different ways without reorganizing the
database tables themselves. > CHANGE, EVERYDAY
What is a table?
The data in an RDBMS is stored in database objects which
are called as tables. This table is basically a collection of
related data entries and it consists of numerous columns
and rows.
Remember, a table is the most common and simplest form
of data storage in a relational database.
> CHANGE, EVERYDAY
What is a field?
Every table is broken up into smaller entities called fields. The fields in the CUSTOMERS
table consist of ID, NAME, AGE, ADDRESS and SALARY.
A field is a column in a table that is designed to maintain specific information about every
record in the table.
What is a Record or a Row?
A record is also called as a row of data is each individual entry that exists in a table. For
example, there are 7 records in the above CUSTOMERS table. Following is a single row of
data or record in the CUSTOMERS table
record is a horizontal entity in a table.
What is a column?
A column is a vertical entity in a table that contains all information associated with a specific
field in a table.
For example, a column in the CUSTOMERS table is ADDRESS, which represents location
description > CHANGE, EVERYDAY
SQL Constraints
Constraints are the rules enforced on data columns on a table. These are used to limit
the type of data that can go into a table. This ensures the accuracy and reliability of the
data in the database.
Constraints can either be column level or table level. Column level constraints are
applied only to one column whereas, table level constraints are applied to the entire
table.
Following are some of the most commonly used constraints available in SQL −
NOT NULL Constraint − Ensures that a column cannot have a NULL value.
DEFAULT Constraint − Provides a default value for a column when none is specified.
UNIQUE Constraint − Ensures that all the values in a column are different.
PRIMARY Key − Uniquely identifies each row/record in a database table.
FOREIGN Key − Uniquely identifies a row/record in any another database table.
CHECK Constraint − The CHECK constraint ensures that all values in a column satisfy
certain conditions.
INDEX − Used to create and retrieve data from the database very quickly.
> CHANGE, EVERYDAY
Data Integrity
The following categories of data integrity exist with each RDBMS −
•Entity Integrity − There are no duplicate rows in a table.
•Domain Integrity − Enforces valid entries for a given column by restricting the
type, the format, or the range of values.
•Referential integrity − Rows cannot be deleted, which are used by other
records.
•User-Defined Integrity − Enforces some specific business rules that do not fall
into entity, domain or referential integrity.
> CHANGE, EVERYDAY
Characteristics of Relational Databases
Ease of Data Retrieval
Under the relational model, accessing data in a database does not
require navigating a rigid pathway through a tree or hierarchy. Users
can query any table in the database, and combine related tables
using special join functions to include relevant data contained in
other tables in the results. Results can be filtered based on the
content of any column, and on any number of columns, allowing
users to easily retrieve meaningful results. Users can choose which
columns to include in the results so that only relevant data are
displayed.
In File System model , accessing file require transfer of whole file to
main memory and may be susceptible to file crash .
> CHANGE, EVERYDAY
Advantages of Relational Databases
Data Integrity
Data integrity is an essential feature of the relational
model. Strong data typing and validity checks ensure
data fall within acceptable ranges, and required data
are present. Referential integrity among tables prevents
records from becoming incomplete or orphaned. Data
integrity helps to ensure accuracy and consistency of
the data.
> CHANGE, EVERYDAY
Advantages of Relational Databases
Flexibility
The relational database model is naturally scalable and extensible,
providing a flexible structure to meet changing requirements and
increasing amounts of data. The relational model permits changes to a
database structure to be implemented easily without impacting the
data or the rest of the database. The database analyst can quickly
and easily add, remove, and modify tables and columns in an existing
database to meet business requirements. There is theoretically no
limit on the number of rows, columns or tables. In reality, growth and
change are limited by the relational database management system
and physical computing hardware, and changes may impact external
applications designed for a specific database structure.
> CHANGE, EVERYDAY
Advantages of Relational Databases
Less redundancy − DBMS follows the rules of
normalization, which splits a relation when any of its
attributes is having redundancy in values. Normalization is a
mathematically rich and scientific process that reduces data
redundancy.
Query Language − DBMS is equipped with query language,
which makes it more efficient to retrieve and manipulate
data. A user can apply as many and as different filtering
options as required to retrieve a set of data. Traditionally it
was not possible where file-processing system was used.
> CHANGE, EVERYDAY
Advantages of Relational Databases
Normalization
A systematic methodology exists for ensuring a relational
database design is free of anomalies that may impact the
integrity and accuracy of the database. "Database
normalization" provides a set of rules, qualities and objectives for
the design and review of a database structure. Normalization
objectives are described in levels called "normal forms." Each
level of normalization must be completed before progressing to
the next level. A database design is generally considered
normalized when it meets the requirements of the third normal
form. Normalization provides designers with confidence the
database design is robust and dependable.
> CHANGE, EVERYDAY
Advantages of Relational Databases
Multiuser and Concurrent Access − DBMS supports multi-user
environment and allows them to access and manipulate data in parallel.
Though there are restrictions on transactions when users attempt to
handle the same data item, but users are always unaware of them.
Security − Features like multiple views offer security to some extent
where users are unable to access data of other users and departments.
DBMS offers methods to impose constraints while entering data into the
database and retrieving the same at a later stage. DBMS offers many
different levels of security features, which enables multiple users to have
different views with different features. For example, a user in the Sales
department cannot see the data that belongs to the Purchase
department. Additionally, it can also be managed how much data of the
Sales department should be displayed to the user. Since a DBMS is not
saved on the disk as traditional file systems, it is very hard for miscreants
to break the code.
> CHANGE, EVERYDAY
What is a Query
Because a database can store thousands of records, it
would be a chore if you had to open a table and go
through each record one at a time until you found the
record you needed. Of course, the process would be even
more difficult if you had to retrieve multiple records.
Thankfully, you don’t have to go through database records
in this way. Rather, to retrieve data within a database, you
run a database query, which is an inquiry into the
database that returns information back from the database.
In other words, a query is used to ask for information from
a database. > CHANGE, EVERYDAY
What is a Index
If a database contains thousands of records with many fields
per record, it may take even a fast computer a significant
amount of time to search through a table and retrieve the
requested data. This is where a database index comes in
handy. An index is a data structure that improves the speed of
data retrieval operations on a database table. The
disadvantage of indexes is that they need to be created and
updated, which requires processing resources and takes up
disk space.
> CHANGE, EVERYDAY
Database Server
Databases are often found on database servers so that they can
be accessed by multiple users and provide a high level of
performance. One popular database server is Microsoft SQL Server.
Database servers like SQL Server do not actually house graphical
programs, word-processing applications, or any other type of
applications. Instead, these servers are entirely optimized to serve
only the purposes of the database itself, usually using advanced
hardware that can handle the high processing needs of the
database. It is also important to note that these servers do not act
as workstations; they generally are mounted on racks located in a
central data center and can be accessed only through an
administrator’s desktop system.
> CHANGE, EVERYDAY