Chapter 2
Database System Concepts and Architecture
1
Outlines
Data Models, Schema and Instances
DBMS Architecture and Data Independence
Database Language and Interface
The Database System Environment
Classification of DBMS
2
Data models, Schema and instances
Data Models: a collection of concepts that can be used to describe the
structure of a database.
Most of the data models allows as to specify basic operations (insert,
delete, modify or retrieve) as well as user defined operations to
represent the dynamic aspect or behavior of a database application.
Categories of Data Models
High-level or conceptual data models provide concepts that are close
to the way many users perceive data
low-level or physical data models provide concepts that describe the
details of how data is stored on the computer storage media, typically
magnetic disks.
Representational (or implementation) data models: provide
concepts that may be easily understood by end users but that are not
too far removed from the way data is organized in computer storage.
3
Schemas, Instances and Database State
Schema
A schema is a description of a particular collection of data, using a
given data model
It is a definition of database
E.g. In relational data model we have the following rule to define
schema
1. Table name outside the parenthesis
2. Column name inside the parenthesis
3. Primary key underlined
Student (SID, Name, Age, Sex)
4
Schemas, Instances, and Database State …
Cont’d
Schema is specified during database design and is not expected to
change frequently.
A displayed schema is called a schema diagram
Even if it is not common to change the database schema. Some times
there is a need of change of data base state. It is called schema
evolution.
We have three Schemas
1. Internal Schema:- To describe Physical storage structures and access
path, typically uses a physical data model
2. Conceptual Schema: - To describe the structure and constraints for the
whole data base for a community of users uses a conceptual or an
implementation data model.
3. External Schema:- To describe the various user views.
5
Schemas, Instances, and Database State …
Cont’d
Database State
The data in the database at a particular moment in time is called a
database state or snapshot.
It is also called the current set of occurrences or instances in the
database.
In a given database state, each schema construct has its own current set
of instances: for example, the STUDENT construct will contain the set
of individual student entities (records) as its instances.
Every time we insert or delete a record, or change the value of a data
item in a record, we change one state of the database into another state.
6
Schemas, Instances, and Database State …
Cont’d
When we define a new database, we specify its database schema only to
the DBMS. At this point, the corresponding database state is the empty
state with no data.
We get the initial state of the database when the database is first
populated or loaded with the initial data.
From then on, every time an update operation is applied to the database,
we get another database state. At any point in time, the database has a
current state
The DBMS is partly responsible for ensuring that every state of the
database is a valid state i.e, a state that satisfies the structure and
constraints specified in the schema.
7
DBMS
What is DBMS?
DBMS is a software package used for providing efficient, convenient
and safe multi-user storage of and access to massive amounts of
persistent data.
A DBMS also provides a systematic method for creating, updating,
storing, retrieving data in a database.
DBMS also provides the service of controlling data access, enforcing
data integrity, managing concurrency control, and recovery.
A full scale DBMS should at least have the following services to
provide to the user.
Data storage, retrieval and update in the database
A user accessible catalogue
Transaction support service
8
DBMS … Cont’d
Concurrency Control Services: access of database by different
users simultaneously
Recovery Services: a mechanism for recovering from failure
Authorization Services (Security): support the access authorization
Support for Data Communication: support data transfer
Integrity Services: rules about data and the change that took place
on the data, correctness and consistency of stored data
Services to promote data independency between the data and the
application
Utility services: sets of utility service facilities like
Importing data
Statistical analysis support
Index reorganization
Garbage collection
9
DBMS Language
1. Data Definition Language (DDL)
Language used to define each data element required by the
organization
Commands for setting up schema of the database
Used to set up a database, create, delete and alter table with the
facility of handling constraints
Is used to define the internal and external schema
2. Data Manipulation Language (DML)
Used for data manipulation
Typical manipulations include retrieval, insertion, deletion, and
modification of the data.
Since the required data or query by the user will be extracted using this
type of language, it is also called “Query Language”
10
DBMS Language … Cont’d
We have two types of DMLs:-
Non-Procedural Manipulation Languages
That allows the user to state what data is needed rather than how
it is to be retrieved.
E.g. SQL
Procedural Data Manipulation Languages
That allows the user to tell the system what data is needed and
exactly how to retrieve the data;
11
DBMS Language … Cont’d
How the Programmer Sees the DBMS
Start with DDL to create tables
CREATE TABLE Students (
Name CHAR (30)
ID CHAR (9) PRIMARY KEY NOT NULL,
Category CHAR (20)) . . .
Continue with DML to populate tables:
INSERT INTO Students
VALUES (‘Rahel’, ‘ICT 123’, ‘undergraduate’)
3. Data dictionary
The data dictionary contains definitions of objects in the system such
as tables and table relationships and rules defined on objects.
12
DBMS Interfaces
Menu-based Interfaces for Web Clients or Browsing
Apps for Mobile Devices
Forms-based Interfaces
Graphical User Interfaces
Natural Language Interfaces
Keyword-based Database Search
Speech Input and Output
Interfaces for Parametric Users
Interfaces for the DBA.
13
Database Development Life Cycle
The major steps in database design are;
1. Planning: That is identifying information gap in an organization and
propose a database solution to solve the problem.
2. Analysis: That concentrates more on fact finding about the problem
or the opportunity.
Feasibility analysis, requirement determination and structuring, and
selection of best design method are also performed at this phase.
3. Design: in database designing more emphasis is given to this phase.
The phase is further divided into three sub-phases.
1. Conceptual Design: concise description of the data, data
type, relationship between data and constraints on the data.
Used to elicit and structure all information requirements
14
Database Development Life Cycle … Cont’d
2. Logical Design: a higher level conceptual abstraction with selected
specific data model to implement the data structure.
It is particular DBMS independent and with no other physical
considerations.
3. Physical Design: physical implementation of the upper level design of
the database with respect to internal storage and file structure to
develop all technology and organizational specification.
4. Implementation: coding, testing and deployment of the designed
database for use.
5. Operation and Support: administering and maintaining the operation
of the database system and providing support to users.
15
DBMS Architecture and Data Independence
A major aim of a database system is to provide users with an abstract
view of data, hiding certain details of how data is stored and
manipulated.
Since a database is a shared resource, each user may require a different
view of the data held in the database. Accordingly there are several
types of architectures of database systems.
The American National Standards Institute/Standards Planning and
Requirements Committee (ANSI-SPARC) also introduced the three
level architecture of the database based on their degree of abstraction.
The architecture consists of the three levels: internal level, conceptual
level and external level.
In this architecture, schemas can be defined at three levels
The goal of the three-schema architecture is to separate the user
applications and the physical database.
16
DBMS Architecture … Cont’d
1. The Internal level:
Has an internal schema, which describes the physical storage structure of
the database.
The internal schema uses a physical data model and describes the
complete details of data storage and access paths for the database.
It describes the physical representation of the database on the computer.
This level describes how the data is stored in the database.
The way the DBMS and OS perceive the data
The internal level is concerned with such things as:
Storage space allocation for data
Record description for storage
Record placement
17
DBMS Architecture ... Cont’d
2. The conceptual level
Has a conceptual schema, which describes the structure of the whole
database for a community of users.
The conceptual schema hides the details of physical storage structures and
concentrates on describing entities, data types, relationships, user
operations, constraints, security and integrity information.
A high-level data model or an implementation data model can be used at
this level.
The community view of the database.
This level describes what data is stored in the database and the
relationships among the data.
It is a complete view of the data requirements of the organization.
18
DBMS Architecture … Cont’d
3. The External Level
Includes a number of external schemas or user views.
Each external schema describes the part of the database that a particular
user group is interested in and hides the rest of the database from that
user group.
The users’ view of the database.
The way users perceive the data
Describe part of the database that is relevant to each user.
Each user has a view of the real world in different way. For example:
dates may be viewed in (day, month, year) or (year, month, day)
Entities, attributes or relationships that are not of interest to the users
may still be represented in the database, but the users will be unaware of
them.
19
DBMS Architecture … Cont’d
ANSI-SPARC Architecture and Database Design Phases
20
DBMS Architecture … Cont’d
Sno. fname lname age Sal. Staff_no lname [Link].
External
level
Staff_no. fname lname DOB Sal. Br. No.
Conceptual level
Struct STAFF
{
Int staff_no;
Char fname[15];
Internal level Char lname[15];
Struct date date_of_birth;
Flooat sal;
Struct staff *next
};
21
DBMS Architecture … Cont’d
Example of data abstraction
A view when posting the grades to all students:
22
Data Independence
The three-schema architecture can be used to explain the concept of data
independence.
Data independence is defined as the capacity to change the schema at
one level of a database system without having to change the schema at
the next higher level.
We can define two types of data independence:
Physical data independence
Logical data independence
23
Data Independence … Cont’d
1. Logical data independence
Is the capacity to change the conceptual schema without having to
change external schemas or application programs.
We may change the conceptual schema to expand the database (by
adding a record type or data item), or to reduce the database (by
removing a record type or data item).
Only the view definition and the mappings need be changed in a
DBMS that supports logical data independence.
Modifications at the logical level are necessary whenever the logical
structure of the database is altered (for example, when money
market accounts are added to a banking system).
24
Data Independence … Cont’d
2. Physical data independence
Is the capacity to change the internal schema without having to change
the conceptual (or external) schemas.
Changes to the internal schema may be needed because some physical
files had to be reorganized.
Modifications at the physical level are occasionally necessary to
improve performance.
25
The Database System Environment
1. DBMS Component Modules
26
The Database System Environment
(cont.)
2. Database System Utilities
help the DBA manage the database system. Common utilities have
functions such as: Loading, Backup, Database storage reorganization
and Performance monitoring.
Other utilities may be available for sorting files, handling data
compression, monitoring access by users, interfacing with the network,
and performing other functions.
3. Tools, Application Environments and Communications Facilities
Data dictionary (or data repository)
Application development environments, such as PowerBuilder (Sybase)
or JBuilder (Borland)
communications software
27
Classification of Database Management
Systems
1. Based on the data model used
Relational data model: most of the current commercial DBMSs use this
model. The systems based on this model is called SQL systems.
Object data model: implemented in some commercial systems but has
not had widespread use.
Big data systems or NOSQL systems use various data models such as
document-based, graph-based, column-based, and key-value data
models.
Object-relational data model
hierarchical and network data models.
Some experimental DBMSs are based on the XML (eXtended Markup
Language) model, which is a tree-structured data model.
28
Classification of Database Management
Systems (cont..)
2. Based on number of users supported by the system.
Single-user systems support only one user at a time and are mostly used
with PCs.
Multiuser systems, which include the majority of DBMSs, support
concurrent multiple users.
3. Based on the number of sites over which the database is distributed.
A centralized DBMS: can support multiple users, but the DBMS and the
database reside totally at a single computer site.
A distributed DBMS (DDBMS) can have the actual database and
DBMS software distributed over many sites connected by a computer
network.
29
Classification of Database Management
Systems (cont..)
4. Based on Cost.
Open source (free) DBMS products: i.e. MySQL and PostgreSQL.
Licensed DBMSs.
5. Based on types of access path options for storing files.
One well-known family of DBMSs is based on inverted file structures.
6. Based on the purpose
General purpose
Specific purpose DBMS: built for a specific application when performance
of the system is major concern.
support a large number of concurrent transactions without imposing
excessive delays. E.g. airline reservations and telephone directory systems.
30
Centralized and Client/ Server architecture
for DBMSs
A Centralized DBMS Architecture is an architecture in which all
the DBMS functionality, application program executions and user
interface processing were carried out on an single machine.
Where as, the Client / Server architecture was developed to deal with
computing environments in which large number of PCs, workstations,
file servers, printers, Db servers, email servers, web servers, other
software and equipment are connected via a network.
The idea is to define specialized servers with specific functionalities.
Client: is the user machine that provides user interface capability and
local processing.
Server: is the machine containing both software and hardware to provide
service for the clients.
31
Centralized and Client/ Server architecture
for DBMSs cont.
Two-Tier Client/ Server architecture
The S/w components are distributed over two systems: the client
and server.
On the client side user interface and application programs can run.
When DBMS access is required the program establishes a
connection to the DBMS (which is on the server side); once the
connection is created, the client program can communicate with the
DBMS.
A standard called Open Database Connectivity (ODBC) provides
an application programming interface (API), which allows
client-side programs to call the DBMS and connect.
A related standard to support java programs connectivity is JDBC.
Advantage: simplicity and seamless compatibility.
32
Centralized and Client/ Server architecture
for DBMSs cont.
Three-Tier architecture for web browsers
Most web applications use this architecture.
It adds an intermediate layer between the client and data base server.
This middle tier is called the application / web server.
The main role of this tier is to run application programs and store
business rule (procedure and constraints) that are used to access the
data from the database.
Advantage: high data security.
33