0% found this document useful (0 votes)
31 views45 pages

Understanding Database Systems Basics

Uploaded by

Fedawak Hailu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views45 pages

Understanding Database Systems Basics

Uploaded by

Fedawak Hailu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Database Systems

Instructor: Isayas E.
1
Chapter 1
Introduction

2
Outline

• The Need for Databases


• Components and functionality of DBS
• Data Models
• Database Design
• History of Database system
3
Definitions
• Data: Is raw, unorganized facts that need to be processed.
• Simply facts or figures
• Information: When data are processed, integrated, organized, structured or
presented so as to make meaningful or useful they are called information.
• E.g.: Data: each student’s test score
Information: average score of class
• Database :A collection of related data ,E.g. various students' name
Analogy: Filing cabinet-DB
• Drawers- tables
• Files in drawers-records
• Act of pulling out info-query
• Database Management System, DBMS : A software program which is used to
store & manipulate the database.
4
The Need for Databases
Database Management System (DBMS)
• DBMS contains information about a particular enterprise
• Collection of interrelated data
• Set of programs to access the data
• An environment that is both convenient and efficient to use
• Database Applications:
• Banking: transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions 5

• Databases can be very large.


University Database Example

• Application program that manipulate the files, including programs to:


• Add new students, instructors, and courses
• Register students for courses, and generate class rosters
• Assign grades to students, compute grade point averages (GPA) and generate
transcripts
• The university may have to write new application programs to deal with rules
specific to the new major.
• This typical file-processing system is supported by a conventional operating
system.

6
7
8
9
10
11
12
13
14
Drawbacks of using file systems to store data
Normal File Systems have the following problems, but it is an advantage for the
DBMS.
•Data redundancy and inconsistency:
Same information may be duplicated in several places.
All copies may not be updated properly.

•Difficulty in accessing data


May have to write a new application program to satisfy an unusual request
E.g. find all customers with the same postal code.
Could generate this data manually, but a long job...

•Data isolation
Data in different files
Data in different formats
•Difficult to write new application programs

15
Multiple users
Want concurrency for faster response time.
Need protection for concurrent updates.
E.g. two customers withdrawing funds from the same account at the same
time| account has $500 in it, and they withdraw $100 and $50. The result
could be $350, $400 or $450 if no protection.
Security problems
Every user of the system should be able to access only the data they are
permitted to see.
E.g. payroll people only handle employee records, and cannot see
customer accounts; tellers only access account data and cannot see payroll
data.
Difficult to enforce this with application programs
Integrity problems
Data may be required to satisfy constraints.
E.g. no account balance below $25.00
Again, difficult to enforce or to change constraints with the file-
processing approach
** These problems and others led to the development of database management
systems. In addition to its above advantages the DBMS application provides Data16
Abstraction.
Data Abstraction: The major purpose of a database system is to provide users with an
abstract view of the system.
-The system hides certain details of how data is stored and maintained.
-The purpose of the three level of architecture is to separate the way the database is
physically represented from the way user think about it.
E.g.: Students need not see information about instructors.
There are three levels of data abstraction levels or database architecture.
A)Physical/ Internal Level
How the data are stored
Eg. Index, B-tree, hashing
Lowest level of abstraction
Complex low-level structures described in detail
B) Conceptual Level
Next highest level of abstraction
Describes what data are stored
Describes the relationship among the data
Database administrator level
B)View/External Level
Highest level
Describes part of the data base for a particular group of users
Can be many different views of a database 17

Eg. Tellers in a bank get a view of customer accounts, but not of payroll data
.

pl e : In ca se of st or in g cu st omer da ta ,
Exam

si ca l le ve l – it will co nt ai ns bl oc k of st orages
Phy
(bytes,GB,TB,etc)

ca l le ve l – it will co nt ai n th e fie ld s an d the


Logi
attributes of data.

ve l – it wor ks with. CLI or GUI ac cess of


View le
database

18
View of Data
An architecture for a database system

Programmers and Database type instructor = record


administrators work at this ID : char (5);
level of abstraction name : char (20);
dept name : char (20);
salary : numeric (8,2);
end;

This code defines a new record type


called instructor with four fields. Each
field has a name and a type 19
associated with it.
Components and Functionality of Database Systems
The database systems include many components.
A)Data independence and efficient access
The importance of data abstraction is to achieve data independence.
•A major objective for the three-level architecture is to provide data independence
i.e upper levels are unaffected by changes to lower levels.
There are two kinds of independence:
1. Physical data independences
•It works between the physical schema
•Changes to the internal schema, such as using different file organizations or
storage structures, using different storage devices should be possible without
having to change the conceptual or external schema. From the user point of view,
the only effect that may be noticed is a change in performance
•Example : if we take of one data from one place to another place
2. Logical data independences
•Changes to the conceptual schemas such as the addition or removal of new
entities, attributes or relationships, should be possible without changing the
application programs (external schema). Clearly, the users for whom the changes
have been made need to be aware of them, but what is important is that other 20
users should not be.
B) Reduced application development time
•with the high-level interface to the data,
• facilitates quick development of applications.
•Such applications are also likely to be more robust than applications developed from scratch
because many important tasks are handled by the DBMS instead of being implemented by
the application.
C) Data integrity and security
•If data is always accessed through the DBMS, the DBMS can enforce integrity constraints on
the data.
•For example, before inserting salary information for an employee, the DBMS can check that
the department budget is not exceeded. Also, the DBMS can enforce access controls that
govern what data is visible to different classes of users.

D) Uniform data administration


•When several users share the data, centralizing the administration of data can offer
significant improvements.
•Experienced professionals who understand the nature of the data being managed, and
how different groups of users use it, can be responsible for organizing the data
representation to minimize redundancy and for fine-tuning the storage of the data to
make retrieval efficient.
E) Concurrent access, recovery from crashes
•data as being accessed by only one user at a time. Further, the DBMS protects users from
21

the effects of system failures.


Basic Database Terminologies
Table/ Relation: This is the structure defined to store data under it.
It is also called as an ENTITY.

Entity: Person, place, thing, or event (property of an entity); an


”object” in the real world that we are interested in: Eg. The object
student is an entity
Fields: It is also known as attributes in database terminology. It
defines the property of an entity. In other words it is called as
column in a table.

Attribute: A character or group of characters (alphabetic or


numeric), that has a specific meaning.
E.g. Name, age, telephone, grade, sex, etc
22
Record:
Candidate Key (Key Attribute): An attribute or set of attributes that uniquely
identifies individual occurrences of an entity type.
E.g. Id No of a student
Key attributes are very important in database
Primary Key: An entity that may have one or more possible candidate keys, one
of which is used to be the primary key.
Foreign Key: It is defined as a column or set of columns in a child table to
declare referential integrity constraint.
Data Value: The value of the attribute
E.g. Age = 26
Telephone = 111213
Salary = 580.56
Fname = Kebede

23
Domain: Each attribute has values taken from a domain. For example,
the domain of Name is String and that of salary is real

Record: A logically connected set of one or more attributes


that describe a person, place or thing. Used to identify each
row in a table, we use key attribute.

File: A collection of related records. For example, a file might


contain data about customers; or a file might contain the
records for the students currently enrolled at the department of
Electrical and Computer Engineering in Ambo University.

Database: Collection of Files – A database (DB) is a collection


of related data that exists over a long period of time

24
Instances and Schemas
• The collection of information stored in the database at a particular moment is called an
instance of the database.
• The overall design of the database is called the database schema.
• Similar to types and variables in programming languages
• Database schema defines the variable declarations in tables that belong to a
particular database; the value of these variables at a moment of time is called the
instance of that database.
• Database systems have several schemas, partitioned according to the levels
of abstraction.
• Physical schema – The design of a database at physical level
• Logical Schema – The design of database at logical level
• Example: The database consists of information about a set of customers and
accounts in a bank and the relationship between them
Analogous to type information of a variable in a program
• Instance – the actual content of the database at a particular point in time
• Analogous to the value of a variable 25
Data Models

is a collection of concepts for describing data and their operations.


is a collection of high-level data description that hides many low-level storage
details
is a formal language that allows us to define the type of data and data processing
that occurs in a database application.

It supports:
o Definition of data structures for storage.
o Definition of rules that the stored data have to obey.
o Definition of operations on the data, such that changes to represented
things in the real world can be reported for.
A DBMS allows a user to define the data to be stored in terms of a data model.
Most database management systems today are based on the relational data model,
which we will focus on this course.

26
Data Models
• A data model provides a way to describe the design of a database at the
physical(how data stored), logical(what data stored), and view/external levels.
• Classified into four different categories:
• Relational model
• Entity-Relationship data model (mainly for database design)
• Object-based data models (Object-oriented and Object-relational)
• Semi structured data model (XML)
• Other older models:
• Network model
• Hierarchical model

27
Relational Model
• Uses a collection of tables to represent both data and the relationships among those data.
• Each table has multiple columns, and each column has a unique name
• All the data is stored in various tables.
• Tables are also known as relations
• Example of tabular data in the relational model attributes
(or columns)

tuples
(or rows)

28

A Sample Relational Database


Entity-Relationship Model
• is entity-relationship (E-R) data model
• uses a collection of basic objects, called entities, and relationships among these objects
• Entities represent objects or a real world things
• Physical things like students, lecturers, employees, products
• More abstract things like modules, orders, courses, projects

Lecturer ID
• Entities have
• A general type or class, such as Lecturer or Module
Name Course • Instances of that particular type, such as
Steve Mills, Natasha Alechina are
instances of Lecturer
Tutors Student • Attributes (such as name, email address)

Module Studies
29
In an E/R Diagram, an entity is usually drawn as a box with rounded corners
In this case, Lecturer, Student, and Module are all entities
Database Design
The process of designing the general structure of the database:
• Why do we need it?
• Agree on structure of the database before deciding on a particular implementation.
• Consider issues such as:
• What entities to model
• How entities are related
• What constraints exist in the domain
• How to achieve good designs
•Logical Design – Deciding on the database schema. Database design requires that we find
a “good” collection of relation schemas.

•Physical Design – Deciding on the physical layout of the database

32
1. Steps of Database Design
Why is database design important?
 A well-designed database facilitates data management and becomes a
valuable information generator.
 A poorly designed database is a breeding ground for uncontrolled data
redundancies.
 A poorly designed database generates errors that lead to bad decisions.
Database design consists of several tasks:
 requirements analysis
 conceptual design
 logical design
 schema refinement
 physical design and tuning (phase in which all six kinds of design steps
are interleaved and repeated until the design is satisfactory).
In developing a good design, one should ask – what are the important queries
33
and updates? What attributes/ relations are involved?
1) Requirement Analysis
The very first step in designing a database application is to
understand
 what data is to be stored in the database
 what applications must be built on top of it, and
 what operations are most frequent and subject to
performance requirements.
 In other words, we must find out what the users
want from the database.
The study of
the current operating environment and
how it is expected to change
Analysis of any available documentation and applications that
are expected to be replaced by the database application.
34
2) Conceptual database design

The information gathered in requirement analysis step is used to develop a


high level description of data to be stored in the database, along with constraints
that are known to hold over this data
The process of constructing a model of the information stored in an enterprise,
independent of all physical considerations.
It is the source of information for the logical design phase.
This step is usually carried out using ER diagrams.

Completion of Conceptual Design


Refine module definition
 Entity
 Attributes
 Relationship

35
3) Logical Database Design
The process of construction a model of the information used in an enterprise based
on a specific data model (e.g. relational), but independent of a particular DBMS and
other physical consideration.
We will only consider relational DBMSs, and therefore, the task in the logical design
step is to convert an ER schema into a relational database schema.

4) Schema Refinement:
The fourth step in database design is to analyze the collection of relations in our
relational database schema to identify potential problems, and to refine it. schema
refinement can be guided by some well-designed and powerful theory. We discuss
the theory of normalizing relations restructuring them to ensure some desirable
properties.
Normalization process
 Discover New entities
 Revise attributes

36
5) Physical Database Design
The process of producing a description of the implementation of the database on
secondary storage.
Secondary storage: defines specific storage or access methods used by database:
Describes the storage structure and access methods used to achieve efficient access
to the data
Tailored to specific DBMS systems – Characteristics are function of DBMS and
operating systems.
Includes estimate of storage space

6) Tuning :
In w/ch all six kinds of design steps are interleaved & repeated until the design is satisfactory.

7) Implementation in DBMS
Two things should be given an attention during the implementation of the specific DBMS
implementation.

37
Database Languages
A database system provides a data-definition language to specify the database
schema and a data-manipulation language to express database queries and
updates.
In practice, the data-definition and data-manipulation languages are not two separate
languages; instead they simply form parts of a single database language, such as the
widely used SQL language.
A)Data Definition Language (DDL): DDL statements are used to define and modify
the database structure of your tables or schema. When you execute a DDL
statement, it takes effect immediately.
Some commands of DDL are:
 CREATE - to create table (objects) in the database
 ALTER - alters the structure of the database
 DROP - delete table from the database
 TRUNCATE - remove all records from a table, including all spaces allocated for the
records are removed
 COMMENT - add comments to the data dictionary
 RENAME - rename a table
38
Data Definition Language (DDL)
• Specification notation for defining the database schema
Example: create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2))
primary key (ID)
• DDL compiler generates a set of table templates
stored in a data dictionary Table: Data dictionary example

• Data dictionary contains metadata (i.e., data about data)

• DDL provides facilities to specify:


• Database schema
• Integrity constraints
• Primary key (ID uniquely identifies instructors)
• E.g.: suppose the university requires that the account balance of a department must
never be negative.
• Authorization
39
• Who can access what
B) Data Manipulation Language (DML): Data Manipulation Language
(DML) statements are used for managing data within tables.

Some commands of DML are:


SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
LOCK TABLE - control concurrency

40
Data Manipulation Language (DML)
• Language for accessing and manipulating the data organized by the appropriate data
model
• A query is a statement requesting the retrieval of information.
• The portion of a DML that involves information retrieval is called a query
language.
• DML also known as query language
• The most widely used query language is SQL(Structured Query Language )

• The types of access are:


• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database
41
Data Manipulation Language (DML)..cont’d
There are basically two types:
•Procedural DMLs: require a user to specify what data are needed and how to get
those data.
• Easier to learn
•Declarative DMLs (also referred to as nonprocedural DMLs) require a user to
specify what data are needed without specifying how to get those data.
•The SQL query language is nonprocedural
Example: SQL query that finds the names of all instructors in the History
department
select instructor.name
from instructor
where instructor.dept name = ’History’;

42
ADVANTAGES OF A DBMS
• Data Independence: Application programs should not, ideally, be exposed to
details of data representation and storage, The DBMS provides an abstract view of
the data that hides such details.
• Efficient Data Access: A DBMS utilizes a variety of sophisticated techniques to
store and retrieve data efficiently. This feature is especially important if the data is
stored on external storage devices.
• Data Integrity and Security: If data is always accessed through the DBMS, the
DBMS can enforce integrity constraints. For example, before inserting salary
information for an employee, the DBMS can check that the department budget is
not exceeded. Also, it can enforce access controls that govern what data is visible
to different classes of users.
• Data Administration: When several users share the data, centralizing the
administration of data can offer significant improvements. Experienced
professionals who understand the nature of the data being managed, and how
different groups of users use it, can be responsible for organizing the data
representation to minimize redundancy and for fine-tuning the storage of the43 data
to make retrieval efficient.
ADVANTAGES OF A DBMS
• Concurrent Access and Crash Recovery: A DBMS schedules concurrent
accesses to the data in such a manner that users can think of the data as being
accessed by only one user at a time. Further, the DBMS protects users from the
effects of system failures.
• Reduced Application Development Time: Clearly, the DBMS supports important
functions that are common to many applications accessing data in the DBMS. This,
in conjunction with the high-level interface to the data, facilitates quick application
development. DBMS applications are also likely to be more robust than similar
stand-alone applications because many important tasks are handled by the DBMS
(and do not have to be debugged and tested in the application).

• Reading Assignment
Disadvantages of DBMS

44
History of Database Systems
• 1950s and early 1960s:
• Data processing using magnetic tapes for storage
• Tapes provided only sequential access
• Punched cards for input
• Late 1960s and 1970s:
• Hard disks allowed direct access to data
• Network and hierarchical data models in widespread use
• Ted Codd defines the relational data model
• Would win the ACM Turing Award for this work
• IBM Research begins System R prototype
• UC Berkeley begins Ingres prototype
• High-performance (for the era) transaction processing
45
History (cont.)
• 1980s:
• Research relational prototypes evolve into commercial systems
• SQL becomes industrial standard
• Parallel and distributed database systems
• Object-oriented database systems
• 1990s:
• Large decision support and data-mining applications
• Large multi-terabyte data warehouses
• Emergence of Web commerce
• Early 2000s:
• XML and XQuery standards
• Automated database administration
• Later 2000s:
• Giant data storage systems 46

• Google BigTable, Yahoo PNuts, Amazon, ..


47

You might also like