0% found this document useful (0 votes)
46 views34 pages

Introduction To Database Concepts-1

This document serves as an introduction to database concepts, covering the definition, organization, and management of data within database systems. It outlines the objectives of the course, the evolution of data management, and the advantages and disadvantages of using Database Management Systems (DBMS). Key topics include data organization, problems solved by databases, and the importance of DBMS in ensuring data integrity, security, and efficient access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views34 pages

Introduction To Database Concepts-1

This document serves as an introduction to database concepts, covering the definition, organization, and management of data within database systems. It outlines the objectives of the course, the evolution of data management, and the advantages and disadvantages of using Database Management Systems (DBMS). Key topics include data organization, problems solved by databases, and the importance of DBMS in ensuring data integrity, security, and efficient access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd

Introduction to Database Concepts

INTRODUCTION
In this course we will study the concept of data, database and other related terms
used in Database Management.

THE OBJECTIVES

The objective of this course include;

• To understand data organization.


• To understand problems solved by database.
• To understand objective of database (data-related goals).
• To understand database modelling.

INTRODUCTION TO DATABASE SYSTEM

The database management system consists of two parts. They are:

1. Database
2. Management System

What is a Database?

To understand the concept of database we must first look at the fundamental principles
of database.

Fundamental principles and concepts of the database systems

Here we consider data which is the building block of every database and other
terms circled around data.

Data

Data is any raw or unprocessed fact; it is the building block of any database. The word data

covers the collection of facts stored in a database, it is distinct pieces of information that
may not make sense independently, example; 1, Tom, Computer science, 18, which
are usually formatted in a special way.

All software is divided into two general categories: data and programs. Programs
are collections of instructions for manipulating data.

Data is not only ubiquitous and pervasive, but also essential for organizations to survive
and prosper (By ubiquitous and pervasive we are highlighting that data is present
everywhere and impacts nearly all aspects of life, data has become a fundamental part
of our daily life, business operations, personal routines and technology etc.). Imaging
trying to operate a business without knowing who your customers are, what products
you are selling, who is working for you, who owes you money, and to whom you owe
money. All businesses must keep this type of data and much more. Just as important
they must have that data available to decision makers when necessary.

1. Data can exist in a variety of forms -- as numbers (0-9) or text in a piece of paper, as bits (0,
1) and bytes (10001100) stored in electronic memory, or as facts stored in a person’s mind.
Strictly speaking, data is the plural of datum, a single piece of information. In practice,
however, people use data as both the singular and plural form of the word.

2. The term data is often used to distinguish binary machine-readable information


from textual human-readable information. For example, some applications make a
distinction between data files (files that contain binary data) and text files (files
that contain ASCII data).
3. In database management systems, data files are the files that store the database
information, whereas other files, such as index files and data dictionaries, store
administrative information, known as metadata (data about data).

➢ While data is any raw or unprocessed fact,


➢ Information is the result of processing raw data to reveal its meaning.
➢ Metadata: Data about data, through which the end-user is integrated and
managed. The metadata describes the data characteristics and the set of
relationships that links the data found within the database.
Evolution of data / data management

The evolution of data can be summarized through distinct phases:

1. Manual record Keeping: Early data management involved physical records like
books, ledgers and files. Examples like census records, financial accounts etc.
Manual systems were slow, prone to error and inefficient for large data sets.

Retrieving or updating the records often required significant effort.

2. File-Based systems: As computers emerged, data was stored in flat files using
basic storage methods e.g. magnetic tapes or disks. Though an improvement
from the previous method, this system leads to redundancy, inconsistency and
difficulty in accessing related data.

3. Database Systems (First-Generation): The advent of DBMS introduced


structured data storage, reduced redundancy, and enabling easier data retrieval
and manipulation. Early DBMS like hierarchical and network models paved the
way for relational databases.

4. Relational Databases: Relational model revolutionized data management


by introducing:
• Data organized into tables with rows and columns.
• Use of SQL (Structured Query Language) for querying and manipulation.

Key advantages

• Reduced redundancy.
• Easier to access data.
• Easier to retrieve data.

Examples of relational database: MySQL, Microsoft SQL Server.

The characteristics that differentiate a DBMS from a Relational Database


Management System (RDBMS) is that the RDBMS provides a set-oriented
database language. The set-oriented database language means that SQL
processes sets of data into groups.

5. Big Data and NoSQL: With the rise of the internet, social media and IOT
(Internet of Things), unstructured and semi-structured data gained prominence.
E.g. videos, images, social media post, XML, JSON etc. NoSQL databases like
MongoDB and Casandra were developed to handle these datatypes.
6. Cloud database: A cloud database is a database that runs on a cloud platform
(e.g. AWS, Azure, Google cloud). Instead of being stored on your own hardware,
the database is managed and accessed over the internet.
Example: Google drive storing your photos is like a cloud database for your
files. Benefits:
• Easily scalable to handle more data.
• Accessible from anywhere
• Managed by cloud providers so you don’t worry about maintenance.

Data Organization

Data organization is the way data is structured, stored and managed to make it easy
to access, analyze and use effectively. In data organization, bits, bytes, fields,
records, files and databases are hierarchical building blocks that define how data is
stored and managed.
Bit: A bit is the smallest unit of data in computing. This term is curled from Binary digit.
Examples include 0 or 1, T or F, High or Low, On or Off. In database, this data type is
known as Boolean.

Bytes: A byte is a unit of data which can be 8 bits, 16bits, 32bits or 64 bits based on
the system. The term is curled from the word (By)eith(te). It can represent characters
such as letters, numbers or symbols.

Field: Field is a group of related characters or bytes. A space allocated for a particular
item of information. It is a single piece of related information in a database. A student
form for example, contains several fields: S/n, Name, Department, Age and so on

Sn Name Department Age


1 Tom Computer Science 20
Record
Fields

In database systems, fields, also known as Attributes or Domain are the smallest units
of information you can access. Most fields have certain properties associated with
them. For example, some fields are numeric whereas others are textual, some are
long, while others are short. In addition, every field has a name, called the field name.

In database management systems, a field can be required, optional, or calculated. A


required field is one in which you must enter data, while an optional field is one you
may leave blank. A calculated field is one whose value is derived from some formula
involving other fields. You do not enter data into a calculated field; the system
automatically determines the correct value. A collection of fields is called a Record.

Record

Record is a complete set of information gotten from collection of related terms. It is


composed of fields, each of which contains one item of information. A set of records
constitutes a file. For example, a student file might contain records that have three fields:
Sn field, a Name field, a department field and an Age field as is shown on the above table.
In relational database management systems, records are called tuples.

File

A collection of data records or information that has a name, called the filename. Almost
all information stored in a computer must be in a file. There are many different types of
files: data files, text files, program files, directory files, and so on. Different types of files
store different types of information. For example, program files store programs, whereas
text files store text. In database, Tables or Relations which comprise of Fields and
Records created are also regarded as files.

Database

Database can simply be defined as any collection of related information, or an


organized collection of data stored and accessed electronically using a database
management system. It is a collection of interrelated data or information organized in
such a way that a computer program can quickly select desired pieces of data or
simply put collection of related tables. You can think of a database as an electronic
filing system. Traditional databases are organized by fields, records, and files.

Any collection pf information:

• A phonebook
• Shopping list
• Facebook user base
• Students in a department (Details).

Database can be stored in different ways:

• On piece of paper
• On your mind
• On a computer

Typical Applications of Database

➢ Student Records
➢ Hotel Booking
➢ Library
➢ Maintenance Information System
➢ Banking System
➢ Sales Records

Data Organization Summary

Database File Record Field Byte Bit

Figure 1: Data organization Summary

• A bit is the smallest unit of data used to build bytes.


• Bytes form fields which represent pieces of information.
• Fields combine to make records.
• Records are stored in files and files are organized into a database.

This structure helps manage data in a logical, easy to access way.

Problems Solved by Databases

Databases are designed to efficiently solve a wide range of problems related to


data storage, organization, retrieval, and management. Here are some key problems
that databases address:

1. Data Redundancy and Inconsistency: Databases reduce data duplication by

storing data in a centralized or distributed system that multiple users can access. By
having a single source of truth, databases minimize inconsistencies across
different data copies.
2. Data Storage and Organization: Databases provide structured storage
solutions, allowing vast amounts of data to be stored in an organized way,
making it easier to access, sort, and manage.
3. Efficient Data Retrieval: With indexing, caching, and querying capabilities,
databases allow for fast and efficient retrieval of data, even when handling
millions of records.
4. Data Integrity and Validation: Databases enforce data integrity constraints, such
as primary keys, foreign keys, and unique constraints, ensuring data accuracy and
validity. This prevents invalid data from being stored in the system.
5. Concurrent Access and Multi-user Support: Databases allow multiple
users to access and modify data simultaneously without conflicts, using
transaction management and concurrency controls like locking mechanisms.
6. Security and Access Control: Databases provide robust security measures,
allowing organizations to control who can access, modify, or view certain data.
This is critical for protecting sensitive information.
7. Backup and Recovery: Databases provide backup and recovery solutions to
prevent data loss in case of system failures, ensuring that data can be restored
to a previous state if needed.
8. Scalability: Databases can be scaled to handle growing data needs, whether
through vertical scaling (adding more power to a single server) or horizontal
scaling (adding more servers). This is especially relevant for applications that
need to handle large volumes of data and high traffic.
9. Data Sharing and Collaboration: Databases enable data sharing between
different systems and users, facilitating collaboration and interoperability across
departments and applications.
10. Data Analytics and Decision-making: Many databases, especially analytical

databases, support complex data analysis and reporting. This helps businesses
make data-driven decisions by providing insights into trends, patterns, and
business metrics.
11. Data Compliance and Auditing: Databases help organizations meet
regulatory requirements by storing logs and audit trails of data access and
modifications, which is important for compliance and auditing purposes.

Overall, databases address the needs for structured data management, secure
access, data integrity, and operational efficiency

What is Management System?

Database management system (DBMS) is a special software that helps users create
and maintain a database.

The primary goal of a DBMS is to offer a more convenient and effective way of storing and
retrieving database information. The management system is important because it provides
the necessary rules and regulations without which the maintenance of database cannot be
possible. Issues like selecting the attributes for a particular table; the common attributes to
create relationship between two tables; handling of insertion and deletion of records from a
table must be guided by rule in order to maintain the integrity of the database.

Advantages of DBMS:

I. Controlling of Redundancy: the centralization of database and its control reduces


data redundancy (i.e. storing same data multiple times). It also eliminates the extra
time for processing the large volume of data thereby saving the storage space.
II. Data Consistency: Eliminating data redundancy improves data consistency. If
students contact address is stored only once, the issue of disparity on stored
values will not arise. Also updating data values is greatly simplified when each
value is stored in one place only.
III. Improved Data Sharing: DBMS allows a user to share the data in any number of
application programs.
IV. Data Integrity: Integrity is concerned with accuracy of data in database.
Centralized control of the data allows the administrator to define integrity
constraints to the data in the database. For example: in Students’ databases, we
can enforce an integrity that it must not accept null student’s Mat_no.
V. Security: Having complete authority over the operational data, enables the DBA in
ensuring that the only means of access to the database is through proper channels.
The DBA can define authorization checks to be carried out whenever
access to sensitive data is attempted.
VI. Efficient Data Access: In a database system, the data is managed by the
DBMS and all access to the data is through the DBMS providing a key to
effective data processing.
VII. Enforcements of Standards: With the centralized of data, DBA can establish
and enforce the data standards which may include the naming conventions, data
quality standards etc.
VIII. Data Independence: In a database system, the database management system
provides the interface between the application programs and the data. When
changes are made to the data representation, the meta data obtained by the
DBMS is changed but the DBMS continues to provide the data to application
program in the previously used way. The DBMS handles the task of
transformation of data wherever necessary.
IX. Reduced Application Development and Maintenance Time: DBMS
supports many important functions that are common to many applications,
accessing data stored in the database, which facilitates the quick development
X. Data Sharing: DBMS supports multi-user environments where several users
can access and modify data concurrently. It also provides controlled access
through privileges.
XI. Scalability: The DBMS handles growing amounts of data efficiently without
compromising performance. It also adapts to the needs of both small-scale
and large-scale applications.
XII. Improved Decision-Making: DBMS facilitates better analysis and insights
through structured and accessible data. It also integrates well with business
intelligence tools for informed decisions making.

Disadvantages of DBMS

Despite its numerous advantages, a Database Management System (DBMS) also has
some disadvantages, especially in specific scenarios. Here are the key drawbacks:
1. High Initial Cost:
• Hardware and Software Costs: Setting up a DBMS requires robust
hardware and software, which can be expensive.
• Licensing Fees: Commercial DBMS software often comes with
significant licensing costs.
2. Complexity:
• DBMSs are complex systems that require skilled personnel to
manage, maintain, and optimize.
• Learning and mastering a DBMS can be time-consuming.
3. Performance Issues:
• In systems with large volumes of transactions or complex queries, the
DBMS may experience performance bottlenecks.
• Indexing and optimization are required to maintain performance, which
can be resource intensive.

4. High Maintenance Requirements:


• Regular updates, backups, and system tuning are necessary to
ensure reliability.
• Requires ongoing technical support and skilled database
administrators (DBAs).
5. Potential for Data Breach:
• Centralized storage can become a target for cyberattacks if
security measures are not adequately implemented.
• Unauthorized access to sensitive data can lead to serious consequences.

6. Overhead:
• A DBMS consumes considerable system resources, such as memory,
CPU, and storage.
• Resource-intensive applications may slow down overall
system performance.

7. Risk of System Failure:


• If the DBMS crashes or becomes corrupted, it can affect all users
and systems relying on the database.
• Recovery from a failure may be time-consuming and costly.

8. Scalability Challenges:
• Scaling a DBMS to handle very large databases or high user concurrency
can be challenging without significant upgrades.
• Distributed DBMSs can address scalability but add complexity.

9. Dependency on DBMS Vendor:


• Many organizations become reliant on specific DBMS vendors,
leading to vendor lock-in.
• Switching to another DBMS can be difficult and costly.

10. Increased Complexity for Small Applications:


• For smaller projects, the overhead of using a DBMS might
outweigh its benefits.
• Simpler file-based systems may be more suitable in such cases.
Database Career Opportunities
JOBTITLE DESCRIPTION SAMPLE SKILLS REQUIRED

Database Developer Create and maintain database-based Programming, database


applications fundamentals, SQL.
Database Designer Design and maintain databases Systems design, database design
SQL.

Database Administrator Manage and maintain DBMS and Database fundamentals, SQL,
databases vendor courses.
Database Analyst Develop databases for decision SQL, query optimization, data
support reporting warehouses.
Database Architect Design and implementation of DBMS fundamentals, data
database environments (conceptual, modelling, SQL, hardware
logical and physical) knowledge, etc.
Database Consultant Help companies leverage database Database fundamentals, data
technologies to improve business modelling, database design, SQL,
process and achieve specific goals DBMS, hardware, vendor-specific
technologies etc.
Database Security Officer Implement security policies for data DBMS fundamentals, database
administration administration, SQL data security
technologies, etc.

Cloud Computing Data Design and implement the Internet technologies, cloud
Architect infrastructure for next-generation storage technologies, data
cloud database systems security, performance tuning,
large databases, etc.
NORMALIZATION
Normalization in database is a systematic approach to organizing data to
improve its efficiency, integrity and scalability.
It involves decomposing large tables into smaller and manageable ones and
defining relationships among them.
The normalization process involves dividing large tables into smaller tables and
defining relationships between them. This is typically done through a series of normal
forms, which indicate the level of normalization achieved by a database. There are
several normal forms including: First Normal Form (1NF), Second Normal Form (2NF),
Third Normal Form (3NF), and higher forms: Boyce Codd Normal Form (BCNF),
Fourth Normal Form (4NF), Fifth Normal Form (5NF).
Here’s a simple example
Imagine you run a small school, and you are keeping records of students and their
courses in one table. It might look like this:
Student ID Name Course 1 Course 2 Course 3
101 Alice Math Science English
102 Bob Math English Null

This setup works, but there are problems:

1. Repetition: The same course names (like “math”) are repeated for different studets.
2. Empty space: If a student takes fewer than 3 courses, you will have empty
(null) fields.
3. Hard to update: If you change “math” to “Further math”, you will have to find
and update it everywhere it appears.

What Normalization Does

Normalization solves these problems by breaking the data into small, related tables.
Let’s fix the table:

1. Create a student table:


Store each student’s personal data
Student ID Name

101 Alice

102 Bob

2. Create a course table:


Store details of all courses
CourseID CourseName

1 Math

2 Science

3 English

3. Create a student-course table:


Links students to their courses without repetition.
StudentID CourseID

101 1

101 2

101 3

102 1

102 3

Benefits

• No Duplication: Course names are stored once in the course table.


• Flexibility: You can add as many courses as you want for a student without
worrying about empty spaces.
• Easier Updates: Changing “math” to “Further math” in the course
table automatically updates it everywhere.
Why Normalize a Database

1. Redundancy Reduction: When data is repeated across tables, updates must be


applied everywhere. If one data is missed, data inconsistencies arise.
a. Example: In a table where a customer’s address is repeated for every
order, updating the address in one row but not others cause mismatches.

2. Prevention of Anomalies: In the context of databases, anomalies are issues


or problems that arise when performing operations like (Inserting, updating
and deleting) on a poorly designed table. These problems occur due to
improper data organization due to redundancy.

Normalization addresses three major anomalies:

i. Insertion Anomalies: Inability to add data without already existing related data.
a. Example 1: You cannot add a new student if they are not enrolled
on a course.
b. Example 2: You can add a new product to a database without assigning
it to an existing supplier.

ii. Update Anomalies: Changing data in one place doesn’t automatically


update it everywhere else, leading to inconsistencies.
a. Example: Updating a department or course in one row but not others
cause mismatches or inconsistency.

iii. Deletion Anomalies: Deleting data unintentionally or deleting one piece of


data removes other important information.
a. Example: Deleting a student record may also remove the only record
of a course if both are on the same table.

Normalization Process/Rules
Normalization is achieved through a series of stages called Normal Forms (NF’s).
Each normal form builds on the previous one.
1) First Normal Form (1NF):
Goals/Requirements:
• To make sure each column contains only one value.
• Ensure each row must be unique and identified by a primary key.
• Ensures each cell in a table contains a single value.

Example (Bad Table):

ID Name Hobbies
1 David Painting/Travelling

This table violates the 1NF because, “Hobbies” contains multiple values

Fixed Table:

ID Name Hobbies
1 David Painting
2 David Travelling

2) Second Normal Form (2NF):


Goals/Requirements:
• Ensure that the table is in 1NF.
• Removes partial dependencies (when a column depends on part
of a composite/primary key).
• All non-key columns should depend on the entire primary key, not
just part of it.

Example (Bad Table):

StudentID CourseID StudentName CourseName


1 101 Jon Doe English
2 102 Jon Smith Science

Here “StudentName” depends only on “StudentID” and “CourseName”


depends only on “CourseID”.

To normalize:

• Create one table for students:


StudentID StudentName
1 Jon Doe
2 Jon Smith

• Create another table for course


CourseIDCourseName
101 Math
102 Science

• Maintain a relationship table


StudentID CourseID
1 101
2 102

3) Third Normal Form (3NF):


Goals/Requirements:
• It must satisfy 2NF.
• Non-key columns should only depend on the primary key.
• Removes transitive dependencies (when a non-key column
depends on another non-key column). Further eliminating
redundancy and dependency chains.

Example:

EmployeeID DepartmentID DepartmentName

1 101 HR

2 102 IT

Here “DepartmentName” depends on “DepartmentID” and not directly


on “EmployeeID”. To Normalize:

• Create a department table:


DepartmentID DepartmentName
101 HR
102 IT

• Maintain an employee table:


EmployeeID DepartmentID

1 101

2 102

Beyond the 3NF (optional)

4) Boyce-Codd Normal Form (BCNF): Handles rare cases where even the NF
is not enough.
5) Fourth Normal Form (4NF) and Fifth Normal Form (5NF): Handles multivalued
and complex dependencies.

Advantages of Normalization

1. Minimized Redundancy: Normalization saves storage and reduces errors


from duplicated data.
2. Improved Data Integrity: Normalization ensures consistent and accurate data.
3. Flexibility: Easier to extend and modify the database design.

Disadvantages of Normalization

1. Normalized databases often require joins to retrieve data, which can


slow performance.
2. Can increase the number of tables, making the queries complex.
3. For small datasets, normalization can be unnecessary and cumbersome.

Example Use Case

In an e-commerce system, normalization ensures that:

- Customer information is stored in one table to avoid repeating customer


details with every order.
- Product details are stored in a separate table to avoid duplicating product
data across multiple orders.

This approach streamlines the database, reduces redundancy and ensures


efficient updates.
Understanding and applying normalization helps bring robust and scalable
databases that perform efficiently while maintaining data integrity.

Summary

• 1NF: No duplicate rows, each cell must contain a single value.


• 2NF: No partial dependency on composite/primary key.
• 3NF: No non-key column should depend on another non-key column.
• Primary key: A primary key is a column that uniquely identifies
each individual row or record in a table.
• Composite key: A composite key is simply a primary key made up of
two or more columns together to uniquely identify rows.
• Non-Key Columns: These are columns in a table that are not part of
the primary key. They store additional information.
Database Models

A database model or data model is a logical construct used to define or represent the
data structure and data relationships found within the database. Data model is not just
a way of structuring data, it also defines or specifies a set of operations that can be
performed on the data.

Categories of Database Model

There are two categories of database model namely:

1) Conceptual Database Model Category


2) Implementation Database Model Category

Conceptual Database Model Category:

This is concerned with what is represented in the database, it includes: Entity


Relationship Model and Object-Oriented Model.

• Entity Relationship (ER) Model:


The ER views the real world as a construct of entities (object or things)
and association between entities. ER represents the real world using:
Entities, Attributes, and Relationships.
i. Entities: Entities are the things or objects you want to store information
about in a database. They can be People (e.g. student, customer), Things
(e.g. book, car), Events (e.g. order, loan).
Each entity has attributes (details about it) for example: A student has
name and ID. So, think of entities as nouns (people, places or things) in
your database.
An entity can be dependent or independent.
▪Afordependent entity is one that relies on another entity
identification.
▪An independent entity is one that does not rely on another entity
for identification.

In the context of databases an entity is any object, concept or thing in the


real world that can be identified and has attributes associated with it.
Entities are fundamental components in database design particularly in
ER models, where they represent nouns of the data structure such as
people, places, objects or events.
Special Entity Types

Associative entities (also known as intersection entities) are used to


associate two or more entities to reconcile a many-to-many relationship.

Subtype entities are used in generalization hierarchies to represent a


subset of instances of their parent entity, called the supertype, but which
have relationships or attributes that apply only to the subset.
ii. Attributes: Attributes describes the entity of which they are associated.
Attributes are properties of entities e.g. Name and ID. Attributes can be
classified as identifiers or descriptors. Identifiers, more commonly called keys
(primary, composite or foreign), uniquely identify an instance of an entity. A
descriptor describes a non-unique characteristic of an entity instance.

iii. Relationships: A relationship represents an association between two or


more entities. An example of relationships would be:
▪ Student taking a particular Course.
▪ Employees work in Departments.

Relationships are classified in terms of degree, connectivity and


cardinality, direction, type and existence.

Classifying Relationships

Classifying Relationships refers to understanding how entities in a database relate to


one another. These relationships also help define how data is stored and managed
within a database. Relationships can be classified by their: Degree, Direction,
Connectivity and Cardinality, Type and Existence.

1. Degree of a relationship:
The degree of relationship in a database refers to the number of entities (or
tables) involved in the relationship. It describes the participation of entities in
the relationship and is a fundamental concept in the Entity-Relationship (ER)
Model used for database design. Types of degree of a relationship:
i. Unary Relationship (Degree 1): Involves only a single entity type
participating in the relationship. This is also called a recursive
relationship because an entity is related to itself.
ii. Binary Relationship (Degree 2): Involves two different entity
types participating in the relationship. This is the most common
type of relationship in database design.
iii. Ternary Relationship (Degree 3): Involves three different entity
types participating in the relationship simultaneously.
iv. N-ary Relationship (Degree N): Involves more than three entities. Such
relationships are rare but are sometimes used when more than three
entities must participate simultaneously in a meaningful way.

2. Connectivity and Cardinality:


Connectivity refers to how entities (things) are connected to each other
in a relationship. Think of it as a link between them. For example:
a. A student connects to a school because the enroll there.
b. A customer connects to a product because they buy it.
Cardinality describes the number of connections between entities in a relationship.
There are three main types of cardinalities:
i. One-to-one (1:1): A one-to-one relationship exists when one record
in an entity corresponds to exactly one record in another entity.
Example: A person entity has one-to-one relationship with a
passport entity since each person can have only one passport and
each passport belongs to one person.

ii. One-to-many (1: N): A one-to-many relationship exists when one


record in an entity corresponds to multiple records in another
entity. Example: A customer entity has one-to-many relationship
with an order entity because a customer can place multiple orders,
but each order belongs to only one customer.

iii. Many-to-many relationships (M: N): A many-to-many relationship


exists when multiple records in one entity correspond to multiple
records in another entity.
Example: A student entity has a many-to-many relationship with
a course entity because a student can enroll in multiple courses
and each course can have multiple students.
3. Direction:
The direction of a relationship indicates the originating entity of a binary
relationship. The entity from which a relationship originates is the parent entity,
the entity where the relationship ends/terminates is the child entity.
The direction of a relationship is determined by its connectivity. In a one-to-one
relationship the direction is from the independent entity to a dependent entity. If
both entities are independent, the direction is arbitrary. With one-to-many
relationships, the entity occurring once is the parent. The direction of many-to-
many relationships is arbitrary.

4. Type:
An identifying relationship is one in which one of the child entities is also a
dependent entity. A non-identifying relationship is one in which both entities
are independent.

5. Existence:
Existence denotes whether to existence of an entity instance is dependent upon
the existence of another, related, entity instance.

Generalization Hierarchies
A generalization hierarchy is a form of abstraction that specifies that two or more entities that
share common attributes can be generalized into a higher level entity type called a supertype
or generic entity. The lower level of entities become the subtype, or categories, to the
supertype. Subtypes are dependent entities. Generalization occurs when two or more entities
represent categories of the same real-world object. For example, Wages_Employees and
Classified_Employees represent categories of the same entity, Employees. In this example,
Employees would be the supertype; Wages_Employees and Classified_Employees would be
the subtypes. Subtypes can be either mutually exclusive (disjoint) or overlapping (inclusive). A
mutually exclusive category is when an entity instance can be in only one category. The
above example is a mutually exclusive category. An employee can either be wages or
classified but not both. An overlapping category is when an entity instance may be in two or
more subtypes. An example would be a person who works for a university could also be a
student at that same university. The completeness constraint requires that all instances of the
subtype be represented in the supertype. Generalization hierarchies can be nested. That is, a
subtype of one hierarchy
can be a supertype of another. The level of nesting is limited only by the constraint of
simplicity. Subtype entities may be the parent entity in a relationship but not the child.

ER Notation
ER notation is a way of showing how things (entities) are related to each other
in a database. It is like a diagram to plan or explain the structure of data.

ER Model: The ER model is a conceptual data model that views the real world as
consisting of entities and relationships. It is a way of showing how things (entities) are
related to each other in a database. Here’s a simple breakdown:

a. Entities: Entities are the things or objects you want to store information
about in a database (e.g. Student, Company). Represented as rectangles.

b. Attributes: These are details about the entities (e.g. a student’s


name, student’s course). Represented as ovals.

c. Relationships: This shows how entities are connected (e.g. students


enroll for a course). Represented by a diamond shape.
Entity Notations

SYMBOLS REPRESENTS REMARK

Entity can be a person, place, object,


Entities or String entities event or concept about which data is
to be maintained. E.g. Student,
Course.
This represents a printout, such as a
Weak Entities document or a report.

Properties or characteristic of an
Attribute entity. E.g. Name of Student, Course
Code etc.
Phone number of a student,
Multivalued Attribute Students offering a course.

Association between entities.


Relationship Relationship is identified with verb or
verb phrase.
A weak entity is a type of entity which
Weak relationship doesn’t have its key attribute. It can
be identified uniquely by considering
the primary key of another entity.
Links attributes to entity types and
Link entity types to other relationship
types.

The diagram helps to visualize how data will be stored and linked.

Example: A rectangle for student entity might connect to another rectangle for
course entity via a diamond labelled enroll.

Student Enroll Course


Implementation Model Category:

This is concerned with how data are represented in the database. It refers to how
the database is physically structured and managed on a computer system, it focuses
on the technical details of storing, retrieving and updating data efficiently.

In this category we have three models namely:

- Hierarchical Model
- Network Model
- Relational Model

- Hierarchical Model: Organizes data in a tree-like structure with parent-child relationship,


there is a hierarchy of parent and child segments. This structure stores data as records
which are connected to each other through links. A record is collection of fields with each
field containing only one value. The field in a record is determined by the type of record. A
record can have repeating information, generally in the child data segments. The record
types are the equivalent of tables in the relational model, and with the individual records
being the equivalent of rows. To create links between these record types, the hierarchical
model uses Parent Child Relationships. These are a 1: N mapping between record types.
This is done by using trees, like set theory used in the relational model, "borrowed" from
Math’s. For example, an organization might store information about an employee, such as
name, employee number, department, salary. The organization might also store
information about an employee's children, such as name and date of birth. The employee
and children data forms a hierarchy, where the employee data represents the parent
segment and the children data represents the child segment. If an employee has three
children, then there would be three child segments associated with one employee
segment. In a hierarchical database the parentchild relationship is one to many. This
restricts a child segment to having only one parent segment. Hierarchical DBMSs were
popular from the late 1960s, with the introduction of IBM's Information Management
System (IMS) DBMS, through the 1970s.

Example.

Parent: Company

Child: Employee, CustomerName etc.


Hierarchical Structure

A Root-segment/Parent Segment

B
C D
Level 1

D1 D2 Level 2
B1 B2 C1 C2

This structure stores data as records which are connected through links. A record
is a collection of fields with each field containing only one value.

Typical example using a department database

Root-segment
Department

Level 1
Name Course
MatNo

Level 2
SLT Chemistry
FOS/…. FOS/… Alice Bob
.

Parent: Root-segment (Department)

Child: Level 1 and Level 2

Data flows in one direction and searching requires moving down the tree.
Advantages

a. It promotes data security


b. It promotes data independence
c. It promotes data integrity
d. It is useful for large databases
e. It is useful when users require a lot of transactions
f. It is suitable for large storage media.

Disadvantages
a. Requires knowledge of the physical level of data storage
b. Rigid structure, changes in relationships require a redesign
c. Difficult to query without predefined paths

-Network Model: Data is stored as nodes (entities) with many-to-many relationships


using link. Think of it like a web.
The popularity of the network data model coincided with the popularity of the hierarchical
data model. Some data were more naturally modelled with more than one parent per child.
So, the network model permitted the modelling of many-to-many relationships in data. In
1971, the Conference on Data Systems Languages (CODASYL) formally defined the
network model. The basic data modelling construct in the network model is the set
construct. A set consists of an owner record type, a set name, and a member record type. A
member record type can have that role in more than one set, hence the multi-parent
concept is supported. An owner record type can also be a member or owner in another set.
The data model is a simple network, and link and intersection record types (called junction
records by IDMS) may exist, as well as sets between them. Thus, the complete network of
relationships is represented by several pairwise sets; in each set some (one) record type is
owner (at the tail of the network arrow) and one or more record types are members (at the
head of the relationship arrow). Usually, a set defines a 1:M relationship, although 1:1 is
permitted. The CODASYL network model is based on mathematical set theory.

Example: An employee can work on multiple projects, and each project can have
multiple employees.
Network Model Examples

- A Sales Rep may have written many invoices. Each invoice is written by a
single Sales Rep.
- A customer may have made purchases in many occasions. Each
occasion corresponds to one invoice.
- An invoice may have many invoice lines. Each invoice line is found in an
invoice ticket.
- A product may appear on several invoice lines. Each invoice line contains
only a single product.

SalesRep Customer

Product Payment
Invoice

Invoice Line
Network Model Example

Network Model is like Hierarchical model only that a member (child) can have more
than one owner (Parent).

Advantages

a. Improves on the Hierarchical model.


b. It handles complex relationships (many-to-many) effectively.
c. More flexible than the hierarchical method.
d. Data access is fast and efficient.

Disadvantages

a. Complex structure: with many pointers (links) can make it hard to manage.
b. Requires skilled users to navigate and query.
c. Less popular compared to relational models reducing support.

-Relational Model: Data is stored and organized in tables (rows and columns). It is
flexible and easy to query using SQL.
Properties of Relational Model:
- Values are atomic: Values cannot be broken down further into smaller
meaningful components. INDIVISIBLE values.
- Each row is unique.
- Column values are of the same kind.
- Each column has a unique name.
- The sequence of columns is insignificant.
- The sequence of rows is insignificant.

Advantages

a. Easy to use and understand (tables and columns).


b. Highly flexible, supports dynamic queries using SQL.

Disadvantages

a. Slower for very large databases compared to the hierarchical and network model.
b. Requires more storage due to the table structures.
c. Ease of use can be a liability i.e. there could be possible misuse.

In summary: Hierarchical model is simple and rigid, Network model is flexible but
complex and Relational model is user-friendly but may struggle with massive datasets.

Dependencies in DBMS

In a DBMS, dependencies are relationships between data that determine how data in
one part of the database is connected to another part of the database. Basically,
dependency is the relationship between two or more attributes in a database. They are
crucial for organizing, managing and ensuring the database operates correctly.

Here’s a simple explanation on what dependencies are:

• Think of data in a database like different sections on a puzzle. Dependencies are


the rules that say, this piece connects to that piece.
• These rules help keep the data consistent and meaningful.

Types of Dependencies

There are different kinds of dependencies such as:

- Functional Dependency
- Transitive Dependency
- Multivalued Dependency
- Partial Dependency
- Fully Functional Dependency

1. Functional Dependency: A functional dependency exists when one attribute


determines another. One column (or set of columns) uniquely determines
another column.

Functional dependency is represented by an arrow sign ( ) that is, A B,


where A functionally determines B. The left-hand side attributes determine the
value on the right-hand side.
A B: A functionally determines B.

Example: Student MatNumber determines a student’s name.

MatNumber Name Course

001 John Science

002 Grace Art

MatNumber Name

In the student table above, Name is functional dependent on Matric.No


because you can only have one value for Name for a given Matric No.

• If you know the MatNumber (001), then you can find the name (John).

2. Transitive Dependency: When an indirect relationship causes functional


dependency, it is called transitive dependency. So basically, one piece of data
depends on another indirectly through a third piece. It occurs when a non-prime
key attribute depends on another non-key prime attribute.
If A B and B C, then A C is a transitive dependency.

Example:

MatNo CourseID CourseName


001 101 Intro. to computer
Science
• Knowing the MatNo, can eventually lead you to the course name.
• MatNo CourseID CourseName
• Knowing the student MatNo can eventually lead you to the CourseName.

3. Multi-valued Dependency: This occurs when one or more rows in a table


implies one or more rows in the same table. Basically, one piece of data
determines a set of values for another, independently of other data.
Example:
• A book with multiple authors.
• Book ID Author 1, Author 2, Author 3 etc.

4. Partial Dependency: Partial Dependency occurs when non-prime key


attributes (non-key columns) depend on part of a composite key.

5. Fully Functional Dependency: An attribute is said to be fully dependent on


another attribute if it is functionally dependent on that attribute and not on any of
its proper subset. For instance, an attribute A is fully functional dependent on
attribute B if it is functionally dependent on B and not on any proper subset of B.
<Courses>
CcodeCunit
Csc2013
Csc2003
Mth1012

<Student score>
MatNo Ccode Score
001 Csc203 70
002 Csc206 45

The above tables state that Matno, Ccode, Cunit → Score. However, this is not
fully functional dependent. Whereas the subset {Matno, Ccode} can easily
determine the score of a student on a course. Thus {Matno, Ccode →(score)
gives fully functional dependency.

You might also like