Chapter 2
Data Models
Learning Objectives
In this chapter, you will learn:
About data modeling and why data models are
important
About the basic data-modeling building blocks
What business rules are and how they influence
database design
2
Learning Objectives
In this chapter, you will learn:
How the major data models evolved
About emerging alternative data models and the
need they fulfill
How data models can be classified by their level of
abstraction
3
Data Modeling and Data Models
Data modeling: Iterative and progressive process of creating
a specific data model for a determined problem domain
• Data models: Simple representations of complex real-world
data structures
• Useful for supporting a specific problem domain
• Model - Abstraction of a real-world object or event
4
Are a communication tool
Give an overall view of the database
Importance
of Data
Organize data for various users
Models
Are an abstraction for the creation of
good database
5
Entity: Unique and distinct object used to
collect and store data
Data Model Attribute: Characteristic of an entity
Basic
One-to-many (1:M)
Relationship: Describes
Building an association among Many-to-many (M:N or
M:M)
entities
Blocks One-to-one (1:1)
Constraint: Set of rules to ensure data
integrity
6
Business Rules
Brief, precise, and unambiguous description
of a policy, procedure, or principle
Enable defining the basic building blocks
Describe main and distinguishing
characteristics of the data
7
Sources of Business Rules
COMPANY POLICY MAKERS DEPARTMENT WRITTEN DIRECT INTERVIEWS
MANAGERS MANAGERS DOCUMENTATION WITH END USERS
8
Reasons for Identifying and
Documenting Business Rules
Help standardize company’s view of data
Communications tool between users and designers
- Understand the nature, role, scope of data, and business processes
Allow designer to: - Develop appropriate relationship participation rules and constraints
- Create an accurate data model
9
Translating Business Rules into Data
Model Components
Nouns translate into entities
Verbs translate into relationships among entities
Relationships are bidirectional
Questions to identify the relationship type
How many instances of B are related to one instance of A?
How many instances of A are related to one instance of B?
10
Naming Conventions
Entity names - • Be descriptive of the objects in the business environment
Required to: • Use terminology that is familiar to the users
Attribute name • be descriptive of the data represented by the attribute
- Required to :
• Facilitates communication between parties
Proper naming: • Promotes self-documentation
11
12
Hierarchical Data Models
Hierarchical Models
Manage large amounts of
data for complex
manufacturing projects
Represented by an upside-
down tree which contains
segments
Segments: Equivalent of a file
system’s record type
Depicts a set of one-to-
many (1:M) relationships
13
Hierarchical Model
Advantages Disadvantages
Requires knowledge of physical
Promotes data sharing data storage characteristics
Parent/child relationship Navigational system requires
promotes conceptual simplicity knowledge of hierarchical path
and data integrity
Changes in structure require
Database security is provided changes in all application
and enforced by DBMS programs
Efficient with 1:M relationships Implementation limitations
No data definition
Lack of standards
Network Data
Models
Represent complex data
relationships
Improve database performance
and impose a database
standard
Depicts both one-to-many
(1:M) and many-to-many (M:N)
relationships
15
Network Model
Advantages Disadvantages
Conceptual simplicity System complexity limits
efficiency
Handles more relationship types
Navigational system yields
Data access is flexible
complex implementation,
Data owner/member relationship application development, and
promotes data integrity management
Conformance to standards Structural changes require
Includes data definition language changes in all application
(DDL) and data manipulation programs
language (DML)
Standard Database Concepts
Schema Subschema
• Conceptual • Portion of the database
organization of the seen by the application
entire database as programs that produce
viewed by the database the desired information
administrator from the data within the
database
16
Standard Database Concepts
Data manipulation Schema data definition
language (DML) language (DDL)
• Environment in which • Enables the database
data can be managed administrator to define
and is used to work the schema
with the data in the components
database
17
Produced an automatic
transmission database
that replaced standard
transmission databases
Tuple: Rows
The Based on a relation
Relation or table: Matrix
composed of
intersecting tuple and
Relational attribute
Attribute: Columns
Model Describes a precise set
of data manipulation
constructs
18
19
Relational Model
Advantages Disadvantages
Structural independence is Requires substantial hardware
promoted using independent and system software overhead
tables
Conceptual simplicity gives
Tabular view improves untrained people the tools to use
conceptual simplicity a good system poorly
Ad hoc query capability is May promote information
based on SQL problems
Isolates the end user from
physical-level details
Improves implementation and
management simplicity
Performs basic functions
provided by the hierarchical
and network DBMS systems
Relational Database Makes the relational data
Management model easier to understand
and implement
System (RDBMS)
Hides the complexities of
the relational model from
the user
20
A Relational Diagram
21
SQL-Based Relational Database
Application
End-user interface
Allows end user to interact with the data
Collection of tables stored in the database
Each table is independent from another
Rows in different tables are related based on common values in
common attributes
SQL engine
Executes all queries
22
The Entity Relationship Model
Graphical representation of entities and their relationships in a database structure
Entity relationship diagram (ERD)
Uses graphic representations to model database components
Entity instance or entity occurrence
Rows in the relational table
Connectivity: Term used to label the relationship types
23
24
Entity Relationship Model
Advantages Disadvantages
Visual modeling yields Limited constraint
conceptual simplicity representation
Visual representation makes Limited relationship
it an effective representation
communication tool No data manipulation
Is integrated with the language
dominant relational model Loss of information content
occurs when attributes are
removed from entities to
avoid crowded displays
The ER Model Notations
25
The Object-Oriented Data Model
(OODM) or Semantic Data Model
Object-oriented database management system(OODBMS)
Based on OODM
Object: Contains data and their relationships with operations that are
performed on it
Basic building block for autonomous structures
Abstraction of real-world entity
Attributes - Describe the properties of an object
26
The Object-Oriented Data Model
(OODM)
Class: Collection of similar objects with shared structure and behavior
organized in a class hierarchy
Class hierarchy: Resembles an upside-down tree in which each class
has only one parent
Inheritance: Object inherits methods and attributes of parent class
Unified Modeling Language (UML)
Describes sets of diagrams and symbols to graphically model a
system
27
28
Object-Oriented Model
Advantages Disadvantages
Semantic content is Slow development of
added standards caused
vendors to supply their
Visual representation own enhancements
includes semantic Compromised widely
content accepted standard
Inheritance promotes Complex navigational
data integrity system
Learning curve is steep
High system overhead
slows transactions
A Comparison of OO, UML, and ER
Models
29
Object/Relational
and XML
Extended relational data model (ERDM)
Supports OO features and complex data
representation
Object/Relational Database Management
System (O/R DBMS)
Based on ERDM, focuses on better
data management
Extensible Markup Language (XML)
Manages unstructured data for efficient
and effective exchange of all data types
30
Aims to: Characteristics
Find new and better ways Volume
to manage large amounts Velocity
of web and sensor-
generated data Variety
Provide high performance Veracity
and scalability at a Value
reasonable cost
31
Volume does not allow the
usage of conventional
structures
Big Data Challenges Expensive
OLAP tools proved
inconsistent dealing with
unstructured data
32
Big Data New Technologies
Hadoop Distributed
Hadoop
File System (HDFS)
MapReduce NoSQL
33
NoSQL Databases
Provide high
Support distributed
Not based on the scalability, high
database
relational model availability, and
architectures
fault tolerance
Geared toward
Support large
performance rather Store data in key-
amounts of sparse
than transaction value stores
data
consistency
34
TI1
35
NoSQL
Advantages Disadvantages
High scalability, availability, and Complex programming is
fault tolerance are provided required
Uses low-cost commodity There is no relationship
hardware support
Supports Big Data There is no transaction
Key-value model improves integrity support
storage efficiency
Slide 35
TI1 pleease check the line marked in red. i didnt understand why its an disadvantage, given same in pdf
Tejas Iyer, 29/1/2014
A Simple Key-value Representation
36
The Evolution of Data Models
37
Data Model Basic Terminology Comparison
38
Data Abstraction Levels
39
The External Model
1 2 3
End users’ view of ER diagrams are External schema:
the data used to represent Specific
environment the external views representation of an
external view
40
External Models for Tiny College
41
The Conceptual Model
Conceptual schema: Basis
Represents a global view
for the identification and Has a macro-level view of
of the entire database by
high-level description of data environment
the entire organization
the main data objects
Logical design: Task of
Is software and hardware
creating a conceptual data
independent
model
42
Conceptual Model for Tiny College
43
The Internal Model
Representing database as seen by the DBMS mapping conceptual model to the DBMS
Internal schema: Specific representation of an internal model
Uses the database constructs supported by the chosen database
Is software dependent and hardware independent
Logical independence: Changing internal model without affecting the conceptual
model
44
Internal Model for Tiny College
45
The Physical Model
Operates at lowest level of abstraction
Describes the way data are saved on storage media such as disks or tapes
Requires the definition of physical storage and data access methods
Relational model aimed at logical level
Does not require physical-level details
Physical independence: Changes in physical model do not affect internal model
46
Levels of Data Abstraction
47