0% found this document useful (0 votes)
20 views4 pages

Data Models

The document discusses data models and their importance in database design. It describes the basic building blocks of data models including entities, attributes, relationships, and constraints. It also discusses business rules and how they are used to define the components of a data model.

Uploaded by

edwinmwenda202
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Data Models

The document discusses data models and their importance in database design. It describes the basic building blocks of data models including entities, attributes, relationships, and constraints. It also discusses business rules and how they are used to define the components of a data model.

Uploaded by

edwinmwenda202
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

DATA MODELS

A data model is a relatively simple representation, usually graphical, of more complex real-world data
structures. Within the database environment, a data model represents data structures and their
characteristics, relations, constraints, transformations, and other constructs with the purpose of supporting
a specific problem domain.

Data modeling is an iterative, progressive process. You start with a simple understanding of the problem
domain, and as your understanding of the problem domain increases, so does the level of detail of the data
model. Done properly, the final data model is in effect a “blueprint” containing all the instructions to
build a database that will meet all end-user requirements. This blueprint is narrative and graphical in
nature, meaning that it contains both text descriptions in plain, unambiguous language and clear, useful
diagrams depicting the main data elements.

THE IMPORTANCE OF DATA MODELS


Data models can facilitate interaction among the designer, the applications programmer, and the end user.
A well-developed data model can even foster improved understanding of the organization for which the
database design is developed. In short, data models are a communication tool.
The importance of data modeling cannot be overstated. Data constitute the most basic information units
employed by a system. Applications are created to manage data and to help transform data into
information. But data are viewed in different ways by different people. For example, contrast the (data)
view of a company manager with that of a company clerk. Although the manager and the clerk both work
for the same company, the manager is more likely to have an enterprise-wide view of company data than
the clerk.

The different users and producers of data and information often reflect the “blind people and the
elephant” analogy: the blind person who felt the elephant’s trunk had quite a different view of the
elephant from the one who felt the elephant’s leg or tail. What is needed is a view of the whole elephant.
Similarly, a house is not a random collection of rooms; if someone is going to build a house, he or she
should first have the overall view that is provided by blueprints. Likewise, a sound data environment
requires an overall database blueprint based on an appropriate data model.

When a good database blueprint is available, it does not matter that an applications programmer’s view of
the data is different from that of the manager and/or the end user. Conversely, when a good database
blueprint is not available, problems are likely to ensue. For instance, an inventory management program
and an order entry system may use conflicting product-numbering schemes, thereby costing the company
thousands (or even millions) of dollars.

DATA MODEL BASIC BUILDING BLOCKS


The basic building blocks of all data models are entities, attributes, relationships, and constraints. An
entity is anything (a person, a place, a thing, or an event) about which data are to be collected and stored.
An entity represents a particular type of object in the real world. Because an entity represents a particular
type of object, entities are “distinguishable”—that is, each entity occurrence is unique and distinct. For
example, a CUSTOMER entity would have many distinguishable customer occurrences, such as John
Smith, Pedro Dinamita, Tom Strickland, etc. Entities may be physical objects, such as customers or
products, but entities may also be abstractions, such as flight routes or musical concerts.
An attribute is a characteristic of an entity. For example, a CUSTOMER entity would be described by
attributes such as customer last name, customer first name, customer phone, customer address, and
customer credit limit. Attributes are the equivalent of fields in file systems.
A relationship describes an association among entities. For example, a relationship exists between
customers and agents that can be described as follows: an agent can serve many customers, and each
customer may be served by one agent. Data models use three types of relationships: one-to-many, many-
to-many, and one-to-one. Database designers usually use the shorthand notations 1:M or 1..*, M:N or
*..*, and 1:1 or 1..1, respectively. (Although the M:N notation is a standard label for the many-to-many
relationship, the label M:M may also be used.)
• One-to-many (1:M or 1..*) relationship. A painter paints many different paintings, but each one
of them is painted by only one painter. Thus, the painter (the “one”) is related to the paintings (the
“many”). Therefore, database designers label the relationship “PAINTER paints PAINTING” as
1:M. (Note that entity names are often capitalized as a convention, so they are easily identified.)
Similarly, a customer (the “one”) may generate many invoices, but each invoice (the “many”) is
generated by only a single customer. The “CUSTOMER generates INVOICE” relationship would
also be labeled 1:M.
• Many-to-many (M:N or *..*) relationship. An employee may learn many job skills, and each
job skill may be learned by many employees. Database designers label the relationship
“EMPLOYEE learns SKILL” as M:N. Similarly, a student can take many classes and each class
can be taken by many students, thus yielding the M:N relationship label for the relationship
expressed by “STUDENT takes CLASS.”
• One-to-one (1:1 or 1..1) relationship. A retail company’s management structure may require
that each of its stores be managed by a single employee. In turn, each store manager, who is an
employee, manages only a single store. Therefore, the relationship “EMPLOYEE manages
STORE” is labeled 1:1.

The preceding discussion identified each relationship in both directions; that is, relationships are
bidirectional:
• One CUSTOMER can generate many INVOICEs.
• Each of the many INVOICEs is generated by only one CUSTOMER.

A constraint is a restriction placed on the data. Constraints are important because they help to ensure data
integrity. Constraints are normally expressed in the form of rules. For example:
• An employee’s salary must have values that are between 6,000 and 350,000.
• A student’s GPA must be between 0.00 and 4.00.
• Each class must have one and only one teacher.

How do you properly identify entities, attributes, relationships, and constraints? The first step is to clearly
identify the business rules for the problem domain you are modeling.

BUSINESS RULES
From a database point of view, the collection of data becomes meaningful only when it reflects properly
defined business rules. A business rule is a brief, precise, and unambiguous description of a policy,
procedure, or principle within a specific organization. In a sense, business rules are misnamed: they apply
to any organization, large or small—a business, a government unit, a religious group, or a research
laboratory—that stores and uses data to generate information.

Business rules, derived from a detailed description of an organization’s operations, help to create and
enforce actions within that organization’s environment. Business rules must be rendered in writing and
updated to reflect any change in the organization’s operational environment.
Properly written business rules are used to define entities, attributes, relationships, and constraints. Any
time you see relationship statements such as “an agent can serve many customers, and each customer can
be served by only one agent,” you are seeing business rules at work.

Business rules describe, in simple language, the main and distinguishing characteristics of the data as
viewed by the company. Examples of business rules are as follows:
• A customer may generate many invoices.
• An invoice is generated by only one customer.
• A training session cannot be scheduled for fewer than 10 employees or for more than 30
employees.
Note that those business rules establish entities, relationships, and constraints. For example, the first two
business rules establish two entities (CUSTOMER and INVOICE) and a 1:M relationship between those
two entities. The third business rule establishes a constraint (no fewer than 10 people and no more than 30
people), two entities (EMPLOYEE and TRAINING), and a relationship between EMPLOYEE and
TRAINING.

Discovering Business Rules


A faster and more direct source of business rules is direct interviews with end users. Unfortunately,
because perceptions differ, end users are sometimes a less reliable source when it comes to specifying
business rules. For example, a maintenance department mechanic might believe that any mechanic can
initiate a maintenance procedure, when actually only mechanics with inspection authorization can
perform such a task. Such a distinction might seem trivial, but it can have major legal consequences.
The database designer’s job is to reconcile any differences and verify the results of the reconciliation to
ensure that the business rules are appropriate and accurate.
The process of identifying and documenting business rules is essential to database design for several
reasons:
• They help to standardize the company’s view of data.
• They can be a communications tool between users and designers.
• They allow the designer to understand the nature, role, and scope of the data.
• They allow the designer to understand business processes.
• They allow the designer to develop appropriate relationship participation rules and constraints
and to create an accurate data model.
Not all business rules can be modeled. For example, a business rule that specifies that “no pilot can fly
more than 10 hours within any 24-hour period” cannot be modeled. However, such a business rule can be
enforced by application software.

As a general rule, a noun in a business rule will translate into an entity in the model, and a verb (active or
passive) associating nouns will translate into a relationship among the entities. For example, the business
rule “a customer may generate many invoices” contains two nouns (customer and invoices) and a verb
(generate) that associates the nouns. From this business rule, you could deduce that:
• Customer and invoice are objects of interest for the environment and should be represented by
their respective entities.
• There is a “generate” relationship between customer and invoice.

To properly identify the type of relationship, you should consider that relationships are bidirectional; that
is, they go both ways. For example, the business rule “a customer may generate many invoices” is
complemented by the business rule “an invoice is generated by only one customer.” In that case, the
relationship is one-to-many (1:M). Customer is the “1” side, and invoice is the “many” side.
As a general rule, to properly identify the relationship type, you should ask two questions:
• How many instances of B are related to one instance of A?
• How many instances of A are related to one instance of B?

For example, you can assess the relationship between student and class by asking two questions:
• In how many classes can one student enroll? Answer: many classes.
• How many students can enroll in one class? Answer: many students.
Therefore, the relationship between student and class is many-to-many (M:N).
NAMING CONVENTIONS
Entity names should be descriptive of the objects in the business environment, and use terminology that is
familiar to the users. An attribute name should also be descriptive of the data represented by that attribute.
It is also a good practice to prefix the name of an attribute with the name of the entity (or an abbreviation
of the entity name) in which it occurs. For example, in the CUSTOMER entity, the customer’s credit limit
may be called CUS_CREDIT_LIMIT. The CUS indicates that the attribute is descriptive of the
CUSTOMER entity, while CREDIT_LIMIT makes it easy to recognize the data that will be contained in
the attribute.

THE EVOLUTION OF DATA MODELS


The quest for better data management has led to several models that attempt to resolve the file system’s
critical shortcomings. These models represent schools of thought as to what a database is, what it should
do, the types of structures that it should employ, and the technology that would be used to implement
these structures.

Hierarchical and Network Models


The hierarchical model was developed in the 1960s to manage large amounts of data for complex
manufacturing projects such as the Apollo rocket that landed on the moon in 1969. Its basic logical
structure is represented by an upside-down tree. The hierarchical structure contains levels, or segments. A
segment is the equivalent of a file system’s record type. Within the hierarchy, a higher layer is perceived
as the parent of the segment directly beneath it, which is called the child. The hierarchical model depicts a
set of one-to-many (1:M) relationships between a parent and its children segments. (Each parent can have
many children, but each child has only one parent.)

The network model was created to represent complex data relationships more effectively than the
hierarchical model, to improve database performance, and to impose a database standard. In the network
model, the user perceives the network database as a collection of records in 1:M relationships. However,
unlike the hierarchical model, the network model allows a record to have more than one parent. While the
network database model is generally not used today, the definitions of standard database concepts that
emerged with the network model are still used by modern data models. Some important concepts that
were defined at this time are:
• The schema, which is the conceptual organization of the entire database as viewed by the
database administrator.
• The subschema, which defines the portion of the database “seen” by the application programs
that actually produce the desired information from the data contained within the database.
• A data management language (DML), which defines the environment in which data can be
managed and to work with the data in the database.
• A schema data definition language (DDL), which enables the database administrator to define
the schema components.

As information needs grew and as more sophisticated databases and applications were required, the
network model became too cumbersome. The lack of ad hoc query capability put heavy pressure on
programmers to generate the code required to produce even the simplest reports. And although the
existing databases provided limited data independence, any structural change in the database could still
produce havoc in all application programs that drew data from the database. Because of the disadvantages
of the hierarchical and network models, they were largely replaced by the relational data model in the
1980s.

You might also like