Topic 8: Database Architecture
Database architecture focuses on the design, development, implementation and maintenance
of database
A database consists of two parts: (1) the database schema describes the structure of the database
and (2) the database instance is the actual content of one particular database. A database schema
is the same irrespective of the contents of the database. There is only one schema for a database.
A database instance changes depending on the data in the database. For example, a database
instance changes as new data is added to the database.
In designing a database, the database designer constructs a database schema by selecting the data
items to be stored in the database.
Complex databases can consist of thousands of data items and be accessed by thousands of users.
To manage this structure database designers divide the users into user groups and provide each
user group with access to their own part of the database.
The way the database designer choses to organize the data into user groups and then convert these
groups into a database management system is called the database architecture.
The commonest database architecture is the three-schema architecture, often called the ANSI-
SPARC model.
The three main components of the three-schema architecture describe three levels of abstraction
within the database system: (1) external, (2) conceptual and (3) internal.
The purpose of these abstraction levels is to provide data independence. That is, each level is
separated from the level above it and the level below it.
Three Schema Architecture
User User User External
View 1 View 2 View 3
Level
Conceptual Conceptual
Schema
Level
Internal Internal
Schema
Level
Database
1
The three levels within the database architecture represent different views of the data.
The data is not held in three different locations; it is held in only one place, that is, the database.
The three levels (external, conceptual and internal) are only descriptions of data.
When a user requests data by writing a database query, they use the structures contained in their
view of the database (their external schema).
The user’s request must be converted from one of the external views to the conceptual view of the
database and then to the internal view of the database. Therefore, it must be possible to convert
between the different levels in the architecture.
Each level is defined in terms of the level below it. For example, the contents of the external level
are described using the information contained in the conceptual schema.
This process of expressing one level in terms of another is called a mapping. A mapping converts
the contents of one level into another, for example, a mapping between the conceptual level and
the external level converts the contents of the conceptual level into one or more external level
views.
When the user requests data from the database using an external schema description it is mapped
to the conceptual schema and then to the internal schema.
The main purpose of the three schema architecture is to separate each user’s view of the database
from other users.
Connelly et al give the following reasons why separating the user views is desirable:
1.Users should be able to view the data in the form that best suits their needs. The same
information may be viewed in more than one way.
2.Users are not concerned with how the database has been physically implemented but instead are
interested only in the content of the database.
3.The physical implementation of the database can be changed by the database administrator
without affecting how the users view the data.
4.Changing the physical structures used to implement the database should not affect the logical
structures in the database, for example, the tables or relations.
5.The global view of the database can be changed without affecting the users. Hence, new data
may be added to the database without affecting existing users.
EXTERNAL SCHEMA
Elmasri et al defines the external schema as “the part of the database that a particular user group
is interested in”. The external schema also “hides the rest of the database from that user group”.
Connolly et al defines the external schema as “the user’s view of the database”. The external level
describes the part of the database that is relevant to particular groups of users.
In an external view, a user will only see the entities, relationships and attributes that are of
particular interest to them. For example, they may see an employee’s address but not an
employee’s salary.
External views may represent the same data in different formats. For example, in a multi-national
company dates may be represented differently depending on the conventions of the country using
the external schemas.
2
The external schema may also contain data that is calculated from the database but not actually
stored in the database.
CONCEPTUAL SCHEMA
The conceptual schema is a description of what is in the database.
Elmasri et al defines the conceptual schema as “the structure of the whole database for a
community of users”.
Connolly et al defines the conceptual schema as “the community view of the database”.
The conceptual schema describes the content of the whole database. Its purpose is to hide the
details of how the database has been implemented in a particular database management system.
The conceptual schema presents a high level view of the database using entities, relationships and
attributes. It will also describe the security and integrity constraints of the whole database.
The external schemas are extracted from the conceptual schema by selecting which entities will be
visible in an external schema.
INTERNAL SCHEMA
The internal schema describes how the database is implemented.
Elmasri et al defines the internal schema as “describing the physical storage structure of the
database”.
Connolly et al defines the internal schema as “the physical representation of the database on the
computer”.
The internal schema will describe the set of operating files and index structures that are used to
store the data.
The purpose of the internal schema is to allow the database administrator to achieve the optimal
implementation of the database, for example, the maximum query performance or the minimum
storage space usage.
The internal schema will describe the mapping between database relations and files, the types of
indexes used, the allocation of disc blocks to the database, the method of record placement for
each relation and any data compression or encryption techniques to be used.
There must be a mapping between the internal schema and the conceptual schema. This mapping
will describe the relationship between operating system file structures and the entities and
relationships in the conceptual schema.
DATA INDEPENDENCE
Elmasri et al define data independence as “the capacity to change the schema at one level of a
database system without having to change the schema at the next higher level”.
For example, when the internal schema of the database is changed (by adding a new index, for
instance) it is not necessary to change the entity-relationship description of the conceptual model.
Sometimes it is necessary to change the mapping between the internal schema and the conceptual
schema. For example, when a file is split in two and an entity is split between two files. However,
3
in this example it is not necessary to change the definition of an entity only the mapping between
the internal level and the conceptual level.
When the internal schema changes it is not necessary to change application programs that only
access the external schemas.
There are two types of data independence: (1) logical independence and (2) physical
independence.
Logical data independence is provided by the mapping between the conceptual schema and the
external schema.
Elmasri et al defines logical data independence as “the capacity to change the conceptual schema
without having to change the external schema or application programs”.
Connolly et al defines logical data independence as “the immunity of the external schema to
changes in the conceptual schema”.
When new data item is added to the conceptual schema it is not necessary to change the external
schema or application programs because the data they use is unaffected by the new data.
For example, adding a new attribute salary to the entity employee will not affect existing external
schemas. However, the external schemas will not know that the salary attribute exists. If the
external schema is to show the new data, or the application programs are to use the new data, then
they must be changed.
When a data item is deleted from the conceptual schema then the external schemas and application
programs that use the data item must be changed. This is because the data item is no longer
available.
Physical data independence is provided by the mapping between the internal schema and the
conceptual schema.
Elmasri et al defines physical data independence as “the capacity to change the internal schema
without having to change the conceptual (or external) schema”.
Connolly et al defines physical data independence as “the immunity of the conceptual schema to
changes in the internal schema”.
When the internal schema changes it is not necessary to change the conceptual schema. For
example, adding a new index should have no effect on the structure of the entities and relationships
in the conceptual schema.
Changing the internal schema will frequently affect the performance of the database. For example,
removing an index may slow down the speed of some queries.
If information is actually removed from the internal schema, for example, a file is deleted, then
this will affect the conceptual schema because the data belonging to an entity will be missing.