DBMS - July2021 Answer Key
DBMS - July2021 Answer Key
Tech
Degree Examination July 2021 (2019 Scheme)
Course Code: CST204
Course Name: DATABASE MANAGBMENT SYSTEMS
1. List any three categories of database users, highlighting any one important characteristic of
each category.
a. Database Administrators
A person who has central control over the system is called DBA. In a database
environment, the primary resource is the database itself, and the secondary resource is
the DBMS and related software. Administering these resources is the responsibility of
the database administrator (DBA).
b. Database Designers
Database designers are responsible for identifying the data to be stored in the database
and for choosing appropriate structures to represent and store this data.
c. System Analyst and Application Programmers
System analysts determine the requirements of end users, especially naive and
parametric end users, and develop specifications for standard canned transactions that
meet these requirements. Application programmers implement these specifications
as programs. Then they test debug, document and maintained these canned
transactions
2. What are the major difference between structured, unstructured and semi structured data
3. What is entity integrity? Why is it important?
The entity integrity constraint states that primary key value can't be null.
This is because the primary key value is used to identify individual rows in relation and if the
primary key has a null value, then we can't identify those rows.
4. Distinguish between Super key, Candidate key, and Primary key using a real convincing
example.
Candidate Key
A candidate key is a set of one or more columns that can uniquely identify a row within a table.
A table can have multiple candidate keys, but can only have one primary key.
A relation may have more than one key. In such cases each of the keys are called candidate
keys
Primary Key
A primary key constraint is a column that uniquely identifies every row in the table of the
relational database management system.
Example: Consider a relational schema : student(Roll_no, Name, Registration_no)
In the above example, we saw that we have two candidate keys i.e (Roll_no) and
(Registration_no).
Primary Key: Any candidate key can become a primary key ( Roll no or Registration no)
For example,
• a trigger can be invoked when a row is inserted into a specified table or
when certain table columns are being updated.
• it may be useful to specify a condition that, if violated, causes some user to
be informed of the violation.
• When an event occur, database trigger is fired and a predefined PL/SQL
block will perform necessary action.
Create a row-level trigger for the customers table that would fire for INSERT or UPDATE
or DELETE operations performed on the CUSTOMERS table. This trigger will display the
salary difference between the old values and new values.
DDL is a set of SQL commands used to create, modify, and delete database structures but not
data.
The common DDL commands are create table, alter table, drop table, truncate ,rename
eg: CREATE TABLE <table name> (column1 datatype(size), column2 datatype(size),
column3 datatype(size),………, PRIMARY KEY(coulmn name));
DML is short name of Data Manipulation Language which deals with data manipulation and
includes most common SQL statements such SELECT, INSERT, UPDATE, DELETE, etc., and
it is used to store, modify, retrieve, delete and update data in a database.
The DELETE Statement in SQL is used to delete existing records from a table.
We can delete a single record or multiple records depending on the condition we specify in
the WHERE clause.
Eg: DELETE FROM <table_name> WHERE <condition>;
Insertion Anomalies
D e l e t i o n anomalies
M o d i f i c a t i o n anomalies
Deletion Anomalies
When a project is deleted, it will result in deleting all the employees who work on that
project.
Alternately, if an employee is the sole employee on a project, deleting that employee
would result in deleting the corresponding project.
Updation Anomalies
• EMP_DEPT, if we change the value of one of the attributes of a particular department
say,
the manager of department 5 we must update the tuples of all employees who work in
that department; otherwise, the database will become inconsistent.
• If we fail to update some tuples, the same department will be shown to have two
different values for manager in different employee tuples, which would be wrong
8. How can we conclude two FDs are equivalent?
1. If all FDs of FD1 can be derived from FDs present in FD2, we can say that FD2 ⊃
FD1.
2. If all FDs of FD2 can be derived from FDs present in FD1, we can say that FD1
⊃FD2.
3. If 1 and 2 both are true, FD1=FD2.
Transaction T1:
Transaction T2
Changing the mode of a lock that is already held is called lock conversion.
Lock conversion occurs when a process accesses a data object on which it already holds a lock,
and the access mode requires a more restrictive lock than the one already held. A process can
hold only one lock on a data object at any given time, although it can request a lock on the
same data object many times indirectly through a query.
If lock conversion is allowed, then upgrading of lock( from S(a) to X(a) ) is allowed in the
Growing Phase, and downgrading of lock (from X(a) to S(a)) must be done in shrinking
phase.
PART B
11. a A company has the following scenario: There are a set of salespersons. Some of them
manage other salespersons. However, a salesperson cannot have more than one manager. A
salesperson can be an agent for many customers. A customer is managed by exactly one
salesperson. A customer can place any number of orders. An order can be placed by exactly
one customer. Each order lists one or more items. An item may be listed in many orders. An
item is assembled from different parts and parts can be common for many items. One or more
employees assemble an item from parts. A supplier can supply different parts in certain
quantities. A part may be supplied by different suppliers.
• Identify and list entities, suitable attributes, primary keys, and relationships
to represent the scenario.
Entities & attributes
Sales_person - sp_id, name, age, address PK- sp_id
Customer - cust_id,name,address,phone PK -cust_id
Part - p_id,pname,ptype PK- p_id
Item – i_code,iname PK - i_code
Order – order id,order date PK – order id
Supplier -s_code,s-name,s_address,s_phone PK - s_code
Employer – emp_id,e_name,designation,salary PK -emp_id
Relationships
Is agent,places,supplies,assembles,manages
• Internal Level
• The internal level has an internal schema which describes the physical storage
structure of the database.
• The internal schema is also known as a physical schema.
• It uses the physical data model. It is used to define that how the data will be stored
in a block.
• The physical level is used to describe complex low-level data structures in detail.
Conceptual Level
• The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.
• The conceptual schema describes the structure of the whole database.
• The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
• In the conceptual level, internal details such as an implementation of the data structure
are hidden.
• Programmers and database administrators work at this level.
• In a DBMS based on the three-schema architecture, each user group refers only to its
own external schema.
• Hence, the DBMS must transform a request specified on an external schema into a
request against the conceptual schema, and then into a request on the internal schema
for processing over the stored database.
If the request is a database retrieval, the data extracted from the stored database must
be reformatted to match the user’s external view
The Conceptual/ Internal Mapping lies between the conceptual level and the internal level. Its
role is to define the correspondence between the records and fields of the conceptual level and
files and data structures of the internal level.
⚫ The structure ofdata filesisstored in the DBMScatalog separately from the accessprograms. This
property iscalled program-data independence
⚫ An operation (alsocalled a function or method) is specified in two parts.
Interface
The interface (or signature) of an operation includes the operation name and the data types of its arguments (or
parameters).
Implementation
Theimplementation (or method) of the operation is specified separately and can be changed without affecting the
database
• A database has many users, each user may require a different perspective or view of
the database.
• A view may be a subset of the database or it may contain virtual data that is derived
⚫ ensures that each transaction appears to execute in isolation from other transactions
⚫ even though hundreds of transactions maybe executing concurrently.
▫Atomicity property
⚫ ensures that either allthe databaseoperations in atransaction are executed or none are.
⚫ Anymechanicalor electrical deviceissubject to failure, andsois the computer system. Inthis casewehave
to ensure that data should be restored to aconsistent state.
⚫ Forexample anamount ofRs50 hasto be transferred from AccountAtoAccount B.
⚫ Let the amount has been debited from account Abut have not been credited to AccountBandin the meantime,
somefailure [Link], it will lead to an inconsistent [Link],wehaveto adopt amechanismwhichensures that
either full transaction should be executed or no transaction should be executed ie,the fund transfer should be
atomic.
13 a. Study the tables given below and write relational algebra expressions for the queries that follow.
PROFESSOR(PROFID,PNAME, PHONE)
Primary keys are underlined. ADVISOR is a foreign key referring to PROFESSOR table.
ROLLNO and COURSEID in ENROLLMENT are also foreign keys referring to the primary
• Roll Number and name of students who have not enrolled for any course.
13.b Explain the left outer join, right outer join, full outer join operations with examples
Full Outer Join (⟗)
Full Outer Join is a type of join in which all the tuples from the left and right
relation which are having the same value on the common attribute. Also, they will
have all the remaining tuples which are not common on in both the relations.
Notation: R1 ⟗R2 where R1 and R2 are relations.
14 a. Consider the following relations for a database that keeps track of business trips of salespersons
in a sales office:
• A trip can be charged to one or more accounts. Specify the foreign keys for this schema,
stating any assumptions you make.
• Write relation algebra expression to get the details of salespersons who have travelled
between Mumbai and Delhi and the travel expense is greater that Rs.50000.
• Write relation algebra expression to get the details of salesperson who had incurred the
greatest travel expenses among all travels made.
Foreign Key
Ssn in TRIP relation is the foreign key of that relation.
TripId in EXPENSE is the foreign key of that relation.
Write relation algebra expression to get the details of salespersons who have travelled between Mumbai
and Delhi and the travel expense is greater that Rs.50000.
Write relation algebra expression to get the details of salesperson who had incurred the greatest travel
expenses among all travels made.
e) DATE - Stores year, month, day, hour, minute and second values
f) BLOB - Binary Large Object (the data type in Oracle for storing binary files like executables, images
etc.)
15 a. Illustrate structure of B-Tree and B+ Tree and explain how they are different?
The B-tree has additional constraints that ensurethat the tree is always balanced.
A B-tree of order p, when used as an access structureon a key field to search for records in a
data file, can be defined as follows:
1. Each internal node in the B-tree is of the form
<P1, <K1, Pr1>, P2, <K2, Pr2>, ..., <Kq–1, Prq–1>,
Pq> where q ≤ p. Each Pi is a tree pointer—a pointer to another node
in the Btree. Each Pri is a data pointer—a pointer to the record whose
search key field value is equal to Ki.
2. Within each node, K1 < K2 < ... < Kq−1.
3. For all search key field values X in the subtree pointed at by Pi, we have: Ki–1 < X <
Ki for 1 < i < q; X < Ki for i= 1; and Ki–1 < X for i = q.
4. Each node has at most p tree pointers.
5. Each node, except the root and leaf nodes, has at least ┌(p/2)┐tree pointers. The root
node has at least two tree pointers unless it is the only node in the tree.
6. A node with q tree pointers, q ≤ p, has q – 1 search key field values (and hence has q –
1 data pointers).
7. All leaf nodes are at the same level. Leaf nodes have the same structure as internal nodes
except that all of their tree pointers Pi are NULL
B+tree
• The structure of the internal nodes of a B+ tree of order p is as follows:
Each internal node is of the form <P1, K1, P2, K2, ..., Pq– 1, Kq –1, Pq> where q ≤ p and
+
Difference between B tree and B tree
• In B+trees, search keys can be repeated but this is not the case forB-trees
• B+trees allow satellite data to be stored in leaf nodes only, whereas B-trees store data in
both leaf and internal nodes
• In B+trees, data stored on the leaf node makes the search moreefficient since we can
store more keys in internal nodes
• this means we need to access fewer nodes
• Deleting data from a B+tree is easier and less time consuming because we only need to
remove data from leaf nodes
• Leaf nodes in a B+tree are linked together making range search operations efficient and
quick
• Finally, although B-trees are useful, B+trees are more popular. In fact, 99% of
database management systems use B+trees for indexing.
• This is because the B+tree holds no data in the internal nodes.
• This maximizes the number of keys stored in a node thereby minimizing the number
of levels needed in a tree.
• Smaller tree depth invariably means faster search.
15.b What are the different types of single-level ordered indices? Explain.
Primary Indexes
• Primary index is defined on an ordered data file whose records are of fixed
length with two fields.
• The first field is of the same data type as the ordering key field(primary key) of
the data file.
• The second field is a pointer to a disk block ( a block address).
• There is one index entry (or index record) in the index file for each block in the
data file.
• The two field values of index entry i are referred as
<K(i),P(i)>
• Primary index is a nondense(sparse) index, since it includes an entry for each disk
block of the data file and the keys of its anchor record.
• Thus index file for primary indexing occupies a much smaller space than data file
for two reasons:
1. There are fewer index entries than the records in the data file.
2. Each index entry is smaller in size than a data record because it has only two
fields. So more index entries can fit in one block than data records.
Clustering Index
• The data file is ordered on a non-key field unlike primary index, which requires that the
ordering field of the data file have a distinct value for each record.
• Includes one index entry for each distinct value of the field;
• The index entry points to the first data block that contains records with that
field value.
Secondary Index
• A secondary index provides a secondary means of accessing a file for which some
primary access already exists.
• The secondary index may be on a field which is a candidate key and has a unique
value in every record, or a non-key with duplicate values.
• The index is an ordered file with two fields.
• The first field is of the same data type as some non-ordering field of the data file
that is an indexing field.
• The second field is either a block pointer or a record pointer.
• There can be many secondary indexes (and hence, indexing fields) for the same file.
• Includes one entry for each record in the data file; hence, it is a dense index.
Differentiate between static hashing and dynamic hashing.
Linear Probing − When a hash function generates an address at which data is already stored,
the next free bucket is allocated to it. This mechanism is called Open Hashing.
The problem with static hashing is that it does not expand or shrink dynamically as the size of
the database grows or shrinks. Dynamic hashing provides a mechanism in which data buckets
are added and removed dynamically and on-demand. Dynamic hashing is also known
as Extendible Hashing.
Main features of Extendible Hashing: The main features in this hashing technique are:
Directories: The directories store addresses of the buckets in pointers. An id is assigned to
each directory which may change each time when Directory Expansion takes place.
Buckets: They store the hashed keys. Directories point to buckets. A bucket may contain more
than one pointers to it if its local depth is less than the global depth.
Global Depth: It is associated with the Directories. They denote the number of bits which are
used by the hash function to categorize the keys. Global Depth = Number of bits in directory
id.
Local Depth: It is the same as that of Global Depth except for the fact that Local Depth is
associated with the buckets and not the directories. Local depth in accordance with the global
depth is used to decide the action that to be performed in case an overflow occurs. Local Depth
is always less than or equal to the Global Depth.
Bucket Splitting: When the number of elements in a bucket exceeds a particular size, then the
bucket is split into two parts.
Directory Expansion: Directory Expansion Takes place when a bucket overflows. Directory
Expansion is performed when the local depth of the overflowing bucket is equal to the global
depth.
The term Armstrong axioms refer to the sound and complete set of inference rules or axioms,
introduced by William W. Armstrong, that is used to test the logical implication of functional
dependencies
+
If F is a set of functional dependencies then the closure of F, denoted as F , is the set of all
functional dependencies logically implied by F.
Armstrong’s Axioms are a set of rules, that when applied repeatedly, generates a closure of
functional dependencies.
• Axiom of reflexivity
If A is a set of attributes and B is subset of A, then A holds B. If B⊆A then A→B
This property is trivial property.
• Axiom of augmentation
If A→B holds and Y is attribute set, then AY→BY also holds.
• Axiom of transitivity
Same as the transitive rule in algebra, if A→B holds and B→C holds, then A→C also
holds.
ii. Write an algorithm to compute the attribute closure of a set of attributes (X)under a
set of functional dependencies(F)
Input: A set F of FDs on a relation schema R, and a set of attributes X, which is a subset of R.
• Used to check whether an attribute (or set of attributes) forms a key or not. ...
• Used to find easily all the functional dependencies hold in a relation. ...
• Used as an alternate way to find Closure of FDs (F+).
17.b Explain the difference between BCNF and 3NF with an example
A relation is in the third normal form if there is no transitive dependency for non-prime
attributes as well as it is in the second normal form.
• It should be in 2NF
• No transitive Dependency
In other words 3NF can be explained like this: A table is in 3NF if it is in 2NF and for each
functional dependency X-> Y at least one of the following conditions hold:
▫ X is a super key of table
▫ Y is a prime attribute of table
• A table complies with BCNF if it is in 3NF and for every functional dependency X->Y,
X should be the super key of the table.
• That is, every relation in BCNF is also in 3NF; however, a relation in 3NF is not
necessarily in BCNF
•
18. a Consider the relation R = {A, B, C, D, E, F, G, H} and the set of functional dependencies
F = {A→DE, B→F, AB→C, C→GH, G→H}. What is the key for R? Decompose R into 2NF and
then 3NF relations.
18.b What is the lossless join property of decomposition? Why is it important?
A decomposition D = {R1, R2, ..., Rm} of R has the lossless(nonadditive) join property
with respect to the set of dependencies F on R if, for every relation state r of R that
satisfies F, the following holds, where * is the NATURAL JOIN of all the relations in
D: *(πR1(r), ..., πRm(r)) = r.
The nonadditive join or lossless join property, which guarantees that the spurious tuple
generation problem does not occur with respect to the relation schemas created after
decomposition
After PROJECT (π) and NATURAL JOIN (*) operations are applied on R1 and R2, no
spurious tuples are there in the result.
Recovery from transaction failures usually means that the database is restored to the
most recent consistent state just before the time of failure.
To do this, the system must keep information about the changes that were applied to
data items by the various transactions.
This information is typically kept in the system log.
Log is a sequence of records, which maintains the records of actions performed by a
transaction.
It is important that the logs are written prior to the actual modification and stored on a
stable storage media, which is failsafe.
The log file is kept on a stable storage media.
When a transaction enters the system and starts execution, it writes a log about it
as <Tn, Start>.
When the transaction modifies an item X, it write logs as follows:
<Tn,X,V1,V2 >
It reads Tn has changed the value of X, from V1 to V2
After a system crash has occurred, the system consults the log to determine which
transactions need to be redone, and which need to be undone so as to ensure atomicity.
Transaction Ti needs to be undone if the log contains the record <Ti,start>, but
does not contain either the record <Ti, commit> or the record <Ti,abort>.
Transaction Ti needs to be redone if the log contains the record <Ti,start> and
either the record <Ti, commit> or the record <Ti,abort>.
Deferred database modification – All logs are written on to the stable storage and the
database is updated when a transaction commits.
The deferred modification techniques do not physically update the database on disk
until after a transaction reaches its commit point ; then the updates are recorded in the
database.
Recovery mechanisms can help roll back or undo the effects of such failed transactions to
ensure data consistency
Serial Schedule
The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its
cycle, then the next transaction is executed.
Concurrent Schedule
If interleaving of operations is allowed, then there will be non-serial schedule.
It contains many possible orders in which the system can execute the individual operations of
the transactions.
conflict serializable
Transactions should possess several properties, often called the ACID properties
They should be enforced by the concurrency control and recovery methods of the
DBMS.
The following are the ACID properties:
1. Atomicity
2. Consistency
3. Isolation
4. Durability
Atomicity: A transaction is an atomic unit of processing; it should either be
performed in its entirety or not performed at all.
All or Nothing
Consistency preservation: A transaction should be consistency preserving, meaning
that if it is completely executed from beginning to end without interference from other
transactions, it should take the database from one consistent state to another.
Isolation: A transaction should appear as though it is being executed in isolation from
other transactions, even though many transactions are executing concurrently.
That is, the execution of a transaction should not be interfered with by any
other transactions executing concurrently.
Durability or permanency: The changes applied to the database by a committed
transaction must persist in the database.
These changes must not be lost because of any failure.