Assosa University
College of Computing and Informatics
Department of Computer Science
ADVANCED DATABASE SYSTEMS
NOTES, MULTIPLE CHOICE QUESTIONS AND ANSWER (MCQA)
Prepared by: Gebreigziabher Abadi (M.Sc in Computer Science)
[email protected]Advanced Database Systems Notes, MCQs and Answers with Explanation, 16 Mar, 2025
MARCH 20, 2025
Advanced Database Systems Notes, MCQs & Answers with Explanation
• Chapter One: Object-Oriented Database Overview
1. Database Definition:
o A database is a collection of data organized in a way that allows for efficient storage, access,
and management.
o It ensures data integrity and protects data from harm.
2. Database Management Systems (DBMS):
o DBMS is a system designed to simplify the management of databases.
3. Object-Oriented Databases (OODBs):
o OODBs are a type of DBMS that stores and manages data using object-oriented principles.
o Unlike traditional relational databases (which use tables and rows), OODBs organize data
as objects, similar to how objects are structured in object-oriented programming (OOP).
4. Advantages of OODBs:
o Enhanced support for complex data modeling, inheritance, and encapsulation.
o Seamless integration between the application's data model and the database structure.
Differences Between OODBs and Relational Databases
1. Data Organization:
o OODBs: Data is stored as objects, which encapsulate both data and behavior.
o Relational Databases: Data is stored in tables with rows and columns, following a structured
schema.
2. Data Modeling:
o OODBs: Support complex data modeling, including relationships, inheritance,
and polymorphism. Objects can be linked directly, reflecting real-world relationships.
o Relational Databases: Data is modeled using tables and defined relationships (e.g., primary
and foreign keys).
3. Query Language:
o OODBs: Use object-oriented query languages (e.g., OQL) or extensions to SQL for querying
and manipulating objects directly.
o Relational Databases: Use SQL for querying and manipulating structured data in tables.
4. Flexibility:
o OODBs: Offer flexibility in data representation and evolving data models. Support dynamic
changes to object structure and behavior.
o Relational Databases: Have a fixed schema, requiring schema modifications for changes in
the data model.
5. Object Persistence:
o OODBs: Provide built-in object persistence, allowing objects to exist beyond the scope of a
program. Objects can be stored, retrieved, and modified without losing their state.
1
o Relational Databases: Rely on Object-Relational Mapping (ORM) frameworks to map
objects to relational tables for persistence.
6. Application Focus:
o OODBs: Well-suited for applications with complex and evolving data models, such
as CAD, multimedia, and scientific applications.
o Relational Databases: Traditionally used in applications with structured and well-defined
data, such as financial systems and business applications.
Features of Object-Oriented Databases
1. Encapsulation and Abstraction:
o OODBs encapsulate data and behavior within objects, adhering to the principles
of encapsulation and abstraction in OOP.
o Objects encapsulate both data and related methods, making it easier to maintain and
manipulate data.
2. Object Persistence:
o OODBs offer built-in object persistence, ensuring that objects can exist beyond the lifetime
of the application.
o Objects can be stored, retrieved, and modified, preserving their state and relationships over
time.
3. Seamless Integration with OOP:
o OODBs align closely with object-oriented programming (OOP) concepts, enabling seamless
integration between the application's data model and the database structure.
o This reduces the impedance mismatch between the application code and the database,
improving development productivity.
4. Support for Complex Queries:
o OODBs support querying complex object structures, allowing queries that traverse and
retrieve objects based on their relationships and attributes.
o Object-oriented query languages, such as OQL, enable expressive and powerful queries on
object-oriented data.
Object-Oriented Concepts
1. Main Claim:
o OO databases aim to maintain a direct correspondence between real-world objects and
database objects, ensuring that objects do not lose their integrity and identity.
2. Object Definition:
o An object has two components:
▪ State (value): Represents the current value of the object.
▪ Behavior (operations): Represents the operations that can be performed on the object.
2
o Similar to a program variable in a programming language, but with a complex data
structure and specific operations defined by the programmer.
3. Object Identity, Structure, and Type Constructors:
o Unique Identity:
▪ Each object in an OO database has a unique identity, typically implemented via a system-
generated object identifier (OID).
▪ The OID is immutable (cannot change), preserving the identity of the real-world object
being represented.
o Object Structure:
▪ The state of a complex object is constructed using type constructors.
▪ An object can be viewed as a triple (i, c, v):
▪ i: Unique object identifier (OID).
▪ c: Type constructor (indicates how the object state is constructed).
▪ v: Object state (current value).
4. Type Constructors:
o Basic Constructors:
▪ Atom: Represents atomic values (e.g., integer, real number, Boolean, string).
▪ Tuple: Represents a tuple form of state value.
▪ Set: Represents a set of object identifiers.
o Commonly Used Constructors:
▪ List: Represents an ordered list of OIDs of objects of the same type.
▪ Array: Represents a single-dimensional array of object identifiers.
▪ Bag: Similar to a set but allows duplicate elements.
o Collection Types:
▪ Set, List, Array, and Bag are called collection types (or bulk types).
▪ The state of the object is a collection of objects, which may be unordered (e.g., set, bag)
or ordered (e.g., list, array).
Summary of OODBMS Features
1. High-Level Query Language:
o OODBMS provides a high-level query language with query optimization capabilities.
2. Support for Persistence, Transactions, and Concurrency:
o OODBMS supports persistence, atomic transactions, concurrency control, and recovery
control.
3. Complex Object Storage and Retrieval:
o OODBMS supports complex object storage, indexes, and access methods for fast and
efficient retrieval.
4. OODBMS Definition:
o OODBMS = object-oriented system + (1) high-level query language + (2) support for
persistence, transactions, and concurrency + (3) support for complex object storage and
retrieval.
3
• Chapter 2: Query Processing & Optimization
1. Overview:
o In declarative languages like SQL, users specify what data is required rather than how it is
retrieved.
o This relieves users from determining the best execution strategy, making the language more
universally usable.
2. Query Processing:
o Query processing involves parsing, validating, optimizing, and executing a query.
o The goal is to transform a high-level SQL query into a low-level execution
strategy (implementing relational algebra) and execute it to retrieve the required data.
3. Query Optimization:
o Query optimization is the process of choosing the most efficient execution strategy for
processing a query.
o The aim is to minimize resource usage and reduce total execution time.
o Optimization depends on database statistics (e.g., relations, attributes, indexes) to evaluate
different execution options.
Phases of Query Processing
1. Dynamic vs. Static Optimization:
o Dynamic Optimization:
▪ Query is parsed, validated, and optimized every time it is run.
▪ Advantage: Uses up-to-date information for optimization.
▪ Disadvantage: Performance is affected due to repeated parsing and optimization.
o Static Optimization:
▪ Query is parsed, validated, and optimized once.
▪ Advantage: Removes runtime overhead and allows more time for evaluating execution
strategies.
▪ Disadvantage: The chosen execution strategy may no longer be optimal when the query is
run.
Query Decomposition
1. Query Decomposition:
o The first phase of query processing.
o Transforms a high-level query into a relational algebra query.
o Ensures the query is syntactically and semantically correct.
o Stages include:
▪ Analysis
▪ Normalization
4
▪ Semantic Analysis
▪ Simplification
▪ Query Restructuring
2. Analysis:
o The query is lexically and syntactically analyzed.
o Verifies that relations and attributes specified in the query are defined in the system catalog.
o Example: A query with an undefined attribute or incompatible data type will be rejected.
3. Normalization:
o Converts the query into a normalized form for easier manipulation.
o Two common forms:
▪Conjunctive Normal Form (CNF): A sequence of conjuncts connected by AND operators.
▪ Disjunctive Normal Form (DNF): A sequence of disjuncts connected by OR operators.
4. Semantic Analysis:
o Rejects queries that are incorrectly formulated or contradictory.
o Example: A query with a predicate like (position = 'Manager' AND position = 'Assistant') is
contradictory.
5. Simplification:
o Detects redundant qualifications, eliminates common subexpressions, and transforms the
query into a more efficient form.
o Considers access restrictions, view definitions, and integrity constraints.
6. Query Restructuring:
o The final stage of query decomposition.
o Restructures the query to provide a more efficient implementation using transformation
rules.
Query Optimization
1. Basic Issues in Query Optimization:
o How to use available indexes.
o How to use memory for sorting and accumulating information.
o How to determine the order of joins.
2. Heuristic Query Optimization:
o Uses heuristic rules to modify the internal representation of a query (e.g., query tree or query
graph).
o Key heuristic: Apply SELECT and PROJECT operations before JOIN or other binary operations
to reduce file size.
3. Transformation Rules for Relational Algebra:
o Conjunctive Selection: Cascades into individual selection operations.
o Commutative Selection: The order of selection operations can be swapped.
o Projection: Only the last projection in a sequence is required.
o Commutative Selection and Projection: If the predicate involves only attributes in the
projection list, selection and projection can commute.
5
o Commutative Selection and Join: If the selection predicate involves only attributes of one
relation, selection and join can commute.
o Associative Join: Join operations are associative.
Cost Estimation for Relational Algebra Operations
1. Database Statistics:
o Statistics include:
▪ Number of tuples (nTuples).
▪ Block factor (bFactor).
▪ Number of blocks (nBlocks).
▪ Number of distinct values for each attribute (nDistinctA).
▪ Selection cardinality (SCA).
2. Selection Operation:
o Different strategies for selection depend on the file structure and indexing:
▪ Linear Search: Full table scan; cost depends on the number of blocks.
▪ Binary Search: For ordered files; cost is logarithmic.
▪ Equality on Hash Key: Cost is 1 if no overflow.
▪ Equality on Primary Key: Uses primary index; cost is low.
▪ Inequality on Primary Key: Uses index to locate tuples.
▪ Equality on Clustering Index: Uses clustering index; cost depends on the number of
matching tuples.
▪ Equality on Non-Clustering Index: Uses non-clustering index; cost depends on the
number of matching tuples.
▪ Inequality on B+ Tree Index: Uses B+ tree index; cost depends on the range of values.
3. Join Operation:
o Join is the most time-consuming operation.
o Strategies include:
▪ Block Nested Loop Join: Iterates over tuples in one relation and matches them with
tuples in another relation.
▪ Indexed Nested Loop Join: Uses an index on the join attribute of the inner relation.
▪ Sort-Merge Join: Sorts both relations and merges them.
▪ Hash Join: Partitions relations using a hash function and joins matching partitions.
4. Cost Estimation for Join:
o Block Nested Loop Join: Cost depends on the number of blocks in both relations.
o Indexed Nested Loop Join: Cost depends on the number of tuples in the outer relation and
the index lookup cost.
o Sort-Merge Join: Cost includes sorting and merging both relations.
o Hash Join: Cost includes partitioning and joining the relations.
6
Summary of Query Optimization
1. Heuristic Rules:
o Perform selection operations as early as possible.
o Combine Cartesian product with a subsequent selection into a join.
o Use associativity of binary operations to rearrange leaf nodes.
o Perform projection operations as early as possible.
o Compute common expressions once.
2. Cost Estimation:
o The success of query optimization depends on accurate and up-to-date database statistics.
o The dominant cost in query processing is disk access, including:
▪ Searching for data blocks.
▪ Reading and writing data blocks.
▪ Storing intermediate files.
▪ CPU usage and buffering.
• MCQs with Step-by-Step Query Optimization Calculations - Staff
o MCQs that cover Selection, Projection, and Join Operations using the formulas and strategies
for heuristic cost estimation.
Selection Operation MCQs
1. Linear Search on Unordered File
Question:
Given a relation Staff with nTuples(Staff) = 3000 and nBlocks(Staff) = 100, what is the cost of a linear
search for a non-key attribute?
• A) 50 blocks
• B) 100 blocks
• C) 150 blocks
• D) 200 blocks
Answer: B) 100 blocks
Explanation:
For a linear search on an unordered file with no index, the entire file must be scanned. The cost is
equal to the number of blocks in the file.
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓) = 100 blocks
2. Binary Search on Ordered File
Question:
Given a relation Staff with nTuples(Staff) = 3000 and nBlocks(Staff) = 100, ordered on the primary
key staffNo, what is the cost of a binary search for the predicate staffNo = 'SG5'?
• A) 7 blocks
• B) 10 blocks
7
• C) 50 blocks
• D) 100 blocks
Answer: A) 7 blocks
Explanation:
For a binary search on an ordered file, the cost is logarithmic to the number of blocks.
Cost = ⌈log2 (𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓))⌉ = ⌈log2 (100)⌉ = 7 blocks
3. Equality Condition on Hash Key
Question:
Given a relation Staff with a hash index on staffNo (primary key) and no overflow, what is the cost
of a Selection operation with the predicate staffNo = 'SG5'?
• A) 1 block
• B) 2 blocks
• C) 50 blocks
• D) 100 blocks
Answer: A) 1 block
Explanation:
For a hash index with no overflow, the cost is 1 block because the hash function directly points to
the block containing the tuple.
Cost = 1 block
4. Equality Condition on Clustering Index
Question:
Given a relation Staff with nTuples(Staff) = 3000, nBlocks(Staff) = 100, and a clustering index
on branchNo with 2 levels and SCbranchNo(Staff) = 6, what is the cost of a selection operation with
the predicate branchNo = 'B003'?
• A) 2 blocks
• B) 3 blocks
• C) 6 blocks
• D) 10 blocks
Answer: B) 3 blocks
Explanation:
For a clustering index, the cost is calculated as:
𝑆𝐶𝑏𝑟𝑎𝑛𝑐ℎ𝑁𝑜(𝑆𝑡𝑎𝑓𝑓) 6
Cost = nLevels(I) + ⌈ ⌉ = 2 + ⌈ ⌉ = 3 blocks
𝑏𝐹𝑎𝑐𝑡𝑜𝑟(𝑆𝑡𝑎𝑓𝑓) 30
8
5. Inequality Condition on B+ Tree Index
Question:
Given a relation Staff with nTuples(Staff) = 3000, nBlocks(Staff) = 100, and a B+ tree index
on salary with 2 levels and 50 leaf blocks, what is the cost of a selection operation with the
predicate salary > 20000?
• A) 2 blocks
• B) 50 blocks
• C) 1527 blocks
• D) 3000 blocks
Answer: C) 1527 blocks
Explanation:
For a B+ tree index, the cost of an inequality condition is calculated as:
nLfBlocks(I) 𝑛𝑇𝑢𝑝𝑙𝑒𝑠(𝑆𝑡𝑎𝑓𝑓)
Cost = nLevels(I) + ⌈ ⌉+⌈ ⌉ = 2 + 25 + 1500 = 1527 blocks
2 2
Projection Operation MCQs
6. Projection on Non-Indexed Attribute
Question:
Given a relation Staff with nTuples(Staff) = 3000 and nBlocks(Staff) = 100, what is the cost of a
projection operation on FName?
• A) 50 blocks
• B) 100 blocks
• C) 150 blocks
• D) 200 blocks
Answer: B) 100 blocks
Explanation:
The cost of a projection operation is equal to the number of blocks in the relation, as the entire
relation must be scanned.
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓) = 100 blocks
Join Operation MCQs
7. Block Nested Loop Join (Join 1: Staff ∞ staffNo PropertyForRent)
Question:
Given two relations Staff with nBlocks(Staff) = 200 and PropertyForRent with
nBlocks(PropertyForRent) = 2000, and 100 buffer blocks, what is the cost of a Block Nested Loop Join
for Staff ∞ staffNo PropertyForRent?
• A) 200 blocks
• B) 2000 blocks
• C) 4000 blocks
• D) 4282 blocks
9
Answer: D) 4282 blocks
Explanation:
The cost of a Block Nested Loop Join is calculated as:
𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓)
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓) + (𝑛𝐵𝑙 𝑜𝑐 𝑘𝑠(𝑃 𝑟𝑜𝑝𝑒𝑟𝑡𝑦𝐹 𝑜𝑟𝑅 𝑒𝑛𝑡)×⌈ ⌉)
𝑛𝐵𝑢𝑓𝑓𝑒𝑟 − 2
200
Cost = 200 + (2000×⌈ 98 ⌉) = 4282 blocks
8. Block Nested Loop Join (Join 2: Branch ∞ branchNo PropertyForRent)
Question:
Given two relations Branch with nBlocks(Branch) = 10 and PropertyForRent with nBlocks(PropertyForRent) =
2000, and 100 buffer blocks, what is the initial cost of a Block Nested Loop Join for Branch ∞ branchNo
PropertyForRent?
• A) 10 blocks
• B) 2000 blocks
• C) 2010 blocks
• D) 20,010 blocks
Answer: D) 20,010 blocks
Explanation:
The initial cost of a Block Nested Loop Join is calculated as:
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝐵𝑟𝑎𝑛𝑐ℎ) + (𝑛𝐵𝑙 𝑜𝑐 𝑘 𝑠(𝐵 𝑟𝑎 𝑛𝑐 ℎ)×𝑛𝐵𝑙 𝑜𝑐 𝑘 𝑠(𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦𝐹 𝑜𝑟𝑅𝑒𝑛𝑡))
Cost = 10 + (10 × 2000) = 20,010 blocks
9. Block Nested Loop Join (Join 2: Branch ∞ branchNo PropertyForRent) - Optimized Cost
Question:
Given two relations Branch with nBlocks(Branch) = 10 and PropertyForRent with nBlocks(PropertyForRent)
= 2000, and 100 buffer blocks, what is the optimized cost of a Block Nested Loop Join for Branch ∞
branchNo PropertyForRent if all blocks of Branch fit into the buffer?
• A) 10 blocks
• B) 2000 blocks
• C) 2010 blocks
• D) 20,010 blocks
Answer: C) 2010 blocks
Explanation:
If all blocks of Branch fit into the buffer, the optimized cost is:
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝐵𝑟𝑎𝑛𝑐ℎ) + 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑃𝑟𝑜𝑝𝑒𝑟𝑡𝑦𝐹𝑜𝑟𝑅𝑒𝑛𝑡)
Cost = 10 + 2000 = 2010 blocks
10
10. Indexed Nested Loop Join (Join 1: Staff ∞ staffNo PropertyForRent)
Question:
Given two relations Staff with nBlocks(Staff) = 200 and PropertyForRent with
nBlocks(PropertyForRent) = 2000, and an index on staffNo (primary key) in Staff, what is the cost of
an Indexed Nested Loop Join for Staff ∞ staffNo PropertyForRent?
• A) 200 blocks
• B) 2000 blocks
• C) 6200 blocks
• D) 8200 blocks
Answer: C) 6200 blocks
Explanation:
The cost of an Indexed Nested Loop Join is calculated as:
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝑆𝑡𝑎𝑓𝑓) + 𝑛𝑇𝑢𝑝𝑙𝑒𝑠(𝑆𝑡𝑎𝑓𝑓)
Cost = 200 + 6000 = 6200 blocks
11. Indexed Nested Loop Join (Join 2: Branch ∞ branchNo PropertyForRent)
Question:
Given two relations Branch with nBlocks(Branch) = 10 and PropertyForRent with
nBlocks(PropertyForRent) = 2000, and an index on branchNo (primary key) in Branch, what is the
cost of an Indexed Nested Loop Join for Branch ∞ branchNo PropertyForRent?
• A) 10 blocks
• B) 500 blocks
• C) 510 blocks
• D) 2000 blocks
Answer: C) 510 blocks
Explanation:
The cost of an Indexed Nested Loop Join is calculated as:
Cost = 𝑛𝐵𝑙𝑜𝑐𝑘𝑠(𝐵𝑟𝑎𝑛𝑐ℎ) + 𝑛𝑇𝑢𝑝𝑙𝑒𝑠(𝐵𝑟𝑎𝑛𝑐ℎ)
Cost = 10 + 500 = 510 blocks
Key Points
• Linear Search and Binary Search are used for selection operations.
• Clustering Index and B+ Tree Index are used for indexed selection.
• Block Nested Loop Join, Indexed Nested Loop Join, Sort-Merge Join, and Hash Join are used
for join operations.
11