0% found this document useful (0 votes)
64 views15 pages

Database Design For Ordbms

Uploaded by

Sruthi Rajendran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views15 pages

Database Design For Ordbms

Uploaded by

Sruthi Rajendran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

DATABASE DESIGN FOR ORDBMS

Consider a scenario where several space probes continuously record videos, and the probe’s
location is periodically recorded along with the video stream. The data associated with each
probe includes:

1. A unique probe ID.


2. A video stream.
3. A sequence of (time, location) pairs representing the probe’s location over time.

RDBMS Database Design

In an RDBMS, the design must accommodate the limitations of relational tables and
primitive data types. A possible schema might look like this:

CREATE TABLE Probes_Loc ( pid INTEGER, time TIMESTAMP,lat REAL, long REAL,

PRIMARY KEY (pid, time));

CREATE TABLE Probes_Video ( pid INTEGER PRIMARY KEY,camera STRING,video


BLOB);

Drawbacks:

1. Video as BLOB: Videos are stored as BLOBs, requiring external application code to
manipulate video data.
2. Dispersed Location Information: Location data is spread across multiple tuples,
making it cumbersome to query sequences.
3. Separate Tables for Video and Location: Queries requiring both video and location
data necessitate joins, which can be inefficient.

-- Display video segment recorded between specified times for probe 10

SELECT P.videoFROM Probes_Video PWHERE P.pid = 10; (Additional application code


needed to extract the required segment)

Find earliest recording time for each probeSELECT L.pid, MIN(L.time)FROM Probes_Loc
L
GROUP BY L.pid;

ORDBMS Database Design

An ORDBMS allows more natural and efficient storage using ADTs and collection types.

Schema:
Note

Advantages:

1. Video as ADT: The video can be stored as an ADT (mpeg_stream), allowing the
definition of methods like display(), which can efficiently extract and display
segments.
2. Unified Location Sequence: Location sequences are stored as a single collection
(location_seq), making it easier to query and manipulate.
3. Elimination of Joins: Queries involving both video and location data do not require
joins, improving performance.

Object Identity in Database Design: Structured Types vs. Reference Types

For example, consider a structured typemy_theater tuple(ina integer, nametext, address text,
phone text) and the reference typetheater ref(theater_t)

Object Identity (OID) is crucial in database design, especially for managing large or
complex objects.

Deletion: Structured types are self-contained objects that are not affected by the deletion of
other objects. For example, an object of type my_theater with attributes like id, name,
address, and phone remains unchanged even if other objects or tables are deleted. On the
other hand, reference types depend on other objects. If an object referenced by a theater
reference type is deleted, the reference might become null or invalid. Therefore, while
structured types are independent and unaffected by deletions elsewhere, reference types are
dependent and can be impacted by the deletion of the objects they reference.

Updating: objects of structured types only change value if they are directly updated; they do
not automatically reflect changes made to other objects Updating the referenced object
propagates the change to all referencing objects. Therefore, while structured types are
independent and require direct updates, reference types are dependent and automatically
reflect updates made to their referenced objects.

 Sharing vs. Copying:

 Structured Types: Multiple independent copies; each must be updated separately.


 Reference Types: Single shared object; updates are reflected everywhere.

 Storage Overhead:

 Structured Types: Higher storage cost due to multiple copies of large objects.
 Reference Types: Lower storage cost; single instance referenced multiple times.

 Clustering:

 Structured Types: Parts stored together; efficient if accessed as a whole.


 Reference Types: Parts may be scattered; could be less efficient to assemble.

Object Identity versus Foreign Key

In databases, an Object Identifier (OID) is a unique way to identify any object stored in the
database. It can point to objects across different tables or within complex data structures. On
the other hand, Foreign Keys (FKs) are specific constraints that link data in one table to data
in another table, ensuring that relationships between tables are maintained correctly.

 OIDs: They offer flexibility because they can reference objects anywhere in the
database. However, they don't come with built-in checks to ensure data relationships
remain valid. It's up to the user to manage and update OID references correctly.
 In databases, Foreign Keys (FKs) are like rules that ensure data connections between
tables are correct. They link one table's column to another table's key (usually its
primary key). This linkage makes sure that values in one table's column match valid
values in another table's key column. For instance, if Table A has a FK pointing to
Table B's primary key, you can't delete a row from Table B if Table A still refers to it.
This prevents mistakes that could disrupt how data relates, keeping everything
organized and reliable in the database.
Extending the ER Model for ORDBMS Design

The traditional ER model, as discussed in Chapter 2, falls short when designing Object-
Relational Database Management Systems (ORDBMS). To address this, we need an
extended ER model that supports advanced features such as structured attributes (like sets,
lists, and arrays), object identifiers (OIDs) to uniquely identify entities, and the ability to
model entities with methods. Here’s how we can illustrate these enhancements using an
extended ER diagram:

1. Structured Attributes: In our extended ER diagram, entities can have attributes that
are structured types, such as lists containing tuples with multiple fields. For instance,
a Probes entity might have a structured attribute like location_sequence represented
as a list of tuples containing time, latitude, and longitude.
2. Abstract Data Types (ADTs) and Methods: Entities in an ORDBMS can also have
attributes that are abstract data types, indicated by dark ovals in our diagram. These
attributes can have their own methods associated with them. For example, the video
attribute of the Probes entity could be an ADT object with methods like play() or
analyze(), allowing manipulation of video data directly within the database.
3. Entity-Relationships with Object Identifiers (OIDs): To maintain relationships
between entities effectively, OIDs are used as unique identifiers for objects within the
database. These OIDs ensure each object can be referenced accurately, supporting
complex relationships and data structures.
4. Modeling Relationships: Alternatively, videos collected by probes can be modeled
as separate entities (Videos). The relationship between Probes and Videos can then
be defined using a relationship set, capturing the fact that each video belongs to one
probe and each probe may have multiple videos.
5. Weak Entity Sets: By making Videos a weak entity set dependent on Probes, we can
enforce referential integrity constraints. This setup ensures that when a Probes entity
is deleted, all associated Videos entities are also deleted automatically, maintaining
data consistency.
6. Nested Collections: Our extended ER model must also support nested collections.
For example, if we want to model a scenario where each probe has a set of location
sequences (entities containing time, latitude, and longitude data), our ER diagram
should allow us to define this relationship clearly.
Nested Collections

Nested collections allow storing multiple related values or records within a single attribute of
an entity in a database. Here’s a straightforward explanation:

 Definition: Nested collections are like containers within a database where you can
store several pieces of related information together in one place.
 Example: Suppose you have a Probes table that tracks different probes and their
location sequences over time

CREATE TABLE Probes ( pid INTEGER PRIMARY KEY, locseq JSONB --


JSONB type allows nested structures);

 In this example, locseq can hold JSON objects representing time-stamped location data
for each probe. This means each probe's location history is stored as a nested collection
within the database.

 Usage: Nested collections are handy when you need to manage complex data structures
within a single table, making it easier to organize and query related data efficiently.

Examples

Example 1: Probes and Location Sequences

Consider a scenario where we want to track the location sequences of probes over time:

Probes (pid: integer, locseq: listof (time: timestamp, lat: real, long: real))

Explanation:

 Probes is a table storing information about probes.


 locseq is a nested collection (a list) that contains tuples of time, lat (latitude),
and long (longitude).
 Each pid (probe ID) can have multiple entries in locseq, representing different
times and locations recorded by that probe.

 Teaching Assignments
 Imagine a database for managing teaching assignments where courses can be taught
by multiple teachers:

TeachingAssignments (course_id: integer, teachers: setof (teacher_id: integer, salary:


decimal))

Explanation:
 TeachingAssignments stores information about which teachers are assigned to
teach which courses.
 teachers is a nested collection (a set) that includes teacher_id and salary for each

teacher assigned to a course.


 This structure allows flexibility in assigning multiple teachers to a course with
different salaries.

  Flexibility: Nested collections are flexible because they allow entities to have a
variable number of related items (like multiple locations for a probe or multiple
teachers for a course).
  Complexity: Querying nested collections can be more complex compared to flat
structures. Retrieving specific elements from nested collections may require more
advanced SQL queries or additional processing in the application.
  Data Integrity: Managing updates and deletions within nested collections requires
careful handling to avoid inconsistencies or orphaned data. For example, when
deleting a probe, you must ensure that all related location sequences ( locseq) are also
properly managed or deleted.
 ORDBMS (Object-Relational Database Management Systems) offer expanded
functionality over traditional RDBMS by supporting complex data types and
enhanced storage capabilities. However, implementing these features presents several
significant challenges, which we explore below.

 ORDBMSIMPLEMENTATIONCHALLENGES

ORDBMS (Object-Relational Database Management Systems) offer expanded


functionality over traditional RDBMS by supporting complex data types and
enhanced storage capabilities. However, implementing these features presents several
significant challenges, which we explore below.

Storage and Access Methods

Storing Large ADT and Structured Type Objects

Object-relational databases must efficiently manage storage for various data types, including
user-defined Abstract Data Types (ADTs) and structured objects:

In ORDBMSs, large ADT (Abstract Data Type) objects such as BLOBs (Binary Large
Objects) often exceed the size of a single disk page. To manage these effectively:
1. Special Storage Areas: ORDBMSs designate separate storage areas specifically for
large objects like BLOBs. These areas are distinct from where regular data tuples are
stored.
2. Use of Pointers: Instead of storing large objects directly within tuples, ORDBMSs
use pointers that reference the location of the large object in its dedicated storage area.
This method ensures efficient management and retrieval of large objects without
affecting the performance of smaller, structured data.
3. Advantages: By segregating large objects into specialized storage and using pointers,
ORDBMSs optimize storage space and maintain fast access times.

Structured objects in ORDBMSs, like the 'stars' attribute in a movie database, can change in
size over time. For instance, as new actors gain fame, they need to be added to the 'stars' list.
This dynamic growth requires the database to be flexible in how it stores and manages these
objects on disk.

To handle this:

1. Dynamic Size Changes: Imagine the 'stars' list starts small but expands as new actors
are added. ORDBMSs use flexible storage methods to accommodate these changes
without slowing down the database.
2. Flexible Storage Mechanisms: These mechanisms ensure that as the 'stars' list
grows, the database can adjust where and how it stores this information on disk. This
flexibility helps maintain performance and efficiency.
3. Optimizing Performance: By using flexible storage, ORDBMSs optimize how they
store and retrieve data, ensuring that even as structured objects grow, the database
remains fast and responsive.

Indexing New Types


Indexes in databases help retrieve data quickly. Traditional databases use indexes like B+
trees and hash indexes, which handle basic search conditions well. However, object-relational
databases (ORDBMSs) need to support more complex data types and operations efficiently.

Different applications, like maps and multimedia, require specialized indexes for efficient
searching. ORDBMSs allow users to create custom indexes tailored to these specific needs.
This flexibility means experts can design indexes that match their data and queries perfectly.

One way to do this is by letting users implement their own indexes outside the database
system. This approach integrates external tools like web search engines but lacks built-in
protections for data integrity and recovery.

Another approach is to provide a flexible template called the Generalized Search Tree (GIST)
within the database. GIST is based on B+ trees and can support most custom indexes with
minimal extra coding. This internal method ensures the database handles multiple users
accessing data at once and can recover from failures reliably.

In essence, ORDBMSs balance the need for diverse indexing capabilities with robust data
management. They offer options for creating custom indexes for specific tasks while
maintaining reliability in database operations.
Query processing in the context of databases refers to the process of transforming a high-
level query language (such as SQL) into a series of low-level instructions that the database
management system (DBMS) can execute to retrieve the requested data efficiently.

Here's a clearer breakdown of what query processing involves:

1. Parsing and Analysis: The first step is to parse the SQL query to ensure its syntax is
correct and to create a query tree (parse tree). The DBMS analyzes the query to
understand its structure and semantics.
2. Query Optimization: Once the query is parsed and analyzed, the DBMS optimizer
determines the most efficient way to execute the query. Optimization aims to
minimize the query's response time and resource usage by considering factors like
available indexes, join methods, and access paths.
3. Plan Generation: Based on the optimization phase, the DBMS generates an
execution plan (query plan) that outlines how the query will be executed. This plan
typically includes the order of table accesses, join algorithms (e.g., nested loops, hash
join), and access methods (e.g., full table scan, index scan).
4. Query Execution: With the execution plan in place, the DBMS executes the query. It
retrieves data from the underlying tables and applies any filtering, sorting, or
aggregation specified in the query.
5. Result Formation: As the query executes, the DBMS formats the retrieved data into
the result format specified by the query (e.g., rows and columns for a SELECT
query).
6. Query Completion: Finally, once the query has been executed and the result is
formed, the DBMS returns the result to the application or user who initiated the query.

User-Defined Aggregation Functions

In ORDBMS (Object-Relational Database Management Systems), users can define new


methods for their custom data types (ADTs). Alongside these methods, they may want to
create custom aggregation functions tailored to their specific data analysis needs. For
example, standard SQL aggregates like COUNT, SUM, and AVG are not always suitable for
more complex data types such as images or spatial data.

Most ORDBMSs allow users to register new aggregation functions with the system. To do
this, users typically need to implement three core methods:

 Initialize: This method sets up the initial internal state needed for the aggregation
operation.
 Iterate: During query processing, this method updates the aggregation state for each
tuple (row) processed.
 Terminate: Once all tuples have been processed, this method finalizes the
aggregation computation and prepares the result for output.
For example, imagine creating an aggregation function to find the second-highest value in a
field. The initialization would set up storage for tracking the top two values, the iteration
would update these values as new data is processed, and the termination would yield the
second-highest value once all data has been examined.

MethodSecurity

When databases allow users to add their own code through ADTs (Abstract Data Types),
there's a risk of introducing bugs or malicious actions that could crash the database or corrupt
data. To handle this:

1. Interpreted Languages: Some databases interpret user-written code instead of


directly converting it to machine language. This allows the database to check each
step of the code for safety before running it. Popular languages for this are Java and
specialized parts of SQL. By doing this, the database ensures that any mistakes in user
code won't cause major problems.
2. Separate Process Execution: Alternatively, users can write their code in languages
like C++ and compile it. However, this code runs in a separate area from the main
database system. Communication between this code and the main database happens
through a controlled method called Interprocess Communication (IPC). This setup
prevents user code from directly changing or viewing sensitive parts of the database.
It adds an extra layer of security to protect against both accidental errors and
intentional attacks.

Method caching in databases means saving the results of expensive operations, like
user-defined methods, so they can be reused later. When a method is called with
certain inputs, the system checks if it has already computed the results for those
inputs. If it has, it retrieves the stored results instead of recalculating them. This
speeds up query processing by avoiding repetitive computations, making database
operations more efficient overall.

There are two main ways to implement method caching:

1. In-memory Caching: Here, the database keeps recently computed method results in
its memory. This prevents the need to recalculate the method for identical input values
during the same query, speeding up execution.

2. Persistent Cache: In this approach, the database keeps a dedicated table, known as a
persistent cache, to store input-output pairs of methods. When a method is called with
specific inputs, instead of recalculating the result, the database fetches the
precomputed value directly from this cache table. This method speeds up query
processing by avoiding redundant computations, enhancing overall database
performance.

Pointer Swizzling:
3. In some applications, objects are stored in memory and accessed using unique
identifiers (oids). To quickly access these objects, systems maintain a table that maps
these identifiers to their actual memory locations. When an object is loaded into
memory, its identifier is replaced with a direct pointer to where the object is stored in
memory. This technique, known as pointer swizzling, speeds up access to objects in
memory. However, if an object is removed from memory (paged out), the pointers
must be converted back to identifiers to manage memory efficiently.

Eg

Library System: Imagine a library database where each book has a unique identifier
(BookID). These books are stored on disk to save memory.

 Searching for a Book: A user searches for a specific book using its BookID, say
BookID 101.

Loading into Memory: The library system fetches the details of BookID 101 from disk
into memory for quick access.

 Assigning an OID: In memory, the library assigns an Object Identifier (OID) to track
BookID 101 while it's in memory. Let's say it assigns OID 1001.

 Swizzling Pointers: To speed up access to BookID 101, the library system replaces all
references to OID 1001 with a direct memory pointer. This pointer points directly to the
location in memory where BookID 101's details are stored.

1. Extended Data Types

SQL3 supports more than just standard data types like integers and strings. It includes types
for large text, images, audio, and video.

Example:

sql
Copy code
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(100),
Resume CLOB, -- Large text data
ProfilePicture BLOB -- Binary data like images
);

Explanation: This table can store employees' IDs, names, resumes (large text), and profile
pictures (binary data).

2. Row Types
Row types let you define complex data structures to group related information together.

Example:

sql
Copy code
CREATE TYPE AddressType AS (
Street VARCHAR(100),
City VARCHAR(50),
ZipCode VARCHAR(10)
);

CREATE TABLE Customers (


CustomerID INT,
Name VARCHAR(100),
Address AddressType
);

Explanation: Here, AddressType groups street, city, and zip code together. The Customers
table uses this type to store customers' addresses.

3. User-Defined Types (UDTs)

UDTs allow you to create custom data types with their own structure and methods.

Example:

sql
Copy code
CREATE TYPE EmployeeType AS OBJECT (
EmployeeID INT,
Name VARCHAR(100),
PROCEDURE display_details()
);

Explanation: EmployeeType is a custom type that includes an employee's ID, name, and a
method to display details.

4. User-Defined Routines (UDRs)

UDRs are custom functions and procedures you can create to reuse logic.

Example:

sql
Copy code
CREATE FUNCTION CalculateAge(birthdate DATE) RETURNS INT AS
BEGIN
RETURN YEAR(CURRENT_DATE) - YEAR(birthdate);
END;

CREATE PROCEDURE UpdateSalary(employeeID INT, newSalary DECIMAL)


BEGIN
UPDATE Employees SET Salary = newSalary WHERE EmployeeID = employeeID;
END;
Explanation: CalculateAge is a function to calculate age from birthdate, and
UpdateSalary is a procedure to update an employee's salary.

5. Subtype and Supertype

Subtypes and supertypes allow for inheritance, where a subtype inherits attributes and
methods from a supertype.

Example:

sql
Copy code
CREATE TYPE PersonType AS OBJECT (
PersonID INT,
Name VARCHAR(100)
) NOT FINAL;

CREATE TYPE EmployeeType UNDER PersonType (


Salary DECIMAL
);

Explanation: PersonType is a base type, and EmployeeType inherits from it, adding a salary
attribute.

6. Subtable and Supertable

Subtables inherit the structure and constraints from their supertables.

Example:

sql
Copy code
CREATE TABLE Person (
PersonID INT,
Name VARCHAR(100)
) UNDER PersonType;

CREATE TABLE Employee (


EmployeeID INT,
Salary DECIMAL
) UNDER Person;

Explanation: Employee table inherits columns from the Person table, adding its own
Salary column.

Sure, I'll simplify and explain each feature of SQL3 with examples to make them easier to
understand.

1. Extended Data Types

SQL3 supports more than just standard data types like integers and strings. It includes types
for large text, images, audio, and video.
Example:

sql
Copy code
CREATE TABLE Employees (
EmployeeID INT,
Name VARCHAR(100),
Resume CLOB, -- Large text data
ProfilePicture BLOB -- Binary data like images
);

Explanation: This table can store employees' IDs, names, resumes (large text), and profile
pictures (binary data).

2. Row Types

Row types let you define complex data structures to group related information together.

Example:

sql
Copy code
CREATE TYPE AddressType AS (
Street VARCHAR(100),
City VARCHAR(50),
ZipCode VARCHAR(10)
);

CREATE TABLE Customers (


CustomerID INT,
Name VARCHAR(100),
Address AddressType
);

Explanation: Here, AddressType groups street, city, and zip code together. The Customers
table uses this type to store customers' addresses.

3. User-Defined Types (UDTs)

UDTs allow you to create custom data types with their own structure and methods.

Example:

sql
Copy code
CREATE TYPE EmployeeType AS OBJECT (
EmployeeID INT,
Name VARCHAR(100),
PROCEDURE display_details()
);

Explanation: EmployeeType is a custom type that includes an employee's ID, name, and a
method to display details.
4. User-Defined Routines (UDRs)

UDRs are custom functions and procedures you can create to reuse logic.

Example:

sql
Copy code
CREATE FUNCTION CalculateAge(birthdate DATE) RETURNS INT AS
BEGIN
RETURN YEAR(CURRENT_DATE) - YEAR(birthdate);
END;

CREATE PROCEDURE UpdateSalary(employeeID INT, newSalary DECIMAL)


BEGIN
UPDATE Employees SET Salary = newSalary WHERE EmployeeID = employeeID;
END;

Explanation: CalculateAge is a function to calculate age from birthdate, and


UpdateSalary is a procedure to update an employee's salary.

5. Subtype and Supertype

Subtypes and supertypes allow for inheritance, where a subtype inherits attributes and
methods from a supertype.

Example:

sql
Copy code
CREATE TYPE PersonType AS OBJECT (
PersonID INT,
Name VARCHAR(100)
) NOT FINAL;

CREATE TYPE EmployeeType UNDER PersonType (


Salary DECIMAL
);

Explanation: PersonType is a base type, and EmployeeType inherits from it, adding a salary
attribute.

6. Subtable and Supertable

Subtables inherit the structure and constraints from their supertables.

Example:

sql
Copy code
CREATE TABLE Person (
PersonID INT,
Name VARCHAR(100)
) UNDER PersonType;
CREATE TABLE Employee (
EmployeeID INT,
Salary DECIMAL
) UNDER Person;

Explanation: Employee table inherits columns from the Person table, adding its own
Salary column.

7. Reference Type and Object Identity

Reference types and object identity allow for creating relationships between objects using
references (like pointers).

8. Collection Type

Collection types let you store multiple values in a single attribute, like arrays or sets.

You might also like