0% found this document useful (0 votes)
11 views21 pages

Unit2 - ORDBMS

The document provides an overview of Relational Database Management Systems (RDBMS) and Object-Relational Database Management Systems (ORDBMS), detailing their definitions, features, and differences. It discusses the challenges of implementing ORDBMS, including complexity, performance issues, and integration with existing systems. Additionally, it covers parallel databases, their architectures, and various parallel query mechanisms to enhance performance in handling large datasets.

Uploaded by

garas47896
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views21 pages

Unit2 - ORDBMS

The document provides an overview of Relational Database Management Systems (RDBMS) and Object-Relational Database Management Systems (ORDBMS), detailing their definitions, features, and differences. It discusses the challenges of implementing ORDBMS, including complexity, performance issues, and integration with existing systems. Additionally, it covers parallel databases, their architectures, and various parallel query mechanisms to enhance performance in handling large datasets.

Uploaded by

garas47896
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Unit 2

ORDBMS
What is RDBMS?
RDBMS stands for Relational Database Management System. In this database management,
the data is organized into related tables. To access the database it uses Structured Query
Language (SQL).
This model is based on the mathematical theory of relational algebra and calculus. The original
concept for the model was proposed by Dr. E.F. Codd in 1970. After some time the model was
classified by defining twelve rules which are known as Codd’s rule
These 12 Codd’s rules are as follows :
• Information Rule • Logical Data Independence
• Guaranteed Access Rule • Integrity Independence
• Systematic Treatment Of Null Values • Distribution Independence
• Database Description Rule • Non Subversion Rule
• Comprehensive Data Sub-Language Rule
• View Updating Rule
• High-Level Insert, Update, and Delete
• Physical Data Independence
What is ORDBMS?
ORDBMS stands for Object-Relational Database Management System. It provides all the
facilities of RDBMS with the additional support of object oriented concepts.
The concept of classes, objects and inheritance are supported in this database. It is present in
the ground level between the RDBMS and OODBMS. In this data can be manipulated by using
any query language.
It is complex because it has to take care of both Relational database concepts and as well as
Object Oriented concepts. Some of the object related DBMS available in the market are as
follows :
• IBM’S DB2 Universal Database system
• Informix’s Universal server
ORDBMS Implementation and Challenges -
Object-Relational Database Management Systems (ORDBMS) combine features of both
object-oriented programming and relational database systems, aiming to provide a more
flexible and complex data model. However, implementing ORDBMS comes with several
challenges:
1. Complexity of Design:
• Data Modeling: Designing an object-relational schema that effectively represents
complex data relationships can be challenging. Developers must balance the advantages
of object-oriented design with the constraints of relational databases.
• Inheritance and Polymorphism: Implementing inheritance and polymorphism in a way
that integrates seamlessly with SQL can lead to complicated design decisions.
2. Performance Issues:
• Query Performance: ORDBMS may have slower performance for certain queries,
especially if they involve complex object types or functions. Optimizing these queries
requires advanced understanding of both SQL and the object model.
• Storage Overhead: The additional features of ORDBMS (like storing objects) can lead to
increased storage requirements and potential performance bottlenecks.
3. Standardization and Compatibility:
• Lack of Standards: While SQL is a widely accepted standard, the object-oriented
extensions can vary significantly between different ORDBMS platforms, creating
challenges in portability and interoperability.
• Migration Issues: Transitioning from a traditional relational database to an ORDBMS can
be complex, requiring significant changes to applications and data structures.

4. Integration with Existing Systems:


• Legacy Systems: Integrating ORDBMS with existing systems, especially those built on
older relational databases, can be difficult and time-consuming.
• Data Migration: Moving data from existing databases to an ORDBMS while maintaining
data integrity and consistency can pose significant challenges.
5. Skill Gap:
• Need for Specialized Knowledge: Developers and database administrators may require
additional training to effectively use ORDBMS features, leading to a potential skill gap in
teams accustomed to traditional relational databases.
6. Maintenance and Support:
• Complex Maintenance: The complexity of object-oriented features may lead to more
complicated maintenance tasks, including debugging and optimizing performance.
• Vendor Support: Depending on the ORDBMS chosen, the level of support and community
resources may vary, impacting the ease of troubleshooting and problem resolution.
Difference Between RDBMS and ORDBMS -
RDBMS ORDBMS
RDBMS is a Relational Database Management ORDBMS is a Object Oriented Relational Database
System based on the Relational model of data. Management System based on the Relational as
well as Object Oriented database model.
It follows table structure, it is simple to use and It is same as RDBMS but it has some extra
easy to understand. confusing extensions because of the Object
Oriented concepts.
It has no extensibility and content. It is only limited to the new data-types.

Since RDBMS is old so, it is very mature. It is developing so it is immature in nature.

It supports Structured Query Language (SQL). It supports Object Query Language (OQL).
It is capable of handling only simple data. It is also capable of handling the complex data.
MS SQL server, MySQL, SQLite, MariaDB are PostgreSQL is an example of ORDBMS.
examples of RDBMS.
Database Design for ORDBMS -
The OODBMS is based on three major components, namely: Object structure, Object classes,
and Object identity. These are explained below.
1. Object Structure -
An object is a real-world entity with certain attributes that makes up the object structure. Also,
an object encapsulates the data code into a single unit which in turn provides data abstraction
by hiding the implementation details from the user.
The object structure is further composed of three types of components: Messages, Methods,
and Variables.
Messages –
A message provides an interface or acts as a communication medium between an object and
the outside world. A message can be of two types: Read-only message: If the invoked method
does not change the value of a variable, then the invoking message is said to be a read-only
message.
Update message: If the invoked method changes the value of a variable, then the invoking
message is said to be an update message.
Methods –
When a message is passed then the body of code that is executed is known as a method.
Whenever a method is executed, it returns a value as output. A method can be of two
types: Read-only method: When the value of a variable is not affected by a method, then it is
known as the read-only method.
Update-method: When the value of a variable change by a method, then it is known as an
update method.
Variables –
It stores the data of an object. The data stored in the variables makes the object
distinguishable from one another.
Parallel Databases -

Nowadays organizations need to handle a huge amount of data with a high transfer rate. For
such requirements, the client-server or centralized system is not efficient.

With the need to improve the efficiency of the system, the concept of the parallel database
comes in picture. A parallel database system seeks to improve the performance of the system
through parallelizing concept.

Multiple resources like CPUs and Disks are used in parallel. The operations are performed
simultaneously, as opposed to serial processing. A parallel server can allow access to a single
database by users on multiple machines.

By the uses of Parallel Databases, there is improvement in performance, proper resources


should be utilized, increase in reliability of the databases.
Working of parallel databases
A parallel database system seeks to improve performance through parallelization of various
operations like loading data, building index and evaluating queries parallel systems improve
processing and I/O speeds by using multiple CPU’s and disks in parallel.

Step 1 − Parallel processing divides a large task into many smaller tasks and executes the
smaller tasks concurrently on several CPU’s and completes it more quickly.

Step 2 − The driving force behind parallel database systems is the demand of applications that
have to query extremely large databases of the order of terabytes or that have to process a
large number of transactions per second.

Step 3 − In parallel processing, many operations are performed simultaneously as opposed to


serial processing, in which the computational steps are performed sequentially.
Architectures for parallel databases
A parallel DBMS is a DBMS that runs across multiple processors or CPUs and is mainly designed
to execute query operations in parallel, wherever possible.
The parallel DBMS link a number of smaller machines to achieve the same throughput as
expected from a single large machine.
In Parallel Databases, mainly there are three architectural designs for parallel DBMS. They are -
1. Shared Memory Architecture
In Shared Memory Architecture, there are multiple CPUs that are attached to an
interconnection network.
In this architecture, a single copy of a multi-threaded operating system and multithreaded
DBMS can support these multiple CPUs. Also, the shared memory is a solid coupled
architecture in which multiple CPUs share their memory. This is called Systematic
multiprocessing.
This architecture has a very wide range which starts from personal workstations that support a
few microprocessors in parallel via RISC.
Advantages :
• It has high-speed data access for a limited number of processors.
• The communication is efficient.
Disadvantages :
• It cannot use beyond 80 or 100 CPUs in parallel.
• The bus or the interconnection network gets block due to the increment of the large
number of CPUs.
2. Shared Disk Architecture
In Shared Disk Architecture, various CPUs are attached to an interconnection network. In this,
each CPU has its own memory and all of them have access to the same disk.
Also, note that here the memory is not shared among CPUs therefore each node has its own
copy of the operating system and DBMS.
Shared disk architecture is a loosely coupled architecture optimized for applications that are
inherently centralized. They are also known as clusters.
Advantages :
• The interconnection network is no longer a bottleneck each CPU has its own memory.
• Load-balancing is easier in shared disk architecture.
• There is better fault tolerance.
Disadvantages :
• If the number of CPUs increases, the problems of interference and memory contentions
also increase.
• There’s also exists a scalability problem.
3. Shared Nothing Architecture
Shared Nothing Architecture is multiple processor architecture in which each processor has its
own memory and disk storage. In this, multiple CPUs are attached to an interconnection
network through a node.
Also, note that no two CPUs can access the same disk area. In this architecture, no sharing of
memory or disk resources is done. It is also known as Massively parallel processing (MPP).
Advantages :
• It has better scalability as no sharing of resources is done
• Multiple CPUs can be added
Disadvantages:
• The cost of communications is higher as it involves sending of data and software
interaction at both ends
• The cost of non-local disk access is higher than the cost of shared disk architectures.

This technology is typically used for very large databases that have the size of 1012 bytes or
TB or for the system that has the process of thousands of transactions per second.
Parallel Query Mechanism -
Parallelism in a query allows us to parallel execution of multiple queries by decomposing them
into the parts that work in parallel.
Parallelism is also used in fastening the process of a query execution as more and more
resources like processors and disks are provided.
The parallelism in a query by the following methods :
1. I/O parallelism :
• It is a form of parallelism in which the relations are partitioned on multiple disks.
• A motive of this is to reduce the retrieval time of relations from the disk. Within, the data
inputted is partitioned and then processing is done in parallel with each partition.
• The results are merged after processing all the partitioned data which is known as data-
partitioning.
• Partitioning is useful for the sequential scans of the entire table placed on ‘n‘ number of
disks and the time taken to scan the relationship is approximately 1/n of the time required
to scan the table on a single disk system.
The different types of I/O parallelism -
Hash partitioning –
A Hash Function is a fast, mathematical function. Each row of the original relationship is
hashed on partitioning attributes. For example, let’s assume that there are 4 disks disk1, disk2,
disk3, and disk4 through which the data is to be partitioned. Now if the Function returns 3,
then the row is placed on disk3.
Range partitioning –
In range partitioning, it issues continuous attribute value ranges to each disk. For example, we
have 3 disks numbered 0, 1, and 2 in range partitioning, and may assign relation with a value
that is less than 5 to disk0, values between 5-40 to disk1, and values that are greater than 40
to disk2.
Round-robin partitioning –
In Round Robin partitioning, the relations are studied in any order. The ith tuple is sent to the
disk number(i % n). So, disks take turns receiving new rows of data. This technique ensures the
even distribution of tuples across disks and is ideally suitable for applications that wish to read
the entire relation sequentially for each query.
2. Intra-query parallelism :
Intra-query parallelism refers to the execution of a single query in a parallel process on
different CPUs using a shared-nothing paralleling architecture technique. This uses two types
of approaches:
First approach –
In this approach, each CPU can execute the duplicate task against some data portion.
Second approach –
In this approach, the task can be divided into different sectors with each CPU executing a
distinct subtask.
3. Inter-query parallelism :
In Inter-query parallelism, there is an execution of multiple transactions by each CPU. It is
called parallel transaction processing.
In this method, each query is run sequentially, which leads to slowing down the running of
long queries. In such cases, DBMS must understand the locks held by different transactions
running on different processes.
4. Intra-operation parallelism :
Intra-operation parallelism is a sort of parallelism in which we parallelize the execution of each
individual operation of a task like sorting, joins, projections, and so on.
The level of parallelism is very high in intra-operation parallelism. This type of parallelism is
natural in database systems.
5. Inter-operation parallelism :
When different operations in a query expression are executed in parallel, then it is called inter-
operation parallelism. They are of two types –
• Pipelined parallelism – the output row of one operation is consumed by the second
operation even before the first operation has produced the entire set of rows in its output.
It is useful for the small number of CPUs and avoids writing of intermediate results to disk.
• Independent parallelism – the operations in query expressions that are not dependent on
each other can be executed in parallel. This parallelism is very useful in the case of the
lower degree of parallelism.

You might also like