0% found this document useful (0 votes)
13 views104 pages

Module 2

Uploaded by

samaymistry105
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views104 pages

Module 2

Uploaded by

samaymistry105
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MM-MCA-14

Advanced Database system


and Data Mining Techniques
Reference Books XP

Reference Reference Name


No
1 Ponniah, Paulraj, Data warehousing fundamentals: a comprehensive guide for IT
professionals, John Wiley & Sons, 2004.
2 Dunham, Margaret H, Data mining: Introductory and advanced topics, Pearson Education
India, 2006.
3 Gupta, Gopal K, Introduction to data mining with case studies, PHI Learning Pvt. Ltd.,
2014.
4 Han, Jiawei, Jian Pei, and Micheline Kamber, Data mining: concepts and techniques,
Second Edition, Elsevier, Morgan Kaufmann, 2011.
5 Ramakrishnan, Raghu, Johannes Gehrke, and Johannes Gehrke, Database management
systems, Vol. 3, McGraw-Hill, 2003.
6 Elmasri, Ramez, and Shamkant B. Navathe, Fundamentals of Database Systems, Pearson
Education, 2008, (2015).
7 Silberschatz, Abraham, Henry F. Korth, and Shashank Sudarshan,Database system
concepts, Vol. 5,McGraw-Hill, 1997.
XP
Web References:

Reference No Reference Name

1 [Link]

2 [Link]

3 [Link]

4 [Link]

mining-guide/

3
Unit 2: Parallel
Database Distributed
Database
Module 2: Architecture and Data Distribution in Distributed and Parallel Databases XP

[Link]. Topics 05

Architecture for Parallel Databases, Types of 1


1
Distributed Databases,
Distributed DBMS Architecture 1
2
Storing Data in a Distributed DBMS. 1
3

ORDBMS: Structured Data Types, Operations on 1


4
Structured Data,

Objects, OIDs and Reference Types, Object oriented 1


5
versus Object relational database.
Self Learning Topics: Mapping OODBMS to ORDBMS
6
XP
Distributed Databases
• Distributed database system is a system where data is stored
across several sites, and each site is typically managed by a
DBMS that can run independent of the other sites.
• A distributed database appears to a user as a single database
but is, in fact, a set of databases stored on multiple computers.
• Computers in such systems are called sites or nodes.
• Nodes are connected through some communication medium
like high-speed network or telephone lines.
• A distributed database management system (DDBMS) is a
centralized software system that manages a distributed
database in a manner as if it were all stored in a single location.
Distributed Databases XP
Distributed Database – User View XP
Distributed DBMS - Reality XP
Transparent Access XP
Features of Distributed Databases XP

• Sharing data:- Users at one site may be able to access data


residing at other sites.
• Availability:- If one site fails in a distributed system, the
remaining sites may be able to continue operating.
• Autonomy:- Each site is able to retain a degree of control over
data that are stored locally.
• Data is physically stored across multiple sites. Data in each site
can be managed by a DBMS independent of the other sites.
• Databases in the collection are logically interrelated with each
other. Often they represent a single logical database.
XP
Features of Distributed Databases

• Various sites co-operate in transparent manner.


– Transparent implies that each user within the system may access all
of the data within the system as if they were a single database.

• System possesses ‘location independence’ i.e. the user is


unaware of where the data is located, it is possible to move the
data from one physical location to another without affecting
the user.
Factors Encouraging DDBMS XP

• Distributed Nature of Organizational Units


• Need for Sharing of Data
• Support for Both OLTP and OLAP
• Database Recovery - Replication of data across different sites
automatically helps in data recovery if database in any site is
damaged.
Advantages of Distributed Databases XP

• Scalability− Expansion of system (to new locations or new units) simply


requires adding new computers and local data to the new site and
finally connecting them to the distributed system, with no interruption
in current functions.
• Improved Processing Power - Instead of one server handling the full
database, we now have a collection of machines handling the same
database.
• More Reliable − In distributed systems, when a component fails, the
functioning of the system continues may be at a reduced performance.
Removal of reliance on a central site.
• Better Response − If data is distributed in an efficient manner, then user
requests can be met from local data itself, thus providing faster
response.
• Lower Communication Cost − In distributed database systems, if data is
located locally where it is mostly used, then the communication costs
for data manipulation can be minimized.
Disadvantages of Distributed Databases XP

• Need for complex and expensive software − DDBMS demands


complex and often expensive software to provide data
transparency and co-ordination across the several sites.
• Processing overhead − Even simple operations may require a
large number of communications and additional calculations to
provide uniformity in data across the sites.
• Data integrity − The need for updating data in multiple sites pose
problems of data integrity.
• Overheads for improper data distribution − Responsiveness of
queries is largely dependent upon proper data distribution.
XP
Types of DDBMS
• Homogeneous distributed database
• Heterogeneous distributed database
Types of DDBMS XP
Homogeneous Distributed Database XP

• The sites use very similar software.


• The sites use identical DBMS or DBMS from the same vendor.
• Each site is aware of all other sites and cooperates with other
sites to process user requests.
• The database is accessed through a single interface as if it is a
single database.
Heterogeneous Distributed Database XP

• Different sites use dissimilar schemas and software.


• The system may be composed of a variety of DBMSs like
relational, traditional file system or different versions of
DBMSs.
• Query processing is complex due to dissimilar schemas.
• Transaction processing is complex due to dissimilar software.
• A site may not be aware of other sites and so there is limited
co-operation in processing user requests.
Architecture of DDBMS XP

• Client-server Systems
• Collaborating server Systems
• Middleware Systems
Client-Server Architecture XP

• A client-server system has one or more client processes and


one or more server processes.
• A client process can send a query to any one server process.
• Clients are responsible for user-interface issues and servers
manage data and execute transaction.
• A client process could run on a personal computer and send
queries to a server running on mainframe.
Advantages of Client Server Architecture XP

• It is relatively simple to implement due to its clean separation


of functionality and because the server is centralized.
• Expensive Server machines are not underutilized by dealing
with mundane user-interactions.
• Users can run a graphical user interface that they are familiar
with, rather than the (possibly unfamiliar and unfriendly) user
interface on the server.
Disadvantages of Client Server Architecture XP

• It does not allow a single query to span multiple servers


because the client process would have to be capable of
breaking such a query into appropriate sub queries to be
executed at different sites and then piecing together the
answers to the sub queries.
• The client process would therefore be quite complex, and its
capabilities would begin to overlap with the server;
distinguishing between clients and servers become harder.
Collaborating Server XP

• Designed to run a single query on multiple servers.


• Servers break single query into multiple small queries and the
result is sent to the client.
• Collaborating server architecture has a collection of database
servers. Each server is capable for executing the current
transactions across the databases.
Middleware XP

• The middleware architecture is designed to allow a single query


to span multiple servers, without requiring all database servers to
be capable of managing such multi-site execution strategies.
• Here we need just one database server capable of managing
queries and transactions that spans over multiple servers; the
remaining servers need to handle only local queries and
transaction.
• This special server is a layer of software that coordinates the
execution of queries and transaction across one or more
independent database servers.
• Such software is known as middleware.
• Especially useful when trying to integrate several legacy systems
Middleware XP

• The middleware layer is capable of executing joins & other


relational operations on data obtained from the other servers
but, does not itself maintain any data.
Storing Data in a Distributed DBMS XP

• Fragmentation
▪ Relation is partitioned into several fragments stored in distinct sites
• Replication
▪ System maintains multiple copies of data, stored in different sites, for faster
retrieval and fault tolerance.
• Replication and fragmentation can be combined
▪ Relation is partitioned into several fragments: system maintains several identical
replicas of each such fragment.
Data Fragmentation XP

• The process of dividing the database into a smaller multiple


parts is called as fragmentation.
• These fragments may be stored at different locations.
• The data fragmentation process should be carried out in such
a way that the reconstruction of original database from the
fragments is possible.
• Types of Fragmentation
–Horizontal Fragmentation
–Vertical Fragmentation
–Mixed Fragmentation
XP

Vertical
Fragmentation

Horizontal
Fragmentation
Horizontal Fragmentation XP

• Horizontal fragmentation divides a relation(table)


horizontally into the group of rows to create subsets
of tables.
• The union of the horizontal fragments must be equal
to the original relation.
Vertical Fragmentation XP

• Divides a relation vertically into groups of columns to create


subsets of relations.
• Column wise : the schema for relation r is split into several
smaller schemas
▪ The collection of vertical fragments should be a lossless-join
decomposition.
▪ To ensure this property all schemas must contain a common
candidate key.
▪ A special attribute, the tuple-id attribute may be added to each
schema to serve as a candidate key.
• Example : relation account with following schema employee_info
= (customer_name, branch_name, account_number, balance )
Advantages of Fragmentation XP

• Allows a relation to be split so that tuples are located


where they are most frequently accessed to minimize
data transfer
• Since data is stored close to the site of usage, efficiency
of the database system is increased.
• Local query optimization techniques are sufficient for
most queries since data is locally available.
• Allows parallel processing on fragments of a relation
Data Replication XP

• A relation or fragment of a relation is said to be


replicated if it is stored redundantly in two or more
sites.
• Full replication of a relation is the case where the
relation is stored at all sites.
• Fully redundant databases are those in which every site
contains a copy of the entire database.
• Replication Schemes
– Full replication
– No replication
– Partial replication
Types of Data Replication XP

• Synchronous replication
– In synchronous replication, the replica will be modified immediately
after some changes are made in the relation table. So there is no
difference between original data and replica.

• Asynchronous replication
– In asynchronous replication, the replica will be modified after
commit is fired on to the database.
Advantages of Replication XP

▪ Availability: failure of site containing relation r does


not result in unavailability of r if replicas exist.
▪ Parallelism: queries on r may be processed by several
nodes in parallel.
▪ Reduced data transfer: relation r is available locally at
each site containing a replica of r. In turn it leads to
faster query evaluation.
Disadvantages of Replication XP

▪ Increased cost of updates: each replica of relation r


must be updated.
▪ Increased complexity of concurrency control:
concurrent updates to distinct replicas may lead to
inconsistent data unless special concurrency control
mechanisms are implemented.
▪ Additional requirement of storage space
Parallel vs. Distributed Databases XP

Parallel Systems Distributed Systems


Memory Tightly coupled shared Distributed memory (shared
memory (shared memory, nothing)
shared disk)
Control Global clock present No global clock. Synchronization
more difficult.
Processor Order of Tbps Order of Gbps
interconnection (Bus, mesh, tree, hypercube) (Ethernet, token ring)
Purpose Performance, Scientific Performance (cost & scalability),
computing reliability, availability,
information/ resource sharing
Geographical Nodes located at same location Nodes located at geographically
Location different location
XP

Parallel
Databases
Centralized Database Systems XP

• All the data is maintained at a single site


• Assumed that the processing of individual transactions is
essentially sequential
• Databases more prevalent in organizations
– Parallel databases
– Distributed databases
Parallel Database System XP

• Parallel databases improve processing and input/output speeds


by using multiple CPUs and disks in parallel.
• Improves performance through parallelization of various
operations, such as loading data, building indexes, and
evaluating queries.
• Can have parallel processing on a single
machine(multiprocessor computer having multiple disks
connected by a fast interconnection network)
• Distributed processing usually imply parallel processing (not
vise versa).
Why Parallel Processing? XP
Parallel Systems XP

• A coarse-grain parallel machine consists of a


small number of powerful processors.
• A massively parallel or fine grain parallel
machine utilizes thousands of smaller
processors.
Performance Measures of Database Systems XP

▪ Throughput - the number of tasks that can be completed in a


given time interval.
▪ Response time - the amount of time it takes to complete a single
task from the time it is submitted.

• Goals of parallel processing - Linear Scaleup & Linear


Speedup.
Speed-Up XP

• Running a given task in less time by increasing the degree


of parallelism is called speedup.
OR
• By increasing number of resources for a constant database
size, performance should increase linearly.
▪ Speedup = small system elapsed time (TS)/ large system elapsed time
(TL)
(Speedup = time_single_machine/time_parallel)
▪ Linear speedup is desired.
▪ Elapsed time is the amount of time that passes from the start of an
event to its finish
Speedup XP

Speedup
Scale-Up XP

• Scaleup is the ability to keep the same performance


levels (response time) when both workload
(transactions) and resources (CPU, memory) increase
proportionally.
• Factor m that expresses how much more work can be
done in the same time period by a system n times
larger.
• scaleup = small system small problem elapsed time/ big
system big problem elapsed time
• Scale up is linear if equation equals 1.
Scaleup XP

Scaleup
Factors Limiting Speedup and Scaleup XP

• Speedup and scaleup are often sublinear due to:


– Startup costs: Cost of starting up multiple processes may
dominate computation time, if the degree of parallelism is
high.
– Interference: Processes accessing shared resources
(e.g.,system bus, disks, or locks) compete with each other,
thus spending time waiting on other processes, rather than
performing useful work.
– Skew: Increasing the degree of parallelism increases the
variance in service times of parallelly executing tasks. Overall
execution time determined by slowest of parallelly executing
tasks.
XP
Physical Architectures for Parallel Database Systems
• Shared memory
• Shared disk
• Shared nothing
Shared Memory XP

• Multiple CPUs are attached to an interconnection network and


can access a common region of main memory
• Processors have large memory caches to avoid memory
references. If data is not in cache, shared memory references are
required.
Shared Memory XP

• Advantages
– Data is easily accessible to any processor.
– One processor can send message to other efficiently via memory writes.
• Disadvantages
– Waiting time of processors is increased due to more number of processors.
–System needs to maintain cache coherency, which
becomes overhead with increasing no of processors.
– Not scalable since the interconnection network or memory contention
becomes a bottleneck (Bandwidth problem. memory interference)
• Example - XPRS
[ Cache coherence refers to the problem of keeping the data in
these caches consistent.]
Shared Disk XP

• Each CPU has its own private memory


but can directly access all disks through
an interconnection network
• As each processor has its own memory
bus is not a bottleneck.
• The system built around this system are
called as clusters.
Shared Disk XP

• Advantages
– Fault tolerance - If a processor (or its memory) fails, the
other processors can take over its tasks, since the database
is resident on disks that are accessible from all processors.
• Disadvantage
– Interconnection channel becomes bottleneck; Limited
scalability though more than shared memory
– Communication across processors is slower, since it has to
go through a communication network
Shared Nothing XP

• Each processor has its own local


memory and local disk(s).
• Processors can communicate with
each other through
intercommunication channel (high
speed network and switches).
• A node functions as the server for the
data on the disk or disks that the node
owns.
Shared Nothing XP

• Advantages
– Only queries that access nonlocal disks, have to pass
through the network, so less I/O has to go through a single
interconnection network
– Can be scaled up to thousands of processors without
interference.
– More scalable as the interconnection networks for shared-
nothing systems are usually designed such that their
transmission capacity increases as more nodes are added.
• Disadvantages
– Cost of communication and accessing non-local disk is much
higher.
Hierarchical System XP

• Combines the characteristics of shared-memory, shared-disk, and


shared-nothing architectures. Also called nonuniform memory
architecture (NUMA).
• At the top level, the system consists of nodes that are connected
by an interconnection network and do not share disks or memory
with one another (shared nothing).
• Hierarchically below, each node of the system could be a system
with a few processors of type shared-memory/ shared-disk.
XP
Comparison - Architectures for Parallel Database Systems

• The basic problem with the shared-memory &


shared-disk architectures is interference.
• As more CPUs are added, existing CPUs are slowed
down because of the increased contention for
memory access and network bandwidth.
• Shared-nothing is the best architecture for large
systems.
Data Partitioning XP

• Distributes data over a number of processing elements


• Each processing element is then executed
simultaneously with other processing elements,
thereby creating parallelism
• Can be physical or logical data partitioning
Types of Partitioning XP

Round-Robin Range Partitioning Hash Partitioning


Partitioning
Comparison of Partitioning Techniques XP

• Evaluate how well partitioning techniques


support the following types of data access:
–Scanning the entire relation.
–Locating a tuple associatively – point queries. E.g., r.A
= 25.
–Locating all tuples such that the value of a given
attribute lies within a specified range – range queries.
E.g., 10  r.A < 25.
Round-Robin Partitioning XP

▪ Each record in turn is allocated to a processing element in a clockwise


manner
▪ If there are n processors, the ith tuple is assigned to processor i mod n in
round-robin partitioning.
▪ “Equal partitioning” or “Random-equal partitioning”
▪ Data evenly distributed, hence supports load balance
▪ But data is not grouped semantically
▪ Best suited for sequential scan of entire relation on each query
Hash Partitioning XP

▪ A hash function is used to partition the data

▪ Data is grouped semantically, that is data on the same group share the
same hash value

▪ Good for point queries on partitioning attribute


Range Partitioning XP

▪ Spreads the records based on a given range of the partitioning attribute


▪ Processing records on a specific range can be directed to certain
processors only
▪ Well suited for point and range queries
Comparison between Partitioning Techniques XP

• Round-robin partitioning is efficient for queries that access the entire


set of data.
• If only a subset of the tuples is required, hash partitioning and range
partitioning are better than round-robin partitioning.
• They enables the user to access only those disks that contain
matching tuples.
• Hash partitioning scheme is best suited for point queries based on
the partitioning attribute.
• Range partitioning is well suited for point and range queries.

• [Point queries. • Consider a selection query with a single equality in


the condition:]
Object Database
Systems
XP
Need for Complex Data Types
▪ Relational DBMSs are not good enough for today’s advanced
database applications
▪ RDBMS support a small, fixed collection of data types (e.g. int,
date, string), which are adequate for traditional application
domains such as administrative data processing.
▪ In many application domains, more complex kind of data must be
handled. Such complex data has to be stored in OS files or
specialized data structures, rather than in a DBMS. Examples of
domains with complex data are CAD/CAM, multimedia
repositories, document management, etc
▪ To support such applications, a DBMS must support complex data
types.
XP
Need for Complex Data Types
▪ E.g. addresses, an entire address can be viewed as an
atomic data item of string type, but this will hide details
such as street, city, zip code which can be of interest to
queries. If they are represented by breaking into parts then
queries will be complicated. Better alternative is to allow
structured data type, allow a type address with subparts
city, street etc.
▪ Multivalued attributes: solution to it is creating a new
relation but it is expensive
XP
Advanced Database Applications
▪ Computer-Aided Design (CAD)
▪ Computer-Aided Manufacturing (CAM)
▪ Computer-Aided Software Engineering (CASE)
▪ Multimedia applications: audio, graphic, video data need to handle.
▪ Real time control systems
▪ Network Management Systems
▪ Geographic Information Systems (GIS)
▪ Office Information Systems (OIS) and Multimedia Systems
▪ Digital Publishing
▪ Interactive and Dynamic Web sites
▪ Other applications with complex and interrelated objects and procedural data.
XP
Weaknesses of RDBMSs
▪ Poor Representation of "Real World" Entities
▪ Limitation in Encapsulating Data (Structure) with Operations
(Behavior)
▪ Limitation in Dealing with Composition, Aggregation,
Generalization-Specialization.
▪ Homogeneous data structure and Limited operations.
▪ Normalization sometimes lead to relations which do not exist, or
correspond, to entities in the real world. This compounds on the
‘join’ feature of query processing.
▪ The many to many relationship is difficult to express.
XP
Object Database Systems
▪ Object-oriented concepts have been evolved: to overcome
the limitations of RDBMS and to handle complex types of
applications which led to the development of object-
database systems.

▪ Object database systems have developed along 2 distinct


paths:
▪ Object-oriented database systems
▪ Object-relational database systems
XP
Evolution of Database Management Systems
▪ 1st Generation
▪ Hierarchical
▪ Network

▪ 2nd Generation
▪ Relational Database

▪ 3rd Generation
▪ Object Oriented
Database
▪ Object-Relational
Database
ORDBMS
RDBMS - ORDBMS XP
XP
ORDBMS
▪ An object-relational database is a database management system
similar to a relational database, but with an object-oriented
database model: objects, classes and inheritance are directly
supported in database schemas as well as within the query
language. Ex. Oracle, PostgreSQL
▪ Features of ORDBMS –
▪ User Data Types (Abstract Data Type)
▪ Inheritance
▪ Encapsulation
XP
User Data Types (Abstract/ Structured Data Type)
• ORDBMSs allow users to define datatypes, functions and operators.
• Object Type

create table rectangle (


length side;
breadth side;
);
XP
Structured Data Types
• use type constructors to generate new types
• Collection types
– set(T): multiset – array(T), T[][] – list(T)
• Row types (composite type)
– row (Col1 T1, ..., Colk Tk) – Named row type
• Reference Types – Ref(T)
• All first-class types!
Object Type XP

• Object types are abstractions of the real-world entities—for example -


customers, purchase orders—that application programs deal with. An
object type is a schema object with three kinds of components:
– A name, which serves to identify the object type uniquely within that
schema
– Attributes, which model the structure and state of the real-world
entity. Attributes are built-in types or other user-defined types.
– Methods, which are functions or procedures written in PL/SQL or Java
and stored in the database, or written in a language such as C and
stored externally. Methods implement operations the application can
perform on the real-world entity.
• An object type is a template(Can’t store data in it) . A structured data unit
that matches the template is called an object.
XP
Abstract Data Type
▪ Combination of an atomic data type & its associated methods is
called an abstract data type ADT.
▪ Traditional SQL has Built-in ADTs like int, float, string, etc.
▪ These types have simple methods associated with them
(arithmetic, comparison, equality, LIKE, etc.)
▪ ORDBMS allows the user to create their own ADTs if a type
cannot be naturally defined in terms of the built-in types.
▪ The label abstract is applied to these types because the DBMS
doesn’t need to know how an ADTs data is stored or how
methods work.
▪ It merely needs to know what methods are available & I/P & O/P
types for the methods.
Abstract Data Type XP

• Underneath the object layer, data is still stored in


columns and tables, but you can work with the data in
terms of the real-world entities. Ex – customer,
purchase order
• A client-side application can request a purchase order
from the server and receive all the relevant data in a
single transmission. The application can then, without
knowing storage locations or implementation details,
navigate among related data items without further
transmissions from the server.
XP
Encapsulation XP

• Hiding ADT internals is called as encapsulation (bind function + data)


• Methods are procedures or functions can be encapsulated within the
defined object so that applications can use to perform operations on
the attributes of the object type.
• Methods are optional. They define the behavior of objects of that type.
Inheritance XP

• In all ORDBMS, inheritance of user defined types is supported to


derive new sub-types or sub-classes which would therefore inherit
all attributes and methods from the parent type.

shape_tp

eshape_tp
Inheritance XP

▪ In object-database systems, inheritance is supported directly and allows


type definitions to be reused and refined very easily.
▪ It can be very helpful when defining similar but slightly different classes.
▪ Inheritance allows us to capture ‘specialization’ explicitly in the database
design.
▪ Inheritance creates explicit relationship in the database between the
sub-type and the super-type
▪ CREATE TYPE sub_type UNDER super_type ;
▪ Sub-type inherits attributes & methods of super-type
▪ Methods defined on super-type apply to objects of type sub-type, but
not vice versa.
Inheritance XP

▪ An object of the subtype is also considered to be an object of the


supertype.
▪ This means that any operations that apply to the supertype also apply
to the subtype. This is expressed in the following principle:
▪ The substitution Principle : Given a supertype A and a subtype B, it is
always possible to substitute an object of type B into a legal
expression written for objects of type A, without producing type
errors.
▪ This enables the code reuse; as queries & methods written for the
supertype can be applied to the subtype without modification.
Binding Methods XP

▪ Registering a new method with the same name as an old


method is called overloading the method name.
▪ Because of overloading, the system must understand which
method is intended in a particular expression.
▪ The process of deciding which method to invoke is called binding
the method to the object.
▪ Early Binding: Binding is done when an expression is parsed
▪ Late Binding: Binding is done at runtime
Object Identifier (OID) XP

• System generated identifier which is assigned, when a new


object is created.
• Every object has unique identity. In an object oriented system,
when object is created OID is assigned to it.
• Properties of OID
– Uniqueness: OID cannot be same to every object in the
system and it is generated automatically by the system.
– Invariant: OID cannot be changed throughout its entire
lifetime.
– Invisible: OID is not visible to user.
Object Identifier (OID) XP

• Oracle assigns to each row object a unique system-generated OID, 16


bytes in length, that is automatically indexed for efficient OID-based
lookups.
• The OID column is the equivalent of having an extra 16-byte primary
key column.
• The object identifier column is a hidden column that Oracle uses to
construct references to the row objects.
▪ OID value should be used only once, that is even the object is
removed from the database its OID should not be assigned to another
object.
▪ OID cannot be modified by the user.
▪ OID s are not dependent on physical location of data.
URLs vs OIDs XP

• OIDs uniquely identify a single object over all time, whereas


web resource pointed at by URL can change over time
• OIDs are simply identifiers & carry no physical information
about the objects they identify-storage location of object can be
changed without changing pointers to object. URLs include
network addresses & file system names, hence resource can’t
be moved.
• OIDs are automatically generated by DBMS for each object.
URLs are user generated.
• Deletion for URLs create problems- ‘404 page not found error’.
OID deletion handled by system (as per referential integrity)
XP

REFs
▪ Built-in data type that stores value of OID.
▪ A REF is a logical “pointer” to a row object which is a system built-
in datatype.
▪ You can use a REF to examine or update the object it refers to.
▪ The REF function in a SQL statement takes as an argument a
correlation name for an object table or view and returns a
reference (a REF) to an object instance from that table or view.
SELECT REF(a)
FROM address_tab a;
REF(A)
000028020928C64A64E9C54BB1BB42DB7BB1431B893E2BEE81A8084FC393AC13EA8856534E02
C01AA20000
0000280209B2B5E415A13F4EF8A3352F686E1C4DCE3E2BEE81A8084FC393AC13EA8856534E02
C01AA20001
REFs XP

▪ In the relational model, foreign keys express many-to-one


relationships.
▪ ORDBMS allows relationships between two entities to be modeled
in the database via the concept of a reference.
▪ Oracle provides REF to encapsulate references to row objects of a
specified object type. From a modelling perspective, REFs provide
the ability to capture an association between two row objects.
Dangling REFs XP

▪ It is possible for the object identified by a REF to become


unavailable through either deletion of the object or a change in
privileges. Such a REF is called dangling.
▪ Oracle SQL provides a predicate (called IS DANGLING) to allow
testing REFs for this condition.
Dereference REFs XP

▪ Accessing the object referred to by a REF is called


dereferencing the REF. Oracle provides the DEREF operator to
do this. Dereferencing a dangling REF results in a null object.
▪ The DEREF function in a SQL statement returns the object
instance corresponding to a REF.

SELECT deref(REF(a))
FROM address_tab a
DEREF(REF(A))(ADDRESS1, ADDRESS2, CITY, PINCODE, STATE)
ADDRESS_TY('Sea Woods Apt', 'Malad', 'Mumbai', 400101, 'Maharashtra')
ADDRESS_TY('river Woods Apt', 'Malad', 'Mumbai', 400101, 'Rajasthan')
XP
Advantages of ORDBMSs

▪ Enriched modelling capabilities


▪ Extensibility
▪ Reuse and Sharing
▪ Support for schema evolution
▪ Applicable for advanced database applications
▪ Improved performance
Disadvantages of ORDBMSs XP

▪ Complexity
▪ Increased costs
▪ Supporters of relational approach believe simplicity
and purity of relational model are lost
▪ Some believe RDBMS is being extended for what will
be a minority of applications
▪ SQL now extremely complex
XP
RDBMS Vs ORDBMS
• In object oriented model, the object has both state and behavior,
while with the relational model only the state is evidenced.
• Relational database is made up of relations, while an object-
oriented database is made up of classes
• In ORDBMS class hierarchy includes inheritance, while RDBMS
includes external keys
• In the relational model there is a primary key that refer the related
data, while object oriented model there is an object identifier
(OID) that refer the inherited data.
• RDBMS is easy to use as it has fewer features
• RDBMS has simple data model which makes it easier to optimize
queries for efficient execution.
• RDBMS is less versatile than an ORDBMS
End of Unit 1
Data is a precious thing and
will last longer
than the systems themselve
- Tim Berners-Lee
Practical (Defining Object Type ) XP

CREATE TYPE<object_type_name> AS CREATE TYPE address_ty AS OBJECT (


OBJECT street VARCHAR2(30),
( city VARCHAR2(20),
<attribute_l><datatype>, state CHAR(2),
. postal_code VARCHAR2(6) );
);
Practical (Creating Table) XP

CREATE TYPE address_ty AS OBJECT ( CREATE TABLE employee (


street VARCHAR2(30), employee_id NUMBER(6),
city VARCHAR2(20), person person_ty;
state CHAR(2),
);
postal_code VARCHAR2(6) );

CREATE TYPE person_ty AS OBJECT(


name VARCHAR2(20)
address address_ty) );

▪ Data cannot be inserted in addess_ty or peson_ty as they


Practical (Inserting Values) XP

• Insert incorporate calls to the constructors for object


types, to create instances of the types.

insert into employee values(101,person_ty('Rita,


address_ty(‘Thakur Marg', 'Mumbai', 'Maharashtra’,
400092));
Practical (Querying) XP

• Select * from employees;


Will display in format
employee person_ty (name, address_ty (street, city , state,
_id postal_code)
101 person_ty('Rita, address_ty(‘Thakur Marg', 'Mumbai',
'Maharashtra’, 400092))
Practical (Querying specific columns) XP

select e.employee_id “Employee Id" ,[Link]


“Employee Name"
from employee e
where [Link]='Mumbai’
Employe Employee
e Id Name
101 Rita
• Without column alias
select e.employee_id,[Link]
from employee e Employee_id [Link]
where [Link]='Mumbai’
101 Rita
Practical (Describe – set depth) XP

• SET DESCRIBE [DEPTH n | ALL]


• Sets the depth of the level to which you can recursively describe
an object.

• Describe employee Name Null? Type


EMPLOYEE_ID NUMBER
PERSON PERSON_TY

• set describe depth 3; Name Null? Type


EMPLOYEE_ID NUMBER
describe employee PERSON PERSON_TY NAME VARCHAR2(25)
ADDRESS ADDRESS_TY STREET VARCHAR2(30)
CITY VARCHAR2(20)
PINCODE NUMBER(8)
STATE VARCHAR2(15)
Practical XP

• To show the details of column_name, data_type for the table


you created.
SELECT COLUMN_NAME, DATA_TYPE FROM
USER_TAB_COLUMNS WHERE TABLE_NAME= ‘employee’;

• To show the details (attr_name,length,attr_type) of the


attributes for the ADT you have created.
SELECT ATTR_NAME, LENGTH, ATTR_TYPE_NAME FROM
USER_TYPE_ATTRS WHERE TYPE_NAME = ‘person_ty'
Sample questions XP

– Distributed & centralized database


– Parallel & distributed databases
– Synchronous and asynchronous replication
– Homogeneous and heterogeneous DDBMS
– Explain architecture of parallel DBMS. With real time example based on it. ---2016-KT, 2015-Rev
– Define speed up and scale up? (2 important issues related to parallel machines)
– Why shared memory and shared disk approaches suffer from the problem of interference?
Discuss speed up and scale up of shared nothing architecture.
– What are advantages and disadvantages of replication and fragmentation?
– RDBMS Vs ORDBMS
– OODBMS Vs ORDBMS
– What are various complex data types available in ORDBMS? Explain with suitable example
– Types of Data partitioning or Data Partitioned parallelism

You might also like