0% found this document useful (0 votes)

9 views5 pages

Algorithm of Data Integration

The document outlines an algorithm for data integration using horizontal and vertical integration techniques across multiple relations. It describes the creation of Relation Attribute Matrices (RAM) of different orders to evaluate the presence of target attributes in various combinations of relations, ultimately identifying candidate sets for integration. The algorithm iteratively refines these sets to derive integration rules for query processing, culminating in a final set of horizontal and vertical integrations.

Uploaded by

Subrata Bose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views5 pages

Algorithm of Data Integration

Uploaded by

Subrata Bose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

Algorithm of Data Integration

Example of Data Integration

Given Relations: A(a, b, u), B(a, b, c, v), C(a, c, d, w), D(c, d, e, x), E(c, d, y), F(c, e, z), G(a, b, c, d,
e, m), H(x, y, z);
Target attribute list {a, b, c, d, e}.

Notations:
 Horizontal Integration on a set of relations HI {. , . , .}

 Vertical Integration on set of relations and attributes VI {. , . , . }attributes

o Example HI { { A } , VI { B , C }( a ,b ) } represents horizontal integration (set union) of relation A with the
relation formed by vertical integration (join) of relation B with C on their common attributes set (a, b)

 Relation Attribute Matrix Order 1 ( RAM 1 ) is a matrix of relations vis-à-vis target attributes where each row
represents one relation and each column represents one attribute from the set of target attributes. Each cell of the
matrix is filled with a 1 if the attribute is present in the relation, 0 if absent.

 Relation Attribute Matrix Order k ( RAM k ), k ≥2, is similar to RAM 1 except that here each relation consists of k
individual relations formed by joining a relation R1 of order (k-1) with another relation R2 of order 1. Each cell is
formed by ORing the corresponding cell values in RAM k −1 and RAM 1 . This indicates whether an attribute is
present in the joined relation. In addition, it has a column which shows the set of common attributes (CA), if exists, for
joining R1 with R2. CA is counted if the corresponding cell value is 1 in both RAM k −1 and RAM 1. It has another
column which indicates whether the attribute set in the joined relation of R1 with R2 has at least one more attribute
than both R1 and R2 or not (Super Set –Yes or No) and finally a Status column which indicates whether the joined
relation of R1 R 2 is Full (Super Set = Yes, has all the attributes of the target list with at least one common attribute),
Part (Super Set = Yes, has some but not all the attributes of the target list with at least one common attribute), None
(neither Part nor Full)

 Candidate Set CS k is the set of possible candidate relations of order k formed by joining a relation R1 of order (k-1)
with another relation R2 of order 1.

1. Create C S1 = { { A } , { B } , { C } , { D } , { E } , { F } , {G } , {H }}. Create Relation Attribute Matrix Order 1 ( RAM 1).

Attributes
Relation
A b c d e
A 1 1 0 0 0
B 1 1 1 0 0
C 1 0 1 1 0
D 0 0 1 1 1
E 0 0 1 1 0
F 0 0 1 0 1
G 1 1 1 1 1
H 0 0 0 0 0

Set HI =∅ , VI =∅ , C S1 =∅ . Examine RAM1 row by row. If all the columns of a row are 1 then add the relation
to HI. If some but not all columns are 1 then add the relation to C S1 . Ignore the relation (row) if all columns contain
0. Thus, in this example, HI ={{G }} and C S1 ={ { A } , { B } , { C } , { D } , { E } , { F } }
2. Create CS 2 by joining each element of CS 1 with a different element of CS 1 so that there is no repetition. Thus, in this
example,
CS2= { { A , B } , { A ,C } , { A , D } , { A , E } , { A , F } , { B , C } , { B , D } , { B , E } , { B , F } , {C , D } , { C , E } , { C , F } , { D , E } , { D , F
. Create RAM2 using CS 2 and RAM 1

Relation a B C d e CA SS Status
{A, B} 1 1 1 0 0 a, b No None
{A, C} 1 1 1 1 0 a Yes Part
{A, D} 1 1 1 1 1 - Yes None
{A, E} 1 1 1 1 0 - Yes None
{A, F} 1 1 1 0 1 - Yes None
{B, C} 1 1 1 1 0 a, c Yes Part
{B, D} 1 1 1 1 1 c Yes Full
{B, E} 1 1 1 1 0 c Yes None
{B, F} 1 1 1 0 1 c Yes None
{C, D} 1 0 1 1 1 c, d Yes Part
{C, E} 1 0 1 1 0 c, d No None
{C, F} 1 0 1 1 1 c Yes Part
{D, E} 0 0 1 1 1 c, d No None
{D, F} 0 0 1 1 1 c, e No None
{E, F} 0 0 1 1 1 c Yes Part

Examine RAM2 row by row. If any row has a status “Full” then add it to HI and remove from CS 2 . If any row has a
“Part” status, add it to VI with the attribute(s) for joining. If any row has a status “None” remove it from CS 2 .

Thus, in this example, HI ={{ G } , VI { B , D }c }. VI ={ { A , C }a , { B , C }(a ,c) , { C , D }(c ,d ) , { C , F }c , { E , F }c }

.
CS2= { { A ,C } , { B , C } , { C , D } , { C , F } , { E , F } }
3. Create CS3 by joining each element of CS2 with an element of CS1.

{
{ { A , C } , B } , { { A ,C } , D } , { { A , C } , E } , { { A , C } , F } , { { B , C } , A } , { { B ,C } , D } , { { B , C } , E } , { { B ,C
C S3 = { { C , D } , A } , { {C , D } , B } , { { C , D } , E } , { {C , D } , F } , { {C , F } , A } , { { C , F } , B } , { {C , F } , D } , { {C , F } , E } , { { E , F
{ { E , F } , C } , {{ E , F } , D }
Create RAM3 using CS3 and RAM 1 , RAM 2. While creating RAM3 we allow repetition if the earlier combination results in
None status. Otherwise we do not try any repetition. For example, since { { A , C } , B } resulted in None, we tried
{ { B ,C } , A }. But as { { A , C } , D } resulted in Full status, there is no reason to try { {C , D } , A }
Relation a b c d e CA SS Status
{ { A , C }, B} 1 1 1 1 0 a, b, c No None
{ { A , C } , D }1 1 1 1 1 c, d Yes Full
{ { A , C }, E } 1 1 1 1 0 c, d No None
{ { A , C }, F } 1 1 1 1 1 c Yes Full
{ { B ,C } , A } 1 1 1 1 0 a, b No None
{ { B ,C } , D } 1 1 1 1 1 c, d Yes Full
{ { B ,C } , E } 1 1 1 1 0 c, d No None
{ { B ,C } , F } 1 1 1 1 1 c Yes Full
{ {C , D } , A } 1 1 1 1 1 a Yes Full (Same as ACD)
{ {C , D } , B } 1 1 1 1 1 a, c Yes Full (Same as BCD)
{ {C , D } , E } 1 0 1 1 1 c, d No None
{ {C , D } , F } 1 0 1 1 1
c, e
No None
{ {C , F } , A } 1 1 1 1 1 a Yes Full (Same as ACF)
{ {C , F } , B } 1 1 1 1 1 a, c Yes Full (Same as BCF)
{ {C , F } , D } 1 0 1 1 1
c, d, e
No None
{ {C , F } , E } 1 0 1 1 1 c, d No None
{ {E , F }, A }1 1 1 1 1 - Yes None
{ {E , F }, B} 1 1 1 1 1 c Yes Full
{ {E , F }, C } 1 0 1 1 1 c, d Yes Part (CFE is None)
{ {E , F }, D }0 0 1 1 1 c, d, e No None

Examine RAM3 row by row. If any row has a status “Full” then add it to HI and remove from CS3. Initialize VI to phi. If any
row has a “Part” status, add it to VI after removing the corresponding relation from VI. If any row has a status “None”
remove it from CS 3 . Thus HI becomes,
HI { {G } , VI { B , D }c ,VI {VI { A , C }a , D } (c ,d ) ,VI { VI { A , C }a , F }c , VI {VI { B , C }(a ,c ) , D }(c ,d ) , VI {VI { B , C }(a , c ) , F }c ,VI { { E
VI ={ { { E , F }c ,C }( c, d ) }
CS3={{ E , F } , C }

C S 4= {{ { E , F } , C } , A } , { { { E , F } ,C } , B } , {{ { E , F } ,C } , D }, Create RAM4 from CS4, RAM3 and RAM1

Relation a b c d e CA SS Status
{ { { E , F } ,C } , A1} 1 1 1 1 a Yes Full
{ { { E , F } ,C } , B1} 1 1 1 1 a, c Yes Full
{ { { E , F } ,C } , D1} 0 1 1 1 c, d, e No None

Now first two rules are added to HI and CS4 becomes a null set. Hence the algorithm stops here.

Thus final rules of all possible integrations are

{
{G }
VI { B , D }c
VI {VI { A ,C }a , D }(c , d)
VI { VI { A ,C }a , F }c
HI VI {VI { B , C }( a ,c ) , D }( c, d )
VI {VI { B , C }( a ,c ) , F }c
VI { { E , F }c , B }c }
VI { {VI { VI { E , F }c , C }(c , d )} , A }a

{ }
VI {VI { VI { E , F }c , C }(c ,d ) } , B (a , c)

Algorithm to find Integration Rules for Query Processing

Input: 1) Number of relations (m); Relations Names [rel_name(1), rel_name(2), …, rel_name(m)]

2) Number of attributes in the target list (n); Attribute Names [attr(1), attr(2), …, attr(n)]
3) Maximum Order of integration desired (p) // optional
4) Relation Attribute Matrix of Order 1 (RAM1) of size m xn

Output: Integration Rules consisting of possible Horizontal and Vertical Integrations

Step 1: Initialization
1. Initialize the set of Horizontal Integration HI =∅ .
2. Create candidate set of order 1 C S1 =∅ . // Set of sets of size 1 i.e. singleton element
3. For each row r in RAM1
c1:= count of number of columns having 1
If c1 = n then
add the relation {rel_name(r)} to HI // the relation has all the attributes of the
target list – case of HI
ElseIf (c1 > 0) then
add the relation {rel_name(r)} to C S1 // the relation has some but not all the
attributes of the target list
EndIf
Next
4. If CS1 is empty or has a single relation then output HI and exit // no new relation to add

Step2: Iterate to find Horizontal and Vertical Integrations

1. Set k = 2;
2. Loop
a. Initialize the set of Vertical Integration
V I =∅ .
b. Create C S k by taking cross product of C S k−1 with C S1 such that no individual relation is
repeated in C S k.
// C S k is the Candidate set of order k means each element set in CSk has k number of
relations
c. Create RAMk of size ¿ C S k ∨¿ x (n+1)
d. For each row r in RAMk
If the relations in C S k (r ) has already appeared in any of the relations from
C S k (1)… C Sk (r−1) then
// Check the status field of that earlier computation, if it is Full or Part ignore this row, if
None then proceed
If RAMk (r, n+1) <> None then Exit For
attribute_list:=””; c1=0, c2 = 0, c3 =0
For each column c in (1 … n)
RAMk (r, c) := RAMk-1 (r, c) OR RAM1 (r, c) // Logical OR
If RAMk-1 (r, c) = 1 and RAM1 (r, c) = 1 then
attribute_list:= attribute_list + attr(i) // common attribute
If RAM1 (r, c) = 1 then
c1:= c1 + 1 // c1:= count of number of columns having 1 in RAM1
If RAMk-1 (r, c) = 1 then
c2:= c2 + 1 // c2:= count of number of columns having 1 in RAMk-1
If RAMk (r, c) = 1 then
c3:= c3 + 1 // c3:= count of number of columns having 1 in RAMk
Next
If attribute_list ≠ “” and (c3 > c1 and c3 > c2) then
// New relation has common attribute(s) i.e. can be joined and it generates a
super set
If (c3 = n) then
RAMk (r, n+1):= Full //Update status of the relation as Full
add the relation C S k (r ) to HI in the form VI{CSk-1(r), CS1(r)}attribute_list
// Case of Full set of attributes obtained by vertical integration of CS k-1(r) and CS1(r) on the
common attributes
Else
RAMk (r, n+1):= Part //Update status of the relation as Part
add the relation C S k (r ) to V I in the form VI{CSk-1(r), CS1(r)}attribute_list
Else
RAMk (r, n+1):= None //Update status of the relation as None

Next
e. Delete all the rows from CSk whose corresponding entry in RAMk is either Full or None
f. If k = p then report HI and exit else continue
End Loop

Algorithm of Data Integration
No ratings yet
Algorithm of Data Integration
4 pages
FD Slide2 09
No ratings yet
FD Slide2 09
19 pages
Relational Database Design Concepts
No ratings yet
Relational Database Design Concepts
77 pages
Dbms Chapter 4
No ratings yet
Dbms Chapter 4
76 pages
Database Concepts and SQL Guide
No ratings yet
Database Concepts and SQL Guide
2 pages
Solution (Mid Semester) 4
No ratings yet
Solution (Mid Semester) 4
8 pages
Attribute Closure and Functional Dependencies
No ratings yet
Attribute Closure and Functional Dependencies
69 pages
Mock
No ratings yet
Mock
20 pages
UNIT III - Functional Dependency
No ratings yet
UNIT III - Functional Dependency
42 pages
Assignment2 1
No ratings yet
Assignment2 1
7 pages
Database 5
No ratings yet
Database 5
40 pages
CS2102 - Cheat Sheet
No ratings yet
CS2102 - Cheat Sheet
3 pages
Database Assignment: Functional Dependencies & Query Optimization
No ratings yet
Database Assignment: Functional Dependencies & Query Optimization
5 pages
Armstrong's Axioms Explained
No ratings yet
Armstrong's Axioms Explained
17 pages
Revision Session
No ratings yet
Revision Session
11 pages
Functional Dependancy
No ratings yet
Functional Dependancy
27 pages
12 Functional Dependencies 19-08-2024
No ratings yet
12 Functional Dependencies 19-08-2024
79 pages
Functional Dependency & Normalization
No ratings yet
Functional Dependency & Normalization
134 pages
Database Systems: Integrity Constraints & SQL
No ratings yet
Database Systems: Integrity Constraints & SQL
2 pages
1045uf Cs Abcd Dbms
No ratings yet
1045uf Cs Abcd Dbms
7 pages
University Database Schema Design
No ratings yet
University Database Schema Design
7 pages
Problems On Normalization
No ratings yet
Problems On Normalization
7 pages
Lecture - 15&16 - Functional Dependency
No ratings yet
Lecture - 15&16 - Functional Dependency
44 pages
Relational Database Design
No ratings yet
Relational Database Design
76 pages
Relational Database Design Guide
No ratings yet
Relational Database Design Guide
56 pages
CH 5
No ratings yet
CH 5
26 pages
Quiz 1 Study Guide
No ratings yet
Quiz 1 Study Guide
2 pages
Database Concepts (SoSe 2025) Excerocese 8
No ratings yet
Database Concepts (SoSe 2025) Excerocese 8
19 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
11 pages
Relational Algebra Operations in RDM: Tools Boot Camp
No ratings yet
Relational Algebra Operations in RDM: Tools Boot Camp
32 pages
CSE227 Endterm Exam Guide
No ratings yet
CSE227 Endterm Exam Guide
6 pages
Database Management Systems Course Guide
No ratings yet
Database Management Systems Course Guide
44 pages
Unit - Iv B
No ratings yet
Unit - Iv B
89 pages
604uf - CS A DBMS
No ratings yet
604uf - CS A DBMS
7 pages
Lecture 9 DB Normalization
No ratings yet
Lecture 9 DB Normalization
112 pages
Smu CS7330 - Howework2
No ratings yet
Smu CS7330 - Howework2
7 pages
Functional Dependencies and Closure
No ratings yet
Functional Dependencies and Closure
112 pages
IT 220 Unit 4 Relational Database Design
No ratings yet
IT 220 Unit 4 Relational Database Design
28 pages
Relational Database Design Notes
No ratings yet
Relational Database Design Notes
27 pages
Database Keys and SQL Queries Explained
No ratings yet
Database Keys and SQL Queries Explained
3 pages
Understanding Third Normal Form and Candidate Keys
No ratings yet
Understanding Third Normal Form and Candidate Keys
21 pages
Normalization
No ratings yet
Normalization
161 pages
Unit 4 DBMS R23
No ratings yet
Unit 4 DBMS R23
19 pages
FD and Closure Attribute
No ratings yet
FD and Closure Attribute
42 pages
Relational Database Design: FDs
No ratings yet
Relational Database Design: FDs
122 pages
CIA 1 Key
No ratings yet
CIA 1 Key
3 pages
04-Functionaldependencies 2
No ratings yet
04-Functionaldependencies 2
72 pages
Lec08 Design Theory
No ratings yet
Lec08 Design Theory
48 pages
Normalization Assignment
No ratings yet
Normalization Assignment
7 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
41 pages
Functional Dependencies and Prime Attributes Analysis
No ratings yet
Functional Dependencies and Prime Attributes Analysis
12 pages
TechVault Database Indexes Hash Composite
No ratings yet
TechVault Database Indexes Hash Composite
17 pages
4 - Chapter 2 - Relational Model of Data - P3
No ratings yet
4 - Chapter 2 - Relational Model of Data - P3
41 pages
(1983) An Analytic Approach To Statistical Databases E Lefons A Silvestri F Tangorra
No ratings yet
(1983) An Analytic Approach To Statistical Databases E Lefons A Silvestri F Tangorra
15 pages
Privacy Preserving Vertical Queries Over Multiple Databases in A Cloud
No ratings yet
Privacy Preserving Vertical Queries Over Multiple Databases in A Cloud
12 pages
Chap 1 Lect 8
No ratings yet
Chap 1 Lect 8
13 pages
Privacy for SQL Queries
No ratings yet
Privacy for SQL Queries
18 pages
(2010) On The Impossibility of Cryptography Alone For Privacy Preserving Cloud Computing
No ratings yet
(2010) On The Impossibility of Cryptography Alone For Privacy Preserving Cloud Computing
6 pages
Ritik Mishra Project
No ratings yet
Ritik Mishra Project
91 pages
Netrunner Cyberdeck Moves Guide
No ratings yet
Netrunner Cyberdeck Moves Guide
2 pages
Evolution of Ancient Garden Designs
No ratings yet
Evolution of Ancient Garden Designs
23 pages
WebSphere Message Broker v1 20160511 - 2257
No ratings yet
WebSphere Message Broker v1 20160511 - 2257
8 pages
Literature Review - Exploring The Accessibility of Websites For Differently Abled Individuals in Sri La
No ratings yet
Literature Review - Exploring The Accessibility of Websites For Differently Abled Individuals in Sri La
4 pages
7 Standards of Textuality
No ratings yet
7 Standards of Textuality
43 pages
PROJECT Wonder - Worksheet
No ratings yet
PROJECT Wonder - Worksheet
7 pages
UN Sustainable Development Law
No ratings yet
UN Sustainable Development Law
16 pages
WPS Canada Inc. Invoice Statement
No ratings yet
WPS Canada Inc. Invoice Statement
1 page
Home Networking: A Seminar Report On
No ratings yet
Home Networking: A Seminar Report On
16 pages
Grade 7/8 Automotive Sealant Prep Guide
No ratings yet
Grade 7/8 Automotive Sealant Prep Guide
10 pages
Fee Payment Guidelines for Students
No ratings yet
Fee Payment Guidelines for Students
1 page
3 - Absorption and Variable Costing
No ratings yet
3 - Absorption and Variable Costing
3 pages
Trauma and Haemorrhage
No ratings yet
Trauma and Haemorrhage
42 pages
Honeywell International Stock Analysis Overview
No ratings yet
Honeywell International Stock Analysis Overview
8 pages
20 đề thi thử Anh THPTQG
No ratings yet
20 đề thi thử Anh THPTQG
326 pages
Detailed Design Study Report of New Bohol Airport Construction and Sustainable Environment Protection Project. Final Report
No ratings yet
Detailed Design Study Report of New Bohol Airport Construction and Sustainable Environment Protection Project. Final Report
311 pages
Application Form - CMP
No ratings yet
Application Form - CMP
3 pages
C&P WITCHLINER Insulated U-Bolt (Not To Grip)
No ratings yet
C&P WITCHLINER Insulated U-Bolt (Not To Grip)
1 page
Apollo Agricultural Tyres Overview
No ratings yet
Apollo Agricultural Tyres Overview
37 pages
Bangladesh Doctors Contact List
No ratings yet
Bangladesh Doctors Contact List
410 pages
Test Accoring ISO
No ratings yet
Test Accoring ISO
6 pages
Magazine Target Audience Analysis Guide
No ratings yet
Magazine Target Audience Analysis Guide
13 pages
Model Invitation Letter for UK Visa
No ratings yet
Model Invitation Letter for UK Visa
3 pages
National Income Accounting
No ratings yet
National Income Accounting
14 pages
English Language Quiz
No ratings yet
English Language Quiz
2 pages
How Do I Start Growing Mushrooms - PDF Version 1
No ratings yet
How Do I Start Growing Mushrooms - PDF Version 1
3 pages
Part 5b - Road Markings
No ratings yet
Part 5b - Road Markings
95 pages
DN 2410 D Uk B
No ratings yet
DN 2410 D Uk B
2 pages
Investigation Data Form: Date Received: Nps Docket No.
No ratings yet
Investigation Data Form: Date Received: Nps Docket No.
4 pages

Algorithm of Data Integration

Uploaded by

Algorithm of Data Integration

Uploaded by

Algorithm of Data Integration

Example of Data Integration

 Vertical Integration on set of relations and attributes VI {. , . , . }attributes

1. Create C S1 = { { A } , { B } , { C } , { D } , { E } , { F } , {G } , {H }}. Create Relation Attribute Matrix Order 1 ( RAM 1).

Thus, in this example, HI ={{ G } , VI { B , D }c }. VI ={ { A , C }a , { B , C }(a ,c) , { C , D }(c ,d ) , { C , F }c , { E , F }c }

C S 4= {{ { E , F } , C } , A } , { { { E , F } ,C } , B } , {{ { E , F } ,C } , D }, Create RAM4 from CS4, RAM3 and RAM1

Thus final rules of all possible integrations are

Algorithm to find Integration Rules for Query Processing

Input: 1) Number of relations (m); Relations Names [rel_name(1), rel_name(2), …, rel_name(m)]

Output: Integration Rules consisting of possible Horizontal and Vertical Integrations

Step2: Iterate to find Horizontal and Vertical Integrations

You might also like