0% found this document useful (0 votes)

17 views4 pages

Algorithm of Data Integration

The document outlines an algorithm for data integration using horizontal and vertical integration techniques on a set of relations with specified attributes. It describes the process of creating Relation Attribute Matrices (RAM) of different orders to identify candidate sets and integration rules, ultimately leading to the identification of horizontal and vertical integrations. The algorithm iteratively combines relations to form new relations while checking for common attributes and the completeness of the target attribute list.

Uploaded by

Subrata Bose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views4 pages

Algorithm of Data Integration

Uploaded by

Subrata Bose

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Algorithm of Data Integration

Example of Data Integration

Given Relations: A(a, b, u), B(a, b, c, v), C(a, c, d, w), D(c, d, e, x), E(c, d, y), F(c, e, z), G(a, b, c, d,
e, m), H(x, y, z);
Target attribute list {a, b, c, d, e}.

Notations:
 Horizontal Integration on a set of relations HI {. , . , .}

 Vertical Integration on set of relations and attributes VI {. , . , . }attributes

o Example HI { { A } , VI { B , C }( a ,b ) } represents horizontal integration (set union) of relation A with the
relation formed by vertical integration (join) of relation B with C on their common attributes set (a, b)

 Relation Attribute Matrix Order 1 ( RAM 1 ) is a matrix of relations vis-à-vis target attributes where each row
represents one relation and each column represents one attribute from the set of target attributes. Each cell of the
matrix is filled with a 1 if the attribute is present in the relation, 0 if absent.
 Relation Attribute Matrix Order k ( RAM k ), k ≥2, is similar to RAM 1 except that here each relation consists of k
R1 of order (k-1) with another relation R2 of order 1. Each cell is
individual relations formed by joining a relation
formed by ORing the corresponding cell values in RAM k −1 and RAM 1 . This indicates whether an attribute is
present in the joined relation. In addition, it has a column which shows the set of common attributes (CA), if exists, for
joining R1 with R2. CA is counted if the corresponding cell value is 1 in both RAM k −1 and RAM 1. It has another
column which indicates whether the attribute set in the joined relation of R1 with R2 has at least one more attribute
than both R1 and R2 or not (Super Set –Yes or No) and finally a Status column which indicates whether the joined
relation of R1 R 2 is Full (Super Set = Yes, has all the attributes of the target list with at least one common attribute),
Part (Super Set = Yes, has some but not all the attributes of the target list with at least one common attribute), None
(neither Part nor Full)

 Candidate Set CS k is the set of possible candidate relations of order k formed by joining a relation R1 of order (k-1)
with another relation R2 of order 1.

1. Create C S1 = { { A } , { B } , { C } , { D } , { E } , { F } , {G } , {H }}. Create Relation Attribute Matrix Order 1 ( RAM 1).

Attributes
Relation
A b c d e
A 1 1 0 0 0
B 1 1 1 0 0
C 1 0 1 1 0
D 0 0 1 1 1
E 0 0 1 1 0
F 0 0 1 0 1
G 1 1 1 1 1
H 0 0 0 0 0

Set HI =∅ , VI =∅ , C S1 =∅ . Examine RAM1 row by row. If all the columns of a row are 1 then add the relation
to HI. If some but not all columns are 1 then add the relation to C S1 . Ignore the relation (row) if all columns contain
0. Thus, in this example, HI ={{G }} and C S1 ={ { A } , { B } , { C } , { D } , { E } , { F } }
2. Create CS 2 by joining each element of CS 1 with a different element of CS 1 so that there is no repetition. Thus, in this
example,
CS2= { { A , B } , { A ,C } , { A , D } , { A , E } , { A , F } , { B , C } , { B , D } , { B , E } , { B , F } , {C , D } , { C , E } , { C , F } , { D , E } , { D , F
. Create RAM2 using CS 2 and RAM 1
Relation a B C d e CA SS Status
{A, B} 1 1 1 0 0 a, b No None
{A, C} 1 1 1 1 0 a Yes Part
{A, D} 1 1 1 1 1 - Yes None
{A, E} 1 1 1 1 0 - Yes None
{A, F} 1 1 1 0 1 - Yes None
{B, C} 1 1 1 1 0 a, c Yes Part
{B, D} 1 1 1 1 1 c Yes Full
{B, E} 1 1 1 1 0 c Yes None
{B, F} 1 1 1 0 1 c Yes None
{C, D} 1 0 1 1 1 c, d Yes Part
{C, E} 1 0 1 1 0 c, d No None
{C, F} 1 0 1 1 1 c Yes Part
{D, E} 0 0 1 1 1 c, d No None
{D, F} 0 0 1 1 1 c, e No None
{E, F} 0 0 1 1 1 c Yes Part

Examine RAM2 row by row. If any row has a status “Full” then add it to HI and remove from CS 2 . If any row has a
“Part” status, add it to VI with the attribute(s) for joining. If any row has a status “None” remove it from CS 2 .

Thus, in this example, HI ={{ G } , VI { B , D }c }. VI ={ { A , C }a , { B , C }(a ,c) , { C , D }(c ,d ) , { C , F }c , { E , F }c }

.
CS2= { { A ,C } , { B , C } , { C , D } , { C , F } , { E , F } }
3. Create CS3 by joining each element of CS2 with an element of CS1.

{
{ { A , C } , B } , { { A ,C } , D } , { { A , C } , E } , { { A , C } , F } , { { B , C } , A } , { { B ,C } , D } , { { B , C } , E } , { { B ,C
C S3 = { { C , D } , A } , { {C , D } , B } , { { C , D } , E } , { {C , D } , F } , { {C , F } , A } , { { C , F } , B } , { {C , F } , D } , { {C , F } , E } , { { E , F
{ { E , F } , C } , {{ E , F } , D }
Create RAM3 using CS3 and RAM 1 , RAM 2. While creating RAM3 we allow repetition if the earlier combination results in
None status. Otherwise we do not try any repetition. For example, since { { A , C } , B } resulted in None, we tried
{ { B ,C } , A }. But as { { A , C } , D } resulted in Full status, there is no reason to try { {C , D } , A }
Relation a b c d e CA SS Status
{ { A , C }, B} 1 1 1 1 0 a, b, c No None
{ { A , C } , D }1 1 1 1 1 c, d Yes Full
{ { A , C }, E } 1 1 1 1 0 c, d No None
{ { A , C }, F } 1 1 1 1 1 c Yes Full
{ { B ,C } , A } 1 1 1 1 0 a, b No None
{ { B ,C } , D } 1 1 1 1 1 c, d Yes Full
{ { B ,C } , E } 1 1 1 1 0 c, d No None
{ { B ,C } , F } 1 1 1 1 1 c Yes Full
{ {C , D } , A } 1 1 1 1 1 a Yes Full (Same as ACD)
{ {C , D } , B } 1 1 1 1 1 a, c Yes Full (Same as BCD)
{ {C , D } , E } 1 0 1 1 1 c, d No None
{ {C , D } , F } 1 0 1 1 1
c, e
No None
{ {C , F } , A } 1 1 1 1 1 a Yes Full (Same as ACF)
{ {C , F } , B } 1 1 1 1 1 a, c Yes Full (Same as BCF)
{ {C , F } , D } 1 0 1 1 1
c, d, e
No None
{ {C , F } , E } 1 0 1 1 1 c, d No None
{ {E , F }, A }1 1 1 1 1 - Yes None
{ {E , F }, B} 1 1 1 1 1 c Yes Full
{ {E , F }, C } 1 0 1 1 1 c, d Yes Part (CFE is None)
{ {E , F }, D }0 0 1 1 1 c, d, e No None

Examine RAM3 row by row. If any row has a status “Full” then add it to HI and remove from CS3. Initialize VI to phi. If any
row has a “Part” status, add it to VI after removing the corresponding relation from VI. If any row has a status “None”
remove it from CS 3 . Thus HI becomes,
HI { {G } , VI { B , D }c ,VI {VI { A , C }a , D } (c ,d ) ,VI { VI { A , C }a , F }c , VI {VI { B , C }(a ,c ) , D }(c ,d ) , VI {VI { B , C }(a , c ) , F }c ,VI { { E
VI ={ { { E , F }c ,C }( c, d ) }
CS3={{ E , F } , C }

C S 4= {{ { E , F } , C } , A } , { { { E , F } ,C } , B } , {{ { E , F } ,C } , D }, Create RAM4 from CS4, RAM3 and RAM1

Relation a b c d e CA SS Status
{ { { E , F } ,C } , A1} 1 1 1 1 a Yes Full
{ { { E , F } ,C } , B1} 1 1 1 1 a, c Yes Full
{ { { E , F } ,C } , D1} 0 1 1 1 c, d, e No None

Now first two rules are added to HI and CS4 becomes a null set. Hence the algorithm stops here.

{
Thus final rules of all possible integrations are

{G }
VI { B , D }c
VI {VI { A ,C }a , D }(c , d)
VI { VI { A ,C }a , F }c
HI VI {VI { B , C }( a ,c ) , D }( c, d )
VI {VI { B , C }( a ,c ) , F }c
VI { { E , F }c , B }c }
{
VI {VI { VI { E , F }c , C }(c , d )} , A }
a

{
VI {VI { VI { E , F }c , C }(c ,d ) } , B (a , c) }
Algorithm to find Integration Rules for Query Processing

Integration_Rule

Input: 1) R: Set of relations with their attributes list, 2) Target_List: List of target attributes, 3) p:
Order of integration desired (optional)
Output: Integration Rules consisting of Horizontal and Vertical Integration

HI =∅ . Create candidate set of order 1 C S1 =∅ .

Step 1: Initialize the Horizontal Integration set
Create Relation Attribute Matrix of Order 1 (RAM 1 ) of size m x n where m is the number of relations
and n is the number of attributes in the target list. Fill a cell (i, j) with 1 if i th relation contains
attribute j, 0 otherwise.
for each row in RAM1
if all the columns are 1 (the relation has all the attributes of the target list) then add the
relation to HI
else
if some columns are 1 (the relation has some but not all the attributes of the target list) then
add the relation to C S1
next

Step2: Iterate to find Horizontal and Vertical Integration

Set k = 2;
Loop
Create C S k

For k = 2 to n – 1
Create C S k −List by taking cross product of each element of x ∈C S k−1−List with y ∈C S1−List where y
does not belong to x. If the xy so formed already belongs to C S k−1 −List then ignore it. If not add xy to
C S k−1 −List .

Create all possible pairs of relations from Relation Matrix by ORing the corresponding numbers. If both numbers are 1 then
count it is as a common attribute (CA). The attribute list so created for a pair is now checked whether it adds any extra
attribute by integrating the relations. For example in the pair table AC extra attributes compared to both A and C, thus it is
said to cover both relations. If newly formed relation has at least one CA and covers both the relations it is accepted for the
next step, if not it is no more considered.

Create
In the example C S2 −List={{ AB},{ AC ,{ AD

We create the following vertical integration list VI-List = {{A,C}, {B, C}, {C, D}, {C, F}, {E, F}}

EFC
A 1 1 1 1 1 1 Both Yes
EFC
B 1 1 1 1 1 1 Both Yes

HI ¿
VI (VI (B , C), D),VI ( B , C) , F ¿ ,
VI (VI (VI (E , F),C ), A),VI (VI (VI ( E , F ), C), B)¿

Algorithm of Data Integration
No ratings yet
Algorithm of Data Integration
5 pages
FD Slide2 09
No ratings yet
FD Slide2 09
19 pages
Dbms Chapter 4
No ratings yet
Dbms Chapter 4
76 pages
Relational Database Design Concepts
No ratings yet
Relational Database Design Concepts
77 pages
Attribute Closure and Functional Dependencies
No ratings yet
Attribute Closure and Functional Dependencies
69 pages
UNIT III - Functional Dependency
No ratings yet
UNIT III - Functional Dependency
42 pages
Database Concepts and SQL Guide
No ratings yet
Database Concepts and SQL Guide
2 pages
Functional Dependency & Normalization
No ratings yet
Functional Dependency & Normalization
134 pages
Functional Dependancy
No ratings yet
Functional Dependancy
27 pages
Relational Database Design Guide
No ratings yet
Relational Database Design Guide
56 pages
Solution (Mid Semester) 4
No ratings yet
Solution (Mid Semester) 4
8 pages
12 Functional Dependencies 19-08-2024
No ratings yet
12 Functional Dependencies 19-08-2024
79 pages
Database 5
No ratings yet
Database 5
40 pages
Database Management Systems Course Guide
No ratings yet
Database Management Systems Course Guide
44 pages
Armstrong's Axioms Explained
No ratings yet
Armstrong's Axioms Explained
17 pages
Problems On Normalization
No ratings yet
Problems On Normalization
7 pages
Understanding Third Normal Form and Candidate Keys
No ratings yet
Understanding Third Normal Form and Candidate Keys
21 pages
CS2102 - Cheat Sheet
No ratings yet
CS2102 - Cheat Sheet
3 pages
Lecture - 15&16 - Functional Dependency
No ratings yet
Lecture - 15&16 - Functional Dependency
44 pages
Mock
No ratings yet
Mock
20 pages
Normalization
No ratings yet
Normalization
161 pages
FD and Closure Attribute
No ratings yet
FD and Closure Attribute
42 pages
Functional Dependencies
No ratings yet
Functional Dependencies
30 pages
Unit 4 DBMS R23
No ratings yet
Unit 4 DBMS R23
19 pages
Database Concepts (SoSe 2025) Excerocese 8
No ratings yet
Database Concepts (SoSe 2025) Excerocese 8
19 pages
IT 220 Unit 4 Relational Database Design
No ratings yet
IT 220 Unit 4 Relational Database Design
28 pages
Quiz 1 Study Guide
No ratings yet
Quiz 1 Study Guide
2 pages
Unit - Iv B
No ratings yet
Unit - Iv B
89 pages
Unit III
No ratings yet
Unit III
43 pages
Relational Database Design
No ratings yet
Relational Database Design
76 pages
Assignment2 1
No ratings yet
Assignment2 1
7 pages
Database Systems: Integrity Constraints & SQL
No ratings yet
Database Systems: Integrity Constraints & SQL
2 pages
Database Normalization Guide
No ratings yet
Database Normalization Guide
41 pages
Relational Database Design Notes
No ratings yet
Relational Database Design Notes
27 pages
Database Assignment: Functional Dependencies & Query Optimization
No ratings yet
Database Assignment: Functional Dependencies & Query Optimization
5 pages
6 Normalization Part 3
No ratings yet
6 Normalization Part 3
37 pages
CH 5
No ratings yet
CH 5
26 pages
Lecture 9 DB Normalization
No ratings yet
Lecture 9 DB Normalization
112 pages
Functional Dependencies and Closure
No ratings yet
Functional Dependencies and Closure
112 pages
Chapter 7: Relational Database Design
No ratings yet
Chapter 7: Relational Database Design
92 pages
University Database Schema Design
No ratings yet
University Database Schema Design
7 pages
Smu CS7330 - Howework2
No ratings yet
Smu CS7330 - Howework2
7 pages
AD Chap3
No ratings yet
AD Chap3
45 pages
Functional Dependencies: January 31, 2003
No ratings yet
Functional Dependencies: January 31, 2003
41 pages
Lec08 Design Theory
No ratings yet
Lec08 Design Theory
48 pages
Relational Database Design Guide
No ratings yet
Relational Database Design Guide
92 pages
Relational Database Design Functional Dependencies
No ratings yet
Relational Database Design Functional Dependencies
21 pages
Revision Session
No ratings yet
Revision Session
11 pages
Lect 9 Decomposition
No ratings yet
Lect 9 Decomposition
35 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
11 pages
Relational Algebra Operations in RDM: Tools Boot Camp
No ratings yet
Relational Algebra Operations in RDM: Tools Boot Camp
32 pages
1045uf Cs Abcd Dbms
No ratings yet
1045uf Cs Abcd Dbms
7 pages
Relational Database Design: Dbms:Databasemanagement System
No ratings yet
Relational Database Design: Dbms:Databasemanagement System
47 pages
(1983) An Analytic Approach To Statistical Databases E Lefons A Silvestri F Tangorra
No ratings yet
(1983) An Analytic Approach To Statistical Databases E Lefons A Silvestri F Tangorra
15 pages
Privacy Preserving Vertical Queries Over Multiple Databases in A Cloud
No ratings yet
Privacy Preserving Vertical Queries Over Multiple Databases in A Cloud
12 pages
Chap 1 Lect 8
No ratings yet
Chap 1 Lect 8
13 pages
Privacy for SQL Queries
No ratings yet
Privacy for SQL Queries
18 pages
(2010) On The Impossibility of Cryptography Alone For Privacy Preserving Cloud Computing
No ratings yet
(2010) On The Impossibility of Cryptography Alone For Privacy Preserving Cloud Computing
6 pages
Experiment 4 DAA
No ratings yet
Experiment 4 DAA
3 pages
Unit 3 - Week 2
No ratings yet
Unit 3 - Week 2
4 pages
Week10 LinearSortsAndRegularExpressions
No ratings yet
Week10 LinearSortsAndRegularExpressions
22 pages
MATLAB Molar Volume Calculation
No ratings yet
MATLAB Molar Volume Calculation
7 pages
Sliding Window Notes
No ratings yet
Sliding Window Notes
19 pages
Design and Analysis of Algorithms Course Notes
No ratings yet
Design and Analysis of Algorithms Course Notes
161 pages
Array Programs For Interviews 1727455838
No ratings yet
Array Programs For Interviews 1727455838
192 pages
DS LAB Questions
No ratings yet
DS LAB Questions
6 pages
8 Week DSA Timetable
No ratings yet
8 Week DSA Timetable
4 pages
Quick Sort
No ratings yet
Quick Sort
7 pages
Unit 2 Question Papers by Pushpa
No ratings yet
Unit 2 Question Papers by Pushpa
4 pages
Data Structures Important Questions Guide
No ratings yet
Data Structures Important Questions Guide
6 pages
Network Algorithms for Developers
No ratings yet
Network Algorithms for Developers
9 pages
Python Record
No ratings yet
Python Record
32 pages
Fastest Hash Function For Table Lookups in C! - CodeProject
No ratings yet
Fastest Hash Function For Table Lookups in C! - CodeProject
1 page
Experiment No 5-1
No ratings yet
Experiment No 5-1
9 pages
Parking Lot Management C Code
No ratings yet
Parking Lot Management C Code
3 pages
O (Max-Icn) : Then LN) T
No ratings yet
O (Max-Icn) : Then LN) T
1 page
Ford Serial Number Decoder
100% (1)
Ford Serial Number Decoder
2 pages
Gaussian Elimination
No ratings yet
Gaussian Elimination
15 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Zero To Hero Python Dsa
No ratings yet
Zero To Hero Python Dsa
17 pages
1.1 Data Structures Syllabus 201
No ratings yet
1.1 Data Structures Syllabus 201
4 pages
Module 3 - Complexity of An Algorithm
No ratings yet
Module 3 - Complexity of An Algorithm
7 pages
Correction of Worksheet 1
No ratings yet
Correction of Worksheet 1
6 pages
Big-M Method: A Variant of Simplex Method
No ratings yet
Big-M Method: A Variant of Simplex Method
17 pages
Array and Linked List Operations
No ratings yet
Array and Linked List Operations
54 pages
Chapter 8 - Binary Trees
No ratings yet
Chapter 8 - Binary Trees
85 pages
Software Prep Guide
No ratings yet
Software Prep Guide
74 pages
CMP 452 Question 2018
No ratings yet
CMP 452 Question 2018
8 pages

Algorithm of Data Integration

Uploaded by

Algorithm of Data Integration

Uploaded by

Algorithm of Data Integration

Example of Data Integration

 Vertical Integration on set of relations and attributes VI {. , . , . }attributes

1. Create C S1 = { { A } , { B } , { C } , { D } , { E } , { F } , {G } , {H }}. Create Relation Attribute Matrix Order 1 ( RAM 1).

Thus, in this example, HI ={{ G } , VI { B , D }c }. VI ={ { A , C }a , { B , C }(a ,c) , { C , D }(c ,d ) , { C , F }c , { E , F }c }

C S 4= {{ { E , F } , C } , A } , { { { E , F } ,C } , B } , {{ { E , F } ,C } , D }, Create RAM4 from CS4, RAM3 and RAM1

HI =∅ . Create candidate set of order 1 C S1 =∅ .

Step2: Iterate to find Horizontal and Vertical Integration

You might also like