0% found this document useful (0 votes)
62 views13 pages

13 - Data Structures - B Tree B+ Tree

The document provides an overview of general trees and B-trees, explaining their structures, properties, and operations such as insertion, deletion, and searching. A general tree consists of nodes with a unique root and can have multiple children, while a B-tree is a self-balancing tree optimized for large data blocks, allowing nodes to have multiple keys and children. The document also details the transformation of general trees into binary trees and the specific operations involved in managing B-trees.

Uploaded by

Kamal joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views13 pages

13 - Data Structures - B Tree B+ Tree

The document provides an overview of general trees and B-trees, explaining their structures, properties, and operations such as insertion, deletion, and searching. A general tree consists of nodes with a unique root and can have multiple children, while a B-tree is a self-balancing tree optimized for large data blocks, allowing nodes to have multiple keys and children. The document also details the transformation of general trees into binary trees and the specific operations involved in managing B-trees.

Uploaded by

Kamal joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

GENERAL TREE

A tree is a finite set of one or more nodes such as there is a unique node called root and the
remaining nodes are partitioned into n>=0 disjoint sets T1, T2, …………Tn. The sets T1, T2,
…………Tn are subtrees of the root. The nodes at the equal level having the same parent are
said to be siblings.
A

B C D

E F G H I

J K

The difference between a binary tree and a general tree is that a binary tree can be empty but
the general tree should have at least one node. Moreover, nodes of the general tree may have
more than two children called siblings.

Linked Representation of General Tree


In a general tree, the left pointer keeps the address of the first child, or the leftmost child of that
node and the right pointer keeps the address of the next sibling of the node.
A

B C D

E F

B C D

E F
Linked Representation of a Tree

A forest is a set of n>=0 disjoint trees which have no common node. In other words, a forest
is a disjoint union of trees. Figure shows forest consisting of three trees T1, T2 and T3.

P A U

Q R B C D V W

E F X

T1 T2 T3
Forest F
Transforming General Tree into Binary Tree
A general tree can be converted into a binary tree as follows:
• The root of the general tree is taken as the root of a binary tree.
• The first child of a node in general tree becomes the left child of the node.
• The next sibling of the node becomes the right child of that node in a binary tree.
Consider the following general tree:
A

B C D

E F G H I J
General Tree
The binary tree has a root A. Node B becomes the left child of A. The next sibling of node B
is node C, therefore it becomes the right child of node B. The next sibling of node C is node D,
therefore it becomes the right child of node C. Similarly, node E becomes the left child of node
B. The next sibling of node E is node F, therefore it becomes the right child of node E.
Similarly, the other nodes are connected.
A

E C

F H D

G I

Binary Tree

B-TREE
A B-tree is a self-balancing tree that stores the data in a sorted form and permits various
operations to perform such as searches, sequential access, insertions and deletions in
logarithmic time. The B-tree is a generalization of a binary search tree in which a node can
have more than two children. Unlike self-balancing binary search trees, the B-tree is optimized
for systems that read and write large blocks of data. B-tree is a good example of a data structure
used for storage in auxiliary memory, therefore, is commonly used in databases and file
systems.
In a binary search tree and AVL Tree, every node can have only one value and maximum of
two children but in B-Tree a node can store more than one value (key) and it can have more
than two children. B-Tree is also known as height balanced m-way Search Tree.

B-Tree of order n has the following properties:


• All nodes except the root node must have at least [n/2] keys and maximum of n-1 keys.
• If the root node is a non-leaf node, then it must have at least 2 children.
• All non-leaf nodes except the root node must have at least n/2 children.
• A non-leaf node with m-1 keys must have m number of children.
• All the key values must be in ascending order.
• All the leaf nodes must be at the same level.

Consider a B-tree of order 5:


60

35, 46 86,100

5,7 40, 42 50, 55,58 66,70,75 90, 96 120,136,150


B Tree of Order 5

In a B-tree of order n=5, figure 7.35 following observations are made:


• All the nodes except the root node must have at least [5/2] keys i.e., minimum 2 keys and
maximum of 5-1 i.e., 4 keys.
• The root node is a non-leaf node, therefore, it has at least 2 children.
• All non-leaf nodes except the root node have at least [5/2] children, i.e., 2 children.
• A non-leaf node with m-1 keys has m number of children. Here, the root with key 60 has
two children. Similarly, a non-leaf node with two keys (35, 46) has three children (5 7),
(40, 42) and (50, 55, 58) and a non-leaf node with two keys (86, 100) has three children
(66, 70, 75), (90, 96) and (120, 136, 150).
• All the key values are in ascending order.
• All the leaf nodes are at the same level.

Insertion in a B-Tree
In a B tree, the new key value must be inserted only at the leaf node. The insertion operation is
performed as follows:
a) Check whether the tree is empty or not.
b) If the tree is empty, then create a new node with a new key value and insert it into the tree
as a root node.
c) If the tree is non-empty, then find a leaf node to which the new key value can be inserted
as with the case of insertion in a binary search tree.
d) If that leaf node has an empty position, then add the new key value to that leaf node by
maintaining ascending order of the key value within the node.
e) If that leaf node is already full, then split that leaf node by moving the middle key to its
parent node. Repeat the same process until the key value is fixed into a node.
f) If splitting occurs to the root node, then the middle value becomes the new root node for
the tree and the height of the tree is increased by one.
Let us create a B-tree of order 5 with the following key values:
50, 30, 40, 65, 80, 100, 45, 70, 85, 120, 10, 35, 55, 90, 95, 130, 150, 160

1. Insert 50 50

2. Insert 30 30 50

3. Insert 40 30 40 50

4. Insert 65 30 40 50 65

5. Insert 80
B-tree of order 5 can have at the most 4 keys. If the fifth key 80 is inserted then find the middle
key amongst 30, 40, 50, 65, 80 which comes out to be 50. Key 50 will become the root with
two children (30, 40) and (65, 80)
50

30 40 65 80

6. Insert 100, 45, 70


50

30 40 45 65 70 80 100

7. Insert 85
Node (65, 70, 80, 100) is already full. Take middle element 80, move it to the root and split the
node into two nodes (65, 70) and (85, 100)
50 80

30 40 45 65 70 85 100

8. Insert 120, 10
50 80

10 30 40 45 65 70 85 100 120

9. Insert 35
Node (10, 30, 40, 45) is already full. Take middle element 35, move it to the root and split the
node into two nodes (10, 30) and (40, 45)

35 50 80

10 30 40 45 65 70 85 100 120
10. Insert 55, 90
35 50 80

10 30 40 45 55 65 70 85 90 100 120

11. Insert 95
Node (85, 90, 100, 120) is already full. Take the middle element 95, move it to the root and
split the node into two nodes (85, 90) and (100, 120)
35 50 80 95

10 30 40 45 55 65 70 85 90 100 120

12. Insert 130, 150


35 50 80 95

10 30 40 45 55 65 70 85 90 100 120 130 150

13. Insert 160


Node (100, 120, 130, 150) is already full. Take the middle element 130, move it to the root and
split the node into two nodes (100, 120) and (150, 160). But the root is already full, therefore,
split it into two nodes (35, 50) and (95, 130) then move the middle key 80 to the root. Thus,
the level of the tree is incremented by one.
80

35 50 95 130

10 30 40 45 55 60 70 85 90 100 120 150 160

Search in a B-Tree
The search operation in a B-tree is similar to that of a Binary Search Tree. In a Binary search
tree, the search process starts by accessing the root node first and then moving either the left
subtree or the right subtree. In B-tree also the search process starts from the root node but every
time we make n-way decision where n is the total number of children. In a B-tree, the search
operation is performed with O(log n) time complexity. The search operation is performed as
follows:
i. Read the item to be searched from the user.
ii. Compare the item with the first key value of the root node in the tree.
iii. If both match, then the item is found then terminate the function.
iv. If both are not matching then check whether the item to be searched is smaller or larger
than that key value.
v. If the item to be searched is smaller than the key value then continue the search process in
the left subtree.
vi. If the item to be searched is larger than the key value then compare it with the next key
value in the same node and repeat the steps iii to vi until the exact match is found with last
key value in a leaf node.
vii. If the last key value in a leaf node also does not match then the item is not found, terminate
the function.

Deletion in a B-Tree
Deletion operation requires the traversal into the B tree. If the location of the key (item) to be
deleted is found, any of the three cases may arise:
Case 1: If the key k to be deleted is in the leaf node x and the node x has more than the minimum
number of keys, in that case the key k can be easily deleted.

Case 2: If the key k is in the internal node x, then there are three cases:
Case 2(a):
If the key y that precedes k in the node x has at least t keys (more than the minimum), then find
the predecessor key pk in the subtree rooted at y. Delete pk and replace k with pk in x

Case 2(b):
Symmetrically, if the child z that follows k in the node x has at least t keys, find the successor
k' and delete and replace as before. Note that finding k' and deleting it can be performed in a
single downward pass.

Case 2(c):
Otherwise, if both y and z have only t−1 (minimum number) keys, merge k and all of z into y,
so that both k and the pointer to z are removed from x. y now contains 2t − 1 keys, and
subsequently k is deleted.

Case 3:
If key k is not present in an internal node x, determine the root of the appropriate subtree that
must contain k. If the root has only t − 1 keys, execute either of the following two cases to
ensure that we descend to a node containing at least t keys.

Case 3(a):
If the root has only t−1 keys but has a sibling with t keys, give the root an extra key by moving
a key from x to the root, moving a key from the roots immediate left or right sibling up into x,
and moving the appropriate child from the sibling to x.
Case 3(b)
If the root and all of its siblings have t−1 keys, merge the root with one sibling. This involves
moving a key down from x into the new merged node to become the median key for that
node.

Consider the B tree


80

35 50 95 130

10 30 40 45 55 65 70 85 90 100 120 150 160 170 175


B Tree

Delete Key 65 from B tree:


Since key 65 is in a leaf and the leaf has more than the minimum number of keys, we move the
key 55 where the key 65 had been.
80

35 50 95 130

10 30 40 45 55 70 85 90 100 120 150 160 170 175

Delete Key 130:


Since 130 is in non-leaf node, find its successor (the next item in ascending order), which
happens to be 150, and move the key 150 up to replace the key 130. Actually, the key 150 is
deleted in the process, since the leaf node containing 150 has extra keys, there is no problem.
80

35 50 95 150

10 30 40 45 55 70 85 90 100 120 160 170 175

Delete Key 120:


Though the key 120 is in a leaf node, this leaf does not have an extra key; the deletion results
in a node with only one key. But it should have at least two keys for a B tree of order 5. If the
sibling node to the immediate left or right has an extra key, it is borrowed from the parent and
a key moves up from the sibling. In this case, the sibling to the right has an extra key. So, the
successor key 150 of the key 120 (the last key in the node where the deletion occurred), is
moved down from the parent, and the key 160 is moved up.
80

35 50 95 160

10 30 40 45 55 70 85 90 100 150 170 175

Delete Key 40:


Step 1: Although key 40 is in a leaf node, the leaf has no extra keys, nor do the siblings to the
immediate right or left. In such a case the leaf has to be combined with one of these two siblings.
The parent’s key that was between those of these two leaves will move down. The key 35 will
move down. Then the leaf containing the keys 10 and 30 with the leaf containing 45 will be
combined.

80

50 95 160

10 30 35 45 55 70 85 90 100 150 170 175

Step 2: As the parent node now contains only one key 50. This is not acceptable. If this node
had a sibling to its immediate left or right that had a spare key, then we would have again
borrowed a key. As we have no way to borrow a key from a sibling, we must again combine
with the sibling, and move down the key 80 from the parent. In this result in shrinking down
the B tree in height by one.

50 80 95 160

10 30 35 45 55 70 85 90 100 150 170 175


B+ TREE

A B+ tree is a self-balancing tree in which every path from the root to the leaf node is of the
same length with each non-leaf node has between n/2 and n-1 children, where n is the
degree/order of the tree. A B+ tree has two types of nodes. One is the index and the other is
data. Index nodes are internal nodes, and the data nodes are external nodes (leaf nodes). Internal
nodes hold the key and the pointers to the internal nodes, the external nodes hold the keys and
pointers to the data.

A B+-tree of order m maintains the following properties:


• All leaves are at the same level.
• If a B+ tree consists of a single node, then the node is both a root and a leaf.
• Each leaf node (unless it’s a root) must contain pairs of <key, pointer> between floor (m/2)
and floor (m).
• Each internal node other than the root has between floor ((m+1)/2) and floor (m+1)
children, where m≥2.
• Equivalently, each internal node other than the root contains between floor (m/2) and floor
(m) search keys: x1<x2< ...<xk where m/2 ≤ k ≤ m. The internal node’s ith child has keys in
the range [xi−1, xi) for i= 1, 2, ..., k+1 (where x0 and xk+1 are arbitrarily small and large
values, respectively).
• The root node contains between 1 and m search keys (or between 1 and m <key, pointer>
pairs if the root is a leaf).
• Each leaf node contains pointers to its adjacent siblings.

The leaves of the B+ tree are generally linked to one another in a linked list which makes
traversing more efficient as sequential traversal is possible in the leaf nodes which is not the
case with B tree which supports only the non-sequential traversing.

Insertion in a B+ Tree
Search the location where the key can be stored.
1. If the node has an empty space, insert the key/reference pair into the node.
2. If the node is already full, split it into two nodes, distributing the keys between the two
nodes. If the node is a leaf, find the median of the keys of that node and repeat the insertion
algorithm to insert it into the parent node. If the node is a non-leaf, exclude the middle
value during the split and repeat this insertion algorithm to insert this excluded value into
the parent node.

Let us create a B+ tree of order 3 with the following keys:


15, 10, 25, 20, 13, 30, 28
Insert 15:
Key 15 is the first element to be inserted into the leaf node.
Insert 10:
As two keys at the most can be inserted in a node in a B+ tree of order 3 and the node having
key 15 is half full, insert 10.

Insert 25:
A node cannot accommodate more than two keys in a B+ tree of order 3; therefore, create
another node to insert it.

Find the median of the keys 10, 15 and 25 which comes out to be 15. Take the key 15 to the
root node.

Insert 20:
Key 20 can be inserted in a node before 25 as the node consisting of 25 is half full.
Insert 13:
Key 13 cannot be accommodated in a node consisting of the keys 10 and 15 as node is full,
break the node into two parts, insert 13.

Take the median of the keys 10, 13, 15 which comes out to be 13. Take the key 13 to the root.

Insert 30:
As key 30 cannot be accommodated in a node consisting of the keys 20 and 25, break it into
two nodes. Insert 30. Find the median of the keys 20, 25 and 30 which comes out to be 25.
Take 25 to the parent of the nodes (20, 25) and 30.

Since the node in B+ tree of order 3 cannot have more than two keys, clubbing the keys 13, 15
and 25 will violate the rule. Therefore, find the median of 13, 15 and 25 which is 15, take it in
a root. It will cause an increase in the height of the tree by 1.
Insert 28:
The node consisting of the key 30 is half full, therefore, space is available. Insert 28.

B+ Tree

Deletion in a B+ Tree
1. Search the leaf containing (key, pointer) entry to delete.
2. Remove the entry from the leaf. If the leaf satisfies the “half full” criterion, then stop.
3. Otherwise, the leaf has few data entries. Then do the following:
i. If the right sibling of the leaf can spare an entry, then move the smallest entry in the
right sibling to the leaf.
ii. Else, if the left sibling of the leaf can spare an entry then move the largest entry in the
left sibling to the leaf.
iii. Else, merge L and a sibling. If merging, then recursively delete the entry (pointing to L
or sibling) from the parent. Merge could propagate to the root, decreasing the height of
a tree.

Delete 30:
Since the node consisting of the keys 28 and 30 is full and deleting 30 will lead to the node to
be half full which is acceptable as the node in B+ tree of order 3 should have at least one key,
therefore simply delete 30.
Delete 15:

Deleting 15 from the leaf node causes the node to underflow as it becomes less than half full.
There are two ways to overcome this situation:
i. Remove the empty node from the tree.
ii. The left sibling of the empty node is full; therefore half of the key values can be given to
the empty node.

You might also like