Lab Manual: B+ Tree Implementation
for Database Indexing
Objective
To implement a B+ Tree data structure.
To use the B+ Tree for database indexing.
To analyze its efficiency for search and range queries.
Theory
B+ Tree Overview:
- A balanced tree data structure commonly used in databases and file systems for indexing.
- All values are stored in leaf nodes; internal nodes store keys to guide search.
- Leaf nodes are linked for efficient range queries.
- Ensures O(log n) insert, search, and delete operations.
Use Case: Efficiently store and search records in a database based on key values (e.g.,
employee IDs, product codes).
Algorithm / Pseudocode
1. Start with an empty root node.
2. Insert key into appropriate leaf node.
3. If node overflows, split node and promote middle key to parent.
4. Repeat until tree is balanced.
5. For search, traverse from root to leaf following key comparisons.
6. For range query, traverse linked leaf nodes.
Python Implementation
# Python B+ Tree Implementation (simplified)
class BPlusTreeNode:
def __init__(self, t, leaf=False):
self.t = t # Minimum degree
self.leaf = leaf
self.keys = []
self.children = []
self.next = None # Pointer for leaf nodes
class BPlusTree:
def __init__(self, t):
self.root = BPlusTreeNode(t, leaf=True)
self.t = t
def insert(self, key):
root = self.root
if len(root.keys) == (2 * self.t) - 1:
new_root = BPlusTreeNode(self.t)
new_root.children.append(self.root)
self._split_child(new_root, 0)
self.root = new_root
self._insert_non_full(self.root, key)
def _insert_non_full(self, node, key):
if node.leaf:
node.keys.append(key)
node.keys.sort()
else:
i = len(node.keys) - 1
while i >= 0 and key < node.keys[i]:
i -= 1
i += 1
if len(node.children[i].keys) == (2 * self.t) - 1:
self._split_child(node, i)
if key > node.keys[i]:
i += 1
self._insert_non_full(node.children[i], key)
def _split_child(self, parent, index):
t = self.t
node = parent.children[index]
new_node = BPlusTreeNode(t, leaf=node.leaf)
mid_key = node.keys[t - 1]
new_node.keys = node.keys[t:]
node.keys = node.keys[:t - 1]
if not node.leaf:
new_node.children = node.children[t:]
node.children = node.children[:t]
parent.keys.insert(index, mid_key)
parent.children.insert(index + 1, new_node)
if node.leaf:
new_node.next = node.next
node.next = new_node
def search(self, key, node=None):
if node is None:
node = self.root
if node.leaf:
return key in node.keys
else:
i=0
while i < len(node.keys) and key > node.keys[i]:
i += 1
return self.search(key, node.children[i])
def traverse(self, node=None):
if node is None:
node = self.root
if node.leaf:
print(node.keys, end=' -> ' )
else:
for i in range(len(node.keys)):
self.traverse(node.children[i])
print(node.keys[i], end=' ' )
self.traverse(node.children[-1])
# Example Usage
bpt = BPlusTree(t=3)
data = [10, 20, 5, 6, 12, 30, 7, 17]
for key in data:
bpt.insert(key)
bpt.traverse()
print("\nSearch 12:", bpt.search(12))
print("Search 15:", bpt.search(15))
Sample Use Case
- Database of employees with EmployeeID as key.
- Use B+ Tree to index EmployeeID for fast search and range queries.
- Example Queries:
* Find employee with ID 12 -> bpt.search(12)
* Range query: List employees with IDs 5–20 by traversing leaf nodes.
Result & Conclusion
- B+ Tree provides efficient O(log n) search and insertion.
- Linked leaf nodes enable fast range queries.
- Ideal for database indexing and large datasets.