0
0
DbmsConceptBeginner · 3 min read

What is B+ Tree Index: Explanation, Example, and Use Cases

A B+ tree index is a type of data structure used in databases to organize and speed up data retrieval. It stores keys in a sorted order with all actual data pointers in the leaf nodes, allowing efficient range queries and fast lookups.
⚙️

How It Works

A B+ tree index works like a well-organized library catalog. Imagine a big book collection where you want to find a book quickly. Instead of searching every book, you use a sorted index that points you to the exact shelf and position.

In a B+ tree, internal nodes hold keys that guide the search path, but only the leaf nodes contain the actual data pointers. This means all data is stored at the bottom level, linked in order, making it easy to scan ranges of data efficiently.

When you search for a value, you start at the root and follow the keys down the tree until you reach the leaf node that contains the data or pointer you need. This structure keeps the tree balanced, so the search time stays fast even as data grows.

💻

Example

This example shows a simple B+ tree structure with keys and pointers. The leaf nodes contain the actual data references, and internal nodes guide the search.

python
class BPlusTreeNode:
    def __init__(self, leaf=False):
        self.leaf = leaf
        self.keys = []
        self.children = []

class BPlusTree:
    def __init__(self, order=3):
        self.root = BPlusTreeNode(leaf=True)
        self.order = order

    def insert(self, key):
        root = self.root
        if len(root.keys) == (self.order - 1):
            new_root = BPlusTreeNode()
            new_root.children.append(root)
            self._split_child(new_root, 0)
            self.root = new_root
        self._insert_non_full(self.root, key)

    def _split_child(self, parent, index):
        order = self.order
        node = parent.children[index]
        new_node = BPlusTreeNode(leaf=node.leaf)
        mid = order // 2
        parent.keys.insert(index, node.keys[mid])
        parent.children.insert(index + 1, new_node)
        new_node.keys = node.keys[mid + 1:]
        node.keys = node.keys[:mid]
        if not node.leaf:
            new_node.children = node.children[mid + 1:]
            node.children = node.children[:mid + 1]

    def _insert_non_full(self, node, key):
        if node.leaf:
            node.keys.append(key)
            node.keys.sort()
        else:
            i = len(node.keys) - 1
            while i >= 0 and key < node.keys[i]:
                i -= 1
            i += 1
            child = node.children[i]
            if len(child.keys) == (self.order - 1):
                self._split_child(node, i)
                if key > node.keys[i]:
                    i += 1
            self._insert_non_full(node.children[i], key)

    def search(self, node, key):
        i = 0
        while i < len(node.keys) and key > node.keys[i]:
            i += 1
        if i < len(node.keys) and key == node.keys[i]:
            if node.leaf:
                return True
            else:
                return self.search(node.children[i + 1], key)
        elif node.leaf:
            return False
        else:
            return self.search(node.children[i], key)

# Create B+ tree and insert keys
bpt = BPlusTree(order=3)
for k in [10, 20, 5, 6, 12, 30, 7, 17]:
    bpt.insert(k)

# Search for keys
print(bpt.search(bpt.root, 6))  # True
print(bpt.search(bpt.root, 15)) # False
Output
True False
🎯

When to Use

Use a B+ tree index when you need fast data retrieval in databases, especially for range queries like finding all records between two values. It is ideal for large datasets because it keeps data balanced and minimizes disk reads.

Common real-world uses include indexing in relational databases, file systems, and any system that requires sorted data access with quick insertions, deletions, and lookups.

Key Points

  • All actual data pointers are stored in leaf nodes, linked in order.
  • Internal nodes only store keys to guide searches.
  • Balanced tree structure ensures fast search, insert, and delete operations.
  • Efficient for range queries and large datasets.
  • Widely used in database indexing and file systems.

Key Takeaways

A B+ tree index stores data pointers only in leaf nodes, making range queries efficient.
It keeps the tree balanced to ensure fast search, insert, and delete operations.
Ideal for large databases where quick sorted data access is needed.
Internal nodes guide the search without storing actual data.
Commonly used in relational databases and file systems for indexing.