What is B Tree: Definition, How It Works, and Use Cases
B tree is a self-balancing tree data structure that keeps data sorted and allows fast searches, insertions, and deletions. It is designed to work well on systems that read and write large blocks of data, like databases and file systems.How It Works
A B tree organizes data in a way similar to a family tree but with multiple children per node instead of just two. Each node can hold several keys (values) and has multiple child nodes. This structure keeps the tree balanced, meaning all leaf nodes are at the same depth, so operations take predictable time.
Think of it like a library index where each page lists several book titles and points you to different shelves. Instead of checking one book at a time, you jump between shelves quickly by following the right pointers. This reduces the number of steps needed to find or add a book.
Because nodes can hold many keys, the tree stays shallow even with lots of data. This is important for storage devices where reading fewer blocks is faster. The tree adjusts itself by splitting or merging nodes when keys are added or removed, keeping balance and efficiency.
Example
This example shows a simple B tree insertion and search using Python. It demonstrates how keys are added and how the tree remains balanced.
class BTreeNode: def __init__(self, t, leaf=False): self.t = t # Minimum degree (defines the range for number of keys) self.leaf = leaf # True if node is leaf self.keys = [] # List of keys self.children = [] # List of child pointers def insert_non_full(self, key): i = len(self.keys) - 1 if self.leaf: self.keys.append(None) while i >= 0 and key < self.keys[i]: self.keys[i + 1] = self.keys[i] i -= 1 self.keys[i + 1] = key else: while i >= 0 and key < self.keys[i]: i -= 1 i += 1 if len(self.children[i].keys) == 2 * self.t - 1: self.split_child(i) if key > self.keys[i]: i += 1 self.children[i].insert_non_full(key) def split_child(self, i): t = self.t y = self.children[i] z = BTreeNode(t, y.leaf) z.keys = y.keys[t:] # Right half keys mid_key = y.keys[t - 1] # Middle key to move up y.keys = y.keys[:t - 1] # Left half keys if not y.leaf: z.children = y.children[t:] y.children = y.children[:t] self.children.insert(i + 1, z) self.keys.insert(i, mid_key) def search(self, key): i = 0 while i < len(self.keys) and key > self.keys[i]: i += 1 if i < len(self.keys) and key == self.keys[i]: return True if self.leaf: return False return self.children[i].search(key) class BTree: def __init__(self, t): self.root = BTreeNode(t, True) self.t = t def insert(self, key): root = self.root if len(root.keys) == 2 * self.t - 1: s = BTreeNode(self.t) s.children.insert(0, root) s.leaf = False s.split_child(0) i = 0 if s.keys[0] < key: i += 1 s.children[i].insert_non_full(key) self.root = s else: root.insert_non_full(key) def search(self, key): return self.root.search(key) # Example usage btree = BTree(2) # Minimum degree 2 for key in [10, 20, 5, 6, 12, 30, 7, 17]: btree.insert(key) print(btree.search(6)) # True print(btree.search(15)) # False
When to Use
B trees are ideal when you need to store large amounts of data that cannot fit entirely in memory and must be read from disk or other slow storage. They minimize the number of disk reads by keeping the tree shallow and nodes wide.
Common uses include:
- Database indexing to quickly find records
- File systems to organize files and directories
- Any system requiring balanced search trees with efficient insertions and deletions on large datasets
If you work with in-memory data and small datasets, simpler trees like binary search trees might be enough. But for large-scale storage, B trees shine.
Key Points
- A
B treeis a balanced tree with multiple keys per node. - It keeps all leaf nodes at the same depth for consistent performance.
- Nodes split or merge to maintain balance during insertions and deletions.
- It is optimized for systems that read/write large blocks of data like disks.
- Widely used in databases and file systems for efficient data access.