LSM trees are designed to optimize write-heavy workloads. What is the main mechanism they use to achieve faster writes compared to traditional B-trees?
Think about how writing in groups rather than one by one affects disk speed.
LSM trees collect writes in a memory structure and then write them to disk in large, sequential chunks. This reduces slow random disk writes and improves write speed.
While LSM trees improve write speed, they have a known downside. What is it?
Consider how data is stored on disk in LSM trees and how that affects reading.
Because data is stored in multiple sorted files (SSTables), reads may need to check several files, causing slower read performance compared to B-trees.
Compaction is a process in LSM trees that merges multiple files into one. What is the main benefit of compaction?
Think about how merging files affects the number of files to search during reads.
Compaction merges multiple sorted files into fewer larger files, reducing the number of files to check during reads and reclaiming space from deleted or overwritten data.
Which statement best explains why LSM trees are preferred over B-trees for write-heavy workloads?
Focus on how disk write patterns differ between the two structures.
LSM trees batch writes and write them sequentially, which is faster on disk. B-trees update nodes in place causing many random writes, which are slower on disk.
In a system using LSM trees, what is the expected effect on read latency immediately after a large burst of writes before compaction runs?
Consider how many files the system must search before compaction merges them.
After many writes, data is stored in multiple new files. Reads must check more files, increasing latency until compaction merges them.