Which statement correctly describes the difference between horizontal and vertical sharding?
Think about whether data is divided by rows or columns.
Horizontal sharding divides data by rows (e.g., user IDs) across databases. Vertical sharding divides data by columns or tables (e.g., user info in one shard, orders in another).
You are designing a sharded database for a social media app. Which sharding key would best distribute user data evenly across shards?
Consider which key is unique and evenly distributed.
User ID is unique and usually evenly distributed, making it a good sharding key to balance load. Geographic location or timestamps can cause uneven shard sizes.
Your sharded database has a hotspot where one shard receives most of the traffic due to popular users. Which approach best mitigates this issue?
Think about redistributing data to balance load.
Hash-based sharding redistributes users evenly, reducing hotspots. Simply upgrading hardware or isolating popular users does not solve uneven load distribution effectively.
Which statement best describes a key tradeoff between range-based and hash-based sharding?
Consider query efficiency versus data balance.
Range-based sharding groups similar keys together, making range queries fast but can cause uneven shard sizes. Hash-based sharding evenly distributes data but scatters related keys, making range queries harder.
Your application currently has 10 million users stored in 5 shards. Each shard can handle up to 3 million users efficiently. You expect user growth to 50 million in 2 years. How many shards should you plan for to maintain performance?
Divide expected users by max users per shard and round up.
50 million users / 3 million per shard = 16.66, so at least 17 shards are needed to keep each shard under capacity.