0
0
HLDsystem_design~25 mins

Database sharding strategies in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Database Sharding Strategies
Design focuses on sharding strategies and architecture for relational or NoSQL databases. Out of scope: detailed application logic, specific database vendor features, and network infrastructure.
Functional Requirements
FR1: Distribute data across multiple database instances to improve scalability
FR2: Support horizontal scaling to handle increasing data volume and traffic
FR3: Ensure data availability and fault tolerance
FR4: Maintain reasonable query latency (p99 < 200ms)
FR5: Support both read and write operations efficiently
FR6: Allow easy addition of new shards without downtime
Non-Functional Requirements
NFR1: Handle up to 1 billion records distributed across shards
NFR2: Support 10,000 concurrent read/write requests
NFR3: Availability target of 99.9% uptime
NFR4: Latency target: p99 API response time under 200ms
NFR5: Data consistency can be eventual for some use cases but must support strong consistency for critical data
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
Shard key selection mechanism
Shard mapping and routing layer
Database instances or clusters for each shard
Metadata service to track shard locations
Load balancer or proxy for query routing
Backup and recovery tools for shards
Design Patterns
Horizontal partitioning (range-based, hash-based, directory-based)
Consistent hashing
Lookup tables for shard mapping
Replication strategies per shard
Re-sharding and data migration approaches
Reference Architecture
Client
  |
  v
Shard Router / Proxy Layer
  |
  +-----------------------------+
  |                             |
Shard 1                     Shard 2 ... Shard N
(Database Instance)         (Database Instance)

Metadata Service tracks shard keys and locations

Backup & Monitoring systems connected to each shard
Components
Shard Router / Proxy Layer
Custom routing service or middleware
Routes client queries to the correct shard based on shard key
Database Shards
Relational DB (e.g., PostgreSQL) or NoSQL DB (e.g., MongoDB)
Store partitioned data subsets independently
Metadata Service
Distributed key-value store or config service (e.g., ZooKeeper, etcd)
Maintain shard key ranges and mapping info
Backup and Monitoring
Backup tools and monitoring dashboards
Ensure data durability and system health
Request Flow
1. Client sends request with data or query including shard key
2. Shard Router extracts shard key and consults Metadata Service
3. Router forwards request to the appropriate shard database instance
4. Shard processes query and returns result to Router
5. Router sends response back to client
6. Metadata Service updates if shards are added or re-sharded
Database Schema
Entities are partitioned by shard key. Each shard holds a subset of data with the same schema. Relationships that span shards require application-level joins or denormalization. Metadata Service stores shard key ranges or hash mappings.
Scaling Discussion
Bottlenecks
Shard Router becomes a single point of failure or bottleneck
Uneven data distribution causing hotspot shards
Cross-shard queries causing high latency
Complexity in re-sharding and data migration
Metadata Service availability impacting routing
Solutions
Use multiple router instances with load balancing and failover
Choose shard keys carefully or use consistent hashing to balance load
Design queries to minimize cross-shard operations or use caching
Implement online re-sharding with minimal downtime and data sync
Deploy Metadata Service in a highly available cluster with consensus
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the sharding strategy and architecture, 10 minutes discussing scaling and trade-offs, and 5 minutes summarizing.
Explain importance of shard key selection and its impact on load balancing
Discuss trade-offs between range-based and hash-based sharding
Highlight how routing and metadata services work together
Address how to handle cross-shard queries and consistency
Show awareness of operational challenges like re-sharding and monitoring