0
0
HLDsystem_design~12 mins

Cross-shard queries in HLD - Architecture Diagram

Choose your learning style9 modes available
System Overview - Cross-shard queries

This system handles queries that need data from multiple database shards. Each shard stores a portion of the data to improve scalability and performance. The system must combine results from different shards and return a unified response quickly and reliably.

Architecture Diagram
User
  |
  v
Load Balancer
  |
  v
API Gateway
  |
  v
Query Coordinator
 /      |       \
Shard 1 Shard 2  Shard N
  |       |        |
Cache   Cache    Cache
  |       |        |
DB 1    DB 2     DB N
Components
User
client
Sends query requests to the system
Load Balancer
load_balancer
Distributes incoming requests evenly to API Gateway instances
API Gateway
api_gateway
Receives requests and forwards them to the Query Coordinator
Query Coordinator
service
Splits cross-shard queries, sends sub-queries to shards, aggregates results
Shard 1
database_shard
Stores a subset of data and processes queries for that subset
Shard 2
database_shard
Stores a subset of data and processes queries for that subset
Shard N
database_shard
Stores a subset of data and processes queries for that subset
Cache
cache
Stores frequently accessed query results to reduce latency
DB 1
database
Persistent storage for Shard 1
DB 2
database
Persistent storage for Shard 2
DB N
database
Persistent storage for Shard N
Request Flow - 29 Hops
UserLoad Balancer
Load BalancerAPI Gateway
API GatewayQuery Coordinator
Query CoordinatorCache
CacheQuery Coordinator
Query CoordinatorShard 1
Query CoordinatorShard 2
Query CoordinatorShard N
Shard 1Cache
CacheShard 1
Shard 1DB 1
DB 1Shard 1
Shard 1Cache
Shard 1Query Coordinator
Shard 2Cache
CacheShard 2
Shard 2DB 2
DB 2Shard 2
Shard 2Cache
Shard 2Query Coordinator
Shard NCache
CacheShard N
Shard NDB N
DB NShard N
Shard NCache
Shard NQuery Coordinator
Query CoordinatorAPI Gateway
API GatewayLoad Balancer
Load BalancerUser
Failure Scenario
Component Fails:Shard 2
Impact:Queries involving Shard 2 data fail or return incomplete results. Overall query response is delayed or partial.
Mitigation:Use replication for Shard 2 to failover to a replica. Query Coordinator retries on replica. Cache may serve stale data if DB is down.
Architecture Quiz - 3 Questions
Test your understanding
Which component is responsible for splitting a cross-shard query into sub-queries?
AQuery Coordinator
BLoad Balancer
CAPI Gateway
DCache
Design Principle
This design shows how to handle queries that span multiple database shards by splitting the query, querying shards in parallel, caching results to reduce latency, and aggregating responses. It balances scalability and performance while handling failures with replication.