0
0
HLDsystem_design~5 mins

Cross-shard queries in HLD - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is a shard in a distributed database system?
A shard is a horizontal partition of data in a database. Each shard holds a subset of the total data, allowing the system to scale by distributing data across multiple servers.
Click to reveal answer
intermediate
Why do cross-shard queries pose challenges in distributed systems?
Because data is split across multiple shards, querying data that spans shards requires coordination between shards, which can increase latency, complexity, and risk of inconsistent results.
Click to reveal answer
intermediate
Name two common strategies to handle cross-shard queries.
1. Scatter-gather: Query all relevant shards and combine results. 2. Global indexes: Maintain an index that points to data across shards to quickly locate data without querying all shards.
Click to reveal answer
beginner
What is the scatter-gather approach in cross-shard queries?
It is a method where the query is sent to all shards that might contain relevant data. Each shard processes the query locally and returns results. The system then merges these results to form the final answer.
Click to reveal answer
intermediate
How can global indexes improve cross-shard query performance?
Global indexes provide a centralized way to find which shard holds the data needed. This reduces the need to query all shards, lowering latency and resource use.
Click to reveal answer
What is the main reason cross-shard queries are slower than single-shard queries?
AThey use outdated database engines
BThey require coordination across multiple shards
CThey only query one shard
DThey do not use indexes
Which approach involves querying all shards and combining results?
ASharding
BGlobal index lookup
CCaching
DScatter-gather
What is a drawback of maintaining global indexes for cross-shard queries?
AThey prevent data partitioning
BThey increase query latency
CThey require extra storage and maintenance
DThey eliminate scalability
Which of the following is NOT a typical challenge of cross-shard queries?
ASimplified query logic
BData inconsistency risks
CIncreased latency
DHigher resource consumption
What does sharding primarily help with in databases?
AScaling by distributing data
BIncreasing data redundancy
CReducing data backups
DImproving security
Explain what cross-shard queries are and why they are challenging in distributed databases.
Think about how data is split and how queries must gather data from multiple places.
You got /4 concepts.
    Describe two strategies to handle cross-shard queries and their trade-offs.
    Consider how queries find data and how results are combined.
    You got /4 concepts.