Bird
Raised Fist0
LLDsystem_design~10 mins

Reservation and hold system in LLD - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - Reservation and hold system
Growth Table: Reservation and Hold System
UsersRequests per SecondDatabase LoadCache UsageNetwork TrafficSystem Behavior
100 users~10-50 RPSSingle DB instance handles writes and readsMinimal caching neededLow bandwidthSystem runs smoothly on a single server
10,000 users~1,000 RPSDB under moderate load, some read replicas neededCache frequently accessed data (holds, availability)Moderate bandwidth, load balancer introducedLatency may increase, need for caching and replicas
1,000,000 users~50,000 RPSDB write bottleneck, sharding requiredHeavy caching, distributed cache clusterHigh bandwidth, CDN for static contentComplex coordination for holds, distributed locking
100,000,000 users~5,000,000 RPSMultiple DB clusters, global shardingMulti-level caching, edge cachesVery high bandwidth, global CDN, message queuesEventual consistency, asynchronous processing
First Bottleneck

The database is the first bottleneck because reservation and hold systems require strong consistency for writes to avoid double booking. As user requests increase, the DB write throughput limits the system's ability to process holds and reservations in real-time.

Scaling Solutions
  • Read Replicas: Offload read queries like availability checks to replicas to reduce DB load.
  • Caching: Use distributed caches (e.g., Redis) for frequently accessed data such as seat availability and hold status.
  • Sharding: Partition the database by resource (e.g., venue, event) to spread write load across multiple DB instances.
  • Horizontal Scaling: Add more application servers behind load balancers to handle increased traffic.
  • Distributed Locking: Implement distributed locks or consensus protocols to prevent double booking in a distributed environment.
  • Asynchronous Processing: Use message queues for non-critical updates to improve responsiveness.
  • CDN: Use CDNs for static content and possibly for caching availability snapshots to reduce load.
Back-of-Envelope Cost Analysis
  • At 10,000 users with ~1,000 RPS, a single DB instance (~5,000 QPS capacity) can handle writes and reads with caching.
  • At 1,000,000 users (~50,000 RPS), DB write capacity is exceeded; sharding and multiple DB clusters needed.
  • Storage: Each reservation record ~1 KB; 1M reservations = ~1 GB storage, manageable with modern DBs.
  • Network bandwidth: 1,000 RPS with ~1 KB payload = ~1 MB/s; scales linearly with users.
  • Cache memory: Redis cluster with 10s of GB RAM to hold hot data for fast access.
Interview Tip

Start by clarifying system requirements and scale. Identify the critical consistency needs for reservations. Discuss the database as the first bottleneck and propose caching and sharding. Explain how distributed locking prevents double booking. Finally, mention asynchronous processing and CDN use for scalability and performance.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Introduce read replicas and caching to offload read queries, then consider sharding the database to distribute write load and prevent bottlenecks.

Key Result
The database write capacity is the first bottleneck in a reservation and hold system. Scaling requires caching, read replicas, and sharding combined with distributed locking to maintain consistency and prevent double booking.

Practice

(1/5)
1. What is the primary purpose of a hold in a reservation and hold system?
easy
A. To delete all reservations from the system
B. To permanently reserve a resource without expiration
C. To cancel a confirmed reservation immediately
D. To temporarily block a resource before final booking

Solution

  1. Step 1: Understand the role of a hold

    A hold temporarily blocks a resource to prevent others from booking it while the user decides.
  2. Step 2: Differentiate hold from reservation

    A reservation is permanent until canceled, while a hold expires if not confirmed.
  3. Final Answer:

    To temporarily block a resource before final booking -> Option D
  4. Quick Check:

    Hold = Temporary block [OK]
Hint: Holds are temporary blocks, not permanent reservations [OK]
Common Mistakes:
  • Confusing hold with permanent reservation
  • Thinking holds never expire
  • Assuming holds cancel reservations
2. Which data structure is best suited to track holds with expiration times efficiently?
easy
A. Simple array without ordering
B. Linked list without timestamps
C. Hash map with timestamps and a priority queue for expirations
D. Stack data structure

Solution

  1. Step 1: Identify requirements for hold tracking

    We need fast lookup by hold ID and efficient expiration handling.
  2. Step 2: Choose data structures

    A hash map allows quick hold lookup; a priority queue orders holds by expiration for timely removal.
  3. Final Answer:

    Hash map with timestamps and a priority queue for expirations -> Option C
  4. Quick Check:

    Hash map + priority queue = efficient hold tracking [OK]
Hint: Use hash map for lookup and priority queue for expirations [OK]
Common Mistakes:
  • Using unordered arrays causing slow expiration checks
  • Choosing stack which is LIFO, not suitable for expirations
  • Ignoring timestamps in data structure
3. Consider this pseudo-code for confirming a hold:
if hold.exists(hold_id) and not hold.is_expired(hold_id):
    reservation.create(hold.resource)
    hold.remove(hold_id)
    return "Confirmed"
else:
    return "Failed"
What will be the output if the hold has expired?
medium
A. "Failed"
B. "Confirmed"
C. Error due to missing hold
D. "Confirmed" but resource is double booked

Solution

  1. Step 1: Check hold existence and expiration

    The code confirms only if hold exists and is not expired.
  2. Step 2: Analyze expired hold case

    If hold is expired, condition fails and returns "Failed" without creating reservation.
  3. Final Answer:

    "Failed" -> Option A
  4. Quick Check:

    Expired hold = "Failed" confirmation [OK]
Hint: Expired holds cause confirmation to fail [OK]
Common Mistakes:
  • Assuming expired holds confirm successfully
  • Expecting errors instead of failure message
  • Ignoring hold expiration check
4. A developer wrote this code to release expired holds:
for hold in holds:
    if hold.expiration_time < current_time:
        holds.remove(hold)
What is the main issue with this code?
medium
A. Holds should not be removed, only marked expired
B. Modifying a list while iterating causes skipped elements or errors
C. Expiration time comparison is incorrect
D. Loop should use while instead of for

Solution

  1. Step 1: Understand iteration and modification

    Removing items from a list while iterating over it causes skipping or runtime errors.
  2. Step 2: Identify correct approach

    Use a separate list to collect expired holds or iterate over a copy to safely remove.
  3. Final Answer:

    Modifying a list while iterating causes skipped elements or errors -> Option B
  4. Quick Check:

    Remove during iteration = skipped elements [OK]
Hint: Never remove items from list while looping over it [OK]
Common Mistakes:
  • Ignoring iteration modification side effects
  • Assuming expiration comparison is wrong
  • Thinking loop type causes the issue
5. You need to design a scalable reservation and hold system for a popular event with thousands of simultaneous users. Which approach best ensures no double booking and timely hold expiration?
hard
A. Use distributed locks on resources, store holds with TTL in a distributed cache, and confirm with atomic transactions
B. Store all holds in a single database table without expiration, confirm by updating status
C. Allow multiple holds per resource and resolve conflicts manually later
D. Use client-side timers to expire holds and update server asynchronously

Solution

  1. Step 1: Prevent double booking with distributed locks

    Distributed locks ensure only one user can hold a resource at a time across servers.
  2. Step 2: Use TTL in distributed cache for hold expiration

    TTL automatically expires holds after timeout, preventing indefinite blocking.
  3. Step 3: Confirm holds atomically

    Atomic transactions guarantee reservation creation without race conditions.
  4. Final Answer:

    Use distributed locks on resources, store holds with TTL in a distributed cache, and confirm with atomic transactions -> Option A
  5. Quick Check:

    Distributed locks + TTL + atomic confirm = scalable, safe system [OK]
Hint: Combine distributed locks, TTL cache, and atomic confirm for scale [OK]
Common Mistakes:
  • Ignoring concurrency causing double booking
  • Relying on client-side expiration only
  • Not using atomic operations for confirmation