Bird
Raised Fist0
LLDsystem_design~10 mins

Transaction history in LLD - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - Transaction history
Growth Table: Transaction History System
UsersTransactions/dayData SizeSystem Changes
1001,000~10 MBSingle server, simple DB, no caching needed
10,000100,000~1 GBDB indexing, read replicas, basic caching
1,000,00010,000,000~100 GBSharded DB, distributed cache, horizontal app scaling
100,000,0001,000,000,000~10 TBMulti-region sharding, archival storage, CDN for UI data
First Bottleneck

The database is the first bottleneck as transaction volume grows. It struggles with write throughput and query latency because transaction history requires frequent writes and complex queries for user statements.

Scaling Solutions
  • Read Replicas: Offload read queries to replicas to reduce load on primary DB.
  • Caching: Use in-memory caches (e.g., Redis) for recent or frequent queries.
  • Sharding: Split data by user ID or time range to distribute load across multiple DB instances.
  • Horizontal Scaling: Add more application servers behind load balancers to handle increased traffic.
  • Archival Storage: Move old transactions to cheaper, slower storage to keep main DB performant.
  • CDN: Use for static UI assets and possibly precomputed reports to reduce server load.
Back-of-Envelope Cost Analysis
  • At 1M users with 10 transactions/day: 10M writes/day ≈ 115 writes/sec.
  • Database must handle ~200 QPS (including reads).
  • Storage: 100 GB for transaction data (assuming 10 KB per transaction).
  • Network bandwidth: ~10 MB/s for data transfer (reads + writes).
  • One DB instance can handle ~5,000 QPS, so single DB can handle writes but reads require replicas.
Interview Tip

Start by estimating user and transaction volume. Identify the bottleneck (usually DB). Discuss scaling steps in order: caching, read replicas, sharding, horizontal app scaling. Mention trade-offs like consistency and latency. Use real numbers to justify choices.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas and implement caching to reduce load on the primary database before considering sharding or adding more app servers.

Key Result
The database is the first bottleneck as transaction volume grows; scaling requires read replicas, caching, and sharding to maintain performance.

Practice

(1/5)
1. What is the main purpose of a transaction history in a system?
easy
A. To record all important actions with details for tracking
B. To speed up the system by caching data
C. To delete old data automatically
D. To encrypt user passwords

Solution

  1. Step 1: Understand the role of transaction history

    Transaction history stores records of actions with details like timestamps and IDs.
  2. Step 2: Identify the correct purpose

    This helps users and systems track past events clearly and reliably.
  3. Final Answer:

    To record all important actions with details for tracking -> Option A
  4. Quick Check:

    Transaction history purpose = record actions [OK]
Hint: Transaction history = record actions with details [OK]
Common Mistakes:
  • Confusing transaction history with caching
  • Thinking it deletes data automatically
  • Mixing it with security features like encryption
2. Which of the following is the correct way to uniquely identify each transaction in a history system?
easy
A. Using a timestamp only
B. Using a unique transaction ID
C. Using the user's name
D. Using the transaction amount

Solution

  1. Step 1: Identify unique identifiers in transaction history

    Unique transaction IDs ensure each record is distinct and traceable.
  2. Step 2: Compare options

    Timestamps alone can repeat; user names and amounts are not unique identifiers.
  3. Final Answer:

    Using a unique transaction ID -> Option B
  4. Quick Check:

    Unique ID = unique transaction record [OK]
Hint: Unique transaction ID ensures distinct records [OK]
Common Mistakes:
  • Assuming timestamp alone is unique
  • Using user name as unique key
  • Using transaction amount as identifier
3. Given this simplified transaction record list:
transactions = [
  {"id": "t1", "time": "2024-01-01T10:00:00Z"},
  {"id": "t2", "time": "2024-01-01T09:00:00Z"},
  {"id": "t3", "time": "2024-01-01T11:00:00Z"}
]

What is the correct order of transaction IDs if sorted by time ascending?
medium
A. ["t1", "t2", "t3"]
B. ["t2", "t3", "t1"]
C. ["t3", "t1", "t2"]
D. ["t2", "t1", "t3"]

Solution

  1. Step 1: Analyze timestamps for each transaction

    t2 = 09:00, t1 = 10:00, t3 = 11:00 in UTC time.
  2. Step 2: Sort transactions by ascending time

    Order is t2 (earliest), then t1, then t3 (latest).
  3. Final Answer:

    ["t2", "t1", "t3"] -> Option D
  4. Quick Check:

    Sorted by time ascending = [t2, t1, t3] [OK]
Hint: Sort by timestamp ascending for correct order [OK]
Common Mistakes:
  • Sorting by ID instead of time
  • Confusing ascending with descending order
  • Ignoring timestamp format
4. You have this code snippet to add a transaction record:
def add_transaction(history, transaction):
    if transaction['id'] not in [t['id'] for t in history]:
        history.append(transaction)
    else:
        print("Duplicate transaction")

history = [{"id": "t1"}]
add_transaction(history, {"id": "t1"})

What is the output when running this code?
medium
A. Duplicate transaction
B. KeyError exception
C. No output, transaction added
D. TypeError exception

Solution

  1. Step 1: Check if transaction ID exists in history

    The code checks if 't1' is already in the list of IDs in history.
  2. Step 2: Since 't1' exists, print duplicate message

    The else branch runs and prints "Duplicate transaction".
  3. Final Answer:

    Duplicate transaction -> Option A
  4. Quick Check:

    Duplicate ID detected = print message [OK]
Hint: Check for existing ID before adding to avoid duplicates [OK]
Common Mistakes:
  • Assuming transaction is added anyway
  • Expecting an exception instead of print
  • Confusing list comprehension syntax
5. You want to design a scalable transaction history system for millions of users. Which approach best ensures fast retrieval of a user's transactions sorted by time?
hard
A. Store transactions in separate files per day without indexing
B. Store all transactions in one big list and scan it every time
C. Use a database with an index on user ID and timestamp
D. Keep transactions only in memory without persistence

Solution

  1. Step 1: Consider scalability and retrieval speed

    Scanning one big list or files without index is slow for millions of users.
  2. Step 2: Use database indexing on user ID and timestamp

    This allows fast queries to get transactions per user sorted by time efficiently.
  3. Step 3: Avoid in-memory only storage for persistence and scale

    Memory-only storage risks data loss and limits scale.
  4. Final Answer:

    Use a database with an index on user ID and timestamp -> Option C
  5. Quick Check:

    Indexing = fast retrieval at scale [OK]
Hint: Index on user ID and timestamp for fast queries [OK]
Common Mistakes:
  • Scanning large lists for each query
  • Ignoring indexing benefits
  • Relying on memory-only storage