| Scale | Users | Groups | Expenses | Key Changes |
|---|---|---|---|---|
| Small | 100 | 20 | 500 | Simple in-memory or single database instance, no caching needed |
| Medium | 10,000 | 2,000 | 50,000 | Database load increases, need read replicas and caching for frequent queries |
| Large | 1,000,000 | 200,000 | 5,000,000 | Database sharding, horizontal scaling of app servers, caching layers, async processing |
| Very Large | 100,000,000 | 20,000,000 | 500,000,000 | Multi-region deployment, advanced sharding, CDN for static data, event-driven architecture |
User, Group, Expense classes in LLD - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
At small scale, the database is the first bottleneck because it handles all user, group, and expense data. As users and expenses grow, the database query load and write volume increase beyond a single instance's capacity.
- Database: Add read replicas to handle read-heavy queries like fetching user groups and expenses.
- Caching: Use in-memory caches (e.g., Redis) for frequently accessed data like user profiles and group memberships.
- Sharding: Partition the database by user ID or group ID to distribute write and read load across multiple servers.
- Horizontal Scaling: Add more application servers behind a load balancer to handle increased traffic.
- Async Processing: Use message queues for expense processing to reduce synchronous load.
- CDN: For static assets related to users or groups, use CDN to reduce bandwidth and latency.
- At 1M users, assuming each user generates 5 expense requests per day, total requests per second ~ 60 (5M requests/day ÷ 86400 seconds).
- Database needs to handle ~100 QPS for reads and writes combined, requiring multiple replicas and sharding.
- Storage: Each expense record ~1 KB, so 5M expenses ~5 GB storage, manageable but grows linearly.
- Network bandwidth: Assuming 1 KB per request, 60 QPS ~ 60 KB/s, low but grows with user base.
Start by identifying key entities (User, Group, Expense) and their relationships. Discuss expected load and data growth. Identify the first bottleneck (usually database). Propose scaling solutions step-by-step: caching, read replicas, sharding, horizontal scaling. Mention trade-offs and monitoring.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read load and implement caching for frequent queries. Then consider sharding the database to handle increased write volume.
Practice
Solution
Step 1: Understand the role of each class
User class represents individual people, Group holds multiple users, Expense tracks costs.Step 2: Identify which class stores individual info
User class stores personal details like name and ID for each person.Final Answer:
User -> Option BQuick Check:
User = Individual person [OK]
- Confusing Group with User
- Thinking Expense stores user details
- Assuming Payment is a class here
Solution
Step 1: Identify correct attribute for multiple users
A list is suitable to hold multiple User objects, so self.users = [] is correct.Step 2: Check other options for correctness
class Group: def __init__(self): self.user = {} uses a dict named user which is not typical for holding users; class Group: def __init__(self): self.expenses = [] uses expenses which belongs to Expense class; class Group: def __init__(self): self.members = None sets members to None which is not a collection.Final Answer:
class Group: def __init__(self): self.users = [] -> Option AQuick Check:
Group holds list of users = self.users = [] [OK]
- Using dict instead of list for users
- Confusing expenses with users
- Initializing members as None instead of a list
class Expense:
def __init__(self, amount, paid_by, split_between):
self.amount = amount
self.paid_by = paid_by
self.split_between = split_between
def split_amount(self):
return self.amount / len(self.split_between)
expense = Expense(120, 'Alice', ['Alice', 'Bob', 'Charlie'])
print(expense.split_amount())Solution
Step 1: Understand the split_amount method
It divides total amount by number of people in split_between list.Step 2: Calculate the split
Amount = 120, split_between has 3 people, so 120 / 3 = 40.0.Final Answer:
40.0 -> Option DQuick Check:
120 divided by 3 = 40.0 [OK]
- Forgetting to divide by number of people
- Dividing by 2 instead of 3
- Assuming paid_by affects split amount
class Expense:
def __init__(self, amount, paid_by, split_between):
self.amount = amount
self.paid_by = paid_by
self.split_between = split_between
def split_amount(self):
return self.amount // len(self.split_between)Solution
Step 1: Analyze the division operator used
The method uses integer division (//) which truncates decimals.Step 2: Understand impact on money split
Using // can lose fractional cents, causing inaccurate splits.Final Answer:
Using integer division (//) may lose cents in split -> Option AQuick Check:
Integer division truncates decimals, causing loss [OK]
- Ignoring decimal loss from integer division
- Confusing data types for paid_by or split_between
- Thinking amount should be string
Solution
Step 1: Identify responsibilities for each class
User holds personal info and balances, Group manages users and expenses, Expense tracks costs and splits.Step 2: Evaluate design for scalability and clarity
Create User, Group, and Expense classes where Group manages users and expenses; Expense tracks amount and split; User tracks individual balances updated by Group cleanly separates concerns, making it easier to maintain and scale.Final Answer:
Create User, Group, and Expense classes where Group manages users and expenses; Expense tracks amount and split; User tracks individual balances updated by Group -> Option CQuick Check:
Clear class roles = scalable design [OK]
- Putting all logic in one class
- Ignoring separation of concerns
- Using flat files for complex data
