| Scale | Requests per Second (RPS) | ID Generation Method | Storage Needs | Latency | Notes |
|---|---|---|---|---|---|
| 100 users | ~10 RPS | Simple timestamp + counter | Minimal (few KB) | <1 ms | Single server, no concurrency issues |
| 10,000 users | ~1,000 RPS | Timestamp + machine ID + sequence number | Small (MB) | <5 ms | Single server with concurrency control or small cluster |
| 1,000,000 users | ~100,000 RPS | Distributed ID generator (e.g., Snowflake) | Moderate (GB logs) | <10 ms | Multiple servers, coordination needed |
| 100,000,000 users | ~10,000,000 RPS | Highly distributed, sharded generators + caching | Large (TB logs) | <20 ms | Global distribution, fault tolerance critical |
Design a unique ID generator in HLD - Scalability & System Analysis
The first bottleneck is the central coordination or state management that ensures uniqueness. At low scale, a single server can handle ID generation easily. As traffic grows, the server's CPU and memory limits are reached due to concurrency and synchronization overhead. Also, network latency and clock synchronization issues arise in distributed setups.
- Horizontal Scaling: Add more ID generator nodes with unique machine IDs to distribute load.
- Sharding: Partition ID space by machine or region to avoid collisions.
- Caching: Pre-generate ID blocks to reduce coordination calls.
- Use of Time-based IDs: Incorporate timestamps to reduce coordination.
- Coordination Services: Use lightweight consensus or coordination (e.g., ZooKeeper) carefully to avoid bottlenecks.
- Fault Tolerance: Design for node failures without ID collisions.
- At 1M users generating 100K IDs/sec, each ID ~8 bytes -> 800 KB/sec storage if logged.
- Network bandwidth for 100K RPS with 8-byte IDs ≈ 0.8 MB/sec, easily handled by 1 Gbps network.
- CPU: Each server can handle ~5K concurrent ID requests; need ~20 servers for 100K RPS.
- Storage: Logs and backups grow ~70 GB/day at 100K RPS.
Start by clarifying requirements: ID length, uniqueness scope (global or per service), latency needs, and failure tolerance. Then discuss simple solutions for low scale and identify bottlenecks as scale grows. Propose incremental scaling strategies and justify choices with trade-offs. Always mention fault tolerance and collision avoidance.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Since the database is the bottleneck, first add read replicas or caching to reduce load. For ID generation, move from a single centralized generator to a distributed approach with machine IDs and sequence numbers to avoid database contention.
