Bird
Raised Fist0
HLDsystem_design~10 mins

Design a unique ID generator in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Design a unique ID generator
Growth Table: Unique ID Generator Scaling
ScaleRequests per Second (RPS)ID Generation MethodStorage NeedsLatencyNotes
100 users~10 RPSSimple timestamp + counterMinimal (few KB)<1 msSingle server, no concurrency issues
10,000 users~1,000 RPSTimestamp + machine ID + sequence numberSmall (MB)<5 msSingle server with concurrency control or small cluster
1,000,000 users~100,000 RPSDistributed ID generator (e.g., Snowflake)Moderate (GB logs)<10 msMultiple servers, coordination needed
100,000,000 users~10,000,000 RPSHighly distributed, sharded generators + cachingLarge (TB logs)<20 msGlobal distribution, fault tolerance critical
First Bottleneck

The first bottleneck is the central coordination or state management that ensures uniqueness. At low scale, a single server can handle ID generation easily. As traffic grows, the server's CPU and memory limits are reached due to concurrency and synchronization overhead. Also, network latency and clock synchronization issues arise in distributed setups.

Scaling Solutions
  • Horizontal Scaling: Add more ID generator nodes with unique machine IDs to distribute load.
  • Sharding: Partition ID space by machine or region to avoid collisions.
  • Caching: Pre-generate ID blocks to reduce coordination calls.
  • Use of Time-based IDs: Incorporate timestamps to reduce coordination.
  • Coordination Services: Use lightweight consensus or coordination (e.g., ZooKeeper) carefully to avoid bottlenecks.
  • Fault Tolerance: Design for node failures without ID collisions.
Back-of-Envelope Cost Analysis
  • At 1M users generating 100K IDs/sec, each ID ~8 bytes -> 800 KB/sec storage if logged.
  • Network bandwidth for 100K RPS with 8-byte IDs ≈ 0.8 MB/sec, easily handled by 1 Gbps network.
  • CPU: Each server can handle ~5K concurrent ID requests; need ~20 servers for 100K RPS.
  • Storage: Logs and backups grow ~70 GB/day at 100K RPS.
Interview Tip

Start by clarifying requirements: ID length, uniqueness scope (global or per service), latency needs, and failure tolerance. Then discuss simple solutions for low scale and identify bottlenecks as scale grows. Propose incremental scaling strategies and justify choices with trade-offs. Always mention fault tolerance and collision avoidance.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Since the database is the bottleneck, first add read replicas or caching to reduce load. For ID generation, move from a single centralized generator to a distributed approach with machine IDs and sequence numbers to avoid database contention.

Key Result
A unique ID generator scales by moving from a simple centralized approach to a distributed system with machine IDs and sequence numbers, addressing coordination bottlenecks and ensuring fault tolerance as traffic grows from thousands to millions of requests per second.