| Users | Requests per Second | Flag Evaluations per Second | Storage Size | Latency Impact | Complexity |
|---|---|---|---|---|---|
| 100 users | ~10-50 RPS | ~10-50 | KBs (few flags) | Negligible | Simple flag storage in memory or DB |
| 10,000 users | ~1,000-5,000 RPS | ~1,000-5,000 | MBs (hundreds of flags) | Low latency needed | Use caching, distributed config store |
| 1,000,000 users | ~100,000 RPS | ~100,000 | GBs (thousands of flags, segments) | Must be <10ms per eval | Use CDN, caching, distributed flag evaluation service |
| 100,000,000 users | ~10,000,000 RPS | ~10,000,000 | TBs (complex targeting, analytics) | Highly optimized, near real-time | Global distributed system, sharding, edge caching |
Feature flags in Microservices - Scalability & System Analysis
The first bottleneck is the feature flag evaluation service and its data store. As user count and requests grow, the system must evaluate flags quickly for each request. The database or config store can become overwhelmed by read requests or complex targeting rules, causing latency spikes.
- Caching: Use in-memory caches (e.g., Redis, local caches) to store flag data and evaluation results to reduce DB load.
- Horizontal Scaling: Add more instances of the flag evaluation service behind a load balancer to handle more concurrent requests.
- Read Replicas: Use database read replicas to distribute read traffic for flag configurations.
- Sharding: Partition flag data by user segments or regions to reduce data size per node.
- CDN and Edge Caching: Cache flag data closer to users to reduce latency and network load.
- Asynchronous Updates: Push flag changes asynchronously to services to avoid blocking requests.
- Feature Flag Evaluation SDKs: Use client-side evaluation where possible to reduce server load.
- At 1M users with 100K RPS, assuming each flag evaluation is 1KB data read, total bandwidth ~100MB/s.
- Storage for flags and targeting rules grows with complexity; expect GBs at million-user scale.
- CPU usage depends on evaluation complexity; caching reduces CPU by avoiding repeated computations.
- Network bandwidth and latency critical; edge caching reduces cross-region traffic.
Start by explaining what feature flags are and why they matter. Then discuss how load grows with users and requests. Identify the bottleneck clearly (flag evaluation and data store). Propose concrete scaling solutions like caching, horizontal scaling, and sharding. Mention trade-offs like consistency vs latency. Use real numbers to show understanding.
Your database handles 1000 QPS for flag data reads. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add caching layer (e.g., Redis or in-memory cache) to reduce direct DB reads and improve response time before scaling the database or services.