0
0
Microservicessystem_design~25 mins

Rate limiting in Microservices - System Design Exercise

Choose your learning style9 modes available
Design: Rate Limiting System for Microservices
Design the rate limiting mechanism and its integration with microservices APIs. Out of scope: detailed API business logic, user authentication mechanisms.
Functional Requirements
FR1: Limit the number of requests a user or client can make to an API within a given time window
FR2: Support different rate limits for different users or API keys
FR3: Provide real-time feedback when limits are exceeded
FR4: Ensure rate limiting works correctly in a distributed microservices environment
FR5: Allow configuration changes without downtime
Non-Functional Requirements
NFR1: Handle up to 100,000 requests per second across all services
NFR2: Enforce limits with p99 latency under 50ms
NFR3: Achieve 99.9% availability
NFR4: Support horizontal scaling of microservices
NFR5: Avoid single points of failure
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
API Gateway or Edge Proxy
Distributed Cache or In-memory Store (e.g., Redis)
Rate Limiter Service or Middleware
Configuration Management Service
Monitoring and Alerting System
Design Patterns
Token Bucket or Leaky Bucket algorithms
Fixed Window vs Sliding Window counters
Centralized vs Distributed rate limiting
Client-side vs Server-side enforcement
Circuit Breaker pattern for overload protection
Reference Architecture
Client
  |
  v
API Gateway / Edge Proxy (with Rate Limiter Middleware)
  |
  v
Microservices
  |
  v
Distributed Cache (Redis Cluster)
  |
  v
Configuration Service
  |
  v
Monitoring & Alerting
Components
API Gateway / Edge Proxy
Envoy, NGINX, or Kong
Intercept incoming requests, enforce rate limits before forwarding to microservices
Rate Limiter Middleware
Custom middleware or Envoy filter
Check and update request counts per user/key using distributed cache
Distributed Cache
Redis Cluster
Store counters and timestamps for rate limiting with low latency
Configuration Service
Central config store (e.g., Consul, etcd)
Manage rate limit rules and allow dynamic updates
Monitoring & Alerting
Prometheus + Grafana
Track rate limit usage, errors, and system health
Request Flow
1. Client sends request to API Gateway
2. API Gateway extracts user identity or API key
3. Rate Limiter Middleware queries Redis to get current count for user/key
4. If count is below limit, increment count and forward request to microservice
5. If count exceeds limit, respond with HTTP 429 Too Many Requests
6. Configuration Service provides rate limit rules to middleware dynamically
7. Monitoring system collects metrics on rate limiting events and system performance
Database Schema
Entities: - User or API Key: id, rate_limit_policy_id - Rate Limit Policy: id, max_requests, window_size_seconds - Counters stored in Redis as keys: "rate_limit:{user_id}:{window_start_timestamp}" with integer count Relationships: - Each User/API Key references one Rate Limit Policy - Counters are ephemeral and reset after window expires
Scaling Discussion
Bottlenecks
Redis becoming a single point of failure or performance bottleneck
API Gateway overload due to high request volume
Latency increase due to network calls to distributed cache
Configuration updates causing inconsistent rate limits across nodes
Solutions
Use Redis Cluster with sharding and replication for high availability and throughput
Deploy multiple API Gateway instances behind a load balancer
Use local caches with short TTLs to reduce Redis calls, accepting slight eventual consistency
Implement versioned configuration with atomic updates and cache invalidation
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Explain different rate limiting algorithms and why you chose one
Discuss trade-offs between strict consistency and performance
Highlight how the design handles distributed microservices environment
Mention how dynamic configuration and monitoring improve operability
Address potential bottlenecks and scaling strategies