FastAPIframework~8 mins

Rate limiting in FastAPI - Performance & Optimization

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Performance: Rate limiting

MEDIUM IMPACT

Rate limiting affects server response time and user interaction speed by controlling request frequency to avoid overload.

Controlling API request frequency to prevent server overload

FastAPI

from fastapi import FastAPI, Request
from starlette.responses import JSONResponse
import time

app = FastAPI()

rate_limits = {}

@app.middleware("http")
async def fixed_window_rate_limit(request: Request, call_next):
    client_ip = request.client.host
    current_time = int(time.time())
    window = current_time // 60  # 1-minute window
    key = f"{client_ip}:{window}"
    count = rate_limits.get(key, 0)
    if count >= 100:
        return JSONResponse(status_code=429, content={"detail": "Too many requests"})
    rate_limits[key] = count + 1
    response = await call_next(request)
    return response

Uses per-client fixed time windows with expiration, limiting requests fairly and avoiding permanent blocks.

📈 Performance GainReduces unnecessary blocking, keeps server responsive, improves INP by handling bursts efficiently

Controlling API request frequency to prevent server overload

FastAPI

from fastapi import FastAPI, Request
from starlette.responses import JSONResponse

app = FastAPI()

@app.middleware("http")
async def naive_rate_limit(request: Request, call_next):
    # Simple in-memory counter without expiration
    if not hasattr(app.state, 'counter'):
        app.state.counter = 0
    app.state.counter += 1
    if app.state.counter > 100:
        return JSONResponse(status_code=429, content={"detail": "Too many requests"})
    response = await call_next(request)
    return response

This naive approach uses a global counter without expiration, causing permanent blocking after limit is reached and no per-user control.

📉 Performance CostBlocks requests unnecessarily, causing poor INP and potential server resource waste

Performance Comparison

Pattern	DOM Operations	Reflows	Paint Cost	Verdict
Naive global counter	0 (server-side only)	0	0	[X] Bad
Per-client fixed window	0 (server-side only)	0	0	[OK] Good

Rendering Pipeline

Rate limiting runs on the server before response generation, affecting how quickly the server can send responses back to the browser.

→Server Processing

→Network Response

⚠️ BottleneckServer Processing when rate limiting logic is inefficient or blocks too many requests

Core Web Vital Affected

INP

Rate limiting affects server response time and user interaction speed by controlling request frequency to avoid overload.

Optimization Tips

1Avoid global counters without expiration to prevent permanent blocking.

2Implement per-client rate limits with time windows to balance fairness and performance.

3Monitor 429 responses and server response times to ensure rate limiting does not degrade user experience.

Performance Quiz - 3 Questions

Test your performance knowledge

What is the main performance risk of a naive global counter for rate limiting in FastAPI?

AIt causes excessive CSS recalculations

BIt increases DOM reflows on the client

CIt can permanently block all users after limit is reached

DIt reduces network bandwidth

DevTools: Network

How to check: Open DevTools, go to Network tab, make rapid repeated requests to API endpoint, observe response status codes and timing.

What to look for: Look for 429 status codes indicating rate limiting and check response times to ensure server responds quickly without delays.