FastAPIframework~15 mins

Request timing middleware in FastAPI - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Perf

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Request timing middleware

What is it?

Request timing middleware is a piece of code that runs during each web request to measure how long the request takes to process. It sits between the client and the application, recording the start and end times of handling a request. This helps developers understand the speed of their application and find slow parts. Middleware means it works automatically for every request without changing the main code.

Why it matters

Without request timing middleware, developers would struggle to know which parts of their web app are slow or causing delays. This can lead to poor user experience because slow responses frustrate users. By measuring request times, teams can improve performance, fix bottlenecks, and ensure the app runs smoothly. It also helps in monitoring and alerting when something goes wrong.

Where it fits

Before learning request timing middleware, you should understand basic FastAPI app structure and how requests and responses flow. After this, you can learn about advanced middleware features, logging, monitoring tools, and performance optimization techniques.

Mental Model

Core Idea

Request timing middleware acts like a stopwatch that starts before handling a request and stops after, measuring the total time taken automatically for every web request.

Think of it like...

It's like a cashier timing how long each customer takes at checkout to find out if the line is moving fast or slow.

┌───────────────┐
│ Client sends  │
│ request       │
└──────┬────────┘
       │
┌──────▼────────┐
│ Middleware    │
│ starts timer  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Request       │
│ processing    │
└──────┬────────┘
       │
┌──────▼────────┐
│ Middleware    │
│ stops timer   │
│ logs duration │
└──────┬────────┘
       │
┌──────▼────────┐
│ Response sent │
│ to client     │
└───────────────┘

Build-Up - 7 Steps

FoundationUnderstanding Middleware Basics

Concept: Middleware is code that runs before and after each request in a web app.

In FastAPI, middleware wraps around the request handling process. It can modify requests, responses, or perform actions like logging. Middleware is added to the app and runs automatically for every request.

Result

You know middleware runs on every request and can do things before and after the main handler.

Understanding middleware as a wrapper around requests is key to grasping how timing can be measured without changing the main code.

FoundationMeasuring Time in Python

IntermediateCreating Basic Timing Middleware

IntermediateAdding Timing Info to Response Headers

IntermediateLogging Request Duration for Monitoring

AdvancedHandling Exceptions in Timing Middleware

ExpertAsync Performance and Middleware Overhead

Under the Hood

FastAPI middleware is implemented as an async function that intercepts each request. It receives the request and a call_next function that triggers the next step in processing. The middleware records the current time before calling call_next, then awaits the response. After the response returns, it records the end time and calculates the difference. This timing is then logged or added to the response. The async nature ensures the server can handle many requests concurrently without blocking.

Why designed this way?

Middleware was designed to be a reusable, centralized way to add cross-cutting concerns like logging, timing, or authentication without changing each route handler. The async design fits FastAPI's goal of high concurrency and performance. Alternatives like decorating every route would be repetitive and error-prone. Middleware provides a clean, consistent hook for request lifecycle events.

┌───────────────┐
│ Incoming      │
│ HTTP Request  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Middleware    │
│ start timer   │
│ await call_next│
│ end timer     │
│ add header/log│
└──────┬────────┘
       │
┌──────▼────────┐
│ Route Handler │
│ processes req │
│ returns resp  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Response sent │
│ to client     │
└───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does timing middleware measure only the route handler time or the entire request lifecycle? Commit to your answer.

Common Belief:Timing middleware only measures the time spent inside the route handler function.

Tap to reveal reality

Quick: Can timing middleware slow down your app significantly? Commit to yes or no.

Common Belief:Adding timing middleware will noticeably slow down the app because it adds extra work.

Tap to reveal reality

Quick: Does timing middleware automatically fix slow requests? Commit to yes or no.

Common Belief:Once you add timing middleware, your app will run faster because it tracks performance.

Tap to reveal reality

Quick: Does timing middleware capture time if the request handler crashes? Commit to yes or no.

Common Belief:If the handler raises an error, timing middleware won't record the duration.

Tap to reveal reality

Expert Zone

Middleware order matters: timing middleware should wrap as outermost to capture full request time including other middleware.

High-resolution timers like time.perf_counter() provide more accurate timing than time.time(), especially for short requests.

Adding timing info to response headers can expose internal performance data; consider security and privacy implications.

When NOT to use

Avoid timing middleware if you need per-function or per-database query timing; use profiling tools or specialized instrumentation instead. Also, if you want detailed distributed tracing, use dedicated tracing libraries like OpenTelemetry.

Production Patterns

In production, timing middleware is combined with structured logging and monitoring systems. Logs are sent to centralized platforms for alerting on slow requests. Timing data is often correlated with user sessions and error logs to diagnose issues quickly.

Connections

Profiling

Builds-on

Request timing middleware provides coarse-grained timing that complements detailed profiling, helping focus profiling efforts on slow requests.

Observability

Part of

Timing middleware is a key observability tool that helps understand system behavior and performance in real time.

Stopwatch in Sports

Similar pattern

Both measure elapsed time to evaluate performance, showing how timing is a universal concept across domains.

Common Pitfalls

#1Not measuring time if an exception occurs in the handler.

Wrong approach:async def middleware(request, call_next): start = time.time() response = await call_next(request) end = time.time() print(f"Time: {end - start}") return response

Correct approach:async def middleware(request, call_next): start = time.time() try: response = await call_next(request) finally: end = time.time() print(f"Time: {end - start}") return response

Root cause:Not using try-finally means timing code is skipped if an error interrupts normal flow.

#2Adding timing header before calling the next handler.

Wrong approach:async def middleware(request, call_next): response = Response() response.headers['X-Time'] = '0' response = await call_next(request) return response

Correct approach:async def middleware(request, call_next): start = time.time() response = await call_next(request) duration = time.time() - start response.headers['X-Time'] = str(duration) return response

Root cause:Headers must be added after response is created to avoid overwriting or missing data.

#3Using blocking time.sleep() inside async middleware.

Wrong approach:async def middleware(request, call_next): start = time.time() time.sleep(1) # blocks event loop response = await call_next(request) end = time.time() print(f"Time: {end - start}") return response

Correct approach:async def middleware(request, call_next): start = time.time() await asyncio.sleep(1) # non-blocking response = await call_next(request) end = time.time() print(f"Time: {end - start}") return response

Root cause:Blocking calls freeze the async event loop, causing delays and inaccurate timing.

Key Takeaways

Request timing middleware measures how long each web request takes automatically by wrapping the request handling process.

It uses Python's time functions and async middleware patterns to record start and end times without blocking the app.

Adding timing info to response headers and logs helps monitor app performance and detect slow requests.

Proper error handling in middleware ensures timing data is captured even when requests fail.

Understanding middleware internals and async behavior helps build efficient, accurate timing tools for production use.

Practice

(1/5)

1. What is the main purpose of a request timing middleware in FastAPI?

easy

A. To convert JSON data to Python objects

B. To handle user authentication automatically

C. To serve static files faster

D. To measure how long each HTTP request takes to process

5. You want to create a request timing middleware that logs the duration only if it exceeds 0.5 seconds. Which code snippet correctly implements this behavior?

hard

A. @app.middleware('http')\nasync def timing_middleware(request, call_next):\n start = time.time()\n response = await call_next(request)\n duration = time.time() - start\n if duration > 0.5:\n print(f'Request took {duration:.3f} seconds')\n return response

B. @app.middleware('http')\ndef timing_middleware(request, call_next):\n start = time.time()\n response = call_next(request)\n duration = time.time() - start\n if duration > 0.5:\n print('Slow request')\n return response

C. @app.middleware('http')\nasync def timing_middleware(request, call_next):\n response = await call_next(request)\n duration = time.time()\n if duration > 0.5:\n print('Request slow')\n return response

D. @app.middleware('http')\nasync def timing_middleware(request, call_next):\n start = time.time()\n response = await call_next(request)\n duration = start - time.time()\n if duration > 0.5:\n print('Request slow')\n return response

Request timing middleware in FastAPI - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand middleware role

Step 2: Identify timing middleware purpose

Final Answer:

Quick Check:

Solution

Step 1: Check middleware decorator and signature

Step 2: Verify timing logic and response modification

Final Answer:

Quick Check:

Solution

Step 1: Analyze header addition in middleware

Step 2: Confirm header content meaning

Final Answer:

Quick Check:

Solution

Step 1: Check function signature and async usage

Step 2: Identify missing await and async

Final Answer:

Quick Check:

Solution

Step 1: Confirm async middleware and await call_next

Step 2: Check timing calculation and conditional logging

Step 3: Verify correct duration calculation and print statement

Final Answer:

Quick Check: