0
0
Nginxdevops~15 mins

Upstream blocks in Nginx - Deep Dive

Choose your learning style9 modes available
Overview - Upstream blocks
What is it?
An upstream block in nginx is a configuration section that defines a group of backend servers. These servers handle requests forwarded by nginx, often for load balancing or failover. It lets nginx distribute incoming traffic efficiently to multiple servers behind the scenes. This helps improve performance and reliability of web services.
Why it matters
Without upstream blocks, nginx would have to send all requests to a single backend server, creating a bottleneck and a single point of failure. Upstream blocks solve this by allowing nginx to balance load and handle server failures gracefully. This means websites stay fast and available even if some servers go down or get busy.
Where it fits
Before learning upstream blocks, you should understand basic nginx configuration and proxy_pass directive. After mastering upstream blocks, you can explore advanced load balancing methods, health checks, and dynamic server management in nginx.
Mental Model
Core Idea
An upstream block is like a traffic controller that directs incoming requests to a group of backend servers to share the load and increase reliability.
Think of it like...
Imagine a restaurant host who seats guests at different tables to keep the restaurant running smoothly. The upstream block is the host, and the backend servers are the tables where guests (requests) are served.
┌───────────────┐
│   Client      │
└──────┬────────┘
       │ Request
       ▼
┌───────────────┐
│   nginx       │
│  (Load       │
│  Balancer)   │
└──────┬────────┘
       │
       ▼
┌───────────────┬───────────────┬───────────────┐
│ Backend 1    │ Backend 2    │ Backend 3    │
│ (Server)    │ (Server)    │ (Server)    │
└───────────────┴───────────────┴───────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding nginx proxy basics
🤔
Concept: Learn how nginx forwards requests to a single backend server using proxy_pass.
In nginx, proxy_pass is used inside a location block to send client requests to another server. For example: location / { proxy_pass http://backend_server; } This sends all requests matching the location to one backend server.
Result
Requests are forwarded to a single backend server defined by proxy_pass.
Knowing how nginx proxies requests to one server sets the stage for understanding how to manage multiple servers.
2
FoundationIntroducing upstream blocks
🤔
Concept: Upstream blocks define a group of backend servers under one name for load balancing.
An upstream block looks like this: upstream backend_group { server backend1.example.com; server backend2.example.com; } Then proxy_pass can use this group: location / { proxy_pass http://backend_group; } This tells nginx to send requests to any server in backend_group.
Result
nginx knows about multiple backend servers grouped under one name.
Grouping servers allows nginx to distribute requests instead of sending all to one server.
3
IntermediateLoad balancing methods in upstream
🤔Before reading on: do you think nginx sends requests to backend servers randomly or in a fixed order? Commit to your answer.
Concept: nginx supports different ways to choose which backend server handles each request.
By default, nginx uses a round-robin method, sending requests one by one to each server in order. You can also configure: - least_conn: sends to the server with the fewest active connections - ip_hash: sends requests from the same client IP to the same server Example: upstream backend_group { least_conn; server backend1.example.com; server backend2.example.com; }
Result
Requests are distributed according to the chosen load balancing method.
Understanding load balancing methods helps optimize traffic distribution based on your application's needs.
4
IntermediateHandling server failures with upstream
🤔Before reading on: do you think nginx automatically skips backend servers that are down? Commit to your answer.
Concept: nginx can detect failed backend servers and stop sending requests to them temporarily.
You can configure parameters like max_fails and fail_timeout: upstream backend_group { server backend1.example.com max_fails=3 fail_timeout=30s; server backend2.example.com; } If backend1 fails 3 times within 30 seconds, nginx stops sending requests to it for 30 seconds.
Result
nginx avoids sending requests to servers that are likely down, improving reliability.
Knowing how nginx handles failures prevents downtime and improves user experience.
5
IntermediateUsing upstream with health checks
🤔
Concept: Active health checks can monitor backend servers and update their status dynamically.
With the nginx Plus version or third-party modules, you can configure health checks that periodically test backend servers. If a server fails a health check, it is marked down and skipped until it recovers. Example (nginx Plus): upstream backend_group { zone backend_zone 64k; server backend1.example.com; server backend2.example.com; health_check; }
Result
Backend server status is updated automatically, improving load balancing accuracy.
Active health checks provide better fault tolerance than passive failure detection.
6
AdvancedSticky sessions with ip_hash in upstream
🤔Before reading on: do you think ip_hash guarantees a client always reaches the same backend server? Commit to your answer.
Concept: ip_hash load balancing sends requests from the same client IP to the same backend server to maintain session state.
Configure ip_hash like this: upstream backend_group { ip_hash; server backend1.example.com; server backend2.example.com; } This helps when backend servers store session data locally and clients need consistent routing.
Result
Clients maintain session affinity, improving user experience for stateful applications.
Understanding session affinity is key for applications that do not share session data across servers.
7
ExpertDynamic upstream configuration and DNS resolution
🤔Before reading on: do you think nginx automatically updates upstream server IPs if DNS changes? Commit to your answer.
Concept: By default, nginx resolves backend server IPs only once at startup, but dynamic DNS resolution can be configured for changing environments.
nginx caches DNS results for upstream servers at startup. If backend IPs change, nginx won't notice unless reloaded. To handle dynamic IPs, use the resolver directive and variables: resolver 8.8.8.8; upstream backend_group { server backend.example.com; } location / { proxy_pass http://backend.example.com; } This setup allows nginx to resolve DNS at request time, but it requires careful configuration. Alternatively, third-party modules or nginx Plus offer dynamic upstream updates.
Result
nginx can adapt to backend IP changes without restart, useful in cloud or container environments.
Knowing nginx's DNS caching behavior prevents unexpected downtime in dynamic infrastructures.
Under the Hood
nginx maintains an internal list of backend servers defined in the upstream block. When a request arrives, nginx uses the configured load balancing method to select a backend server. It then opens a connection to that server and forwards the request. nginx tracks server health passively by counting failed connections and can mark servers as down temporarily. DNS resolution for upstream servers happens at startup unless configured otherwise. The upstream block acts as a shared resource accessed by worker processes to distribute load efficiently.
Why designed this way?
Upstream blocks were designed to separate backend server definitions from request handling logic, making configurations cleaner and more flexible. Early nginx versions resolved backend IPs once to improve performance, trading off dynamic updates. Load balancing methods were added to support common traffic distribution needs without external tools. This design balances simplicity, performance, and reliability.
┌───────────────┐
│   Client      │
└──────┬────────┘
       │ Request
       ▼
┌───────────────┐
│ nginx Master  │
│ Process       │
└──────┬────────┘
       │ Shares upstream config
       ▼
┌───────────────┐
│ nginx Worker  │
│ Processes    │
└──────┬────────┘
       │ Select backend server
       ▼
┌───────────────┬───────────────┬───────────────┐
│ Backend 1    │ Backend 2    │ Backend 3    │
│ (Server)    │ (Server)    │ (Server)    │
└───────────────┴───────────────┴───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does nginx automatically detect and remove failed backend servers from upstream? Commit yes or no.
Common Belief:nginx automatically detects failed backend servers and stops sending requests to them without any configuration.
Tap to reveal reality
Reality:nginx only passively detects failures based on connection errors and requires configuration like max_fails and fail_timeout to mark servers down temporarily.
Why it matters:Assuming automatic failure detection can lead to downtime if nginx keeps sending requests to unreachable servers.
Quick: Does ip_hash guarantee perfect session stickiness even if backend servers change? Commit yes or no.
Common Belief:ip_hash always sends the same client to the same backend server regardless of server changes.
Tap to reveal reality
Reality:ip_hash depends on the list of servers; if servers are added or removed, client routing can change, breaking session stickiness.
Why it matters:Misunderstanding this can cause unexpected session loss and user frustration.
Quick: Does nginx resolve backend server IPs on every request by default? Commit yes or no.
Common Belief:nginx resolves backend server IPs dynamically on every request to handle IP changes automatically.
Tap to reveal reality
Reality:By default, nginx resolves backend IPs only once at startup and caches them, requiring reload to update.
Why it matters:Believing in dynamic DNS resolution can cause outages when backend IPs change without nginx reload.
Quick: Is the upstream block only for load balancing? Commit yes or no.
Common Belief:Upstream blocks are only used for load balancing multiple backend servers.
Tap to reveal reality
Reality:Upstream blocks also manage failover, session affinity, and can be used even with a single backend for future scalability.
Why it matters:Limiting upstream blocks to load balancing misses their full potential in managing backend server groups.
Expert Zone
1
nginx upstream blocks share their configuration across worker processes using shared memory zones for efficient load balancing decisions.
2
The order of server declarations affects load balancing behavior, especially with weight parameters and failover configurations.
3
Using variables in proxy_pass disables some optimizations and requires explicit DNS resolver configuration to avoid stale backend IPs.
When NOT to use
Upstream blocks are not suitable when backend servers require complex session replication or state sharing; in such cases, dedicated session stores or service meshes are better. Also, for very dynamic environments, consider service discovery tools or nginx Plus for automatic backend updates.
Production Patterns
In production, upstream blocks are combined with health checks, weighted load balancing, and failover settings. They are often integrated with service discovery systems via scripts or dynamic DNS. Sticky sessions with ip_hash are used for stateful apps, while least_conn is preferred for stateless services. Multi-stage upstreams can route traffic based on request type or priority.
Connections
Load Balancers (general networking)
Upstream blocks implement load balancing, a core function of network load balancers.
Understanding upstream blocks helps grasp how load balancers distribute traffic to multiple servers to improve performance and reliability.
Service Discovery in Microservices
Upstream blocks can be integrated with service discovery to dynamically update backend servers.
Knowing how upstream blocks work clarifies how service discovery tools automate backend management in modern cloud environments.
Restaurant Seating Management
Both upstream blocks and restaurant hosts manage distributing guests/requests efficiently to available resources.
This cross-domain view highlights the universal challenge of balancing load and maintaining service quality.
Common Pitfalls
#1nginx keeps sending requests to a backend server that is down, causing errors.
Wrong approach:upstream backend_group { server backend1.example.com; server backend2.example.com; } location / { proxy_pass http://backend_group; } # No max_fails or fail_timeout configured
Correct approach:upstream backend_group { server backend1.example.com max_fails=3 fail_timeout=30s; server backend2.example.com; } location / { proxy_pass http://backend_group; }
Root cause:Not configuring failure detection parameters causes nginx to treat all servers as always available.
#2Using variables in proxy_pass without resolver, causing backend IP resolution failure.
Wrong approach:resolver 8.8.8.8; location / { proxy_pass http://$backend_server; } # Missing resolver directive or misconfigured
Correct approach:resolver 8.8.8.8; location / { proxy_pass http://$backend_server; } # Correct resolver directive present
Root cause:nginx does not resolve variables in proxy_pass without an explicit resolver, leading to connection errors.
#3Expecting ip_hash to maintain session affinity after backend servers change.
Wrong approach:upstream backend_group { ip_hash; server backend1.example.com; server backend2.example.com; } # Later adding or removing servers without session management
Correct approach:Use shared session storage or sticky cookies alongside ip_hash to maintain session consistency when backend servers change.
Root cause:ip_hash depends on server list stability; changing servers breaks client-server mapping.
Key Takeaways
Upstream blocks in nginx group backend servers to distribute client requests efficiently.
They enable load balancing, failover, and session affinity to improve web service reliability and performance.
nginx resolves backend server IPs once at startup by default, so dynamic DNS requires special configuration.
Configuring failure detection parameters prevents nginx from sending requests to downed servers.
Advanced use includes health checks, dynamic backend updates, and integration with service discovery for cloud environments.