Concept Flow - Load balancing between workers

Start Master Process

↓

Fork Worker Processes

↓

Workers Listen for Requests

↓

OS Distributes to Workers

↓

Workers Process Requests

↓

Send Response Back

↓

OS Balances Load

↓

More Requests?

No↓

Shutdown

The master process forks workers. Workers listen for requests on the same port, and the OS distributes them evenly to workers for processing.

Execution Sample

Node.js

import cluster from 'node:cluster';
import { cpus } from 'node:os';
import http from 'node:http';

if (cluster.isPrimary) {
  for (let i = 0; i < cpus().length; i++) cluster.fork();
} else {
  http.createServer((req, res) => {
    res.end(`Handled by worker ${process.pid}`);
  }).listen(8000);
}

This code forks one worker per CPU core and balances incoming HTTP requests among them.

Execution Table

Step	Action	Master State	Worker State	Load Distribution
1	Master starts and checks cluster.isPrimary	Primary: true, forks 4 workers	No workers yet	No load yet
2	Master forks worker 1	Primary: true, 1 worker forked	Worker 1 started	No load yet
3	Master forks worker 2	Primary: true, 2 workers forked	Worker 2 started	No load yet
4	Master forks worker 3	Primary: true, 3 workers forked	Worker 3 started	No load yet
5	Master forks worker 4	Primary: true, 4 workers forked	Worker 4 started	No load yet
6	Workers listen on port 8000	Primary: true	Workers listening	No load yet
7	Request 1 arrives	Primary: true	Worker 1 handles request	Request 1 -> Worker 1
8	Request 2 arrives	Primary: true	Worker 2 handles request	Request 2 -> Worker 2
9	Request 3 arrives	Primary: true	Worker 3 handles request	Request 3 -> Worker 3
10	Request 4 arrives	Primary: true	Worker 4 handles request	Request 4 -> Worker 4
11	Request 5 arrives	Primary: true	Worker 1 handles request	Request 5 -> Worker 1
12	No more requests	Primary: true	Workers idle	Load balanced evenly
13	Master shuts down workers	Primary: true, workers shutdown	Workers stopped	No load

💡 No more requests; all workers have handled requests evenly; master shuts down.

Variable Tracker

Variable	After 1	After 2	After 3	After 4	After 5	Final
workersForked	1	2	3	4	4	4
requestsHandledByWorker1	1	1	1	1	2	2
requestsHandledByWorker2	0	1	1	1	1	1
requestsHandledByWorker3	0	0	1	1	1	1
requestsHandledByWorker4	0	0	0	1	1	1

Key Moments - 3 Insights

Why does the master fork multiple workers instead of handling requests itself?

How does the master decide which worker gets a request?

What happens if a worker crashes?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution table, which worker handles the 3rd request?

AWorker 3

BWorker 2

CWorker 1

DWorker 4

Concept Snapshot

Load balancing in Node.js cluster:
- Master process forks workers equal to CPU cores
- Workers listen on the same port
- Requests are balanced round-robin by default (OS/kernel level)
- Workers handle requests independently
- Master can restart crashed workers to maintain balance

Full Transcript

In Node.js, load balancing between workers is done using the cluster module. The master process starts and forks one worker per CPU core. Each worker runs its own server instance. Incoming requests are distributed evenly among workers, usually in a round-robin fashion by the OS. Workers process requests independently and send responses back. This approach uses all CPU cores efficiently. The master can monitor workers and restart any that crash to keep the system balanced. The execution table shows the master forking workers, workers starting servers, and requests being assigned one by one to each worker in turn. Variables track how many workers are forked and how many requests each worker handles. This method improves performance by parallelizing work across CPU cores.