0
0
Node.jsframework~5 mins

Handling worker crashes and restart in Node.js

Choose your learning style9 modes available
Introduction

Sometimes worker processes stop working unexpectedly. Handling crashes and restarting workers helps keep your app running smoothly without downtime.

You run multiple worker processes to handle tasks in parallel.
You want your app to recover automatically if a worker crashes.
You need to keep your server stable and responsive.
You want to monitor worker health and restart them when needed.
Syntax
Node.js
import cluster from 'node:cluster';
import os from 'node:os';

if (cluster.isPrimary) {
  // Fork workers
  for (let i = 0; i < os.cpus().length; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died. Restarting...`);
    cluster.fork();
  });
} else {
  // Worker code here
}

cluster.isPrimary checks if the current process is the main one that controls workers.

The exit event lets you detect when a worker stops and restart it.

Examples
Restart a worker immediately after it crashes.
Node.js
cluster.on('exit', (worker, code, signal) => {
  console.log(`Worker ${worker.process.pid} crashed.`);
  cluster.fork();
});
Restart a worker with a delay to avoid rapid crash loops.
Node.js
cluster.on('exit', (worker) => {
  setTimeout(() => {
    cluster.fork();
  }, 1000); // Restart after 1 second delay
});
Basic cluster setup with one worker and restart on crash.
Node.js
if (cluster.isPrimary) {
  cluster.fork();
  cluster.on('exit', (worker) => {
    console.log(`Worker ${worker.process.pid} died.`);
    cluster.fork();
  });
} else {
  // Worker code
}
Sample Program

This program starts one worker per CPU core. Each worker crashes after 2 seconds. The primary process detects the crash and restarts the worker automatically.

Node.js
import cluster from 'node:cluster';
import os from 'node:os';

if (cluster.isPrimary) {
  console.log(`Primary ${process.pid} is running`);

  // Fork workers equal to number of CPU cores
  for (let i = 0; i < os.cpus().length; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} died. Restarting...`);
    cluster.fork();
  });
} else {
  console.log(`Worker ${process.pid} started`);

  // Simulate a crash after 2 seconds
  setTimeout(() => {
    console.log(`Worker ${process.pid} crashing now.`);
    process.exit(1);
  }, 2000);
}
OutputSuccess
Important Notes

Always monitor worker crashes to avoid infinite restart loops.

You can add logging or alerts inside the exit event handler for better monitoring.

Use process.exit(code) in workers to simulate crashes during testing.

Summary

Use the cluster module to run multiple workers for better performance.

Listen to the exit event to detect worker crashes.

Restart workers automatically to keep your app running smoothly.