Overview - Health checks (Terminus)

What is it?

Health checks with Terminus in NestJS are a way to monitor if your application and its parts are working correctly. They let you create endpoints that report the status of your app, like if the database or other services are reachable. This helps keep your app reliable by alerting you when something breaks. Terminus is a tool that makes adding these checks easy and standardized.

Why it matters

Without health checks, you might not know when your app or its dependencies fail until users complain or data is lost. Health checks help detect problems early, so you can fix them before they affect users. They also allow automated systems to restart or replace failing parts, keeping your app running smoothly. This is crucial for apps that need to be always available, like websites or APIs.

Where it fits

Before learning health checks with Terminus, you should understand basic NestJS concepts like modules, controllers, and services. After this, you can explore advanced monitoring, logging, and deployment strategies that use health check data to improve app reliability.

Mental Model

Core Idea

Health checks are like regular doctor visits for your app, checking if all parts are healthy and working as expected.

Think of it like...

Imagine your app is a car, and health checks are the dashboard lights and gauges that tell you if the engine, brakes, or fuel system are okay. If a light turns on, you know something needs attention before the car breaks down.

┌───────────────┐
│   Health Check│
│   Endpoint    │
└──────┬────────┘
       │
       ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ Database Check│   │ API Check     │   │ Disk Space    │
│ (reachable?)  │   │ (responsive?) │   │ (enough free) │
└──────┬────────┘   └──────┬────────┘   └──────┬────────┘
       │                   │                   │
       └──────────┬────────┴──────────┬────────┘
                  ▼                   ▼
             ┌───────────────┐   ┌───────────────┐
             │ All Healthy?  │──▶│ Return 200 OK │
             │ (true/false)  │   │ or 503 Error  │
             └───────────────┘   └───────────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding Health Checks Basics

Concept: Health checks are endpoints that report if your app is working properly.

A health check is a special URL your app exposes. When you visit it, the app runs tests on itself or its dependencies and returns a status. If everything is fine, it returns a success message; if not, it returns an error. This helps external systems know if your app is healthy.

Result

You get a simple status response like 'OK' or 'Service Unavailable' when you visit the health check URL.

Knowing that health checks are just special URLs helps you see them as simple tools for monitoring, not complicated magic.

2

FoundationInstalling and Setting Up Terminus

3

IntermediateAdding Basic Health Indicators

4

IntermediateCustomizing Health Check Responses

5

IntermediateUsing Asynchronous Health Indicators

6

AdvancedImplementing Graceful Shutdown with Terminus

7

ExpertExtending Terminus with Custom Health Indicators

Under the Hood

Terminus works by creating a special HTTP endpoint in your NestJS app. When this endpoint is called, Terminus runs all registered health indicator functions, which can be synchronous or asynchronous. Each indicator returns a status object. Terminus collects these results, combines them into a JSON response, and sets the HTTP status code to 200 if all are healthy or 503 if any fail. It also hooks into Node.js signals to manage graceful shutdown by running cleanup functions and updating health status accordingly.

Why designed this way?

Terminus was designed to standardize health checks in NestJS apps, making it easy to add reliable monitoring without reinventing the wheel. It uses asynchronous checks to handle real-world dependencies that may respond slowly. The design separates indicators so developers can add or remove checks modularly. Graceful shutdown support was added to help apps integrate smoothly with container orchestrators like Kubernetes, which rely on health status to manage app lifecycle.

┌───────────────┐
│ HTTP Request  │
│ to /health    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Terminus Core │
│ (Health Check)│
└──────┬────────┘
       │
       ▼
┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│ Indicator 1   │   │ Indicator 2   │   │ Indicator N   │
│ (DB Check)    │   │ (API Check)   │   │ (Custom Check)│
└──────┬────────┘   └──────┬────────┘   └──────┬────────┘
       │                   │                   │
       └──────────┬────────┴──────────┬────────┘
                  ▼                   ▼
             ┌───────────────┐   ┌───────────────┐
             │ Collect Status│──▶│ Compose JSON  │
             └───────────────┘   └───────────────┘
                      │                   │
                      └──────────┬────────┘
                                 ▼
                        ┌───────────────┐
                        │ HTTP Response │
                        │ 200 or 503    │
                        └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: do you think a health check guarantees your app is bug-free? Commit to yes or no.

Common Belief:Health checks mean the app is fully working and bug-free.

Tap to reveal reality

Quick: do you think health checks should always return HTTP 200 even if some services fail? Commit to yes or no.

Common Belief:Health checks should always return 200 OK to avoid alarming monitoring systems.

Tap to reveal reality

Quick: do you think health checks run continuously in the background? Commit to yes or no.

Common Belief:Health checks run constantly inside the app to monitor health in real-time.

Tap to reveal reality

Quick: do you think Terminus health checks can only test HTTP services? Commit to yes or no.

Common Belief:Terminus health checks are limited to checking HTTP endpoints only.

Tap to reveal reality

Expert Zone

1

Terminus health indicators can be combined with logical operators (AND/OR) to create complex health rules, allowing nuanced health states.

2

Graceful shutdown integration with health checks prevents traffic routing to an app instance that is closing, avoiding errors during deployments.

3

Custom health indicators can include performance metrics or business logic checks, not just connectivity, providing richer health insights.

When NOT to use

Terminus health checks are not suitable for deep application performance monitoring or security scanning. For those, use specialized APM tools or security scanners. Also, if your app is extremely simple with no external dependencies, basic health checks may be unnecessary.

Production Patterns

In production, Terminus health checks are often combined with Kubernetes readiness and liveness probes to manage container lifecycle. Teams extend Terminus with custom indicators for databases, caches, message brokers, and third-party APIs. Health check data is integrated into monitoring dashboards and alerting systems to automate incident response.

Connections

Kubernetes Liveness and Readiness Probes

Health checks with Terminus provide the data Kubernetes uses to decide if a container is ready or should be restarted.

Understanding Terminus health checks helps you configure Kubernetes probes correctly, improving app stability in cloud environments.

Circuit Breaker Pattern

Health checks inform circuit breakers about service availability to prevent cascading failures.

Knowing how health checks feed circuit breakers helps design resilient distributed systems.

Medical Checkups

Both health checks in apps and medical checkups assess the status of complex systems to prevent failures.

Seeing health checks as regular checkups highlights the importance of proactive monitoring and maintenance.

Common Pitfalls

#1Not returning proper HTTP status codes on failure.

Wrong approach:return { status: 'ok' }; // always returns 200 even if DB is down

Correct approach:throw new HttpException('Service Unavailable', 503); // returns 503 on failure

Root cause:Misunderstanding that health checks must signal failure via HTTP status codes.

#2Running heavy or slow checks synchronously blocking the response.

Wrong approach:function slowCheck() { while(true) {} } // blocks event loop

Correct approach:async function slowCheck() { await someAsyncCall(); return { status: 'ok' }; }

Root cause:Not using asynchronous code for checks that involve waiting, causing app unresponsiveness.

#3Exposing health checks without security considerations.

Wrong approach:app.use('/health', terminusHealthCheck); // open to public without restrictions

Correct approach:app.use('/health', authMiddleware, terminusHealthCheck); // restrict access

Root cause:Ignoring that health endpoints can reveal sensitive info and should be protected.

Key Takeaways

Health checks with Terminus in NestJS provide a standardized way to monitor app and dependency health via HTTP endpoints.

They help detect problems early, enabling automated recovery and improving app reliability and uptime.

Health indicators can be synchronous or asynchronous and cover databases, APIs, and custom logic.

Proper HTTP status codes and detailed JSON responses are essential for effective monitoring integration.

Terminus supports graceful shutdown and custom indicators, making it flexible for real-world production use.