Bird
Raised Fist0
Kubernetesdevops~5 mins

Why cluster monitoring matters in Kubernetes - Why It Works

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
When you run many applications on a Kubernetes cluster, things can break or slow down. Cluster monitoring helps you see what is happening inside your cluster so you can fix problems quickly and keep your apps running smoothly.
When you want to know if your apps are healthy and working as expected.
When you need to find out why an app is slow or not responding.
When you want to track resource use like CPU and memory to avoid crashes.
When you want alerts to warn you before problems get worse.
When you want to understand usage trends to plan for future growth.
Commands
This command shows the current CPU and memory usage of all nodes in your cluster. It helps you see if any node is overloaded.
Terminal
kubectl top nodes
Expected OutputExpected
NAME CPU(cores) MEMORY(bytes) worker-node-1 250m 512Mi worker-node-2 300m 768Mi
This command shows CPU and memory usage for all pods in all namespaces. It helps identify which pods use the most resources.
Terminal
kubectl top pods --all-namespaces
Expected OutputExpected
NAMESPACE NAME CPU(cores) MEMORY(bytes) default my-app-1234567890-abcde 100m 200Mi kube-system coredns-abcdef1234 50m 100Mi
--all-namespaces - Show pods from all namespaces, not just the current one
This command lists recent events in the cluster sorted by time. It helps you see warnings or errors happening in your cluster.
Terminal
kubectl get events --sort-by=.metadata.creationTimestamp
Expected OutputExpected
LAST SEEN TYPE REASON OBJECT MESSAGE 1m Warning BackOff pod/my-app-1234567890-abcde Back-off restarting failed container
--sort-by - Sort events by creation time to see the latest first
Key Concept

If you remember nothing else from this pattern, remember: monitoring your cluster helps you catch problems early and keep your apps running well.

Common Mistakes
Ignoring resource usage until apps crash or slow down
Waiting too long makes problems harder to fix and causes downtime
Regularly check resource usage and events to catch issues early
Only monitoring one part of the cluster, like nodes but not pods
Problems can happen anywhere; missing pod issues can cause app failures
Monitor both nodes and pods to get a full picture of cluster health
Not sorting events by time, making it hard to find recent problems
Old events clutter the output and hide new warnings or errors
Use sorting flags like --sort-by to see the latest events first
Summary
Use 'kubectl top nodes' to check node resource usage and spot overloads.
Use 'kubectl top pods --all-namespaces' to find which pods use the most CPU and memory.
Use 'kubectl get events --sort-by=.metadata.creationTimestamp' to see recent cluster warnings and errors.

Practice

(1/5)
1. Why is cluster monitoring important in Kubernetes?
easy
A. It removes unused containers automatically.
B. It helps detect problems early and keeps the system healthy.
C. It replaces the need for backups.
D. It automatically scales the cluster without user input.

Solution

  1. Step 1: Understand the purpose of monitoring

    Monitoring tracks system health and performance to spot issues early.
  2. Step 2: Compare options with monitoring goals

    Only early problem detection and health maintenance match monitoring's purpose.
  3. Final Answer:

    It helps detect problems early and keeps the system healthy. -> Option B
  4. Quick Check:

    Monitoring = Early problem detection [OK]
Hint: Monitoring = spotting problems early to keep system healthy [OK]
Common Mistakes:
  • Confusing monitoring with automatic scaling
  • Thinking monitoring replaces backups
  • Assuming monitoring deletes containers
2. Which command is used to check the status of nodes in a Kubernetes cluster for monitoring?
easy
A. kubectl get nodes
B. kubectl describe service
C. kubectl get pods
D. kubectl logs

Solution

  1. Step 1: Identify command to list nodes

    The command kubectl get nodes lists all cluster nodes and their status.
  2. Step 2: Eliminate other commands

    kubectl get pods lists pods, not nodes; kubectl describe service shows service details; kubectl logs shows logs of pods.
  3. Final Answer:

    kubectl get nodes -> Option A
  4. Quick Check:

    Nodes status = kubectl get nodes [OK]
Hint: Nodes status command is 'kubectl get nodes' [OK]
Common Mistakes:
  • Using 'kubectl get pods' to check nodes
  • Confusing logs with node status
  • Describing services instead of nodes
3. Given the output below from kubectl top nodes, what does it indicate?
NAME           CPU(cores)   MEMORY(bytes)
node-1         250m        512Mi
node-2         900m        1Gi
node-3         100m        256Mi
medium
A. node-3 has the highest CPU usage.
B. node-1 is using the most memory.
C. All nodes have equal resource usage.
D. node-2 is under heavy CPU and memory load compared to others.

Solution

  1. Step 1: Analyze CPU and memory usage per node

    node-2 shows 900m CPU and 1Gi memory, which is higher than node-1 and node-3.
  2. Step 2: Compare usage values

    node-3 has lowest CPU (100m), node-1 has moderate CPU (250m), node-2 is highest in both CPU and memory.
  3. Final Answer:

    node-2 is under heavy CPU and memory load compared to others. -> Option D
  4. Quick Check:

    Highest CPU and memory = node-2 [OK]
Hint: Highest CPU and memory usage means heavy load [OK]
Common Mistakes:
  • Mistaking 100m as highest CPU
  • Assuming equal resource usage
  • Confusing memory units
4. You set up cluster monitoring but notice no metrics appear when running kubectl top nodes. What is the most likely cause?
medium
A. Nodes are offline.
B. kubectl command is outdated.
C. Metrics-server is not installed or running.
D. Pods are not labeled correctly.

Solution

  1. Step 1: Understand what provides metrics for 'kubectl top'

    The metrics-server collects resource usage data for nodes and pods.
  2. Step 2: Identify why metrics might be missing

    If metrics-server is missing or not running, kubectl top shows no data.
  3. Final Answer:

    Metrics-server is not installed or running. -> Option C
  4. Quick Check:

    Missing metrics = metrics-server issue [OK]
Hint: No metrics? Check if metrics-server is running [OK]
Common Mistakes:
  • Blaming kubectl version without checking metrics-server
  • Assuming nodes are offline without verification
  • Thinking pod labels affect node metrics
5. You want to improve cluster reliability by setting up alerts for high CPU usage on nodes. Which approach best supports this goal?
hard
A. Use Prometheus to monitor node metrics and configure alert rules for CPU thresholds.
B. Manually check node CPU usage daily with kubectl top nodes.
C. Restart nodes periodically to prevent high CPU usage.
D. Disable monitoring to reduce overhead and avoid false alerts.

Solution

  1. Step 1: Identify monitoring tool for alerts

    Prometheus collects metrics and supports alerting rules for conditions like high CPU.
  2. Step 2: Evaluate options for reliability

    Manual checks are slow and error-prone; restarting nodes blindly is not a solution; disabling monitoring removes visibility.
  3. Final Answer:

    Use Prometheus to monitor node metrics and configure alert rules for CPU thresholds. -> Option A
  4. Quick Check:

    Automated alerts = Prometheus + alert rules [OK]
Hint: Automate alerts with Prometheus for reliable monitoring [OK]
Common Mistakes:
  • Relying on manual checks only
  • Restarting nodes without cause
  • Disabling monitoring to avoid alerts