Overview - Memory requests and limits

What is it?

Memory requests and limits are settings in Kubernetes that control how much memory a container can use. A memory request is the amount of memory Kubernetes guarantees to a container, while a memory limit is the maximum memory the container is allowed to use. These settings help Kubernetes decide where to place containers and prevent any container from using too much memory and affecting others.

Why it matters

Without memory requests and limits, containers could use more memory than expected, causing the system to slow down or crash. This would be like letting one person in a shared apartment use all the hot water, leaving none for others. Setting these controls ensures fair resource sharing and system stability, which is critical for running reliable applications.

Where it fits

Before learning memory requests and limits, you should understand basic Kubernetes concepts like pods and containers. After this, you can learn about CPU requests and limits, quality of service classes, and how Kubernetes schedules workloads based on resource needs.

Mental Model

Core Idea

Memory requests guarantee a container’s minimum memory, while memory limits cap its maximum memory use to keep the system stable.

Think of it like...

Imagine a shared kitchen where each person is assigned a minimum shelf space (memory request) to store their ingredients and a maximum shelf space (memory limit) to prevent overcrowding. This keeps the kitchen organized and fair for everyone.

┌───────────────────────────────┐
│         Kubernetes Pod         │
│ ┌───────────────┐ ┌─────────┐ │
│ │ Container A   │ │ Container B │
│ │ ┌───────────┐ │ │ ┌───────┐ │
│ │ │ Memory    │ │ │ │ Memory│ │
│ │ │ Request   │ │ │ │ Request│ │
│ │ │ (Min)    │ │ │ │ (Min) │ │
│ │ └───────────┘ │ │ └───────┘ │
│ │ ┌───────────┐ │ │ ┌───────┐ │
│ │ │ Memory    │ │ │ │ Memory│ │
│ │ │ Limit    │ │ │ │ Limit │ │
│ │ │ (Max)    │ │ │ │ (Max) │ │
│ │ └───────────┘ │ │ └───────┘ │
│ └───────────────┘ └─────────┘ │
└───────────────────────────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding Kubernetes Pods and Containers

Concept: Learn what pods and containers are in Kubernetes as the basic units where memory requests and limits apply.

A pod is the smallest deployable unit in Kubernetes and can contain one or more containers. Containers run applications and need resources like CPU and memory to work. Kubernetes manages these resources to keep the system healthy.

Result

You understand that memory settings apply to containers inside pods, which are scheduled on nodes.

Knowing the structure of pods and containers is essential because memory requests and limits are set per container, affecting how pods are scheduled and run.

2

FoundationWhat Are Memory Requests and Limits?

3

IntermediateHow Kubernetes Uses Memory Requests

4

IntermediateWhat Happens When Memory Limits Are Exceeded

5

IntermediateQuality of Service Classes Based on Memory Settings

6

AdvancedMemory Requests and Limits in Production Clusters

7

ExpertSurprising Effects of Memory Limits on Container Performance

Under the Hood

Kubernetes uses Linux cgroups to enforce memory requests and limits. Requests reserve memory on a node for scheduling, while limits set cgroup constraints that the kernel enforces at runtime. If a container exceeds its limit, the kernel's OOM killer terminates it. Kubernetes monitors container status and restarts terminated containers based on policies.

Why designed this way?

This design separates scheduling guarantees from runtime enforcement, allowing Kubernetes to efficiently allocate resources while protecting node stability. Alternatives like only using limits would risk scheduling pods on nodes without enough memory, causing failures. Using cgroups leverages existing Linux kernel features for resource control.

┌───────────────┐
│ Kubernetes    │
│ Scheduler    │
│ (uses requests)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Node with     │
│ Available    │
│ Memory       │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Container     │
│ cgroup limits │
│ (memory limit)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Linux Kernel  │
│ OOM Killer    │
└───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does Kubernetes schedule pods based on memory limits or requests? Commit to your answer.

Common Belief:Kubernetes schedules pods based on memory limits because that is the maximum memory a container can use.

Tap to reveal reality

Quick: If a container exceeds its memory request but stays below its limit, will it be killed? Commit to your answer.

Common Belief:A container will be killed if it uses more memory than its request, even if under the limit.

Tap to reveal reality

Quick: Does setting no memory requests or limits mean a container can use unlimited memory safely? Commit to your answer.

Common Belief:Without memory requests or limits, containers can safely use as much memory as they want without issues.

Tap to reveal reality

Quick: Can setting memory limits too low cause performance problems even if the container does not get killed? Commit to your answer.

Common Belief:Memory limits only cause problems if the container exceeds them and gets killed; otherwise, they have no effect.

Tap to reveal reality

Expert Zone

1

Memory requests affect scheduling but do not limit runtime usage; limits enforce runtime caps but do not influence scheduling decisions.

2

Setting requests equal to limits guarantees a pod's resources and places it in the Guaranteed QoS class, reducing eviction risk under node pressure.

3

Linux kernel behavior under memory pressure can cause containers near their limits to experience latency spikes due to memory reclaim and swapping.

When NOT to use

Memory requests and limits are not suitable for workloads with highly unpredictable memory usage patterns where static limits cause frequent restarts. In such cases, consider using vertical pod autoscaling or custom resource metrics to adjust resources dynamically.

Production Patterns

In production, teams often set conservative memory requests to ensure scheduling success and set limits slightly higher to allow burst usage. They monitor pod memory usage with tools like Prometheus and adjust settings to balance stability and resource efficiency. Critical pods use Guaranteed QoS by matching requests and limits.

Connections

CPU requests and limits

Parallel resource management concepts in Kubernetes

Understanding memory requests and limits helps grasp CPU resource controls, as both use similar scheduling and enforcement mechanisms.

Linux cgroups

Underlying technology enforcing resource limits

Knowing how cgroups work explains why Kubernetes can enforce memory limits and how the kernel kills processes exceeding them.

Shared apartment resource allocation

Resource sharing and fairness in a shared environment

The concept of memory requests and limits mirrors how roommates share limited resources fairly to avoid conflicts and shortages.

Common Pitfalls

#1Not setting memory requests causes pods to be scheduled without guaranteed memory.

Wrong approach:apiVersion: v1 kind: Pod metadata: name: example-pod spec: containers: - name: app image: myapp resources: limits: memory: "500Mi"

Correct approach:apiVersion: v1 kind: Pod metadata: name: example-pod spec: containers: - name: app image: myapp resources: requests: memory: "300Mi" limits: memory: "500Mi"

Root cause:Confusing limits with requests leads to missing guaranteed memory, causing scheduling on nodes without enough resources.

#2Setting memory requests higher than actual usage wastes cluster resources.

Wrong approach:resources: requests: memory: "2Gi" limits: memory: "3Gi"

Correct approach:resources: requests: memory: "500Mi" limits: memory: "1Gi"

Root cause:Overestimating requests causes Kubernetes to reserve more memory than needed, reducing cluster efficiency.

#3Setting memory limits too low causes frequent container restarts.

Wrong approach:resources: requests: memory: "500Mi" limits: memory: "600Mi"

Correct approach:resources: requests: memory: "500Mi" limits: memory: "1Gi"

Root cause:Too tight limits cause containers to exceed them during normal operation, triggering OOM kills.

Key Takeaways

Memory requests guarantee the minimum memory a container needs and influence pod scheduling.

Memory limits cap the maximum memory a container can use at runtime to protect node stability.

Kubernetes schedules pods based on requests, not limits, so setting requests correctly is critical.

Exceeding memory limits causes the container to be killed by the kernel's OOM killer and restarted by Kubernetes.

Balancing requests and limits properly ensures efficient resource use, stable applications, and predictable performance.