0
0
Kubernetesdevops~15 mins

Memory requests and limits in Kubernetes - Deep Dive

Choose your learning style9 modes available
Overview - Memory requests and limits
What is it?
Memory requests and limits are settings in Kubernetes that control how much memory a container can use. A memory request is the amount of memory Kubernetes guarantees to a container, while a memory limit is the maximum memory the container is allowed to use. These settings help Kubernetes decide where to place containers and prevent any container from using too much memory and affecting others.
Why it matters
Without memory requests and limits, containers could use more memory than expected, causing the system to slow down or crash. This would be like letting one person in a shared apartment use all the hot water, leaving none for others. Setting these controls ensures fair resource sharing and system stability, which is critical for running reliable applications.
Where it fits
Before learning memory requests and limits, you should understand basic Kubernetes concepts like pods and containers. After this, you can learn about CPU requests and limits, quality of service classes, and how Kubernetes schedules workloads based on resource needs.
Mental Model
Core Idea
Memory requests guarantee a container’s minimum memory, while memory limits cap its maximum memory use to keep the system stable.
Think of it like...
Imagine a shared kitchen where each person is assigned a minimum shelf space (memory request) to store their ingredients and a maximum shelf space (memory limit) to prevent overcrowding. This keeps the kitchen organized and fair for everyone.
┌───────────────────────────────┐
│         Kubernetes Pod         │
│ ┌───────────────┐ ┌─────────┐ │
│ │ Container A   │ │ Container B │
│ │ ┌───────────┐ │ │ ┌───────┐ │
│ │ │ Memory    │ │ │ │ Memory│ │
│ │ │ Request   │ │ │ │ Request│ │
│ │ │ (Min)    │ │ │ │ (Min) │ │
│ │ └───────────┘ │ │ └───────┘ │
│ │ ┌───────────┐ │ │ ┌───────┐ │
│ │ │ Memory    │ │ │ │ Memory│ │
│ │ │ Limit    │ │ │ │ Limit │ │
│ │ │ (Max)    │ │ │ │ (Max) │ │
│ │ └───────────┘ │ │ └───────┘ │
│ └───────────────┘ └─────────┘ │
└───────────────────────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Kubernetes Pods and Containers
🤔
Concept: Learn what pods and containers are in Kubernetes as the basic units where memory requests and limits apply.
A pod is the smallest deployable unit in Kubernetes and can contain one or more containers. Containers run applications and need resources like CPU and memory to work. Kubernetes manages these resources to keep the system healthy.
Result
You understand that memory settings apply to containers inside pods, which are scheduled on nodes.
Knowing the structure of pods and containers is essential because memory requests and limits are set per container, affecting how pods are scheduled and run.
2
FoundationWhat Are Memory Requests and Limits?
🤔
Concept: Introduce the definitions of memory requests and limits and their roles in resource management.
Memory request is the amount of memory Kubernetes guarantees to a container. Memory limit is the maximum memory a container can use. If a container tries to use more than its limit, it may be terminated. Requests help Kubernetes decide where to place pods based on available resources.
Result
You can distinguish between guaranteed memory (request) and maximum allowed memory (limit).
Understanding these two settings clarifies how Kubernetes balances resource allocation and prevents resource hogging.
3
IntermediateHow Kubernetes Uses Memory Requests
🤔Before reading on: do you think Kubernetes schedules pods based on memory requests or limits? Commit to your answer.
Concept: Explain that Kubernetes uses memory requests to schedule pods on nodes with enough available memory.
When you create a pod, Kubernetes looks at the memory requests of all containers inside it. It finds a node with enough free memory to meet these requests. Memory limits do not affect scheduling but control runtime usage.
Result
Pods are placed only on nodes that can guarantee the requested memory, ensuring stable operation.
Knowing that scheduling depends on requests, not limits, helps prevent resource overcommitment and scheduling failures.
4
IntermediateWhat Happens When Memory Limits Are Exceeded
🤔Before reading on: do you think exceeding memory limits causes the container to slow down or to be killed? Commit to your answer.
Concept: Describe the behavior when a container uses more memory than its limit.
If a container tries to use more memory than its limit, the Linux kernel's Out Of Memory (OOM) killer terminates it. This prevents one container from crashing the whole node. Kubernetes then may restart the container based on its restart policy.
Result
Containers that exceed memory limits are killed and restarted, protecting node stability.
Understanding this prevents surprises in production where containers unexpectedly restart due to memory overuse.
5
IntermediateQuality of Service Classes Based on Memory Settings
🤔Before reading on: do you think setting requests and limits equal affects pod priority? Commit to your answer.
Concept: Introduce Kubernetes Quality of Service (QoS) classes that depend on memory requests and limits.
Kubernetes assigns pods to QoS classes: Guaranteed, Burstable, or BestEffort. If requests and limits are equal for all resources, the pod is Guaranteed and less likely to be evicted. If requests are less than limits, it's Burstable. If no requests are set, it's BestEffort.
Result
You can predict pod eviction priority and stability based on memory settings.
Knowing QoS classes helps design pods that survive node pressure and maintain availability.
6
AdvancedMemory Requests and Limits in Production Clusters
🤔Before reading on: do you think setting very high memory limits without requests is safe in production? Commit to your answer.
Concept: Discuss best practices and risks of memory settings in real-world Kubernetes clusters.
In production, setting memory requests too low can cause pods to be scheduled on nodes without enough memory, leading to OOM kills. Setting limits too high without requests can cause resource contention. Properly balancing requests and limits ensures efficient resource use and stability.
Result
Clusters run smoothly with predictable pod behavior and minimal crashes.
Understanding the balance between requests and limits is key to avoiding costly downtime and resource waste.
7
ExpertSurprising Effects of Memory Limits on Container Performance
🤔Before reading on: do you think memory limits can cause performance degradation even if the container stays below the limit? Commit to your answer.
Concept: Explain how memory limits can affect container performance due to kernel memory management and cgroup behavior.
Memory limits use Linux cgroups to restrict container memory. When close to the limit, the kernel may aggressively reclaim memory or swap, causing latency spikes. Also, some applications may behave differently under memory pressure, leading to subtle bugs or slowdowns.
Result
Containers may experience performance issues even without OOM kills if limits are too tight.
Knowing this helps experts tune memory limits not just for safety but also for optimal performance and reliability.
Under the Hood
Kubernetes uses Linux cgroups to enforce memory requests and limits. Requests reserve memory on a node for scheduling, while limits set cgroup constraints that the kernel enforces at runtime. If a container exceeds its limit, the kernel's OOM killer terminates it. Kubernetes monitors container status and restarts terminated containers based on policies.
Why designed this way?
This design separates scheduling guarantees from runtime enforcement, allowing Kubernetes to efficiently allocate resources while protecting node stability. Alternatives like only using limits would risk scheduling pods on nodes without enough memory, causing failures. Using cgroups leverages existing Linux kernel features for resource control.
┌───────────────┐
│ Kubernetes    │
│ Scheduler    │
│ (uses requests)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Node with     │
│ Available    │
│ Memory       │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Container     │
│ cgroup limits │
│ (memory limit)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Linux Kernel  │
│ OOM Killer    │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does Kubernetes schedule pods based on memory limits or requests? Commit to your answer.
Common Belief:Kubernetes schedules pods based on memory limits because that is the maximum memory a container can use.
Tap to reveal reality
Reality:Kubernetes schedules pods based on memory requests, which are the guaranteed minimum resources, not limits.
Why it matters:Scheduling based on limits would risk placing pods on nodes without enough guaranteed memory, causing failures and instability.
Quick: If a container exceeds its memory request but stays below its limit, will it be killed? Commit to your answer.
Common Belief:A container will be killed if it uses more memory than its request, even if under the limit.
Tap to reveal reality
Reality:Containers are only killed if they exceed their memory limit, not their request.
Why it matters:Confusing requests with limits can lead to incorrect assumptions about container stability and resource usage.
Quick: Does setting no memory requests or limits mean a container can use unlimited memory safely? Commit to your answer.
Common Belief:Without memory requests or limits, containers can safely use as much memory as they want without issues.
Tap to reveal reality
Reality:Without limits, containers can consume all node memory, causing the node to become unstable or crash due to OOM kills affecting other pods.
Why it matters:Not setting limits risks cluster stability and can cause unpredictable outages.
Quick: Can setting memory limits too low cause performance problems even if the container does not get killed? Commit to your answer.
Common Belief:Memory limits only cause problems if the container exceeds them and gets killed; otherwise, they have no effect.
Tap to reveal reality
Reality:Tight memory limits can cause kernel memory reclaim and swapping, leading to performance degradation even without kills.
Why it matters:Ignoring this can cause subtle, hard-to-debug performance issues in production.
Expert Zone
1
Memory requests affect scheduling but do not limit runtime usage; limits enforce runtime caps but do not influence scheduling decisions.
2
Setting requests equal to limits guarantees a pod's resources and places it in the Guaranteed QoS class, reducing eviction risk under node pressure.
3
Linux kernel behavior under memory pressure can cause containers near their limits to experience latency spikes due to memory reclaim and swapping.
When NOT to use
Memory requests and limits are not suitable for workloads with highly unpredictable memory usage patterns where static limits cause frequent restarts. In such cases, consider using vertical pod autoscaling or custom resource metrics to adjust resources dynamically.
Production Patterns
In production, teams often set conservative memory requests to ensure scheduling success and set limits slightly higher to allow burst usage. They monitor pod memory usage with tools like Prometheus and adjust settings to balance stability and resource efficiency. Critical pods use Guaranteed QoS by matching requests and limits.
Connections
CPU requests and limits
Parallel resource management concepts in Kubernetes
Understanding memory requests and limits helps grasp CPU resource controls, as both use similar scheduling and enforcement mechanisms.
Linux cgroups
Underlying technology enforcing resource limits
Knowing how cgroups work explains why Kubernetes can enforce memory limits and how the kernel kills processes exceeding them.
Shared apartment resource allocation
Resource sharing and fairness in a shared environment
The concept of memory requests and limits mirrors how roommates share limited resources fairly to avoid conflicts and shortages.
Common Pitfalls
#1Not setting memory requests causes pods to be scheduled without guaranteed memory.
Wrong approach:apiVersion: v1 kind: Pod metadata: name: example-pod spec: containers: - name: app image: myapp resources: limits: memory: "500Mi"
Correct approach:apiVersion: v1 kind: Pod metadata: name: example-pod spec: containers: - name: app image: myapp resources: requests: memory: "300Mi" limits: memory: "500Mi"
Root cause:Confusing limits with requests leads to missing guaranteed memory, causing scheduling on nodes without enough resources.
#2Setting memory requests higher than actual usage wastes cluster resources.
Wrong approach:resources: requests: memory: "2Gi" limits: memory: "3Gi"
Correct approach:resources: requests: memory: "500Mi" limits: memory: "1Gi"
Root cause:Overestimating requests causes Kubernetes to reserve more memory than needed, reducing cluster efficiency.
#3Setting memory limits too low causes frequent container restarts.
Wrong approach:resources: requests: memory: "500Mi" limits: memory: "600Mi"
Correct approach:resources: requests: memory: "500Mi" limits: memory: "1Gi"
Root cause:Too tight limits cause containers to exceed them during normal operation, triggering OOM kills.
Key Takeaways
Memory requests guarantee the minimum memory a container needs and influence pod scheduling.
Memory limits cap the maximum memory a container can use at runtime to protect node stability.
Kubernetes schedules pods based on requests, not limits, so setting requests correctly is critical.
Exceeding memory limits causes the container to be killed by the kernel's OOM killer and restarted by Kubernetes.
Balancing requests and limits properly ensures efficient resource use, stable applications, and predictable performance.