0
0
Kubernetesdevops~15 mins

Resource requests and limits in Kubernetes - Deep Dive

Choose your learning style9 modes available
Overview - Resource requests and limits
What is it?
Resource requests and limits in Kubernetes are settings that tell the system how much CPU and memory a container needs and how much it can use at most. Requests guarantee a minimum amount of resources for a container to run smoothly. Limits set the maximum resources a container can consume to avoid affecting other containers. These settings help Kubernetes manage resources efficiently across many containers.
Why it matters
Without resource requests and limits, containers could use too many resources, causing other containers to slow down or crash. This would make applications unreliable and hard to manage. By defining these, Kubernetes ensures fair sharing and stability, preventing resource shortages or waste. This leads to better performance, cost control, and predictable behavior in cloud environments.
Where it fits
Before learning resource requests and limits, you should understand basic Kubernetes concepts like pods, containers, and nodes. After this, you can learn about Kubernetes scheduling, autoscaling, and quality of service classes, which all depend on resource management.
Mental Model
Core Idea
Resource requests reserve the minimum needed resources for a container, while limits cap the maximum resources it can use to keep the system balanced.
Think of it like...
Imagine a shared kitchen where each cook reserves a certain amount of stove space (requests) to prepare their meal without interruption, but they cannot use more than their allotted burners (limits) so others can cook too.
┌─────────────────────────────┐
│        Kubernetes Node       │
│ ┌───────────────┐           │
│ │   Pod A       │           │
│ │ ┌───────────┐ │           │
│ │ │ Container │ │           │
│ │ │ Requests  │ │           │
│ │ │ Limits    │ │           │
│ │ └───────────┘ │           │
│ └───────────────┘           │
│ ┌───────────────┐           │
│ │   Pod B       │           │
│ │ ┌───────────┐ │           │
│ │ │ Container │ │           │
│ │ │ Requests  │ │           │
│ │ │ Limits    │ │           │
│ │ └───────────┘ │           │
│ └───────────────┘           │
│                             │
│ Total Node Resources        │
└─────────────────────────────┘
Build-Up - 8 Steps
1
FoundationUnderstanding Kubernetes Resources Basics
🤔
Concept: Introduce what CPU and memory resources mean in Kubernetes context.
In Kubernetes, CPU is measured in units called millicores (1000 millicores = 1 CPU core). Memory is measured in bytes (usually megabytes or gigabytes). Containers need these resources to run applications. Kubernetes manages these resources on nodes to run many containers efficiently.
Result
Learner understands the units and types of resources Kubernetes manages.
Knowing the units and types of resources is essential before setting any requests or limits.
2
FoundationWhat Are Resource Requests and Limits?
🤔
Concept: Explain the difference between requests and limits simply.
A resource request is the amount of CPU or memory a container asks for to start and run reliably. A resource limit is the maximum amount it can use. If a container tries to use more than its limit, Kubernetes may stop or slow it down.
Result
Learner can distinguish between guaranteed resources (requests) and maximum allowed resources (limits).
Understanding this difference helps prevent resource conflicts and ensures fair sharing.
3
IntermediateHow Kubernetes Uses Requests for Scheduling
🤔Before reading on: do you think Kubernetes schedules pods based on their limits or requests? Commit to your answer.
Concept: Kubernetes uses resource requests to decide where to place pods on nodes.
When you create a pod, Kubernetes looks at the resource requests to find a node with enough free resources. It ignores limits during scheduling. This means requests affect whether a pod can start, while limits control runtime usage.
Result
Learner understands that requests influence pod placement, not limits.
Knowing that scheduling depends on requests prevents confusion about pod startup failures.
4
IntermediateWhat Happens When Limits Are Exceeded?
🤔Before reading on: do you think exceeding CPU limits kills the container immediately or just slows it down? Commit to your answer.
Concept: Explain Kubernetes behavior when containers exceed their resource limits.
If a container uses more memory than its limit, Kubernetes kills it to protect the node. For CPU, Kubernetes throttles the container, slowing it down but not killing it. This difference is important for application stability.
Result
Learner knows the consequences of exceeding CPU vs memory limits.
Understanding these behaviors helps design applications that handle resource limits gracefully.
5
IntermediateQuality of Service Classes Based on Requests and Limits
🤔
Concept: Introduce how Kubernetes classifies pods by their resource settings.
Kubernetes assigns pods to QoS classes: Guaranteed, Burstable, or BestEffort. Guaranteed pods have equal requests and limits. Burstable pods have requests less than limits. BestEffort pods have no requests or limits. These classes affect pod eviction priority during resource pressure.
Result
Learner understands how resource settings affect pod priority and eviction.
Knowing QoS classes helps optimize resource allocation and pod reliability.
6
AdvancedConfiguring Requests and Limits in Pod Specs
🤔Before reading on: do you think resource requests and limits are set at the pod or container level? Commit to your answer.
Concept: Show how to define requests and limits in Kubernetes YAML manifests.
In the pod spec, under each container, you add a resources section with requests and limits. For example: resources: requests: cpu: "500m" memory: "256Mi" limits: cpu: "1" memory: "512Mi" This tells Kubernetes the minimum and maximum resources for that container.
Result
Learner can write correct YAML to set resource requests and limits.
Knowing the exact syntax prevents configuration errors that cause scheduling or runtime issues.
7
AdvancedImpact of Requests and Limits on Cluster Autoscaling
🤔
Concept: Explain how resource settings influence autoscaling decisions.
Cluster autoscalers look at resource requests to decide when to add or remove nodes. If pods request more resources than available, autoscalers add nodes. Limits do not affect autoscaling directly but control container usage. Proper requests help autoscalers keep the cluster balanced and cost-effective.
Result
Learner understands the role of requests in autoscaling behavior.
Knowing this helps optimize cluster size and cost by setting realistic requests.
8
ExpertSurprising Effects of Overcommitting Resources
🤔Before reading on: do you think setting requests lower than actual usage always improves cluster efficiency? Commit to your answer.
Concept: Discuss risks and behaviors when requests are set too low compared to actual usage.
Overcommitting means setting requests lower than what containers actually use. This can increase cluster utilization but risks pod eviction or throttling under load. Kubernetes may evict Burstable or BestEffort pods first during pressure. Also, CPU throttling can cause performance issues that are hard to debug.
Result
Learner realizes the trade-offs of overcommitting resources.
Understanding overcommit risks helps balance efficiency and reliability in production.
Under the Hood
Kubernetes uses the kube-scheduler to assign pods to nodes based on resource requests. The kubelet on each node enforces limits using Linux cgroups, which control CPU shares and memory usage. When a container exceeds its memory limit, the kernel's OOM killer terminates it. CPU limits cause the cgroup to throttle CPU cycles, slowing the container without killing it.
Why designed this way?
This design separates scheduling decisions (requests) from runtime enforcement (limits) to optimize resource allocation and stability. Using cgroups leverages existing Linux kernel features for resource control, avoiding reinventing complex mechanisms. This approach balances fairness, efficiency, and protection against resource abuse.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ kube-scheduler│──────▶│   Node kubelet│──────▶│ Linux cgroups │
│  (uses requests)│     │ (enforces limits)│    │ (enforce CPU & │
└───────────────┘       └───────────────┘       │ memory limits)│
                                                └───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think Kubernetes kills containers that exceed CPU limits immediately? Commit yes or no.
Common Belief:Containers that exceed CPU limits are killed immediately like with memory limits.
Tap to reveal reality
Reality:Kubernetes throttles CPU usage when limits are exceeded, slowing the container but not killing it.
Why it matters:Believing containers are killed can lead to unnecessary debugging and misconfiguration.
Quick: Do you think setting only limits without requests affects pod scheduling? Commit yes or no.
Common Belief:Setting resource limits alone controls pod scheduling on nodes.
Tap to reveal reality
Reality:Kubernetes schedules pods based only on resource requests, ignoring limits during scheduling.
Why it matters:Misunderstanding this causes pods to fail scheduling unexpectedly.
Quick: Do you think BestEffort pods have resource guarantees? Commit yes or no.
Common Belief:Pods without resource requests or limits still get guaranteed resources.
Tap to reveal reality
Reality:BestEffort pods have no resource guarantees and are the first to be evicted under pressure.
Why it matters:Assuming guarantees can cause critical workloads to be disrupted.
Quick: Do you think overcommitting resources always improves cluster efficiency? Commit yes or no.
Common Belief:Setting requests lower than actual usage always makes better use of cluster resources.
Tap to reveal reality
Reality:Overcommitting can cause throttling, evictions, and unpredictable performance.
Why it matters:Ignoring this leads to unstable applications and harder troubleshooting.
Expert Zone
1
Requests and limits can be set differently per container within the same pod, affecting pod QoS class and scheduling.
2
CPU limits use cgroup shares and throttling, which can cause uneven CPU time slices, impacting latency-sensitive apps.
3
Memory limits are hard limits enforced by the kernel, so setting them too low can cause immediate pod crashes.
When NOT to use
Avoid setting resource requests and limits when running very lightweight or short-lived jobs where overhead is unnecessary. Instead, use BestEffort QoS or specialized batch scheduling. Also, in environments with strict resource isolation, consider using dedicated nodes or namespaces with quotas.
Production Patterns
In production, teams often set requests based on average usage and limits based on peak usage to balance efficiency and stability. Monitoring tools track actual usage to adjust these values over time. QoS classes guide eviction policies during node pressure. Autoscalers rely on requests to scale clusters dynamically.
Connections
Operating System Resource Management
Builds-on
Understanding Linux cgroups and kernel OOM killer helps grasp how Kubernetes enforces resource limits at the system level.
Cloud Cost Optimization
Builds-on
Proper resource requests and limits prevent overprovisioning, directly reducing cloud infrastructure costs.
Project Management Resource Allocation
Analogy in resource planning
Just like allocating team members to tasks with minimum and maximum hours, Kubernetes allocates CPU and memory to containers to ensure smooth operation.
Common Pitfalls
#1Not setting resource requests causes pods to be scheduled without guaranteed resources.
Wrong approach:resources: limits: cpu: "1" memory: "512Mi"
Correct approach:resources: requests: cpu: "500m" memory: "256Mi" limits: cpu: "1" memory: "512Mi"
Root cause:Learners often think setting limits alone is enough, but requests are needed for scheduling guarantees.
#2Setting requests higher than limits causes pod creation errors.
Wrong approach:resources: requests: cpu: "2" memory: "1Gi" limits: cpu: "1" memory: "512Mi"
Correct approach:resources: requests: cpu: "500m" memory: "256Mi" limits: cpu: "1" memory: "512Mi"
Root cause:Requests must never exceed limits; misunderstanding this leads to invalid pod specs.
#3Assuming CPU limits kill containers like memory limits do.
Wrong approach:Expect container to restart immediately when CPU usage spikes above limit.
Correct approach:Understand CPU limits throttle CPU usage without killing the container.
Root cause:Confusing CPU throttling with memory OOM killing causes wrong troubleshooting steps.
Key Takeaways
Resource requests guarantee the minimum CPU and memory a container needs to run reliably.
Resource limits cap the maximum CPU and memory a container can use to protect other workloads.
Kubernetes schedules pods based on requests, not limits, so requests affect pod placement.
Exceeding memory limits kills containers, but exceeding CPU limits throttles them without killing.
Properly setting requests and limits improves cluster stability, performance, and cost efficiency.