Kubernetes Node Out of Resources Error Explained: Node Pressure, Pod Evictions & Resource Exhaustion Monitoring with CubeAPM

Author: Vijay Aggarwal
Category: Kubernetes Errors
Published Date: November 1, 2025
Last updated: January 21st, 2026

The Kubernetes “Node Out of Resources” error occurs when a node runs out of CPU, memory, or storage, blocking new pod scheduling or evicting running workloads. With 91% of organizations using containers in production, this issue poses a major stability risk — leading to Pending pods, evictions, and NotReady nodes that can trigger outages and failed deployments.

CubeAPM pinpoints these failures in real time by tracking node pressure metrics, eviction events, and kubelet logs across clusters. It correlates CPU, memory, and disk usage with pod restarts and deployment changes, helping teams identify which workloads over-consume resources and trigger node exhaustion before outages occur.

In this guide, we’ll define the error, explore its root causes, show how to fix it, integrate CubeAPM for monitoring, and provide alerting best practices.

What is Kubernetes Node Out of Resources Error

The Node Out of Resources error in Kubernetes occurs when a node exceeds its available CPU, memory, or ephemeral storage. When this happens, the kubelet marks the node as NotReady or Under Pressure, and Kubernetes either evicts pods or blocks new scheduling on that node.

This state is triggered by Kubernetes’ resource pressure detection mechanism. The kubelet constantly tracks node utilization and raises conditions such as MemoryPressure, DiskPressure, or PIDPressure when thresholds are breached. These safety measures help prevent node crashes but can disrupt workloads if resource requests and limits are poorly configured.

You’ll usually see this issue logged as events like:
Node is OutOfMemory, NodeHasDiskPressure, or NodeHasInsufficientCPU.

Key Characteristics

Node state changes: Node transitions to NotReady or SchedulingDisabled.
Eviction signals: Pods get terminated or rescheduled under MemoryPressure or DiskPressure.
Throttled workloads: CPU and memory throttling increase latency and error rates.
Scheduling failures: New pods remain Pending due to unavailable resources.
Kubelet logs: Show repeated eviction events and pressure conditions.

Why Kubernetes Node Out of Resources Error Happens

When a node reports “Out of Resources,” it usually means one or more resource pools — CPU, memory, or storage — have reached exhaustion. Below are the most common, Kubernetes-specific reasons this happens.

1. Overcommitted CPU or Memory Requests

When pods request more CPU or memory than the node can physically provide, the scheduler still tries to fit them until the node hits its capacity. Overcommitted nodes cause throttling, higher latency, and may eventually mark the node as NotReady under MemoryPressure.

Quick check:

Look for “Allocatable” vs. “Allocated” resources exceeding 100%.

Bash

kubectl describe node <node-name>

2. Memory Leaks in Long-Running Pods

Pods that slowly consume more memory over time (due to inefficient code or caching) can drain node memory. The kubelet then evicts lower-priority pods to reclaim space, resulting in cascading failures across workloads.

Quick check:

Bash

kubectl top pod --sort-by=memory

Identify pods with steady, unbounded memory growth.

3. Ephemeral Storage Exhaustion

Each pod writes temporary logs, images, and container layers to a node’s ephemeral storage. When /var/lib/kubelet fills up, Kubernetes triggers the DiskPressure condition and starts evicting pods.

Quick check:

Bash

 kubectl describe node <node-name> | grep DiskPressure

If true, check df -h on that node to confirm low disk space.

4. High Pod Density or Bursty Workloads

Running too many pods per node or hosting workloads with unpredictable spikes (e.g., autoscalers or cron jobs) can lead to short-lived resource depletion. This often results in CPU throttling and pods restarting under pressure.

Quick check:

Bash

kubectl get pods -o wide --field-selector spec.nodeName=<node-name>

Count pods exceeding normal density for your node type.

5. Insufficient Node Autoscaling or Quota Configuration

If Cluster Autoscaler or resource quotas are misconfigured, nodes can’t scale out fast enough to meet demand. Kubernetes continues scheduling workloads on already saturated nodes, triggering OutOfResource events.

Quick check:

Verify autoscaling settings in:

Bash

kubectl get configmap cluster-autoscaler-status -n kube-system

How to Fix Kubernetes Node Out of Resources Error

Fixing this issue involves freeing up node capacity, optimizing resource allocation, and tightening autoscaling policies. Below are the most effective ways to stabilize your cluster.

1. Identify Resource-Hungry Pods

Start by pinpointing pods consuming excessive CPU or memory. High resource utilization by a few workloads can starve other pods and push the node into MemoryPressure or CPUPressure.

Check:

Bash

kubectl top pod --sort-by=memory

If a few pods dominate usage, review their requests and limits.

Fix:
Adjust the resources.requests and resources.limits in their PodSpec to match realistic usage patterns.

2. Clean Up Ephemeral Storage

Old container logs, unused images, and temp files can fill /var/lib/docker or /var/lib/kubelet, causing DiskPressure.

Check:

Bash

kubectl describe node <node-name> | grep DiskPressure

Fix:

Bash

kubectl drain <node-name> --delete-emptydir-data && systemctl restart kubelet

You can also prune unused images with:

Bash

 docker system prune -af

3. Reduce Pod Density per Node

Excess pods overload node CPU and memory, causing throttling and scheduling delays.

Check:

Bash

kubectl get pods -o wide --field-selector spec.nodeName=<node-name> | wc -l

Fix:

Use the PodTopologySpreadConstraints or node taints to balance pods across nodes:

YAML

topologySpreadConstraints:

- maxSkew: 1

  topologyKey: kubernetes.io/hostname

  whenUnsatisfiable: ScheduleAnyway

4. Enable or Tune Cluster Autoscaler

If nodes are constantly maxed out, autoscaling may be disabled or misconfigured.

Check:

Bash

kubectl get configmap cluster-autoscaler-status -n kube-system -o yaml

Fix:

Update the minimum and maximum node group sizes to allow scaling during high load:

Bash

kubectl edit deployment cluster-autoscaler -n kube-system

5. Implement Pod Priority and QoS

Low-priority pods can crowd out critical workloads during resource shortages.

Fix:

Assign priorities in PodSpecs so essential services preempt less important pods:

YAML

priorityClassName: system-cluster-critical

This ensures critical workloads stay active even under pressure.

6. Monitor Node Conditions Continuously

Proactive monitoring prevents outages. Track metrics like CPU saturation, memory pressure, and eviction counts.

Check:

Bash

 kubectl describe node <node-name> | grep Pressure

If any pressure condition is true, it’s time to scale or rebalance workloads.

Monitoring Kubernetes Node Out of Resources Error with CubeAPM

When a node hits CPU, memory, or disk limits, you need full visibility into which workloads triggered it, when it began, and what impact it caused. CubeAPM gives you the fastest path to that root cause by correlating four telemetry signals — Events, Metrics, Logs, and Rollouts — across your entire Kubernetes environment. It automatically detects pressure states (MemoryPressure, DiskPressure, PIDPressure), correlates them with pod evictions, and helps you trace the resource surge back to specific deployments.

Step 1 — Install CubeAPM (Helm)

Use Helm to deploy CubeAPM in your cluster.

Bash

helm install cubeapm cubeapm/cubeapm --namespace cubeapm --create-namespace

For upgrades:

Bash

helm upgrade cubeapm cubeapm/cubeapm --namespace cubeapm

If you need custom settings, modify values.yaml to include your OpenTelemetry and log exporter configs before installation.

Step 2 — Deploy the OpenTelemetry Collector (DaemonSet + Deployment)

CubeAPM uses two collector modes:

DaemonSet: Collects node-level and kubelet metrics from every node.

Bash

helm install otel-agent open-telemetry/opentelemetry-collector --set mode=daemonset

Deployment: Handles trace, event, and log pipelines centrally.

Bash

helm install otel-collector open-telemetry/opentelemetry-collector --set mode=deployment

This ensures complete data flow from nodes, pods, and namespaces into CubeAPM’s backend.

Step 3 — Collector Configs Focused on Node Out of Resources

Below are minimal configuration snippets for both collectors.

DaemonSet config (otel-agent-config.yaml):

YAML

receivers:

  kubeletstats:

    collection_interval: 30s

    auth_type: serviceAccount

    metrics:

      cpu.utilization:

      memory.working_set:

      filesystem.usage:

processors:

  batch:

exporters:

  otlp:

    endpoint: cubeapm:4317

kubeletstats: Captures per-node CPU, memory, and disk usage.
batch: Groups telemetry for optimized export.
otlp exporter: Sends node metrics to CubeAPM in real time.

Deployment config (otel-collector-config.yaml):

YAML

receivers:

  k8s_events:

  filelog:

    include: [ /var/log/kubelet.log ]

processors:

  attributes:

    actions:

      - key: k8s.node.name

        from_attribute: host.name

        action: insert

exporters:

  otlp:

    endpoint: cubeapm:4317

k8s_events: Captures node pressure and eviction events.
filelog: Streams kubelet and container runtime logs.
attributes: Adds node-level metadata to logs and events.

Step 4 — Supporting Components

To enrich node telemetry, deploy kube-state-metrics:

Bash

helm install kube-state-metrics prometheus-community/kube-state-metrics

This provides real-time resource condition metrics like kube_node_status_condition and kube_pod_container_resource_limits.

Step 5 — Verification (What You Should See in CubeAPM)

After successful setup, you should see:

Events: Node eviction and pressure events (MemoryPressure, DiskPressure).
Metrics: CPU, memory, and disk utilization visualized per node.
Logs: Kubelet warnings such as “evicting pods due to disk pressure.”
Restarts: Sudden spike in pod restarts correlated with node pressure events.
Rollouts: Deployment timeline showing which workload triggered exhaustion.

These correlated views allow you to see the full sequence — from node overload to eviction — in one dashboard, making CubeAPM ideal for diagnosing and preventing Out of Resources incidents.

Example Alert Rules for Node Out of Resources Error

Proactive alerting helps identify resource saturation long before Kubernetes starts evicting pods or marking nodes as NotReady. With CubeAPM, you can define these PromQL-based rules in your alerts dashboard and route them to Slack, Teams, or PagerDuty for real-time action. Each alert below targets a specific pressure signal — memory, disk, CPU, or eviction rate — commonly seen during node exhaustion.

1. Node Memory Pressure Alert

This alert fires when a node’s memory usage exceeds 90% of total allocatable memory for more than five minutes. Sustained high memory usage often leads to MemoryPressure, triggering evictions or throttling. Detecting it early helps teams rebalance pods or scale out nodes before workloads are terminated.

YAML

- alert: NodeMemoryPressure

  expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.9

  for: 5m

  labels:

    severity: warning

  annotations:

    summary: "High memory usage on {{ $labels.instance }}"

    description: "Node memory usage above 90% — potential MemoryPressure condition."

2. Disk Pressure Alert

This rule monitors ephemeral storage utilization across nodes and triggers when disk usage stays above 85% for ten minutes. Disk saturation is one of the most frequent causes of DiskPressure, which can force Kubernetes to evict pods and delay deployments. By alerting early, CubeAPM helps you clean up unused images, logs, and containers before space runs out.

YAML

- alert: NodeDiskPressure

  expr: (node_filesystem_size_bytes - node_filesystem_free_bytes) / node_filesystem_size_bytes > 0.85

  for: 10m

  labels:

    severity: critical

  annotations:

    summary: "High disk usage on {{ $labels.instance }}"

    description: "Disk utilization above 85% — may trigger pod evictions."

3. CPU Saturation Alert

This alert fires when the average CPU utilization across a node remains above 90% for more than ten minutes. Prolonged CPU saturation often causes latency spikes, pod throttling, and failed scheduling attempts. With CubeAPM’s correlated metrics view, you can trace which deployments or workloads are consuming excessive CPU before performance degradation spreads.

YAML

- alert: NodeCPUSaturation

  expr: avg(rate(node_cpu_seconds_total{mode!="idle"}[5m])) by (instance) > 0.9

  for: 10m

  labels:

    severity: warning

  annotations:

    summary: "High CPU utilization on {{ $labels.instance }}"

    description: "Node CPU usage above 90% — may lead to throttling and NotReady states."

4. Pod Eviction Rate Alert

This alert tracks the frequency of pod evictions across nodes — a strong indicator of resource pressure or imbalance. When evictions exceed normal operational thresholds, it signals that one or more nodes are out of capacity and need immediate attention.

YAML

- alert: HighPodEvictionRate

  expr: rate(kube_pod_evict_total[5m]) > 2

  for: 5m

  labels:

    severity: critical

  annotations:

    summary: "High pod eviction rate detected"

    description: "Pods are being evicted frequently due to node resource exhaustion."

Conclusion

The Kubernetes Node Out of Resources error is one of the most disruptive cluster issues, often caused by poor resource planning, overcommitment, or misconfigured autoscaling. When left unchecked, it can lead to widespread pod evictions, NotReady nodes, and application downtime that directly impact reliability and SLAs.

Traditional monitoring tools only show raw metrics but miss the relationships between node pressure, pod evictions, and deployment events. CubeAPM solves this by correlating metrics, logs, events, and rollout data to pinpoint which workloads triggered node exhaustion and when it began. This end-to-end visibility helps teams act before the cluster becomes unstable.

With real-time dashboards, OpenTelemetry-native collection, and smart alerting, CubeAPM empowers DevOps teams to detect resource bottlenecks early, optimize scheduling, and maintain high uptime across all Kubernetes nodes.

FAQs

1. What does “Node Out of Resources” mean in Kubernetes?

This error means a node has exhausted one or more of its key resources — CPU, memory, or ephemeral storage. When this happens, the kubelet marks the node under pressure (MemoryPressure, DiskPressure, or PIDPressure) and may evict pods or stop scheduling new ones until resources are freed.

2. How can I check which resource caused the Out of Resources error?

You can use kubectl describe node <node-name> to review conditions. Look for lines like MemoryPressure=True or DiskPressure=True, which indicate the specific resource causing the issue. Metrics tools such as CubeAPM can help correlate these conditions with usage trends, workloads, and specific pods consuming excessive resources.

3. How do I prevent nodes from running out of resources?

Set accurate resource requests and limits for every workload, enable Cluster Autoscaler, and distribute pods evenly across nodes. Use observability platforms like CubeAPM to monitor node saturation, view pod density, and trigger alerts when CPU, memory, or disk utilization approaches critical thresholds.

4. Why are my pods getting evicted even when resources look available?

Pod evictions can occur if the kubelet detects local node pressure or filesystem saturation that isn’t visible at the cluster level. This can also happen when ephemeral storage is full or when QoS policies prioritize critical pods. CubeAPM helps detect these edge cases by tracking kubelet logs, eviction events, and node state transitions in one correlated view.

5. Can autoscaling fix the Node Out of Resources problem automatically?

Partially. Cluster Autoscaler can add new nodes when workloads exceed current capacity, but it won’t prevent resource leaks, unbounded caching, or poor limit configurations. Combining autoscaling with continuous monitoring through CubeAPM ensures sustainable scaling without recurring pressure states.

Last9 vs Datadog: In-Depth Comparison 2026

Indu Priya July 3, 2026

Monitoring a Fastify Application: Datadog Setup, Overhead, and Alternatives

Indu Priya July 3, 2026

Vertex AI Endpoint Latency and Cost Monitoring: Complete Guide

Abhinav Garg July 3, 2026

Monitoring DragonflyDB in Production: Setup & Best Practices

Indu Priya July 3, 2026

pgvector Query Performance Monitoring: How to Track Index Health, Query Latency, and Embedding Search Performance

Abhinav Garg July 3, 2026

SigNoz vs Azure Monitor: In-Depth Comparison 2026

Indu Priya July 3, 2026

Kubernetes Node Out of Resources Error Explained: Node Pressure, Pod Evictions & Resource Exhaustion Monitoring with CubeAPM

Table of Contents

What is Kubernetes Node Out of Resources Error

Key Characteristics

Why Kubernetes Node Out of Resources Error Happens

1. Overcommitted CPU or Memory Requests

2. Memory Leaks in Long-Running Pods

3. Ephemeral Storage Exhaustion

4. High Pod Density or Bursty Workloads

5. Insufficient Node Autoscaling or Quota Configuration

How to Fix Kubernetes Node Out of Resources Error

1. Identify Resource-Hungry Pods

2. Clean Up Ephemeral Storage

3. Reduce Pod Density per Node

4. Enable or Tune Cluster Autoscaler

5. Implement Pod Priority and QoS

6. Monitor Node Conditions Continuously

Monitoring Kubernetes Node Out of Resources Error with CubeAPM

Step 1 — Install CubeAPM (Helm)

Step 2 — Deploy the OpenTelemetry Collector (DaemonSet + Deployment)

Step 3 — Collector Configs Focused on Node Out of Resources

Step 4 — Supporting Components

Step 5 — Verification (What You Should See in CubeAPM)

Example Alert Rules for Node Out of Resources Error

1. Node Memory Pressure Alert

2. Disk Pressure Alert

3. CPU Saturation Alert

4. Pod Eviction Rate Alert

Conclusion

FAQs

Related Posts

Features

Resources

Links