Kubernetes disk pressure error occurs when a node runs critically low on disk space or inodes, prompting the kubelet to evict pods or halt new scheduling. In fact, 93% of organizations now run Kubernetes in production, making disk and resource management a critical reliability concern. When nodes hit DiskPressure, workloads are rescheduled, deployments fail, and performance across the cluster can quickly degrade.
CubeAPM helps teams detect and prevent Kubernetes disk pressure error before it impacts workloads by continuously tracking filesystem metrics. It correlates these with eviction events, highlights pods or images consuming excessive space, and triggers anomaly alerts — ensuring storage health, predictable performance, and full visibility across Kubernetes clusters.
In this guide, we’ll explain what the Kubernetes DiskPressure error means, why it happens, how to fix it, and how to monitor and alert on it effectively using CubeAPM.
Table of Contents
ToggleWhat is Kubernetes Disk Pressure Error
The condition in Kubernetes signals that a node is running out of available disk resources. It’s one of the core node conditions (MemoryPressure, PIDPressure, DiskPressure, NetworkUnavailable, Ready) that the kubelet reports to the control plane to describe node health. When disk usage crosses predefined eviction thresholds — typically set under evictionHard or evictionSoft in the kubelet configuration — the kubelet marks the node as DiskPressure=True.
This alert means the kubelet can no longer guarantee storage for running containers or system processes. Kubernetes reacts by evicting low-priority pods, pausing new scheduling on the node, and freeing disk space to recover stability. While this helps protect the node, it often disrupts workloads unexpectedly — especially when the underlying cause is image bloat, uncollected logs, or persistent volume growth that goes unnoticed.
In short, the DiskPressure error is Kubernetes’ self-protection mechanism: it prevents total node failure by evicting pods when disk space becomes dangerously low, but it can also create cascading issues if teams lack visibility into what’s consuming the storage.
Key Characteristics of kubernetes disk pressure error:
- Triggered when nodefs or imagefs crosses eviction thresholds
- Node status changes to DiskPressure=True under kubectl describe node
- The kubelet evicts low-priority pods and pauses new scheduling
- Often caused by log bloat, orphaned images, or temporary volume overflow
- Can affect StatefulSets, CI/CD jobs, and nodes running high I/O workloads
Why Kubernetes Disk Pressure Error Happens
Kubernetes disk pressure error typically appears when the kubelet detects insufficient free space on either the node filesystem (nodefs) or the container image filesystem (imagefs). Below are the most common causes behind it.
1. Excessive Container Logs
Large application logs stored under /var/lib/docker/containers can quickly consume node storage, especially when log rotation isn’t configured. This is one of the leading triggers for DiskPressure alerts in long-running workloads.
Quick check:
du -sh /var/lib/docker/containers/* | sort -h
If you see one or more log directories exceeding several gigabytes, enable log rotation or redirect logs to a centralized backend.
2. Unused or Orphaned Images
Old images left on the node after deployments, failed pulls, or rollbacks accumulate over time. The kubelet might not automatically clean them if disk thresholds aren’t reached yet, leading to gradual storage exhaustion.
Quick check:
crictl images | grep <repository>
If you see a large list of outdated or untagged images, run crictl image prune or docker system prune -a to reclaim space safely.
3. Temporary Volume Growth (emptyDir and Cache Directories)
Temporary directories used by emptyDir volumes or app-level caching (e.g., npm, Maven, or build artifacts) can silently expand until they consume all available disk space.
Quick check:
kubectl describe pod <pod-name> | grep emptyDir -A5
If you see large emptyDir volumes or temporary mounts without size limits, adjust manifests or move cache data to external storage.
4. Containerd or Docker Overlay Data Expansion
OverlayFS directories under /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs or /var/lib/docker/overlay2 may grow excessively due to incomplete cleanup of layers or build cache.
Quick check:
du -sh /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/* | sort -h
If you see subdirectories consuming tens of gigabytes, clear unused layers and restart the container runtime to trigger cleanup.
5. Persistent Volume Mismanagement
Large or unbounded PersistentVolumeClaims can cause unexpected disk consumption on the node hosting the volume, especially when dynamic provisioning is used without limits.
Quick check:
kubectl get pvc -A -o wide
If you see PVCs bound to the affected node with high capacity or unbounded requests, set storage quotas or use dedicated storage classes.
6. Image Cache Bloat and Registry Mirroring
Nodes that pull images frequently from private registries or mirror large layers without garbage collection often hit DiskPressure thresholds faster.
Quick check:
crictl image prune --dry-run
If you see a long list of cached images, enable automatic pruning or use smaller base images to reduce layer footprint.
7. Eviction Thresholds Set Too Low
In some clusters, kubelet eviction settings under /var/lib/kubelet/config.yaml are configured too aggressively (e.g., evictionHard: {“nodefs.available”: “10%”}). This triggers premature DiskPressure even with sufficient usable space.
Quick check:
cat /var/lib/kubelet/config.yaml | grep eviction
If you see overly strict thresholds like 10% or less, adjust them to balanced values (e.g., 5% for nodefs.available and 3% for imagefs.available).
How to Fix Kubernetes Disk Pressure Error
Fixing DiskPressure requires cleaning up unused data, reconfiguring eviction thresholds, and optimizing container storage. Follow these steps to resolve it efficiently.
1. Clear Excessive Container Logs
Container logs in /var/lib/docker/containers often grow unchecked when rotation isn’t configured.
Quick check:
du -sh /var/lib/docker/containers/* | sort -h
Fix: Enable log rotation and truncate oversized logs to free up disk space immediately.
find /var/lib/docker/containers/ -name "*.log" -type f -size +500M -exec truncate -s 0 {} \;
2. Remove Unused or Orphaned Images
Old or failed images can consume large portions of the node’s filesystem.
Quick check:
crictl images | grep <repository>
Fix: Prune unused images safely using the container runtime.
crictl image prune -f
3. Clean Temporary Volumes and EmptyDir Data
Temporary emptyDir volumes or caches can silently fill node storage.
Quick check:
kubectl describe pod <pod-name> | grep emptyDir -A5
Fix: Apply a size limit to temporary volumes or delete stale pods with unbounded caches.
kubectl patch pod <pod-name> -p '{"spec":{"volumes":[{"name":"cache","emptyDir":{"sizeLimit":"500Mi"}}]}}'
4. Reclaim OverlayFS and Containerd Cache
Overlay and snapshot directories in containerd often retain unused layers.
Quick check:
du -sh /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/* | sort -h
Fix: Stop containerd, remove orphaned layers, and restart the service to reclaim space.
sudo systemctl stop containerd && sudo rm -rf /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/* && sudo systemctl start containerd
5. Adjust Eviction Thresholds
Aggressive eviction thresholds may trigger DiskPressure early.
Quick check:
cat /var/lib/kubelet/config.yaml | grep eviction
Fix: Relax kubelet eviction settings for node and image filesystems.
sudo sed -i '/evictionHard/d' /var/lib/kubelet/config.yaml && echo 'evictionHard: {"nodefs.available": "5%", "imagefs.available": "3%"}' | sudo tee -a /var/lib/kubelet/config.yaml && sudo systemctl restart kubelet
6. Enable Automatic Image Garbage Collection
Without garbage collection, nodes accumulate unused image layers indefinitely.
Quick check:
ps aux | grep kubelet | grep gc-threshold
Fix: Configure automatic image garbage collection in kubelet flags.
sudo sed -i '/image-gc/d' /var/lib/kubelet/config.yaml && echo '--image-gc-high-threshold=85 --image-gc-low-threshold=80' | sudo tee -a /etc/default/kubelet && sudo systemctl restart kubelet
7. Move Container Storage to a Dedicated Disk
Nodes with small root partitions run out of space faster.
Quick check:
df -h | grep var/lib/docker
Fix: Mount a larger or separate disk for container storage.
sudo mount /dev/sdb1 /var/lib/docker
Monitoring Kubernetes Disk Pressure Error with CubeAPM
The fastest way to troubleshoot DiskPressure is by correlating node metrics, kubelet events, and filesystem logs. CubeAPM brings these together through its four unified signal streams — Metrics, Events, Logs, and Rollouts — to pinpoint which nodes or workloads are consuming excessive disk space. By continuously tracking filesystem usage (nodefs, imagefs) and kubelet eviction signals, CubeAPM helps teams detect early disk saturation before it triggers pod evictions.
Step 1 — Install CubeAPM (Helm)
Install CubeAPM in your cluster using Helm. This deploys dashboards, pipelines, and alert templates for Kubernetes node and storage metrics.
helm install cubeapm https://charts.cubeapm.com/cubeapm-latest.tgz --namespace cubeapm --create-namespace
Upgrade later with:
helm upgrade cubeapm https://charts.cubeapm.com/cubeapm-latest.tgz -n cubeapm
Configure custom tokens or endpoints in values.yaml if using BYOC or on-premise mode.
Step 2 — Deploy the OpenTelemetry Collector (DaemonSet + Deployment)
Deploy the collector in two modes for full coverage:
DaemonSet: Gathers per-node filesystem metrics and kubelet events.
helm install cube-otel-ds https://charts.cubeapm.com/otel-collector-ds.tgz -n cubeapm
Deployment: Acts as the central telemetry pipeline to CubeAPM.
helm install cube-otel-deploy https://charts.cubeapm.com/otel-collector-deploy.tgz -n cubeapm
Step 3 — Collector Configs Focused on DiskPressure
DaemonSet Config (Node Metrics + Disk Stats):
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'kubelet-node'
static_configs:
- targets: ['localhost:10255']
processors:
batch:
exporters:
otlp:
endpoint: cubeapm:4317
service:
pipelines:
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [otlp]
- Prometheus receiver: Scrapes kubelet metrics for nodefs and imagefs usage.
- Batch processor: Optimizes transmission of large metric sets.
- OTLP exporter: Sends data directly to CubeAPM’s ingestion endpoint.
Deployment Config (Events + Logs):
receivers:
kubeletstats:
collection_interval: 60s
filelog:
include: [/var/log/kubelet.log, /var/log/syslog]
processors:
memory_limiter:
limit_mib: 500
batch:
exporters:
otlp:
endpoint: cubeapm:4317
service:
pipelines:
logs:
receivers: [filelog]
processors: [memory_limiter, batch]
exporters: [otlp]
metrics:
receivers: [kubeletstats]
processors: [batch]
exporters: [otlp]
- kubeletstats receiver: Captures node conditions like DiskPressure=True.
- filelog receiver: Streams kubelet logs containing eviction and disk usage events.
- memory_limiter: Prevents overload during event spikes.
Step 4 — Supporting Components (Optional)
Deploy kube-state-metrics for richer visibility into pod and PVC states.
helm install kube-state-metrics https://charts.cubeapm.com/kube-state-metrics.tgz -n cubeapm
Step 5 — Verification Checklist
Before going live, validate that CubeAPM is ingesting all signals correctly:
- Events: Eviction warnings such as “Pod evicted due to DiskPressure” appear in the Events view.
- Metrics: node_filesystem_avail_bytes and imagefs_available_bytes show node-level trends.
- Logs: Kubelet log entries confirm eviction or cleanup events.
- Restarts: Pods redeploy automatically when disk pressure resolves.
- Rollouts: Deployment view highlights which workload triggered the condition.
Example Alert Rules for Kubernetes DiskPressure Error
You can use these alert rules to proactively detect nodes nearing disk exhaustion and automatically trigger alerts before the kubelet evicts pods. These PromQL rules integrate directly into CubeAPM’s alert manager or Prometheus-compatible pipelines.
1. Node Disk Usage Above Threshold
This alert triggers when a node’s filesystem usage exceeds 85%, indicating imminent DiskPressure risk.
- alert: NodeDiskUsageHigh
expr: (1 - (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"})) * 100 > 85
for: 5m
labels:
severity: warning
annotations:
summary: "High Disk Usage on Node"
description: "Node {{ $labels.instance }} is using more than 85% of its disk space."
2. Node Condition: DiskPressure True
This alert fires when the kubelet explicitly reports a node in DiskPressure=True state.
- alert: NodeDiskPressureDetected
expr: kube_node_status_condition{condition="DiskPressure",status="true"} == 1
for: 2m
labels:
severity: critical
annotations:
summary: "Kubernetes Node Under Disk Pressure"
description: "Node {{ $labels.node }} is reporting DiskPressure=True. Evictions or scheduling failures may occur."
3. Low ImageFS Space (Container Cache Saturation)
This alert identifies when container image storage (imagefs) is nearly full, which often precedes DiskPressure.
- alert: NodeImageFSLow
expr: (1 - (node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes{fstype!~"tmpfs|overlay"})) * 100 > 90
for: 10m
labels:
severity: warning
annotations:
summary: "Low ImageFS Space Detected"
description: "Image filesystem on node {{ $labels.instance }} is above 90% capacity."
These rules help CubeAPM’s alert engine trigger early warnings through Slack, Teams, or WhatsApp integrations — giving operators time to prune images, rotate logs, or reschedule workloads before node DiskPressure disrupts deployments.
Conclusion
DiskPressure is one of the most disruptive node-level conditions in Kubernetes, often leading to cascading pod evictions and unpredictable outages. Without proactive monitoring, it can silently degrade performance and stall deployments across the cluster.
By correlating node metrics, kubelet events, and container logs, CubeAPM helps teams detect DiskPressure before it causes service impact. Its OpenTelemetry-native pipelines continuously track storage saturation, log growth, and eviction trends across all nodes and workloads.
Start monitoring your Kubernetes clusters with CubeAPM today — gain real-time visibility into disk health, prevent evictions, and maintain peak reliability at predictable cost.