Kubernetes Exit Code 139 error (Segmentation Fault) occurs when a container process attempts to access restricted memory, forcing the operating system to kill it. These runtime crashes don’t just restart containers — they ripple through clusters, stalling deployments, disrupting microservices, and delaying rollouts. The business impact is massive, unplanned IT downtime now costs an average of $14,056 per minute, with large enterprises losing up to $23,750 per minute.
CubeAPM is built for Kubernetes reliability. It captures Kubernetes Events like container exits, correlates them with pod- and node-level memory metrics, ingests logs showing segmentation faults, and ties everything back to rollout histories and scaling actions. This gives SREs and developers a single timeline of why Pods crash with Exit Code 139, dramatically cutting down the time to root cause.
In this guide, we’ll explain what Exit Code 139 means in Kubernetes, why it happens, the most effective fixes, and how CubeAPM helps you detect, monitor, and resolve segmentation faults faster in production clusters.
Table of Contents
ToggleWhat is Exit Code 139 (Segfault) in Kubernetes
When a Pod reports Exit Code 139, the process inside its container crashed with a segmentation fault—it tries to read or write memory it doesn’t own, so the OS kills it to protect the system. Kubernetes records the container’s termination with code 139, but it doesn’t include the code-level root cause (that lives in your app/runtime logs and core dumps).
Because the kubelet follows the Pod’s restartPolicy, a segfault often triggers automatic restarts. If the underlying bug persists, the Pod flips into CrashLoopBackOff (short bursts of up → crash → backoff). During rollouts, new replicas may never become Ready, blocking deployments and cascading into dependency outages.
- How it typically shows up (operator view):
- Pod status shows terminated with exit code 139; restarts count increases
- kubectl logs <pod> -c <container> –previous reveals messages like “Segmentation fault (core dumped)”
- Crash loops during Deployment or StatefulSet rollouts; readiness probes fail; Δ in container_restart_count metrics
- Node is healthy; no memory OOM events—so the failure is application/runtime, not resource eviction
- Why it matters (cluster & release impact):
- Rollouts stall: new versions can’t stabilize, causing automated rollbacks or partial availability
- Microservice ripple effects: upstream timeouts, downstream 5xx, and queue backlogs when a critical Pod keeps dying
- Harder triage than OOMKilled (137): resource dashboards look fine; root cause lives in app logs, native libs, or ABI/arch mismatches
- SLO/SLA risk: repeated segfaults inflate error budgets and jeopardize contractual uptime commitments
Why Exit Code 139 (Segfault) in Kubernetes Pods Happens
1. Buggy memory access in the application
Segfaults are most often caused by unsafe memory operations such as invalid pointer dereferences or buffer overflows. In Kubernetes, the container exits abruptly and the Pod goes into repeated restarts.
Example: Native modules in C or Rust crashing with “Segmentation fault (core dumped)”.
Quick check:
kubectl logs <pod-name> -c <container-name> --previous | grep "Segmentation fault"
When you see this: If logs contain “Segmentation fault (core dumped),” it confirms the failure is inside the application code. Next step: reproduce locally with the same build, enable core dumps, and inspect stack traces to isolate the faulty function or module.
2. Binary and OS/ABI mismatch
If a binary was compiled for glibc but the container base image uses musl (e.g., Alpine), it can crash immediately at startup. This shows up as Exit Code 139 even though the failure is at the compatibility layer.
Example: Running a glibc-linked Go binary inside Alpine.
Quick check:
kubectl exec <pod-name> -c <container-name> -- ldd /app/binary
When you see this: If ldd reports “not found” for critical libraries, the crash is due to missing or mismatched dependencies. Rebuild the image with the correct base or bundle the required runtime libraries.
3. Wrong CPU architecture
Deploying an image built for amd64 onto an arm64 Kubernetes node can cause instant segfaults. The Pod will continuously restart and never become Ready.
Example: Pulling an arm64 base image on a mixed architecture cluster.
Quick check:
kubectl get node <node-name> -o jsonpath='{.status.nodeInfo.architecture}'
When you see this: If the node architecture doesn’t match the image architecture, Kubernetes cannot run the binary correctly. Use multi-arch images or rebuild for the node’s architecture.
4. Native/shared library drift
Segfaults can occur if a container ships libraries compiled against different or missing .so versions. The crash happens during runtime linking, leaving little context besides Exit Code 139.
Example: A Python wheel compiled with glibc 2.35 running inside a base image with glibc 2.31.
Quick check:
kubectl exec <pod-name> -c <container-name> -- ldd /usr/lib/libexample.so | grep "not found"
When you see this: If you find unresolved symbols or “not found” libraries, the segfault stems from dependency drift. Align your build and runtime environments or use container images with stable, pinned versions.
5. Entrypoint or command issues
A broken entrypoint script or corrupted binary can trigger a segfault before the main application logic even runs. This leaves the Pod in a CrashLoopBackOff state without meaningful logs.
Example: A shell script that calls a non-existent binary in CMD.
Quick check:
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[*].command}'
When you see this: If the command references a path or binary that doesn’t exist, the segfault is due to a bad entrypoint. Fix the Pod spec or Dockerfile to point to a valid, tested executable.
6. Init container side-effects
Init containers sometimes manipulate shared libraries, environment variables, or file permissions in ways that break the main container. For example, overwriting /usr/lib or injecting an LD_PRELOAD can cause the app to segfault immediately at startup.
Example: An init container replacing glibc with a custom build.
Quick check:
kubectl logs <pod-name> -c <init-container-name> --previous
When you see this: If init logs show library changes or permission modifications, the segfault is likely caused upstream. Review the init container steps and ensure they don’t tamper with runtime dependencies for the main container.
7. Stack overflow from recursion or tiny stack limits
Deep recursion or excessive stack allocations can overflow the process stack, causing a segfault. Minimal container images may also set restrictive default stack sizes, making the problem appear suddenly in production.
Example: A recursive function without a termination condition crashing under production input.
Quick check:
kubectl exec <pod-name> -c <container-name> -- ulimit -s
When you see this: If the stack size is small (e.g., 8192 KB) and your code involves deep recursion, the crash is stack overflow. Increase stack limits or refactor recursive logic into iterative code.
8. Native extensions and JIT/runtime mismatches
Languages like Java, Node.js, and Python rely on native extensions or JIT compilers that can crash when versions don’t align. A Node add-on compiled for a different ABI, or a Python wheel built for the wrong interpreter, often segfaults without clear error messages.
Example: A Node.js native module crashing when loaded by a newer Node runtime.
Quick check:
kubectl logs <pod-name> -c <container-name> --previous | head -n 50
When you see this: If logs show segfaults without OOM errors, focus on runtime version mismatches. Rebuild or reinstall native modules to match the runtime shipped in your container.
How to Fix Kubernetes Exit Code 139 Error (Segfault)
1. Fix invalid entrypoint or command
A bad command/args or a missing binary can segfault before your app starts.
Check:
kubectl get pod <pod> -o jsonpath='{.spec.containers[0].command}{" "}{.spec.containers[0].args}'
Fix (patch to a valid entrypoint):
kubectl patch deploy <deploy> --type='json' -p='[{"op":"replace","path":"/spec/template/spec/containers/0/command","value":["/app/bin/server"]},{"op":"replace","path":"/spec/template/spec/containers/0/args","value":["--port=8080"]}]'
2. Align OS/ABI (glibc vs musl) base image
Running a glibc-linked binary on Alpine (musl) often crashes at startup.
Check:
kubectl exec <pod> -c <container> -- sh -c 'cat /etc/alpine-release 2>/dev/null || cat /etc/debian_version 2>/dev/null; ldd /app/bin/server 2>&1 | head -n 20'
Fix (switch image tag to glibc base):
kubectl set image deploy/<deploy> <container>=registry.example.com/app/server:debian-glibc-1.4.2
3. Correct CPU architecture mismatches
An amd64 image on an arm64 node (or vice-versa) may just crash.
Check:
kubectl get node $(kubectl get pod <pod> -o jsonpath='{.spec.nodeName}') -o jsonpath='{.status.nodeInfo.architecture}{" "}' && kubectl get pod <pod> -o jsonpath='{.spec.containers[0].image}'
Fix (pin scheduling to amd64 nodes):
kubectl patch deploy <deploy> --type='merge' -p='{"spec":{"template":{"spec":{"nodeSelector":{"kubernetes.io/arch":"amd64"}}}}}'
Fix (or swap to multi-arch image):
kubectl set image deploy/<deploy> <container>=registry.example.com/app/server:1.4.2@sha256:<multi-arch-manifest-digest>
4. Resolve shared-library drift (missing .so / symbol mismatch)
Different .so versions between build and runtime images cause segfaults during linking.
Check:
kubectl exec <pod> -c <container> -- sh -c 'ldd /app/bin/server | grep -i "not found" || true'
Fix (pin runtime to build base):
kubectl set image deploy/<deploy> <container>=registry.example.com/app/server:build-base-1.4.2
Fix (or bundle needed libs via initContainer volume):
# Minimal example (add to spec) — copy exact libs your app needs
initContainers:
- name: copy-libs
image: registry.example.com/ops/libbundle:glibc-2.35
command: ["/bin/sh","-c","cp -R /opt/libs/* /work/libs"]
volumeMounts:
- {name: app-libs, mountPath: /work/libs}
containers:
- name: app
image: registry.example.com/app/server:1.4.2
env: [{name: LD_LIBRARY_PATH, value: /work/libs}]
volumeMounts: [{name: app-libs, mountPath: /work/libs, readOnly: true}]
volumes: [{name: app-libs, emptyDir: {}}]
5. Rebuild native modules to match runtime (Node/Python/JVM)
Native addons compiled for another ABI or runtime will segfault at load.
Check:
kubectl logs <pod> -c <container> --previous | head -n 200 | grep -Ei "segmentation fault|core dumped|illegal instruction" || true
Fix (Node: rebuild for current runtime in Dockerfile):
RUN npm ci --omit=dev && npm rebuild --update-binary && node -e "console.log(process.versions)"
Fix (Python: ensure manylinux wheels / correct ABI):
RUN pip install --no-cache-dir --upgrade pip && pip install --no-cache-dir --only-binary=:all: -r requirements.txt
6. Undo harmful init-container side effects
Init steps that overwrite system libs, inject LD_PRELOAD, or chmod system paths can trigger segfaults.
Check:
kubectl logs <pod> -c <init-container-name> --previous | tail -n 100
Fix (remove the risky env/steps):
kubectl patch deploy <deploy> --type='json' -p='[{"op":"remove","path":"/spec/template/spec/initContainers/0/env/0"}]'
(Adjust path to the exact field that sets LD_PRELOAD or modifies /usr/lib*.)
7. Increase stack size or eliminate deep recursion
Small stacks plus deep recursion or big frames → SIGSEGV.
Check:
kubectl exec <pod> -c <container> -- sh -c 'ulimit -s'
Fix (raise stack then exec app):
kubectl patch deploy <deploy> --type='merge' -p='{"spec":{"template":{"spec":{"containers":[{"name":"<container>","command":["/bin/sh","-c"],"args":["ulimit -s 16384; exec /app/bin/server --port=8080"]}]}}}}'
8. Loosen security profile to confirm syscall/capability issues
Over-restrictive seccomp/AppArmor/cap drops can crash native paths.
Check:
kubectl get pod <pod> -o yaml | grep -A3 securityContext
Fix (test with RuntimeDefault seccomp; non-prod):
kubectl patch deploy <deploy> --type='merge' -p='{"spec":{"template":{"metadata":{"annotations":{"container.seccomp.security.alpha.kubernetes.io/<container>":"runtime/default"}}}}}'
Fix (or temporarily disable AppArmor via annotation if enabled on your cluster):
kubectl patch deploy <deploy> --type='merge' -p='{"spec":{"template":{"metadata":{"annotations":{"container.apparmor.security.beta.kubernetes.io/<container>":"unconfined"}}}}}'
9. Roll back the first bad build and bisect
Segfaults frequently appear right after an image or config bump.
Check:
kubectl rollout history deploy/<deploy>
Fix (roll back fast):
kubectl rollout undo deploy/<deploy> --to-revision=<last-good-rev>
10. Stabilize rollouts to reduce churn while you debug
Let the app fail fast without flapping probes and restarts masking logs.
Check: review probes and backoff in the spec
Fix (add startupProbe + stricter readiness):
containers:
- name: app
image: registry.example.com/app/server:1.4.2
startupProbe:
httpGet: {path: /healthz, port: 8080}
failureThreshold: 30
periodSeconds: 2
readinessProbe:
httpGet: {path: /ready, port: 8080}
periodSeconds: 5
failureThreshold: 3
Monitoring Exit Code 139 (Segfault) in Kubernetes with CubeAPM
The fastest way to root-cause segfaults is to correlate four signal streams in one place: Kubernetes Events (container terminations, restarts), pod/node memory metrics, container logs (“Segmentation fault (core dumped)”), and rollout history (which change introduced the crash). CubeAPM ingests all four via the OpenTelemetry Collector and stitches them into a single timeline so you can see what failed, when, and why.
Step 1 — Install CubeAPM (Helm)
Install once, then keep it updated with a single command.
helm install cubeapm cubeapm/cubeapm --namespace cubeapm --create-namespace
helm upgrade cubeapm cubeapm/cubeapm --namespace cubeapm --reuse-values
(If you manage config centrally, point to your values.yaml with -f values.yaml.)
Step 2 — Deploy the OpenTelemetry Collector (DaemonSet + Deployment)
Use DaemonSet for node/pod-local scraping and log tailing; use Deployment for central pipelines, enrichment, and export to CubeAPM.
DaemonSet:
helm install otel-ds open-telemetry/opentelemetry-collector --namespace observability --set mode=daemonset
Deployment:
helm install otel-gw open-telemetry/opentelemetry-collector --namespace observability --set mode=deployment
Step 3 — Collector configs focused on Exit Code 139
DaemonSet (node-local) — capture Events, metrics, and container logs with a segfault signal
receivers:
kubeletstats:
collection_interval: 30s
auth_type: serviceAccount
metrics:
k8s.container.restart_count:
enabled: true
k8s_events:
auth_type: serviceAccount
filelog:
include: [/var/log/containers/*.log]
start_at: beginning
operators:
- type: regex_parser
regex: '(?P<msg>Segmentation fault \(core dumped\))'
parse_from: body
on_error: skip
- type: add
field: attributes.error.type
value: segfault
- type: add
field: attributes.exit.code
value: "139"
exporters:
otlphttp:
endpoint: http://otel-gw.observability:4318
compression: gzip
processors:
batch: {}
k8sattributes: {}
service:
pipelines:
metrics:
receivers: [kubeletstats]
processors: [k8sattributes, batch]
exporters: [otlphttp]
logs:
receivers: [filelog, k8s_events]
processors: [k8sattributes, batch]
exporters: [otlphttp]
- kubeletstats: collects pod/container metrics like container_restart_count to spot crash loops.
- k8s_events: streams Pod/Container events (terminated with exit code 139, restarts).
- filelog + regex_parser: tails container logs and tags messages that match “Segmentation fault (core dumped)” with error.type=segfault and exit.code=139.
- k8sattributes: enriches data with pod, namespace, node, and owner (Deployment/StatefulSet).
- otlphttp → otel-gw: ships logs/metrics to the central gateway which forwards to CubeAPM.
Deployment (gateway) — centralize, enrich, and forward to CubeAPM
receivers:
otlp:
protocols:
http:
exporters:
otlphttp:
endpoint: https://ingest.cubeapm.com/otlp
headers:
authorization: "Bearer ${CUBEAPM_TOKEN}"
processors:
batch: {}
attributes:
actions:
- key: error.category
value: segmentation_fault
action: upsert
service:
pipelines:
metrics:
receivers: [otlp]
processors: [attributes, batch]
exporters: [otlphttp]
logs:
receivers: [otlp]
processors: [attributes, batch]
exporters: [otlphttp]
- otlp (recv): accepts data from all DaemonSet collectors.
- attributes: normalizes a signal label (error.category=segmentation_fault) for easy querying.
- otlphttp (exp): forwards everything to CubeAPM’s OTLP endpoint using your API token.
Step 4 — Supporting components (optional but helpful)
helm install ksm prometheus-community/kube-state-metrics --namespace observability
Step 5 — Verification (what you should see in CubeAPM)
- Events: Pod/Container events showing “container terminated” with exit code 139, followed by restart events aligned on the same timeline.
- Metrics: A spike in container_restart_count and corresponding pod readiness flaps during the failure window.
- Logs: Lines containing “Segmentation fault (core dumped)” enriched with error.type=segfault and exit.code=139.
- Restarts timeline: A clear sequence of crash → backoff → restart loops keyed to the offending Deployment/ReplicaSet.
- Rollout context: The failing replicas tied to a specific revision; you can pivot to rollout history to identify the first bad build.
Example Alert Rules for Exit Code 139 (Segfault) in Kubernetes
1. Alert on container restarts due to segfaults
Repeated container restarts with Exit Code 139 are a clear sign of a segfault loop. This rule alerts when restart counts spike in a short time window.
groups:
- name: segfault-alerts
rules:
- alert: ContainerSegfaults
expr: increase(kube_pod_container_status_restarts_total{reason="Error"}[5m]) > 3
for: 2m
labels:
severity: critical
annotations:
summary: "Segfault detected in {{ $labels.pod }}"
description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} restarted more than 3 times in 5m due to Exit Code 139."
2. Alert on segfault log patterns
Segfaults often only surface in container logs. This rule scans logs for “Segmentation fault (core dumped)” and raises an alert when detected.
groups:
- name: segfault-logs
rules:
- alert: SegfaultLogDetected
expr: count_over_time({error_type="segfault"}[2m]) > 0
for: 1m
labels:
severity: warning
annotations:
summary: "Segfault log line in {{ $labels.pod }}"
description: "Logs show a segmentation fault for container {{ $labels.container }} in {{ $labels.namespace }}."
3. Alert on CrashLoopBackOff caused by segfaults
When a segfault keeps recurring, Pods end up in CrashLoopBackOff. This alert notifies you when containers are stuck in that state.
groups:
- name: segfault-crashloops
rules:
- alert: CrashLoopBackOffSegfault
expr: kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"} == 1
for: 3m
labels:
severity: critical
annotations:
summary: "Pod {{ $labels.pod }} stuck in CrashLoopBackOff"
description: "Container {{ $labels.container }} is stuck restarting with Exit Code 139."
4. Alert on rollout failures due to segfaults
A wave of segfaults during a rollout can leave deployments stuck with unavailable replicas. This rule triggers when rollouts stall.
groups:
- name: segfault-rollouts
rules:
- alert: RolloutStalledSegfault
expr: kube_deployment_status_replicas_unavailable > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Deployment rollout stalled"
description: "Deployment {{ $labels.deployment }} has unavailable replicas due to Exit Code 139 segfaults."
Conclusion
Exit Code 139 in Kubernetes is one of those tricky failures that looks simple — a process died with a segfault — but hides deeper issues inside the container image, runtime libraries, or application code. Left undiagnosed, it forces Pods into endless restart loops, stalls rollouts, and cascades across dependent microservices.
The impact isn’t just technical. Segfaults that block deployments or trigger outages can cost hundreds of thousands in downtime and SLA penalties. They also drain engineering time, as teams scramble across logs, events, and metrics that are scattered across the stack. Without a correlated view, mean time to resolution stretches dangerously long.
CubeAPM gives you that correlation in one place. By combining Kubernetes Events, node and container metrics, container logs, and rollout context, CubeAPM builds a clear timeline of why Pods crashed with Exit Code 139. This makes it faster to detect, alert, and fix segmentation faults before they hit customer experience. Start monitoring segfaults with CubeAPM today to keep your Kubernetes workloads reliable and your business protected.