CRITICAL

How to Fix Kubernetes 0/3 Nodes Are Available (Scheduling Failed)

Quick Fix Summary

TL;DR

Check node taints, resource requests, and node conditions with `kubectl describe node` and `kubectl get events`.

The Kubernetes scheduler cannot find a suitable node to place your pod, resulting in a 'SchedulingFailed' event. This is a critical failure that prevents application deployment or scaling.

Diagnosis & Causes

Node taints preventing pod placement.

Pod resource requests exceed node capacity.

NodeSelector or node affinity rules not satisfied.

All nodes are in a NotReady state.

PersistentVolumeClaim binding failures.

Recovery Steps

Step 1: Diagnose with kubectl describe and get events

Immediately gather the error details from the pod and cluster events to understand the scheduler's decision.

bash

kubectl describe pod <pod-name> -n <namespace>
kubectl get events --sort-by='.lastTimestamp' -n <namespace>

Step 2: Inspect Node Status and Taints

Check if nodes are ready and examine their taints, which can repel pods unless the pod has a matching toleration.

bash

kubectl get nodes
kubectl describe node <node-name> | grep -A 10 -i taint

Step 3: Verify Node Resources vs. Pod Requests

Compare the pod's resource requests (CPU/memory) against the allocatable resources on each node.

bash

kubectl describe node <node-name> | grep -A 5 -i allocatable
kubectl describe pod <pod-name> | grep -i requests -A 2

Step 4: Check for NodeSelector and Affinity Rules

Ensure your pod's nodeSelector or nodeAffinity rules match labels present on at least one ready node.

bash

kubectl get nodes --show-labels
kubectl describe pod <pod-name> | grep -i node-selector -A 2 -B 2

Step 5: Resolve PersistentVolumeClaim Issues

If the pod requires a PersistentVolume, ensure a matching StorageClass exists and the PVC is bound.

bash

kubectl get pvc -n <namespace>
kubectl describe pvc <pvc-name> -n <namespace>

Step 6: Add Tolerations or Fix Node State

Based on diagnosis, either add tolerations to your pod spec or cordon/drain/fix the problematic node.

yaml

# Example Toleration for a common node taint
tolerations:
- key: "node.kubernetes.io/unreachable"
  operator: "Exists"
  effect: "NoSchedule"

Architect's Pro Tip

"Use `kubectl get pods -o wide --all-namespaces | grep -v Running` to instantly see all non-running pods and their assigned nodes, highlighting cluster-wide scheduling issues."

Frequently Asked Questions

What's the difference between a taint and a nodeSelector?

A taint repels pods from a node unless they tolerate it (node-pushes-pod-away). A nodeSelector attracts a pod to nodes with matching labels (pod-pulls-to-node).

My node shows 'Ready' but pods still won't schedule. Why?

The node may be cordoned (scheduling disabled), have resource pressure (memory/disk), or have unschedulable taints like 'NoSchedule' that your pod doesn't tolerate.

How can I prevent this error during deployment?

Define realistic resource requests/limits in your pod spec, use PodDisruptionBudgets for voluntary disruptions, and ensure node affinity/toleration rules are tested in non-production first.

Related Kubernetes Guides

502 Bad Gateway

How to Fix Kubernetes 0/3 Nodes Are Available (Scheduling Failed)

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Diagnose with kubectl describe and get events

Step 2: Inspect Node Status and Taints

Step 3: Verify Node Resources vs. Pod Requests

Step 4: Check for NodeSelector and Affinity Rules

Step 5: Resolve PersistentVolumeClaim Issues

Step 6: Add Tolerations or Fix Node State

Architect's Pro Tip

Frequently Asked Questions

What's the difference between a taint and a nodeSelector?

My node shows 'Ready' but pods still won't schedule. Why?

How can I prevent this error during deployment?

Related Kubernetes Guides

How to Fix Kubernetes 502 Bad Gateway from Ingress-NGINX (K8s 1.30+)

How to Fix Kubernetes 502 Bad Gateway with Istio Service Mesh (2025)

How to Fix Kubernetes 502 Bad Gateway in Ingress (K8s 1.31+)