CRITICAL

How to Fix Kubernetes 502 Bad Gateway in Ingress (K8s 1.31+)

Quick Fix Summary

TL;DR

Check Ingress backend service endpoints and verify pod readiness probes are passing.

A 502 Bad Gateway from Kubernetes Ingress indicates the Ingress controller (e.g., NGINX) cannot establish a connection to the backend Pods defined in your Service. This is a critical networking failure between the Ingress layer and your application.

Diagnosis & Causes

Backend Service has no healthy endpoints.

Pod readiness or liveness probes are failing.

NetworkPolicy blocking Ingress controller traffic.

Misconfigured Ingress `servicePort` or `serviceName`.

Resource constraints causing Pod crashes or throttling.

Recovery Steps

Step 1: Verify Service Endpoints and Pod Status

First, confirm your Service is correctly targeting running Pods and that the Pods are ready.

bash

# Check if your Service has Endpoints
kubectl get endpoints <your-service-name> -n <namespace>
# Describe the Service to see selector and port mapping
kubectl describe svc <your-service-name> -n <namespace>
# Check Pod status and readiness
kubectl get pods -n <namespace> -l app=<your-app-label> -o wide

Step 2: Inspect Pod Readiness/Liveness Probes

A failing readiness probe removes a Pod from Service endpoints. Check probe configuration and logs.

bash

# Get the Pod's YAML to review probe configuration
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 15 readinessProbe
# Check for probe-related errors in Pod events
kubectl describe pod <pod-name> -n <namespace> | tail -30
# Check application logs for probe requests
kubectl logs <pod-name> -n <namespace> --tail=50

Step 3: Check Ingress Controller Logs & Configuration

Examine the Ingress controller logs for upstream connection errors and verify its configuration.

bash

# Get the Ingress controller Pod name (adjust label selector for your setup)
kubectl get pods -n ingress-nginx --show-labels
# Tail the error logs for 502s
kubectl logs -n ingress-nginx -l app.kubernetes.io/component=controller --tail=100 | grep -i "502\|upstream"
# Check the generated NGINX config for your upstream
kubectl exec -n ingress-nginx <controller-pod> -- cat /etc/nginx/nginx.conf | grep -A 10 -B 5 "<your-service-name>"

Step 4: Validate Network Connectivity

Test connectivity from the Ingress controller namespace to your application Pods to rule out NetworkPolicy issues.

bash

# Run a temporary curl Pod in the Ingress controller namespace
kubectl run curl-test --image=curlimages/curl:latest -n ingress-nginx --rm -it --restart=Never -- sh
# Inside the test pod, curl your Service's ClusterIP and Port
curl -v http://<service-cluster-ip>:<port>
# Also test direct Pod IP (bypass Service)
curl -v http://<pod-ip>:<container-port>

Step 5: Review Ingress Resource Definition

Ensure the Ingress spec correctly references the Service name and port.

yaml

# Get your Ingress resource YAML
kubectl get ingress <ingress-name> -n <namespace> -o yaml
# Pay close attention to the `backend.service` block:
# spec:
#   rules:
#   - host: ...
#     http:
#       paths:
#       - path: /
#         pathType: Prefix
#         backend:
#           service:
#             name: your-correct-service-name  # <-- Must match
#             port:
#               number: 8080                    # <-- Must be a port defined in the Service

Step 6: Check for Resource Limits & Pod Eviction

Insufficient CPU/Memory can cause Pod crashes or throttling, leading to intermittent 502s.

bash

# Check for recent Pod evictions or OOMKilled events
kubectl get pods -n <namespace> -o wide | grep -E "Evicted|CrashLoopBackOff"
# Describe a problematic Pod for resource events
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 -B 5 "Events:"
# Check node resource pressure
kubectl describe nodes | grep -A 5 "Allocatable"

Architect's Pro Tip

"For intermittent 502s under load, increase `proxy-next-upstream-tries` and `proxy-connect-timeout` in your Ingress Controller ConfigMap to handle upstream flakiness."

Frequently Asked Questions

My Service has Endpoints, but I still get a 502. What's next?

The Pods are 'Ready' but may not be accepting traffic. Check application startup time vs. initialDelaySeconds in your readiness probe. Also, run a connectivity test from the Ingress controller namespace directly to a Pod IP to bypass potential kube-proxy or Service issues.

Does this guide apply to AWS ALB Ingress Controller or other Ingress controllers?

The core principles (Service/Endpoint health, probes, networking) are universal. However, diagnostic commands and specific configurations (like the Pro Tip parameters) differ. Always consult your specific Ingress controller's documentation and logs.

Why did this start happening after upgrading to K8s 1.31+?

Kubernetes 1.31 may include updates to the `EndpointSlice` API, kube-proxy, or core networking that could affect Service discovery. Ensure your Ingress controller version is compatible with 1.31. Also, review any deprecated API removals that might affect your Ingress or Service definitions.

Related Kubernetes Guides

502 Bad Gateway

How to Fix Kubernetes 502 Bad Gateway in Ingress (K8s 1.31+)

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Verify Service Endpoints and Pod Status

Step 2: Inspect Pod Readiness/Liveness Probes

Step 3: Check Ingress Controller Logs & Configuration

Step 4: Validate Network Connectivity

Step 5: Review Ingress Resource Definition

Step 6: Check for Resource Limits & Pod Eviction

Architect's Pro Tip

Frequently Asked Questions

My Service has Endpoints, but I still get a 502. What's next?

Does this guide apply to AWS ALB Ingress Controller or other Ingress controllers?

Why did this start happening after upgrading to K8s 1.31+?

Related Kubernetes Guides

How to Fix Kubernetes 502 Bad Gateway from Ingress-NGINX (K8s 1.30+)

How to Fix Kubernetes 502 Bad Gateway with Istio Service Mesh (2025)

How to Fix K8s CrashLoopBackOff: Probe Failures & Resources