WARNING

How to Fix GCP Cloud-Run-429

Quick Fix Summary

TL;DR

Increase your service's maximum concurrency and request timeout limits in Cloud Run configuration.

A Cloud-Run-429 error indicates your service is receiving more concurrent requests than it's configured to handle. This is a server-side throttling mechanism, not a client-side issue.

Diagnosis & Causes

Service max concurrency limit is too low.

Sudden traffic spike exceeding configured capacity.

Slow request processing causing request queue buildup.

Downstream dependency latency increasing request duration.

Misconfigured autoscaling parameters (min/max instances).

Recovery Steps

Step 1: Immediate Capacity Increase via Console

Quickly raise the concurrency and timeout limits to stop the 429 errors and allow traffic to flow.

bash

# Navigate to Cloud Run > Your Service > Edit & Deploy New Revision
# Under 'Capacity':
# 1. Set 'Maximum number of requests per container' to 80 (or higher)
# 2. Set 'Request timeout' to 300s (5 minutes) if needed
# 3. Under 'Autoscaling', increase 'Maximum number of instances'
# 4. Click 'Deploy'

Step 2: Configure via gcloud CLI for CI/CD

Programmatically update the service configuration using the gcloud command-line tool. This is ideal for reproducible deployments.

bash

gcloud run services update YOUR_SERVICE_NAME \
--region=YOUR_REGION \
--concurrency=80 \
--max-instances=50 \
--timeout=300s \
--cpu=2 \
--memory=2Gi

Step 3: Implement Application-Level Rate Limiting & Queuing

For more control, implement a request queue and worker pattern within your service to smooth out traffic bursts.

javascript

// Node.js example using express and bull queue
const Queue = require('bull');
const processQueue = new Queue('requests');
processQueue.process(async (job) => {
  // Your request logic here
});
app.post('/endpoint', async (req, res) => {
  const job = await processQueue.add(req.body);
  res.json({ jobId: job.id });
});

Step 4: Proactive Monitoring & Alerting

Set up alerts for concurrency and request count to catch issues before they cause 429 errors.

bash

# Create an alerting policy for high concurrency
gcloud alpha monitoring policies create \
--policy-from-file="alert_policy.json"
# Sample alert_policy.json condition:
# metric: run.googleapis.com/container/concurrent_requests
# condition: above 70% of max_configured_concurrency for 1 minute

Architect's Pro Tip

"Set your 'max concurrency' to ~80% of your container's proven stable limit. This leaves headroom for health checks and garbage collection, preventing cascading failures during traffic spikes."

Frequently Asked Questions

Is a 429 error from Cloud Run a problem with my code?

Not necessarily. It's primarily an infrastructure throttling error. However, inefficient code that processes requests slowly can trigger it by exhausting concurrency slots.

What's the difference between Cloud Run 429 and a 5xx error?

A 429 means the service is too busy to accept more work (scaling/limit issue). A 5xx error (like 502 or 503) typically indicates the container crashed or failed to start.

Will increasing concurrency increase my costs?

Potentially, but efficiently. Higher concurrency allows each container instance to handle more requests, which can reduce the number of needed instances. Monitor your bill and CPU utilization.

Related GCP Guides

Container Startup Timeout

How to Fix GCP Cloud-Run-429

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Immediate Capacity Increase via Console

Step 2: Configure via gcloud CLI for CI/CD

Step 3: Implement Application-Level Rate Limiting & Queuing

Step 4: Proactive Monitoring & Alerting

Architect's Pro Tip

Frequently Asked Questions

Is a 429 error from Cloud Run a problem with my code?

What's the difference between Cloud Run 429 and a 5xx error?

Will increasing concurrency increase my costs?

Related GCP Guides

How to Fix GCP Container Startup Timeout

GCP Cloud SQL Instance Disk Full: Troubleshooting Guide

How to Fix GCP IAM PERMISSION_DENIED Error