WARNING

How to Fix GCP Cloud-Run-429

Quick Fix Summary

TL;DR

Increase your service's maximum concurrency and request timeout limits in Cloud Run configuration.

A Cloud-Run-429 error indicates your service is receiving more concurrent requests than it's configured to handle. This is a server-side throttling mechanism, not a client-side issue.

Diagnosis & Causes

  • Service max concurrency limit is too low.
  • Sudden traffic spike exceeding configured capacity.
  • Slow request processing causing request queue buildup.
  • Downstream dependency latency increasing request duration.
  • Misconfigured autoscaling parameters (min/max instances).
  • Recovery Steps

    1

    Step 1: Immediate Capacity Increase via Console

    Quickly raise the concurrency and timeout limits to stop the 429 errors and allow traffic to flow.

    bash
    # Navigate to Cloud Run > Your Service > Edit & Deploy New Revision
    # Under 'Capacity':
    # 1. Set 'Maximum number of requests per container' to 80 (or higher)
    # 2. Set 'Request timeout' to 300s (5 minutes) if needed
    # 3. Under 'Autoscaling', increase 'Maximum number of instances'
    # 4. Click 'Deploy'
    2

    Step 2: Configure via gcloud CLI for CI/CD

    Programmatically update the service configuration using the gcloud command-line tool. This is ideal for reproducible deployments.

    bash
    gcloud run services update YOUR_SERVICE_NAME \
    --region=YOUR_REGION \
    --concurrency=80 \
    --max-instances=50 \
    --timeout=300s \
    --cpu=2 \
    --memory=2Gi
    3

    Step 3: Implement Application-Level Rate Limiting & Queuing

    For more control, implement a request queue and worker pattern within your service to smooth out traffic bursts.

    javascript
    // Node.js example using express and bull queue
    const Queue = require('bull');
    const processQueue = new Queue('requests');
    processQueue.process(async (job) => {
      // Your request logic here
    });
    app.post('/endpoint', async (req, res) => {
      const job = await processQueue.add(req.body);
      res.json({ jobId: job.id });
    });
    4

    Step 4: Proactive Monitoring & Alerting

    Set up alerts for concurrency and request count to catch issues before they cause 429 errors.

    bash
    # Create an alerting policy for high concurrency
    gcloud alpha monitoring policies create \
    --policy-from-file="alert_policy.json"
    # Sample alert_policy.json condition:
    # metric: run.googleapis.com/container/concurrent_requests
    # condition: above 70% of max_configured_concurrency for 1 minute

    Architect's Pro Tip

    "Set your 'max concurrency' to ~80% of your container's proven stable limit. This leaves headroom for health checks and garbage collection, preventing cascading failures during traffic spikes."

    Frequently Asked Questions

    Is a 429 error from Cloud Run a problem with my code?

    Not necessarily. It's primarily an infrastructure throttling error. However, inefficient code that processes requests slowly can trigger it by exhausting concurrency slots.

    What's the difference between Cloud Run 429 and a 5xx error?

    A 429 means the service is too busy to accept more work (scaling/limit issue). A 5xx error (like 502 or 503) typically indicates the container crashed or failed to start.

    Will increasing concurrency increase my costs?

    Potentially, but efficiently. Higher concurrency allows each container instance to handle more requests, which can reduce the number of needed instances. Monitor your bill and CPU utilization.

    Related GCP Guides