CRITICAL

How to Fix AWS RateExceeded

Quick Fix Summary

TL;DR

Immediately implement exponential backoff with jitter in your application code and request a service limit increase.

AWS RateExceeded occurs when you exceed the API request rate limits for a specific AWS service. This is a hard throttling mechanism AWS uses to protect service stability and prevent resource exhaustion.

Diagnosis & Causes

  • Aggressive polling without backoff logic
  • Sudden traffic spikes overwhelming API limits
  • Misconfigured auto-scaling triggering rapid API calls
  • Buggy code stuck in infinite retry loops
  • Shared service limits across multiple accounts/regions
  • Recovery Steps

    1

    Step 1: Implement Exponential Backoff with Jitter

    Modify your application to handle throttling gracefully by implementing retry logic with exponential backoff and jitter to prevent synchronized retries.

    python
    import boto3
    from botocore.config import Config
    from botocore.retries import bucket
    import random
    import time
    def exponential_backoff_with_jitter(base_delay, max_delay, attempt):
        delay = min(max_delay, base_delay * (2 ** attempt))
        jitter = random.uniform(0, delay * 0.1)
        time.sleep(delay + jitter)
    2

    Step 2: Request Service Limit Increase via AWS Support

    For production-critical services, request a permanent limit increase through the AWS Support Center.

    bash
    # Navigate to AWS Support Center -> Create Case -> Service Limit Increase
    # Select the specific service (e.g., EC2, Lambda, S3)
    # Specify the limit type and desired new limit
    # Provide business justification for production requirements
    3

    Step 3: Implement Client-Side Caching

    Reduce API calls by caching responses for read-heavy operations, especially for Describe* and List* API calls.

    python
    import boto3
    from cachetools import TTLCache
    cache = TTLCache(maxsize=100, ttl=300)
    def get_cached_instance_types():
        if 'instance_types' not in cache:
            ec2 = boto3.client('ec2')
            cache['instance_types'] = ec2.describe_instance_types()
        return cache['instance_types']
    4

    Step 4: Set Up CloudWatch Alarms for Throttling

    Monitor and alert on throttling events before they impact production using CloudWatch metrics.

    bash
    aws cloudwatch put-metric-alarm \
        --alarm-name "API-Throttling-Alarm" \
        --metric-name "ThrottledRequests" \
        --namespace "AWS/ApiGateway" \
        --statistic "Sum" \
        --period 300 \
        --threshold 10 \
        --comparison-operator "GreaterThanThreshold" \
        --evaluation-periods 1 \
        --alarm-actions "arn:aws:sns:us-east-1:123456789012:OpsTeam"

    Architect's Pro Tip

    "Use the AWS Service Quotas API to programmatically monitor your usage against limits and trigger alerts at 80% utilization, not 100%."

    Frequently Asked Questions

    How long does it take AWS to process a limit increase request?

    Standard support requests take 24-48 hours. Business and Enterprise support can expedite to 2-12 hours for critical production issues.

    Can I bypass RateExceeded errors by switching AWS regions?

    No, service limits are typically region-specific. However, distributing load across regions can help if you're hitting regional limits.

    What's the difference between RateExceeded and ThrottlingException?

    RateExceeded is the generic error code for exceeding API request limits. ThrottlingException is service-specific (e.g., DynamoDB) but means the same thing.

    Related AWS Guides