WARNING

How to Fix AWS General SDK ThrottlingException

Quick Fix Summary

TL;DR

Implement exponential backoff with jitter in your SDK client configuration.

AWS services enforce API request rate limits per account, region, and IAM principal. A ThrottlingException (HTTP 429) occurs when these limits are exceeded, requiring immediate client-side retry logic.

Diagnosis & Causes

  • Exceeding service-specific API request rate limits.
  • Burst traffic without proper client-side throttling.
  • Shared service quotas across multiple applications.
  • Missing or inadequate SDK retry configuration.
  • Aggressive polling or tight loops in application code.
  • Recovery Steps

    1

    Step 1: Configure SDK Retry Logic with Exponential Backoff

    The most critical fix is to configure your AWS SDK client to automatically retry throttled requests using an exponential backoff strategy with jitter to prevent synchronized retry storms.

    python
    import boto3
    from botocore.config import Config
    config = Config(
        retries = {
            'max_attempts': 10,
            'mode': 'adaptive'
        }
    )
    client = boto3.client('dynamodb', config=config)
    2

    Step 2: Implement Application-Level Throttling (Token Bucket)

    For high-throughput applications, proactively throttle your own request rate to stay below AWS limits, using a token bucket or leaky bucket algorithm.

    python
    import asyncio
    import time
    class TokenBucket:
        def __init__(self, rate, capacity):
            self.rate = rate  # tokens per second
            self.capacity = capacity
            self.tokens = capacity
            self.last_update = time.time()
        async def consume(self, tokens=1):
            while True:
                now = time.time()
                self.tokens = min(self.capacity, self.tokens + (now - self.last_update) * self.rate)
                self.last_update = now
                if self.tokens >= tokens:
                    self.tokens -= tokens
                    return
                await asyncio.sleep((tokens - self.tokens) / self.rate)
    3

    Step 3: Check and Request Service Quota Increases

    Identify the specific service and API operation being throttled via CloudWatch Metrics or the error message, then request a quota increase in the AWS Service Quotas console.

    bash
    # 1. Identify the throttled API from CloudTrail or error message.
    # 2. Navigate to AWS Service Quotas Console.
    # 3. Search for the service (e.g., 'Amazon DynamoDB').
    # 4. Find the specific quota (e.g., 'Read capacity units').
    # 5. Request an increase with a justified business case.
    # CLI: Request a quota increase (example for Lambda Concurrent Executions)
    aws service-quotas request-service-quota-increase --service-code lambda --quota-code L-B99A9384 --desired-value 1500
    4

    Step 4: Distribute Load Across Partitions (Shard Keys/Accounts)

    For services like DynamoDB or Kinesis, distribute requests across multiple partition keys, streams, or even AWS accounts to avoid hitting per-partition or per-account limits.

    python
    # Example: Using a random partition key suffix for DynamoDB to avoid hot partitions
    import random
    import string
    base_partition_key = "USER#12345"
    suffix = ''.join(random.choices(string.ascii_lowercase + string.digits, k=4))
    distributed_partition_key = f"{base_partition_key}#{suffix}"

    Architect's Pro Tip

    "Enable 'Adaptive' retry mode in the SDK. It uses client-side metrics to dynamically adjust retry speed, which is more effective than standard exponential backoff during sustained throttling."

    Frequently Asked Questions

    What's the difference between a ThrottlingException and a ProvisionedThroughputExceededException?

    ThrottlingException is a general API rate limit across AWS services. ProvisionedThroughputExceededException is specific to DynamoDB when exceeding table-level read/write capacity units.

    Will increasing service quotas stop all throttling?

    No. Some limits are hard (e.g., S3 5,500 GET/sec per prefix). For these, you must design your application to distribute requests (e.g., using randomized prefixes).

    How do I find which service is throttling me?

    Check the `service` and `operation` fields in the ThrottlingException error message or look for `ThrottledRequests` metrics in the suspected service's CloudWatch namespace.

    Related AWS Guides