WARNING

How to Fix AWS Application Load Balancer 504 Gateway Timeout Error

Quick Fix Summary

TL;DR

Increase the ALB idle timeout and verify your target's response time is under this limit.

A 504 Gateway Timeout from an AWS Application Load Balancer (ALB) indicates the ALB did not receive a response from its registered target (e.g., EC2, Lambda, ECS) within its configured idle timeout period. This is a client-side timeout from the ALB's perspective, not a failure from the target.

Diagnosis & Causes

  • Application response time exceeds ALB idle timeout (default 60s).
  • Target instance is under high CPU/Memory load.
  • Network connectivity issues between ALB and targets.
  • Application is stuck in a long-running process.
  • Lambda function timeout exceeds ALB timeout.
  • Recovery Steps

    1

    Step 1: Check ALB & Target Group Metrics in CloudWatch

    Identify if the timeout is at the ALB or target level. High `TargetResponseTime` and `HTTPCode_ELB_5XX_Count` confirm the issue.

    bash
    aws cloudwatch get-metric-statistics --namespace AWS/ApplicationELB --metric-name TargetResponseTime --statistics Average --period 300 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --dimensions Name=LoadBalancer,Value=app/your-alb-name/1234567890abcdef
    aws cloudwatch get-metric-statistics --namespace AWS/ApplicationELB --metric-name HTTPCode_ELB_5XX_Count --statistics Sum --period 300 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --dimensions Name=LoadBalancer,Value=app/your-alb-name/1234567890abcdef
    2

    Step 2: Increase the ALB Idle Timeout

    The default idle timeout is 60 seconds. Increase it if your application legitimately needs longer to respond. The maximum is 4000 seconds.

    bash
    aws elbv2 modify-load-balancer-attributes --load-balancer-arn arn:aws:elasticloadbalancing:region:account-id:loadbalancer/app/your-alb-name/1234567890abcdef --attributes Key=idle_timeout.timeout_seconds,Value=120
    3

    Step 3: Adjust the Target Group Deregistration Delay

    When a target is failing, the ALB waits for in-flight requests to complete during the deregistration delay. Setting `deregistration_delay.timeout_seconds` too low can cause 504s during scaling events.

    bash
    aws elbv2 modify-target-group-attributes --target-group-arn arn:aws:elasticloadbalancing:region:account-id:targetgroup/your-tg-name/abcdef1234567890 --attributes Key=deregistration_delay.timeout_seconds,Value=60
    4

    Step 4: Profile Your Application's Response Time

    Use application profiling (e.g., X-Ray, custom logging) to find slow database queries, external API calls, or code paths causing the delay.

    javascript
    # Enable X-Ray on your ALB (Console/CLI)
    # Sample: Adding a response time header in a Node.js app (Express)
    app.use((req, res, next) => {
      const startHrTime = process.hrtime();
      res.on('finish', () => {
        const elapsedHrTime = process.hrtime(startHrTime);
        const elapsedTimeInMs = elapsedHrTime[0] * 1000 + elapsedHrTime[1] / 1e6;
        console.log(`Request Path: ${req.path}, Method: ${req.method}, Response Time: ${elapsedTimeInMs}ms`);
      });
      next();
    });
    5

    Step 5: Verify Target Health & Capacity

    Ensure your backend instances (EC2, ECS tasks) have sufficient CPU/Memory and are passing health checks. An unhealthy target receiving requests will timeout.

    bash
    aws elbv2 describe-target-health --target-group-arn arn:aws:elasticloadbalancing:region:account-id:targetgroup/your-tg-name/abcdef1234567890
    # Check EC2 Instance Metrics
    aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization --dimensions Name=InstanceId,Value=i-1234567890abcdef0 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --period 300 --statistics Average
    6

    Step 6: For Lambda Targets: Align Timeouts

    If using Lambda, the ALB timeout must be greater than the Lambda function timeout. Also, ensure Lambda returns its response before timing out.

    bash
    # 1. Set Lambda timeout (e.g., 30 sec) less than ALB timeout (e.g., 31 sec).
    # 2. Verify Lambda does not exceed configured memory, causing throttling.

    Architect's Pro Tip

    "Monitor the `UnHealthyHostCount` metric. A rising count with 504s often points to a backend capacity issue, not just a slow request."

    Frequently Asked Questions

    What's the difference between an ALB 504 and a 502 error?

    A 504 (Gateway Timeout) means the ALB gave up waiting for your target. A 502 (Bad Gateway) means the target responded, but with an invalid or malformed HTTP response that the ALB could not process.

    Can I set the ALB idle timeout to 0?

    No. A value of 0 disables the idle timeout, which is not recommended for production as it can lead to resource exhaustion. The valid range is 1-4000 seconds.

    My application is fast, but I still get sporadic 504s. Why?

    This is often due to garbage collection pauses (Java/Python), cold starts (Lambda, containers), or TCP connection exhaustion between the ALB and your targets. Enable ALB access logs to see the `target_processing_time` field for outliers.

    Related AWS Guides