WARNING

How to Fix AWS Application Load Balancer 504 Gateway Timeout Error

Quick Fix Summary

TL;DR

Increase the ALB idle timeout and verify your target's response time is under this limit.

A 504 Gateway Timeout from an AWS Application Load Balancer (ALB) indicates the ALB did not receive a response from its registered target (e.g., EC2, Lambda, ECS) within its configured idle timeout period. This is a client-side timeout from the ALB's perspective, not a failure from the target.

Diagnosis & Causes

Application response time exceeds ALB idle timeout (default 60s).

Target instance is under high CPU/Memory load.

Network connectivity issues between ALB and targets.

Application is stuck in a long-running process.

Lambda function timeout exceeds ALB timeout.

Recovery Steps

Step 1: Check ALB & Target Group Metrics in CloudWatch

Identify if the timeout is at the ALB or target level. High `TargetResponseTime` and `HTTPCode_ELB_5XX_Count` confirm the issue.

bash

aws cloudwatch get-metric-statistics --namespace AWS/ApplicationELB --metric-name TargetResponseTime --statistics Average --period 300 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --dimensions Name=LoadBalancer,Value=app/your-alb-name/1234567890abcdef
aws cloudwatch get-metric-statistics --namespace AWS/ApplicationELB --metric-name HTTPCode_ELB_5XX_Count --statistics Sum --period 300 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --dimensions Name=LoadBalancer,Value=app/your-alb-name/1234567890abcdef

Step 2: Increase the ALB Idle Timeout

The default idle timeout is 60 seconds. Increase it if your application legitimately needs longer to respond. The maximum is 4000 seconds.

bash

aws elbv2 modify-load-balancer-attributes --load-balancer-arn arn:aws:elasticloadbalancing:region:account-id:loadbalancer/app/your-alb-name/1234567890abcdef --attributes Key=idle_timeout.timeout_seconds,Value=120

Step 3: Adjust the Target Group Deregistration Delay

When a target is failing, the ALB waits for in-flight requests to complete during the deregistration delay. Setting `deregistration_delay.timeout_seconds` too low can cause 504s during scaling events.

bash

aws elbv2 modify-target-group-attributes --target-group-arn arn:aws:elasticloadbalancing:region:account-id:targetgroup/your-tg-name/abcdef1234567890 --attributes Key=deregistration_delay.timeout_seconds,Value=60

Step 4: Profile Your Application's Response Time

Use application profiling (e.g., X-Ray, custom logging) to find slow database queries, external API calls, or code paths causing the delay.

javascript

# Enable X-Ray on your ALB (Console/CLI)
# Sample: Adding a response time header in a Node.js app (Express)
app.use((req, res, next) => {
  const startHrTime = process.hrtime();
  res.on('finish', () => {
    const elapsedHrTime = process.hrtime(startHrTime);
    const elapsedTimeInMs = elapsedHrTime[0] * 1000 + elapsedHrTime[1] / 1e6;
    console.log(`Request Path: ${req.path}, Method: ${req.method}, Response Time: ${elapsedTimeInMs}ms`);
  });
  next();
});

Step 5: Verify Target Health & Capacity

Ensure your backend instances (EC2, ECS tasks) have sufficient CPU/Memory and are passing health checks. An unhealthy target receiving requests will timeout.

bash

aws elbv2 describe-target-health --target-group-arn arn:aws:elasticloadbalancing:region:account-id:targetgroup/your-tg-name/abcdef1234567890
# Check EC2 Instance Metrics
aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization --dimensions Name=InstanceId,Value=i-1234567890abcdef0 --start-time $(date -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) --end-time $(date +%Y-%m-%dT%H:%M:%SZ) --period 300 --statistics Average

Step 6: For Lambda Targets: Align Timeouts

If using Lambda, the ALB timeout must be greater than the Lambda function timeout. Also, ensure Lambda returns its response before timing out.

bash

# 1. Set Lambda timeout (e.g., 30 sec) less than ALB timeout (e.g., 31 sec).
# 2. Verify Lambda does not exceed configured memory, causing throttling.

Architect's Pro Tip

"Monitor the `UnHealthyHostCount` metric. A rising count with 504s often points to a backend capacity issue, not just a slow request."

Frequently Asked Questions

What's the difference between an ALB 504 and a 502 error?

A 504 (Gateway Timeout) means the ALB gave up waiting for your target. A 502 (Bad Gateway) means the target responded, but with an invalid or malformed HTTP response that the ALB could not process.

Can I set the ALB idle timeout to 0?

No. A value of 0 disables the idle timeout, which is not recommended for production as it can lead to resource exhaustion. The valid range is 1-4000 seconds.

My application is fast, but I still get sporadic 504s. Why?

This is often due to garbage collection pauses (Java/Python), cold starts (Lambda, containers), or TCP connection exhaustion between the ALB and your targets. Enable ALB access logs to see the `target_processing_time` field for outliers.

Related AWS Guides

AccessDeniedException

How to Fix AWS Application Load Balancer 504 Gateway Timeout Error

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Check ALB & Target Group Metrics in CloudWatch

Step 2: Increase the ALB Idle Timeout

Step 3: Adjust the Target Group Deregistration Delay

Step 4: Profile Your Application's Response Time

Step 5: Verify Target Health & Capacity

Step 6: For Lambda Targets: Align Timeouts

Architect's Pro Tip

Frequently Asked Questions

What's the difference between an ALB 504 and a 502 error?

Can I set the ALB idle timeout to 0?

My application is fast, but I still get sporadic 504s. Why?

Related AWS Guides

How to Fix AWS AccessDeniedException Error

How to Fix AWS InvalidClientTokenId

How to Fix AWS Lambda ThrottlingException (Rate Exceeded)