CRITICAL

Redis Sentinel Failing Health Checks: Troubleshooting the CLUSTERDOWN Error

Quick Fix Summary

TL;DR

Check if a majority of Sentinel nodes are reachable and can communicate. Restart the Sentinel service on a quorum of nodes.

The CLUSTERDOWN error occurs when Redis Sentinel cannot achieve a quorum to perform failover operations, often due to network partitions, misconfiguration, or insufficient healthy Sentinel instances.

Diagnosis & Causes

Network connectivity issues between Sentinel nodes

Insufficient number of healthy Sentinel instances to form a quorum

Recovery Steps

Step 1: Verify Sentinel Cluster State and Quorum

Check the status of all Sentinel instances to see which are reachable and confirm the current master.

bash

redis-cli -p 26379 sentinel masters
redis-cli -p 26379 sentinel sentinels <master-name>
redis-cli -p 26379 sentinel get-master-addr-by-name <master-name>

Step 2: Check Sentinel and Redis Logs for Errors

Examine logs for connection failures, vote disagreements, or configuration errors.

bash

sudo journalctl -u redis-sentinel --since "1 hour ago"
sudo tail -f /var/log/redis/sentinel.log
sudo grep -E "(failover|vote|quorum|down)" /var/log/redis/sentinel.log

Step 3: Validate Network Connectivity Between Sentinels

Ensure all Sentinel nodes can communicate on their configured ports (default 26379).

bash

for ip in $(redis-cli -p 26379 sentinel sentinels <master-name> | grep ip | awk -F: '{print $2}'); do nc -zv $ip 26379; done
sudo ss -tlnp | grep 26379

Step 4: Confirm Sentinel Configuration and Quorum Settings

Verify the `sentinel monitor` directive and `quorum` value are consistent across all Sentinel configs.

bash

sudo grep -E "^(sentinel monitor|sentinel down-after-milliseconds|quorum)" /etc/redis/sentinel.conf
cat /etc/redis/sentinel.conf

Step 5: Force a Sentinel Failover if Quorum is Achievable

If a quorum of Sentinels is reachable but the cluster is stuck, manually trigger a failover.

bash

redis-cli -p 26379 sentinel failover <master-name>

Step 6: Restart Sentinel Services to Clear State

Gracefully restart Sentinel instances, starting with the one that can see the current master.

bash

sudo systemctl restart redis-sentinel
sudo systemctl status redis-sentinel

Step 7: Check Underlying Redis Master/Slave Health

Ensure the Redis instances being monitored are themselves healthy and replicating.

bash

redis-cli -h <master-ip> -p 6379 info replication
redis-cli -h <slave-ip> -p 6379 info replication

Architect's Pro Tip

"A split-brain scenario where two subsets of Sentinels each elect a different master is a common root cause. Always verify the `master` field from `sentinel masters` on ALL Sentinel nodes to ensure consensus."

Frequently Asked Questions

How many Sentinel nodes do I need to avoid CLUSTERDOWN?

You need a quorum, which is typically a majority. For 3 nodes, quorum is 2. For 5 nodes, quorum is 3. Always deploy an odd number (3, 5) to avoid ties.

Can I temporarily fix this by restarting just one Sentinel?

No. Restarting a single Sentinel often won't resolve a quorum issue. You must restore connectivity or restart enough Sentinels to re-establish a majority quorum.

Related Redis Guides

MISCONF / NOREPLICAS

Redis Sentinel Failing Health Checks: Troubleshooting the CLUSTERDOWN Error

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Verify Sentinel Cluster State and Quorum

Step 2: Check Sentinel and Redis Logs for Errors

Step 3: Validate Network Connectivity Between Sentinels

Step 4: Confirm Sentinel Configuration and Quorum Settings

Step 5: Force a Sentinel Failover if Quorum is Achievable

Step 6: Restart Sentinel Services to Clear State

Step 7: Check Underlying Redis Master/Slave Health

Architect's Pro Tip

Frequently Asked Questions

How many Sentinel nodes do I need to avoid CLUSTERDOWN?

Can I temporarily fix this by restarting just one Sentinel?

Related Redis Guides

Root Cause Analysis: Why Redis Cluster Fails Over in High Throughput Scenarios

How to Fix Redis MISCONF: Persistence Save Failed

How to Fix Redis NOAUTH Authentication Required