What Does “Cluster Load is Balanced” Really Mean?
When you see the message "Cluster Load is Balanced" in YugabyteDB, it means that the cluster’s internal rebalancer hasn’t needed to take any action recently. The system has been idle and there haven’t been any balancing issues. However, it does NOT always mean every node is healthy or perfectly balanced. It's more about whether the cluster balancer is busy or facing any problems.
What Happens When a Node is Stopped?
Less than 15 minutes after stopping a node (by default):
- The system quickly moves leadership roles (the master replica of tablet data) off the stopped node for reliability.
- For a while after a node goes down, the system still "remembers" it as part of the group, so no warnings are shown.
- During this period, the "Cluster Load is Balanced" message appears.
After about 15 minutes (by default):
- The system considers the stopped node as unavailable for storing data.
- If the cluster still has enough healthy nodes, the rebalancer begins shifting data and leadership roles as needed, and will show the cluster as “Not Balanced” until redistribution is complete.
- The message changes to "Cluster Load is Not Balanced" while the cluster is actively rebalancing or if there’s a problem (such as not enough nodes/disk space for the replicas).
Note: The default timeout used to decide when a stopped node is considered failed is set by
follower_unavailable_considered_failed_sec(default: 15 minutes). If you have changed this value, you will need to wait for your configured interval.
What Does "Cluster Load is Not Balanced" Mean?
- “Not Balanced” does NOT always mean there is a serious problem.
- Most of the time, it simply means the cluster data is not yet evenly distributed across all available nodes (for example, after scaling out or in, or after node restarts/replacements).
- During these times, the load balancer actively moves data and leadership roles around to restore balance. This state is normal and typically temporary, it may clear up in seconds, or it could last much longer (such as when rebalancing terabytes of data).
- “Not Balanced” should only be considered a critical issue if the cluster is unable to maintain the desired level of replication, such as when all nodes in a zone are down, disks are full, or nodes are overloaded.
In short: The “Cluster Load is Balanced” message just means the rebalance process is not currently running and no problems were detected recently. When you see “Not Balanced”, it’s usually a temporary and expected part of operations, unless it persists and is accompanied by warnings about under-replication or unavailable data.
Tip: If you are testing or scaling your cluster, expect to see “Not Balanced” for a while as the system redistributes data. The duration depends on your cluster size and data volume.
Reference ID: SUPPORT-858
Comments
0 comments
Please sign in to leave a comment.