Recovering a Percona/Galera Cluster After Losing Quorum

Percona/Galera clusters rely on quorum to maintain data consistency and high availability. For a cluster to function correctly and tolerate node failures, a minimum of three nodes is recommended. Losing quorum, meaning a majority of nodes cannot see each other or agree on the cluster state, prevents the cluster from processing writes and can leave nodes in an 'Initialized' or non-primary state.

This article provides a step-by-step guide to safely restoring your cluster to an operational state after quorum is lost, including best practices to prevent data inconsistency.

Understanding Quorum Loss in Percona/Galera Clusters

A typical Percona XtraDB/Galera Cluster consists of three or more nodes. Quorum is achieved when more than half of the nodes are communicating and agree on the cluster state. If a node fails and the remaining nodes cannot form a majority (for example, if the network is partitioned or nodes are restarted incorrectly), the cluster will refuse to process transactions to protect your data.

In this scenario, you may see two or more nodes running, but none is in the 'Primary' state; instead, they remain 'Initialized' or 'Non-Primary', and no writes are possible. This is a protective measure against split-brain and data corruption.

Recommended Recovery Procedure

1. Identify the Most Advanced Node

Before restarting the cluster, it's crucial to determine which remaining node has the most up-to-date data. Restarting the cluster from an out-of-date node can cause data loss. Percona and MariaDB recommend checking the following on each node:

grastate.dat: Located in the MySQL data directory, this file records the last known cluster state. The node with the highest seqno (sequence number) is usually the most advanced.
wsrep_recover: Running mysqld --wsrep_recover will print the recovered position to the error log, helping you identify the best node for bootstrapping.

Example: Checking grastate.dat

cat /var/lib/mysql/grastate.dat

Look for the line seqno:—the node with the highest value is the safest candidate.

2. Stop MySQL on All Other Nodes

To avoid data conflicts or split-brain scenarios, gracefully stop the MySQL service on all nodes except the most advanced one:

sudo systemctl stop mysql

sudo service mysql stop

3. Bootstrap the Cluster from the Most Advanced Node

On the chosen node, start MySQL and force the cluster to bootstrap. For Percona XtraDB Cluster, this is done by setting the pc.bootstrap option:

Open the MySQL console:

mysql -u root -p

Then execute:

SET GLOBAL wsrep_provider_options='pc.bootstrap=true';

Alternatively, you may need to start MySQL with a bootstrap command, depending on your configuration. For example:

sudo galera_new_cluster

Note: In Percona XtraDB Cluster 8.0 and later, you can set safe_to_bootstrap: 1 in the grastate.dat file of the chosen node before starting MySQL to indicate it is safe for cluster bootstrap.

4. Rejoin Remaining Nodes

Once the cluster is running in Primary mode with one node, start MySQL on the other nodes one at a time. Each node will join the cluster and synchronize its data.

Monitor the cluster status by running:

SHOW STATUS LIKE 'wsrep_cluster_size';
SHOW STATUS LIKE 'wsrep_cluster_state';

These commands help you verify that all nodes have joined the cluster and are in the correct state.

Important Considerations

Always use the most advanced node for cluster bootstrap to avoid losing committed transactions.
Never attempt to bootstrap more than one node at a time; this can lead to split-brain and irreconcilable data differences.
Ensure all nodes are properly shut down before starting the bootstrap process.
If in doubt, consult your database administrator or Percona support for complex scenarios.

For even more information and help try these sources: