Articles in this section

Rapid network fluctuations result in dead partitions in Strong Consistency mode

Problem Description

When an Aerospike cluster running with Strong Consistency namespaces experiences rapid fluctuations in network connectivity across multiple nodes partitions are marked as dead.

Explanation

When a namespace is running in Strong Consistency mode the database will take a very cautious approach to data consistency.  When a node enters or leaves the Aerospike cluster, the cluster will rebalance during the clustering cycle.  At these rebalance intervals the regime of the Aerospike namespace will increment.  The regime is an integer that allows nodes to know if they are aligned with the current version of the cluster.  The regime increment is normal and easy to observe in the Aerospike logs.

If two nodes enter or leave in the same atomic moment, two rebalances could be triggered in rapid succession interrupting the one that was in progress.  This is an extremely unusual occurrence that would only normally occur when there are very severe fluctuations in cluster connectivity.  The rapid rebalance interruption is easy to spot as the regime increments by more than 2.

A normal regime increment looks as follows:
 

Nov 01 2023 15:39:22 GMT: INFO (partition): (partition_balance_ee.c:1000) {myns} 22 of 22 nodes participating - regime 475 -> 477


A rapidly interrupted rebalance, often referred to as a 'double pump' looks like this:
 

Nov 01 2023 15:41:24 GMT: INFO (partition): (partition_balance_ee.c:1000) {myns} 22 of 22 nodes participating - regime 477 -> 481


In the example above, the regime increments by 4, a 'double pump' however it is not impossible for an increment of 6 or more, depending on how many concurrent rebalances happen.  Whether the concurrent rebalance is a 'double' or 'triple pump' is not important.  A concurrent rebalance causes impacted partitions to be subset.

Earlier versions of Aerospike would subset all 4096 partitions within the namespace.  Later versions are more nuanced and will only subset those where the connectivity fluctuation has introduced a chance of inconsistency.

When a partition is subset it cannot be counted when SC determines availability.  If both the roster master and roster replica partitions become subset then the partition becomes unavailable.

If a partition is unavailable when the entire roster of nodes is present then that partition is, by definition, dead.
 


Solution

The key to dead partitions is to ignore the over-dramatic name and focus rather on the meaning.  A partition is marked as 'dead' when Aerospike cannot use programmatic means to determine whether the partition is consistent.  Most often this is due to node crashes however, as discussed here, the cause can also be related to poor connectivity or clustering exchange being interrupted by rapid changes.

As with any other circumstance where dead partitions arise, the partitions can be returned to usage by running the revive  command.  It just requires a human operator to make a decision knowing more context on what occurred.

Starting with version 7.1, asd can also auto-revive dead partitions at start up.

 In most instances of the double pump it is quite safe to revive partitions with the proviso that the connectivity issues are resolved.


Notes

If the connectivity issues leading to the 'double pump' result in the loss of some nodes from the cluster partitions will still be subset.  They will be marked as unavailable.  They will only be marked as dead when the full roster of nodes is present (via a reset of the roster or a return of the missing nodes)


Applies To Earliest Version

Pre 4.9

Applies To Latest Version

Current Version
Was this article helpful?
0 out of 1 found this helpful