Problem Description
When taking down an Aerospike node for routine maintenance the following exceptions are seen in the Aerospike client logs (Java client used as an example here):
com.aerospike.client.AerospikeException: Error Code -1: java.net.ConnectException: Connection refused
What causes this error and how can it be prevented?
Explanation
So that the clients can go to the right node for a given partition, each client maintains a partition map which lists which node owns which partitions. Every second, the clients tend to the node and get the up to date list of the partitions each node owns.
When a node is taken out of the cluster (i.e. the asd process is stopped) the clients will still try and connect to that node until their partition map is refreshed.
The cluster will re-form once it realizes a node has departed. The cluster determines that a node has left by counting missing heartbeats. Each node will send a heartbeat to other nodes in the cluster at the interval defined in the Aerospike configuration file, this is measured in ms and defaults to 150.
The cluster will allow a number of missing heartbeats, defined by the timeout configuration, which defaults to 10 before declaring the node dead.
Finally, there is an overhead called the quantum interval to optimize for potential multiple rapid cluster changes. With default configuration parameters, the quantum interval is 1.8s however it would be adjusted based on a few heartbeat and network latency related settings. Refer to the What is the Quantum Interval article for further details.
Therefore, in default configuration, this implies an interval of typically 2.5 to 3 seconds before a cluster re-forms after one or more nodes stop responding and are ejected from the cluster. At that point, a new partition distribution is set which the clients gradually discover as they tend each node every second (default tend interval). Up to an extra second would then be added for the clients to get the updated partition map. Having said that, most transaction types can be configured with policies that would retry against an alternate node (typically replica node) which would either process the transaction or potentially proxy it.
Solution
In Aerospike 4.3.1 and higher the quiesce feature was introduced. When a node is marked for quiescence nothing happens immediatly, however, on issuing a recluster command, the node gives up ownership of its partitions. Those will be taken over by a new node (typically the next node in the succession list (i.e. the node that owned the replica partition prior to the quiesced node leaving), unless rack aware or uniform balance is configured, in which case, it could be a different node taking over.
Before clients tend and build their new partition maps, the quiesced node will proxy any transactions that it receives to the new master node. When the clients have built their new partition map and are no longer directing transactions towards the quiesced node it can be shut down safely without any disruption to client traffic.
However, a quiesced node will still be in the cluster until it is shut down. Client will still be tending such node and when the node is shut down, the tend calls would fail until the cluster re-forms again without the node (but as the node was already quiesced, partition ownership will not change at that time).