Context
To recover shared memory usage, nodes must perform a rolling cold start, which rebuilds the primary index from storage.
⚠️ Important: You may not want to cold start if:
-
You do not handle resurrection of deleted objects.
-
Your cluster is already low on RAM, which may increase the risk of OOM-kill.
For more details, please refer to this KB article:
How do I speed up cold-start eviction?
Method
- Run this in asadm (in Enable Mode) to apply for all nodes:
asinfo -v "set-config:context=service;migrate-threads=0"
- Quiesce that node to transfer partitions ownership gracefully to other nodes
manage quiesce with xxx.xxx.xxx.xxx. (IP Address of the node)manage recluster
- Shut down Aerospike on the node
- Trigger cold start, for example (See notes below on other methods) by running:
asd-coldstart
- After the node re-joins the cluster, restore migration:
asinfo -v "set-config:context=service;migrate-threads=1"
- Wait for migration to complete.
- Repeat all the steps for the next node.
Notes
-
Using ipcrm to remove shmem segments can also trigger cold start. This KB article has an example of how to use
ipcrm:
How to change data-size config in a running cluster. -
Changing certain configuration parameters will also trigger a cold start. For example, updating
partition-tree-sprigs. This may be a good opportunity to apply such changes if you haven’t already. - Setting "
cold-start-empty true"in aerospike.conf if the cluster is stable—that is, when all data is fully replicated with RF ≥ 2. In this case, the node can rejoin empty and the missing data will be filled by other nodes.
- Remember, cold starting a node an result in resurrection of deleted data or data where the expiration of already existing records was lowered.