Context
Aerospike cold restart scans all records on the persisted storage layer and rebuilds the primary index (and data if data-in-memory is configured true) and then any secondary indexes in memory. This process can take a long time depending on the number of records, disk I/O capability or even CPU capacity.
Method
There are various way to speed up a cold start, depending on where the bottleneck is.
-
The
asmttool can be leveraged to avoid having to rebuild the primary index. -
If you do still need to rebuild the primary index from a disk scan, our tests have shown that having one device partition per CPU core is usually optimal, so it may be advantageous to partition your devices or use more files.
-
You may be able to speed up the building of secondary indexes by setting
sindex-startup-device-scanto true or (carefully) increasingsindex-builder-threads. Please see the cautions on those configuration items before proceeding. -
If one of the high-water marks (
high-water-disk-pct,high-water-memory-pct, ormounts-high-water-pct) is being exceeded at cold-start time, the node will have to perform evictions which add time to the startup. You may be able to avoid this by increasing the appropriate high-water mark, or by settingdisable-cold-start-evictionto true, but this may also cause the node to run out of space and fail to start up. Exercise caution with these parameters. See article on 'How do I speed up cold start eviction? ' for more details.
Notes
When planning to use smaller disk partitions in parallel for your namespace storage, keep in mind that having too small partitions or files may prevent defragmentation which may lead to low device_available_pct. This could occur if the post-write-queue is greater or close to the size of the partition itself.