Articles in this section

Avail-pct drops without defragmentation starting

Problem Description

On a small namespace (less than 256Mb) stored on a filesystem, the device_avail_pct continues to drop but when logs are checked, the defrag_q is empty and therefore the defragger is not running. device_free_pct is still high. An example of info output from asadm might look as follows:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Information~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Namespace                                       Node   Avail%   Evictions      Master     Replica     Repl     Stop     Pending         Disk    Disk     HWM          Mem     Mem    HWM      Stop
                    .                                          .        .           .     Objects     Objects   Factor   Writes    Migrates         Used   Used%   Disk%         Used   Used%   Mem%   Writes%
                    .                                          .        .           .           .           .        .        .   (tx%,rx%)            .       .       .            .       .      .         .

Namespace__Storage         host1:3000                           33               0     1.000       4.000     2        false    (0,0)         1.250 KB   1       50       18.479 KB   1       60     90
Namespace__Storage         host2:3000                           53               0     3.000       0.000     2        false    (0,0)       768.000 B    1       50       18.276 KB   1       60     90
Namespace__Storage         host3:3000                           43               0     2.000       2.000     2        false    (0,0)         1.000 KB   1       50       18.339 KB   1       60     90
Namespace__Storage                                                               0     6.000       6.000                       (0,0)         3.000 KB                    55.095 KB

The output above shows the that the namespace has a small number of very small objects and Disk Used%_ is very low, yet, Avail% is very low.

Looking into the logs, the defrag-q is empty and free-wblocks is dropping:

[Caprica6:5477 user$ grep -i defrag-q aerospike.log | grep Namespace__Storage | more

July 12 2022 14:39:13 GMT: INFO (drv_ssd): (drv_ssd.c:9999) {Namespace__Storage} /dev/sda: used-bytes 296160983424 free-wblocks 103 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)
July 12 2022 14:39:23 GMT: INFO (drv_ssd): (drv_ssd.c:9999) {Namespace__Storage} /dev/sda: used-bytes 296160983424 free-wblocks 102 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)
July 12 2022 14:39:33 GMT: INFO (drv_ssd): (drv_ssd.c:9999) {Namespace__Storage} /dev/sda: used-bytes 296160983424 free-wblocks 101 write-q 0 write (12659541,43.3) defrag-q 0 defrag-read (11936852,39.1) defrag-write (3586533,10.2) shadow-write-q 0 tomb-raider-read (13758,598.0)

Explanation

In this situation the write-block-size has not been set and so has taken the default value of 1Mb. The default value for the post-write-queue is 256 blocks so, in this case, that equates to 256Mb which is bigger than the file used for the namespace. Blocks in thepost-write-queue are not eligible for defragmentation. As such the blocks are sitting in the post-write-queue which can exceed the size of the namespace (which is 100Mb).

Solution

The solution is to reduce the size of the post-write-queue so that it is no bigger than the size of the namespace. The parameter is dynamic and so can be set without a node restart and also configurable at a namespace level (it is part of the storage-engine sub-stanza). The command to alter the size of the post-write-queue is:

asinfo -v 'set-config:context=namespace;id=<NAMESPACE>;post-write-queue=8

Applies To Earliest Version

Pre 4.9

Applies To Latest Version

Current Version
Was this article helpful?
0 out of 0 found this helpful