Context
While adding a new device to a namespace, only new records will be written to all devices evenly (based on the RIPEMD160 hash), while existing records on old devices will not be distributed until they are re-written.This may lead to device_available_pct to go down on old devices.
Method
Check the logs for the used-bytes statistic for each device. In the example below, if /dev/sdu being was added at a later time than the other devices, the used-bytes in aerospike log will be unbalanced:
/dev/sds: used-bytes 64327798656 free-wblocks 38659 write-q 0 write (3939741,0.0) defrag-q 0 defrag-read (3765781,0.0) defrag-write (1814464,0.0) /dev/sdu: used-bytes 31775779968 free-wblocks 407900 write-q 0 write (3865747,0.0) defrag-q 0 defrag-read (3506795,0.0) defrag-write (1810913,0.0)
The "used-bytes" of the old device (/dev/sds) is twice the new device (/dev/sdu) because the old records still exists in /dev/sds and will only be moved over to the new device (based on their digest hash) when they are updated or when the block they belong to is defragmented. This may lead to device_available_pct to go down on old devices.
In this case of adding new disk resulting in unbalance devices usage,
- You could temporarily increase
defrag-lwm-pctto force the defragmentation to redistribute the records across the devices. - You could touch every record which would force an update and a rewrite at the correct device, which would rebalance the records across devices.
Here is a log output of a namespace that had its defrag-lwm-pct temporarily increased in order to force rebalance the records through defragmentation:
Dec 11 2017 14:15:02 GMT: INFO (info): (thr_info.c:3097) Changing value of defrag-lwm-pct of ns ns1 from 55 to 60
Dec 11 2017 14:15:02 GMT: INFO (drv_ssd): (drv_ssd.c:4175) {ns1} sweeping all devices for wblocks to defrag ...
Dec 11 2017 14:15:02 GMT: INFO (info): (thr_info.c:3476) config-set command completed: params context=namespace;id=ns1;defrag-lwm-pct=60
Dec 11 2017 14:15:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 64159201152 free-wblocks 41586 write-q 0 write (3943375,9.2) defrag-q 198841 defrag-read (3971184,9965.5) defrag-write (1818093,9.2)
Dec 11 2017 14:15:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 31632894720 free-wblocks 410508 write-q 0 write (3869428,17.4) defrag-q 87503 defrag-read (3600588,4395.2) defrag-write (1814589,17.4)
Dec 11 2017 14:15:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 63924358016 free-wblocks 47229 write-q 0 write (3947045,183.5) defrag-q 189534 defrag-read (3971189,0.2) defrag-write (1821763,183.5)
Dec 11 2017 14:15:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 31867974144 free-wblocks 411926 write-q 0 write (3876076,332.4) defrag-q 79467 defrag-read (3600618,1.5) defrag-write (1821237,332.4)
Dec 11 2017 14:15:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 63690468736 free-wblocks 52877 write-q 0 write (3950760,185.8) defrag-q 180179 defrag-read (3971198,0.4) defrag-write (1825478,185.8)
Dec 11 2017 14:15:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 32101641216 free-wblocks 413381 write-q 0 write (3882748,333.6) defrag-q 71371 defrag-read (3600648,1.5) defrag-write (1827909,333.6)
Dec 11 2017 14:16:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 63458194432 free-wblocks 58517 write-q 0 write (3954476,185.8) defrag-q 170830 defrag-read (3971205,0.3) defrag-write (1829194,185.8)
Dec 11 2017 14:16:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 32333685504 free-wblocks 414832 write-q 0 write (3889375,331.4) defrag-q 63314 defrag-read (3600669,1.0) defrag-write (1834536,331.4)
Dec 11 2017 14:16:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 63225907712 free-wblocks 64130 write-q 0 write (3958153,183.9) defrag-q 161545 defrag-read (3971210,0.2) defrag-write (1832871,183.9)
Dec 11 2017 14:16:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 32565940992 free-wblocks 415998 write-q 0 write (3895567,309.6) defrag-q 55972 defrag-read (3600686,0.9) defrag-write (1840728,309.6)
Dec 11 2017 14:16:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 62990220544 free-wblocks 69788 write-q 0 write (3961827,183.7) defrag-q 152222 defrag-read (3971218,0.4) defrag-write (1836545,183.7)
Dec 11 2017 14:16:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 32801846912 free-wblocks 417097 write-q 0 write (3901745,308.9) defrag-q 48710 defrag-read (3600700,0.7) defrag-write (1846906,308.9)
Dec 11 2017 14:17:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 62756865536 free-wblocks 75398 write-q 0 write (3965489,183.1) defrag-q 142962 defrag-read (3971231,0.6) defrag-write (1840207,183.1)
Dec 11 2017 14:17:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 33035008128 free-wblocks 418198 write-q 0 write (3907868,306.1) defrag-q 41504 defrag-read (3600718,0.9) defrag-write (1853029,306.1)
Dec 11 2017 14:17:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 62520854784 free-wblocks 81048 write-q 0 write (3969169,184.0) defrag-q 133640 defrag-read (3971239,0.4) defrag-write (1843887,184.0)
Dec 11 2017 14:17:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 33271071104 free-wblocks 419286 write-q 0 write (3914029,308.0) defrag-q 34272 defrag-read (3600736,0.9) defrag-write (1859190,308.0)
Dec 11 2017 14:17:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 62286831872 free-wblocks 86694 write-q 0 write (3972870,185.1) defrag-q 124307 defrag-read (3971253,0.7) defrag-write (1847588,185.1)
Dec 11 2017 14:17:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 33504822528 free-wblocks 420390 write-q 0 write (3920183,307.7) defrag-q 27035 defrag-read (3600757,1.0) defrag-write (1865344,307.7)
Dec 11 2017 14:18:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 62055264256 free-wblocks 92308 write-q 0 write (3976573,185.1) defrag-q 114998 defrag-read (3971260,0.3) defrag-write (1851291,185.1)
Dec 11 2017 14:18:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 33736587264 free-wblocks 421501 write-q 0 write (3926310,306.4) defrag-q 19826 defrag-read (3600785,1.4) defrag-write (1871471,306.4)
Dec 11 2017 14:18:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 61824949760 free-wblocks 97896 write-q 0 write (3980275,185.1) defrag-q 105724 defrag-read (3971276,0.8) defrag-write (1854993,185.1)
Dec 11 2017 14:18:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 33967003776 free-wblocks 422647 write-q 0 write (3932445,306.8) defrag-q 12564 defrag-read (3600805,1.0) defrag-write (1877606,306.8)
Dec 11 2017 14:18:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 61588534016 free-wblocks 103549 write-q 0 write (3983952,183.9) defrag-q 96404 defrag-read (3971286,0.5) defrag-write (1858670,183.9)
Dec 11 2017 14:18:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 34203251968 free-wblocks 423772 write-q 0 write (3938653,310.4) defrag-q 5248 defrag-read (3600822,0.9) defrag-write (1883814,310.4)
Dec 11 2017 14:19:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 61358507264 free-wblocks 109163 write-q 0 write (3987683,186.6) defrag-q 87070 defrag-read (3971297,0.6) defrag-write (1862401,186.6)
Dec 11 2017 14:19:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 34433026176 free-wblocks 424110 write-q 0 write (3943583,246.5) defrag-q 0 defrag-read (3600841,0.9) defrag-write (1888744,246.5)
Dec 11 2017 14:19:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 61126012672 free-wblocks 114780 write-q 0 write (3991366,184.1) defrag-q 77779 defrag-read (3971306,0.4) defrag-write (1866084,184.1)
Dec 11 2017 14:19:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 34665224576 free-wblocks 422280 write-q 0 write (3945418,91.8) defrag-q 0 defrag-read (3600846,0.2) defrag-write (1890579,91.8)
Dec 11 2017 14:19:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 60903349888 free-wblocks 120223 write-q 0 write (3994998,181.6) defrag-q 68717 defrag-read (3971320,0.7) defrag-write (1869716,181.6)
Dec 11 2017 14:19:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 34887782144 free-wblocks 420529 write-q 0 write (3947178,88.0) defrag-q 0 defrag-read (3600855,0.4) defrag-write (1892339,88.0)
Dec 11 2017 14:20:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 60696712320 free-wblocks 125314 write-q 0 write (3998421,171.1) defrag-q 60213 defrag-read (3971329,0.4) defrag-write (1873139,171.1)
Dec 11 2017 14:20:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35094744960 free-wblocks 418899 write-q 0 write (3948816,81.9) defrag-q 0 defrag-read (3600863,0.4) defrag-write (1893977,81.9)
Dec 11 2017 14:20:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 60485303296 free-wblocks 130488 write-q 0 write (4001880,172.9) defrag-q 51591 defrag-read (3971341,0.6) defrag-write (1876598,172.9)
Dec 11 2017 14:20:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35305846656 free-wblocks 417236 write-q 0 write (3950483,83.3) defrag-q 0 defrag-read (3600867,0.2) defrag-write (1895644,83.3)
Dec 11 2017 14:20:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 60272513280 free-wblocks 135735 write-q 0 write (4005415,176.8) defrag-q 42820 defrag-read (3971352,0.6) defrag-write (1880133,176.8)
Dec 11 2017 14:20:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35518842240 free-wblocks 415559 write-q 0 write (3952164,84.1) defrag-q 0 defrag-read (3600871,0.2) defrag-write (1897325,84.1)
Dec 11 2017 14:21:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 60061075968 free-wblocks 140926 write-q 0 write (4008888,173.6) defrag-q 34173 defrag-read (3971368,0.8) defrag-write (1883606,173.6)
.
.
.
Dec 11 2017 14:21:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809112832 free-wblocks 413270 write-q 0 write (3954460,31.2) defrag-q 0 defrag-read (3600878,0.1) defrag-write (1899621,31.2)
Dec 11 2017 14:21:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980192128 free-wblocks 148118 write-q 0 write (4017821,234.4) defrag-q 18076 defrag-read (3971397,0.7) defrag-write (1892539,234.4)
Dec 11 2017 14:21:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809056384 free-wblocks 413270 write-q 0 write (3954461,0.1) defrag-q 0 defrag-read (3600879,0.1) defrag-write (1899622,0.1)
Dec 11 2017 14:22:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980192128 free-wblocks 151346 write-q 0 write (4022519,234.9) defrag-q 10160 defrag-read (3971407,0.5) defrag-write (1897237,234.9)
Dec 11 2017 14:22:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809056384 free-wblocks 413270 write-q 0 write (3954461,0.0) defrag-q 0 defrag-read (3600879,0.0) defrag-write (1899622,0.0)
Dec 11 2017 14:22:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980192128 free-wblocks 154574 write-q 0 write (4027228,235.4) defrag-q 2240 defrag-read (3971423,0.8) defrag-write (1901946,235.4)
Dec 11 2017 14:22:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809056384 free-wblocks 413270 write-q 0 write (3954461,0.0) defrag-q 0 defrag-read (3600879,0.0) defrag-write (1899622,0.0)
Dec 11 2017 14:22:43 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980140800 free-wblocks 155495 write-q 0 write (4028558,66.5) defrag-q 0 defrag-read (3971434,0.6) defrag-write (1903276,66.5)
Dec 11 2017 14:22:43 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809007360 free-wblocks 413272 write-q 0 write (3954462,0.1) defrag-q 0 defrag-read (3600882,0.2) defrag-write (1899623,0.1)
Dec 11 2017 14:23:03 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980140800 free-wblocks 155495 write-q 0 write (4028558,0.0) defrag-q 0 defrag-read (3971434,0.0) defrag-write (1903276,0.0)
Dec 11 2017 14:23:03 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809007360 free-wblocks 413272 write-q 0 write (3954462,0.0) defrag-q 0 defrag-read (3600882,0.0) defrag-write (1899623,0.0)
Dec 11 2017 14:23:23 GMT: INFO (drv_ssd):... /dev/sds: used-bytes 59980140800 free-wblocks 155495 write-q 0 write (4028558,0.0) defrag-q 0 defrag-read (3971434,0.0) defrag-write (1903276,0.0)
Dec 11 2017 14:23:23 GMT: INFO (drv_ssd):... /dev/sdu: used-bytes 35809007360 free-wblocks 413272 write-q 0 write (3954462,0.0) defrag-q 0 defrag-read (3600882,0.0) defrag-write (1899623,0.0)
As soon as the defrag-lwm-pct was increased from 55 to 60, the defrag-q surged to 87K. The defragmentation activity (defrag-read/defrag-write) went to 0.0 after about 8 minutes. And the used_bytes for /dev/sds decreased from 64159201152 to 59980140800 while /dev/sdu increased from 31632894720 to 35809007360. It is still unbalanced but it is much better than before.
Notice that the defrag-write is quite high (333.6 blocks per sec) which could affect client applications latency on the regular read/write traffic. To minimize such impact, defrag-sleep could be increased to slow down the defragmentation reads.
After recovery, defrag-lwm-pct should set back to 50% for a healthy 2x write amplification.