Articles in this section

How to check partition ownership per rack?

Context

Older version of Aerospike had an issue were rackware rules for master and replica could get broken. In this scenario master and prole replica for a specific partition could end up on the same rack. (FIX in AER-6726)

Method

We can get a dump of partition-info from each node to verify partition-id ownership.

Steps:

1) Get partition-info from each node and save to individual file. File should have the rack-id number of the node.
2) Run following script to count duplicate partition IDs, per namespace, per rack.
 

$  egrep 'S:2:1|S:2:0' T* |awk -F  ':' '{print $3,$1,$2,$4":"$5":"$6}'|grep rack_id_2|awk '{print $1" "$4}'|sort|uniq -c|sort -k 2 -n > rack_id_2_partitions.txt

$  egrep 'S:2:1|S:2:0' T* |awk -F  ':' '{print $3,$1,$2,$4":"$5":"$6}'|grep rack_id_1|awk '{print $1" "$4}'|sort|uniq -c|sort -k 2 -n > rack_id_1_partitions.txt


or using namespace count number:
 

 egrep 'S:2:1|S:2:0' TIGER* |awk -F ':' '{print $3,$1,$2,$4":"$5":"$6}'|grep rack_id_1|awk '{print $1" "$4}'|sort|uniq -c|sort -k 2 -n |awk '{if ($1 > NAMESPACECOUNT) print $0}'

 

 egrep 'S:2:1|S:2:0' TIGER* |awk -F ':' '{print $3,$1,$2,$4":"$5":"$6}'|grep rack_id_2|awk '{print $1" "$4}'|sort|uniq -c|sort -k 2 -n |awk '{if ($1 > NAMESPACECOUNT) print $0}'


The first column count should be equal to the namespace count. (ie: 4 for four namespaces)
 


Notes

Internal KB for AER-6726

Reproduction environment:

All is needed is a 5 node cluster with 3 racks: 

Rack 1 node-ids: A1,A2,A3
Rack 2 node-id : B1
Rack 3 stay-quiesced true : C1


Screenshot 2024-04-02 at 10.20.01 AM 
  • No data is needed in the cluster
  • quiesced A2 and waited for migrations while C1 is permanently quiesced.
  • At the end 269 partitions had both master and replica on Rack 1.

Applies To Earliest Version

5.0

Applies To Latest Version

6.0
Was this article helpful?
0 out of 0 found this helpful