Context
When a new remote destination namespace is added to a cluster (XDR configuration) it is necessary to populate that destination with data. This article discusses the two methods that are available within Aerospike 5.x and higher to achieve this.It is possible to use backup and restore to create a baseline data load for the remote cluster and use XDR to migrate the delta in real time. This method is discussed in How to Migrate Aerospike from one Cluster to Another,Method
Method 1
The first method available uses the XDR rewind functionality. Rewind can be used to ship the whole namespace, only a specific set or it can be done from a specific time stamp . The How to rewind XDR for a namespace? article describes how to use rewind.
When using the rewind feature, XDR will reduce the namespace partitions and ship records as it encounters them. This means that the shipping volume is controlled by the overall XDR velocity controls.
This implies that the following points should be considered:
-
If the xdr ship latency is constant i.e latency being close to the actual network link latency between source and destination, then the throughput can be increased by increasing the xdr
max-throughputparameter (if it was previously configured). The XDRlatency_msmetric can directly be monitored across a cluster through theinfo xdrcommand onasadm. -
If
latency_msstarts increasing, it indicates that the network or destination is likely being overloaded. At this point, the XDR throughput should be reduced via themax-throughputconfiguration parameter. -
For versions 5.5 and above, the max-recoveries-interleaved configuration parameter can be used to control how recoveries are done. This can be useful if recoveries keep occurring because the time it takes to complete a round for a partition is longer than it takes to have the XDR transaction queue fill up.
Method 2
The second method is to use a UDF. Please see How to create a scan and touch UDF. This is a UDF which is used to touch the records in a set / touch those records which are updated after a certain LUT. When the record is touched it will automatically be added to the XDR transaction queue for subsequent shipping. A touch does not affect the contents of the record, but will update its metadata (specifically, the LUT, at the record level as well as at the bin level, if configured through the conflict-resolve-writes parameter). The UDF scans and traverses the whole index and processes either all records, those in a specific set or those with a specific LUT. Aerospike Expressions can be used with UDFs.
Using this UDF method, the XDR shipping rate can also be controlled using the max-throughput parameter. Additionally, the single-scan-threads and background-scan-max-rps can be used to control the pace of the scan itself.
As the second method uses the same XDR controls as before but can also throttle using the scan controls described above, this means that there is slightly more control over shipping velocity, but at the price of tempering with the record's LUT(s).
Notes
- It is possible to use backup and restore to create a baseline data load for the remote cluster and use XDR to migrate the delta in real time. This method is discussed in How to Migrate Aerospike from one Cluster to Another,