Problem Description
xdr_client_write_error is consistently increasing on the destination cluster after enabling bin convergence on the destination cluster.
Admin> show stat like client_write_error -flip
~test Namespace Statistics (2022-10-10 16:49:43 UTC)~
Node|client_write_error|xdr_client_write_error
dc6-1:3000| 10| 10
Number of rows: 1
Server log errors on destination cluster:
WARNING (rw): (write.c:939) {test} write_master: can't replace record c1ed607e1b3ccfc10fe746ee42bdbbbcdd3a1b7c if conflict resolving
WARNING (rw): (rw_utils_ee.c:530) unexpected src-lut 0
Server log errors on source cluster:
WARNING (xdr): (dc.c:2628) {test} DC dc6 abandon result 22
Explanation
If bin convergence is only enabled on the destination cluster, “can’t replace record…if conflict resolving” and “unexpected src-lut 0” warnings will show in the destination logs. xdr_client_write_errors will also increase. “can’t replace record…if conflict resolving” warning happens because when bin convergence is enabled, the client cannot do writes with record replace policy. Please see our documentation on Record replace not allowed for more details.
The source cluster would get “abandon result 22” because without bin convergence enabled, it wouldn’t have a src-lut.
NOTE: A bin-convergence cluster can ship to a non-bin-convergence cluster without any issues. Only shipping from a non-bin-convergence cluster to a bin-convergence cluster would result in these warnings and increased xdr_client_write_errors.
Solution
The only way to resolve these errors is to have bin convergence enabled on both source and destination. If the cluster is set up in an active-active topology, you would need to enable it on both the source and destination.
There are two ways to enable bin convergence, statically and dynamically.
The following examples show the minimum configurations you would need to enable bin convergence.
Statically
namespace test {
// other namespace settings...
conflict-resolve-writes true
// other namespace settings...
}
...
xdr {
src-id 1
dc dc5 {
node-address-port 172.17.0.2 3000
namespace test {
bin-policy only-changed // or changed-and-specified or changed-or-specified
ship-bin-luts true
write-policy update
// more xdr settings ...
}
}
// more xdr settings ...
}
Dynamically
The following changes should be done in the order listed below. For example, trying to set ship-bin-luts to true first would cause errors such as
WARNING (xdr): (info.c:769) can't set 'ship-bin-luts' true if xdr context 'src-id' is 0
WARNING (xdr): (info.c:774) can't set 'ship-bin-luts' true if bin policy is 'all'
src-id: valid numbers 0-255; each cluster must have a unique src-id
asinfo -v "set-config:context=xdr;src-id=1"
bin-policy: must be set to only-changed, changed-and-specified, or changed-or-specified
asinfo -v "set-config:context=xdr;dc=DC1;namespace=nameSpaceName;bin-policy=only-changed"
write-policy: must be set to update
asinfo -v "set-config:context=xdr;dc=DC1;namespace=nameSpaceName;write-policy=update"
ship-bin-luts: must be set to true
asinfo -v "set-config:context=xdr;dc=DC1;namespace=namespaceName;ship-bin-luts=true"
conflict-resolve-writes: must be set to true
asinfo -v "set-config:context=namespace;id=namespaceName;conflict-resolve-writes=true"
NOTE: If you plan to enable bin convergence, any clients writing to those clusters would also need their write policies to be set to update to prevent the ‘can’t replace record…if conflict resolving’ error from occurring.