Problem Description
A stand alone which cluster which is neither an XDR source nor an XDR destination can throw this exception. The exception in itself looks mis-leading as there is no XDR involved here.This exception is ERROR-28 AS_ERR_LOST_CONFLICT, which indicates that the client write lost to a conflicting write by XDR. This can happen when bin convergence is on and XDR wrote a later version of the record compared to the client write.
There is one more case where this can happen even when there is no XDR involved.
Explanation
Here are the conditions for this to happen in a standalone cluster.- conflict-resolve-writes is set to true.
- Namespace is SC.
- XDR is not configured.
If conflict-resolve-writes is set to true and the namespace is SC, but XDR is not configured, then we do not introduce this delay, and let the writes go through, which is when we see this error.
This is specific to SC only, if the namespace is AP with rest of the parameters as it is, one would not see this.
In short, these are the points.
-
In SC, if an update comes in and the LUT we would see is earlier than the LUT of the existing record, we just wait and we wouldn’t let it through so, we re-queue internally until the current time is equal or greater than the time of the previous record’s LUT (or we time out if we wait too long).
-
If the namespace was configured to ship via XDR, we would wait and not ship if the LUT of the bins we would be shipping match the previously existing LUT.
-
In this case, we don’t have XDR configured, so we would let records go through with the same LUT as previously for the bins.
If we end up trying to write LUT for a bin with the same as previously, we fail if conflict-resolve-writes is turned on.
So, this would only happen if XDR is not configured to ship, while running in SC and having conflict-resolve-writes turned on and even without clock going back simply having two writes at the same millisecond.