Articles in this section

Asrestore failing with an error "Max key-put retries exceeded (5)"

Problem Description

Asrestore is failing with the error Max key-put retries exceeded (5)” 

Command used for asrestore:
asrestore --host <IP Address> --directory /<directory-path> --namespace <namespace name>

Explanation

Since the asrestore output does not give any other errors except “Max key-put retries exceeded (5)” , even with the --verbose option, further debugging was enabled using a dev build to get more detailed asrestore log output.

In the detailed log output, following errors can be observed:

Example asrestore output with detail logging:
2022-12-28 13:21:11 GMT [VER] [13000] Retrying unhandled write error - code -6: Socket create failed: -1 BB96B0016AC0142 172.22.0.107:3000 at src/main/aerospike/as_event_uv.c:1400
2022-12-28 13:21:11 GMT [INF] [13032] Restoring /mnt/userprofile_00348.asb
2022-12-28 13:21:11 GMT [VER] [13032] Opening backup file /mnt/userprofile_00348.asb
2022-12-28 13:21:11 GMT [VER] [13000] Retrying unhandled write error - code -6: Socket create failed: -1 BB96B0016AC0142 172.22.0.107:3000 at src/main/aerospike/as_event_uv.c:1400
2022-12-28 13:21:11 GMT [VER] [13000] Retrying unhandled write error - code -6: Socket create failed: -1 BB96B0016AC0142 172.22.0.107:3000 at src/main/aerospike/as_event_uv.c:1400
2022-12-28 13:21:11 GMT [ERR] [13032] Failed to open file /mnt/userprofile_00348.asb in "r" mode (error 24: Too many open files)
2022-12-28 13:21:11 GMT [ERR] [13032] Error while opening backup file
2022-12-28 13:21:11 GMT [VER] [13000] Retrying unhandled write error - code -6: Socket create failed: -1 BB96B0016AC0142 172.22.0.107:3000 at src/main/aerospike/as_event_uv.c:1400
2022-12-28 13:21:11 GMT [VER] [13000] Retrying unhandled write error - code -6: Socket create failed: -1 BB96B0016AC0142 172.22.0.107:3000 at src/main/aerospike/as_event_uv.c:1400
The original failure Max key-put retries exceeded (5)” and the too many open files error are related.

The linux default per process file descriptors is 1024 and a maximum of 32 * 16 async connections can be used (each one a socket) (formula is max-async-batches * batch-size if server version < 6.0, default for max-async-batches is 32 and batch-size is 16).

If the number of backup files are more, it is possible that asrestore can exceed the maximum open file descriptors. Maximum FD reached would cause the C client to fail to open new sockets, resulting in error -6 (AEROSPIKE_ERR_ASYNC_CONNECTION) which asrestore retries, resulting in the max-retry exceeded error.
 

Solution

To resolve this, check the per process file descriptor limit and raise it for asrestore if that is a problem.

The appropriate number for pre 6.0 servers should be number_of_backup_files + (max-async-batches * batch-size). For >= server 6.0 number_of_backup_files + max-async-batches. It is better to give a little over the result of that formula just to be safe. Or use fewer larger backup files to restore from. 


Applies To Earliest Version

Pre 4.9

Applies To Latest Version

Current Version
Was this article helpful?
0 out of 0 found this helpful