Articles in this section

Server: Bad File Descriptor or similar error potentially leading to a crash (SIGUSR1, SIGTERM, SIGKILL) when using LDAP

Problem Description

Environment

This is only applicable in environments that use pre-2.5.17 or pre-2.6.7 OpenLDAP versions.

Symptoms

An error similar to the following is observed in the logs:

Oct 20 2024 11:38:44 GMT-0500: CRITICAL (socket): (socket.c:1642) Error while adding FD 2817 to epoll instance 1105: 9 (Bad file descri
ptor)

Potentially followed by a crash with a SIGUSR1/SIGTERM/SIGKILL signal.

The following errors may also be observed on the affected nodes as well as other nodes in the cluster:

Oct 20 2024 11:38:44 GMT-0500: WARNING (security): (ldap_ee.c:703) error binding to ldap for user <...>: -1 (Can't contact LDAP server) 

Explanation of the issue:
The issue is caused by certain older versions of OpenLDAP (pre-2.5.17 and pre-2.6.7). If a connection to an LDAP server fails, the affected versions of OpenLDAP attempt to close that connection’s File Descriptor (FD) twice. Under most circumstances the second attempt to close the FD will fail silently without causing any issues. However, if the same FD gets reused between the first and the second attempts to close it, it will cause the issue described here.

Here is an example:

  • OpenLDAP uses FD 1234 for its connection to the LDAP server.

  • OpenLDAP closes FD 1234. This makes FD 1234 available to be handed out to another asd thread.

  • A new client connection is accepted by asd. FD 1234 gets used for that client connection.

  • OpenLDAP (erroneously) closes FD 1234 again due to the internal OpenLDAP issue. At this point, FD 1234 is associated with a new valid client connection.

  • When the attempt is made to assign the client connection’s FD 1234 to a service thread, we fail with the error code 9 - Bad file descriptor.

Fix:
Upgrade OpenLDAP to 2.5.17 (or a later 2.5.x release) or to 2.6.7 (or a later 2.6.x) release. See here for details.
Alternatively, use another solution such as Identity Management or Red Hat Directory Server. 

Workaround:
It may be possible to avoid this issue by ensuring that only one LDAP server is specified in the Aerospike server configuration.


Was this article helpful?
0 out of 0 found this helpful