Context
Service threads play a pivotal role in the seamless management of incoming client traffic demarshalling. When these threads become overwhelmed or stalled, it triggers a backlog in the network queue, potentially leading to service thread starvation. This article delineates the steps necessary to identify and confirm instances of service thread starvation.
Additionally, if you count the number of CLOSE_WAIT states, observe an increase in such occurrences without corresponding ESTABLISHED connections. This is an indication that clients attempted to make new connections but failed, leaving a significant number of CLOSE_WAIT states in the netstat output.
Method
A reliable method to assess service thread performance involves running a script to capture netstat data every 10 seconds on the server node(s). The primary focus during this analysis should be on monitoring the Recv-Q parameter in the netstat output.
Here is a sample netstat output between the server at port 3010 and client at port 40048:
grep
Tue Oct 10 20:50:52 UTC 2023
tcp 0 0 192.168.202.18:3010 192.168.34.194:40048 ESTABLISHED 1/asd
Tue Oct 10 20:51:02 UTC 2023
tcp 2163 0 192.168.202.18:3010 192.168.34.194:40048 ESTABLISHED 1/asd
Tue Oct 10 20:51:12 UTC 2023
tcp 2163 0 192.168.202.18:3010 192.168.34.194:40048 ESTABLISHED 1/asd
Tue Oct 10 20:51:22 UTC 2023
tcp 2163 0 192.168.202.18:3010 192.168.34.194:40048 ESTABLISHED 1/asd
Tue Oct 10 20:51:32 UTC 2023
tcp 2195 0 192.168.202.18:3010 192.168.34.194:40048 CLOSE_WAIT 1/asd
In instances where service threads encounter difficulties demarshalling the queue, the Recv-Q values ascend from 0 to 2163 and peak at 2195. The network state transitions from ESTABLISHED to CLOSE_WAIT. This transition signifies that the client likely experienced a timeout and initiated the closure of the existing connection.