Socket auto disconnect after inaction

georgbat wrote on Thursday, May 16, 2019:

I want configure auto disconnection time for TCP stack. In my config file there are 2 values which describe time - ipconfigTCP_HANG_PROTECTION_TIME=300, ipconfigTCP_KEEP_ALIVE_INTERVAL=20.
But my observations shows that disconnect occures ather 90 secs.
Can you suggest how to manage this time in controlled way, not by random selection.

heinbali01 wrote on Thursday, May 16, 2019:

Hi George, let me first ask what you are creating? Is it a TCP server?
Is it a web-server, or a server that responds to questions?
Do I understand correctly that you want to disconnect automatically from a client, once the client doesn’t seem to respond? Are those clients somewhere on the web?

Normally, and in most cases, a TCP connection comes to an end because of a shutdown. All operating systems are programmed to finish the last steps correctly (FIN, FIN+ACK, ACK).

So, using keep-alive messages is an emergency thing, normally only useful when the peer is on the Internet.

I will summarise the two options that you mention:

ipconfigTCP_HANG_PROTECTION_TIME (seconds)
when the socket is not yet connected, we can not send keep-alive packets. During that state, a simple timer is used to protect the socket from staying in that phase for ever.
Normally, this timer shouldn’t do much: an active connect is limited in time: it does 3 attempts, and after the last attempt, it gets into an error state eCLOSE_WAIT.

ipconfigTCP_KEEP_ALIVE_INTERVAL (seconds)
when a connection is idle ( i.e. no data are being sent or received ), the socket will send keep-alive packets. It has a limited use: it is not supported by all routers or TCP clients.
The timing of this process has some randomness: as long as data is being exchanged, the socket is happy. Only after a total silence of ipconfigTCP_KEEP_ALIVE_INTERVAL seconds, the socket will send a keep-alive message.
A second and third keep-alive message will be sent, each with a fixed time-out of 3 seconds( Yes, hard-coded ).

Note that the low-level keep-alive messages are sent to the other OS, and not to the client. Maybe the client is dead/non-functional, while the OS still responds to the keep-alive messages. In that case, you may get a false OK.

I’don;t know if it’s possible in your case: a better solution is to check the peer’s status on a higher level: by actively polling the client, sending a message of the type “are you alive?”, to which is must respond.
Or make the client responsible: let it send a poll message at least every N seconds.

The best way to test/understand the “keep-alive” behaviour is to create some logging.

georgbat wrote on Thursday, May 16, 2019:

Thanks, Hein!
I am creating ftp server. It has limited number of active conections (equal 3). When a network is bad, I emulate this by router which downs/ups interfaces in random moments, client and server don’t know that connection was lost because it happened in the “internet”, not by sending (FIN, FIN+ACK, ACK).
FTP client has an option which defines timeout of inactivity. After the timeout client will try to reconnect. If this timeout less than time of “auto disconnect” and the network turnes off regulary than one client blocks all provided connections (one is realy active, and others are “freezing” while auto disconnect won`t occure).
I can set this option in client equal 90 but I think that 90 seconds is too mutch.
Do you have any advice for me?