Socket taking 40s to close

I follow the recommended practice to close a socket by performing the following:

FreeRTOS_shutdown( xSocket, FREERTOS_SHUT_RDWR );

for( ;; )
    {
    BaseType_t xResult;

        xResult = FreeRTOS_recv( xSocket, pcBuffer, sizeof( pcBuffer ), 0 );
        if( ( xResult < 0 ) && ( xResult != -pdFREERTOS_ERRNO_EAGAIN ) )
        {
            break;
        }
		if( xResult > 0 )
		{
			/* Do something with the data received. */
		}
    }

FreeRTOS_closesocket xSocket )

What I have found is that FreeRTOS_recv returns 0 for 40 seconds before a code that breaks out. I was expecting it to complete on the order of milliseconds.

Any ideas?

@blanco-ether , welcome to the FreeRTOS forum!

You wrote:

I was expecting it to complete on the order of milliseconds.

I would expect so too. When you initiate a shutdown, then the peer should respond within a few ms.

A perdiod of 40 seconds makes me think that the peer is unreachable, or maybe the peer has closed the socket already.

Have a look at these config parameters:

/* Include support for TCP hang protection
 * All sockets in a connecting or disconnecting stage
 * will timeout after a period of non-activity
 */

#	define ipconfigTCP_HANG_PROTECTION         ( 1 )
#	define ipconfigTCP_HANG_PROTECTION_TIME    ( 30 ) /* in seconds */

/* Include support for TCP keep-alive messages */
#	define ipconfigTCP_KEEP_ALIVE              ( 1 )
#	define ipconfigTCP_KEEP_ALIVE_INTERVAL     ( 20 ) /* in seconds */

ipconfigTCP_HANG_PROTECTION_TIME is used during the connection phase.
ipconfigTCP_KEEP_ALIVE_INTERVAL is used once connected to the peer.

If your DUT is communicating with a laptop, you can monitor the TCP conversation on the laptop using a program like WireShark.

Please let us know what caused the slow shutdown.

Hein,
The config parameters are defined with the same values.

I am working on a project that will provide a Web server for
configuration. The Web server must be enabled through a physical
user interface and will time out and shut down if no connection is
made within a few minutes. The same will happen if the user connects
but fails to enter a one time password that the physical UI presents.
The server can also be shut down on demand through the physical UI.

The server code is based on the FreeRTOS HTTP demo. I also added
a websocket implementation. Everything is working and we can configure
our product as desired. The only remaining problem is that if I shut down the
server on demand I experience the issue I outlined.

As I said the HTTP server is based on the demo. The listener is created and bound as in the
demo. The following options are employed as in the demo.
BaseType_t xNoTimeout = 0;
FreeRTOS_setsockopt(xSocket, 0, FREERTOS_SO_RCVTIMEO, (void*) &xNoTimeout,
sizeof(BaseType_t));
FreeRTOS_setsockopt(xSocket, 0, FREERTOS_SO_SNDTIMEO, (void*) &xNoTimeout,
sizeof(BaseType_t));

  WinProperties_t xWinProps;
  
   memset( &xWinProps, '\0', sizeof(xWinProps));
   /* The parent socket itself won't get connected.  The properties below
     will be inherited by each new child socket. */
   xWinProps.lTxBufSize = ipconfigHTTP_TX_BUFSIZE;
   xWinProps.lTxWinSize = ipconfigHTTP_TX_WINSIZE;
   xWinProps.lRxBufSize = ipconfigHTTP_RX_BUFSIZE;
   xWinProps.lRxWinSize = ipconfigHTTP_RX_WINSIZE;
  
   /* Set the window and buffer sizes. */
  FreeRTOS_setsockopt(xSocket, 0, FREERTOS_SO_WIN_PROPERTIES, (void*) &xWinProps,
      sizeof(xWinProps));

I am not re-using the listener, the same one accepts a new client connection. 2 - 4 client connections (sockets) may be opened when a browser connects depending on if the one time password is entered correctly. There is a 1s wait on select at which time a variable is checked for shutdown.

The server is shut down within the task that manages the web server (listener, clients, etc.)
The server task shuts down as follows:

Listener

      FreeRTOS_FD_CLR( listenerSocket, xSocketSet, eSELECT_ALL );
      FreeRTOS_shutdown(listenerSocket, FREERTOS_SHUT_RDWR);
      FreeRTOS_closesocket(listenerSocket);

Followed by each client in sequence:
for each client connection…

{
      FreeRTOS_FD_CLR(xSocket, xSocketSet, eSELECT_ALL);
      FreeRTOS_shutdown(xSocket, FREERTOS_SHUT_RDWR);
      while(1)
                {
                  unsigned buf;
                  int rc;
                  rc = FreeRTOS_recv(pxTCPClient->xSocket, &buf, 1, 0);
                  if((rc < 0) && (rc != -pdFREERTOS_ERRNO_EWOULDBLOCK))
                    {
                      break;
                    }
      
                  vTaskDelay(125);
                }
      
      FreeRTOS_closesocket(xSocket);
}

I uploaded a pcap. At the end of the capture a websocket closes almost immediately
because the one time password is incorrect. A few seconds later I accessed the physical
UI to shut down the server that had 2 connections remaining from HTTP. One takes 7
seconds to shutdown. The other takes 40 seconds to shutdown. By shutdown I mean for
FreeRTOS_recv to return in error.

These sockets usually take exactly 40s to shutdown. They were less in this capture for some reason.

Any help greatly appreciated.
long_socket_shutdown.7z (259.0 KB)

Hi @blanco-ether,
Upon analyzing the pcap file you provided, it appears that one of the clients acknowledged the FIN packet from the server after a delay of 96 seconds.

Here is the analysis:


Regarding the analysis of port 15990, it appears that port 7681 sent a FIN packet at timestamp 84.740, but the response with a FIN packet from port 15990 was delayed until timestamp 180.795. This observation suggests that the client stack may not send FIN packet properly, leading to a delayed wake-up and response.

Could you please check if the FreeRTOS_shutdown() function is being called correctly by the clients? It would be helpful if you could elaborate on the timing of when this function is invoked in relation to the FIN packet.

Thank you.

Thank you for taking a look Hein and ActoryOu.

Our application implements a Web server that can be accessed for a limited time for the purpose of configuring our device. The Web server client app consists of only 6 files and takes about 5s to load in a browser. The client browser typically opens 2 sockets to load the files in parallel. Our application will shut down the server it implements if it detects a period of inactivity or can also be shut down on demand through a physical user interface on our device.

The Wireshark screen shots that follow show the Google Chrome loading the web app followed by the sequence that we typically see when we shut down the server on demand through the device physical user interface. The ip 10.219.1.118 is the browser, 10.219.1.130 is our device. Two sockets are opened by Chrome in the screen shot below.

The screen shot below shows that the web app loaded in 5 seconds. The Web server was shut down through the device user interface at entry 541. The browser doesn’t send back FIN, ACK till 19s later at which time FreeRTOS_recv finally receive returns negative and shutdown is called on the next socket in entry 545. That one takes about 31s to return negative calling FreeRTOS_recv.

Quite often there is 2 or more sequences of [TCP Keep-Alive]/[TCP Keep-Alive ACK] between the browser and our server after FreeRTOS_shutdown is called and also FreeRTOS_recv returns negative exactly 40s later.

We can’t figure out if this is normal behavior of a browser dragging along after we call FreeRTOS_shutdown or something is wrong on our end.

Interestingly, our web application will open a websocket (not shown) after a one time passsword is entered correctly. The websocket socket always closes in milliseconds however closing a websocket is a more well defined sequence as the server issues a close header followed by closing the socket.

Any ideas are greatly appreciated.

Thanks

IMO, the behavior of the TCP stack is as expected. The FreeRTOS_recv() call after FreeRTOS_shutdown() is intended to check if the remote peer sends a FIN packet to gracefully close the connection. If no packet is received from the remote peer, the FreeRTOS_recv() function will block if you’re using the blocking mode.

Note that even after shutting down the socket handler, it’s still possible to receive some remaining packets through FreeRTOS_recv(). This is because there might be packets that were already in transit before the shutdown occurred.

Thank you.

You cannot control when the client (browser in this case) decides to send FIN. You mentioned that it is happening when you terminate the server. Is it hampering your application performance? If yes, you can chose to close the socket right away (instead of a graceful shutdown) assuming that you do not care to receive the data from the browser anymore.

Thank you for clarifying. We somehow assumed that FIN ACK would be issued promptly from the browser as the thought was that this response was generated automatically by a TCP stack. I see now that it is probably sent when shutdown is called on the socket on the browser side and the browser isn’t monitoring open sockets closely.

Thank you also for clarifying that the only reason to loop on FreeRTOS_recv after shutdown is specifically to eliminate the risk of losing data. It wasn’t clear that It is safe and acceptable to call FreeRTOS_close if that isn’t a concern.

The long close wasn’t a major issue in our application other than it was taking a long time for our user interface to indicate that the server was shut down as it didn’t indicate so until the last FreeRTOS_close was issued.

Thanks again!

I think that you can update the UI right after issuing shutdown (which I guess is done in a separate task) and then continue the graceful close sequence.