Sockets end in state eCLOSE_WAIT

Hi Guys,

I’m running an http-Server based on FreeRTOS and FreeRTOS-TCP Version2.3.2-LTS-Patch 2. The server serves a REST-Api and the queries come about every 30ms. On some nodes I have the problem that after several hours or days all sockets end in the state „ eCLOSE_WAIT “.

I’ve read the post about “FreeRTOS+TCP : Why do sockets stay in the status eCLOSE_WAIT ?” (no links allowed) and tried to adapt the vHTTPClientDelete-function accordingly so that these sockets are actually closed, but this does not seem to work reliably.

Is there another way to safely close these sockets or another way to solve the problem?
Can I directly check the socket state and close it if “eCLOSE_WAIT”?

Alex.

Here also the netstat output:

Arp  1:  23 -          ab2a8c0ip : 98:28:a6 : 2e:f7:d5
Arp has 1 entries

Prot Port IP-Remote       : Port  R/T Status       Alive  tmout Child
TCP 15001 0               ip:    0 0/0 eTCP_LISTEN        0      0 0/3
TCP    80 0               ip:    0 0/0 eTCP_LISTEN        0      0 12/12
TCP    80 c0a8b20a        ip:64497 1/1 eESTABLISHED       0      0
TCP    80 c0a8b20a        ip:64715 1/0 eCLOSE_WAIT        0   6057
TCP    80 c0a8b20a        ip:64934 1/0 eCLOSE_WAIT        0  13806
TCP    80 c0a8b20a        ip:65370 1/0 eCLOSE_WAIT        0   2393
TCP    80 c0a8b20a        ip:49423 1/0 eCLOSE_WAIT        0  15737
TCP    80 c0a8b20a        ip:52490 1/0 eCLOSE_WAIT        0   8294
TCP    80 c0a8b20a        ip:52923 1/0 eCLOSE_WAIT        0  17762
TCP    80 c0a8b20a        ip:53355 1/0 eCLOSE_WAIT        0   5738
TCP    80 c0a8b20a        ip:58779 1/0 eCLOSE_WAIT        0  16802
TCP    80 c0a8b20a        ip:62684 1/0 eCLOSE_WAIT        0   8043
TCP    80 c0a8b20a        ip:65085 1/0 eCLOSE_WAIT        0  20000
TCP    80 c0a8b20a        ip:49573 1/0 eCLOSE_WAIT        0   5029
FreeRTOS_netstat: 14 sockets 0 < 19 < 40 buffers free

What does it mean that for all sockets in the “eCLOSE_WAIT”-state only the receive stream is available?

Thanks,
Alex

I’m running into this exact same problem.

Hi @oberondoro, thank you for reporting this.

You should be able to post links by now.

We are not aware that the +TCP library has or had a problem with “hanging sockets”.

Somewhere in the code I read:

  /* And wait for the user to close this socket. */
     vTCPStateChange( pxSocket, eCLOSE_WAIT );

Which says that the eCLOSE_WAIT state must be handled by the owner of the socket.

@oberondoro wrote:

Is there another way to safely close these sockets or another way to solve the problem?
Can I directly check the socket state and close it if “eCLOSE_WAIT”?

Yes, if you are sure that the socket won’t be referred to by your application or by the HTTP library, you can call closesocket() to delete the socket.

Here below I will describe how this normally happens:

When a TCP API returns a negative number, it contains an errno value.

Most negative codes are fatal errors, except for pdFREERTOS_ERRNO_EAGAIN (AKA pdFREERTOS_ERRNO_EWOULDBLOCK) and pdFREERTOS_ERRNO_EINTR.

This is a simple TCP echo server:

    for( ;; )
    {
        BaseType_t xRc = FreeRTOS_recv( xSocket, pcBuffer, sizeof pcBuffer, 0 );
        if( xRc > 0 )
        {
            xRc = FreeRTOS_send( xSocket, pcBuffer, ( size_t ) xRc, 0 );
        }
        if( xRc < 0 )
        {
            if( ( xRc != -pdFREERTOS_ERRNO_EAGAIN ) &&
                ( xRc != -pdFREERTOS_ERRNO_EINTR ) )
            {
                printf("The connection is broken with errno %d\n", -xRc );
                break;
            }
        }
    }
    /* The socket will be deleted. */
    FreeRTOS_closesocket( xSocket );

Note that in the above example there is no need to call shutdown(). The peer already took the initiative to shutdown the connection.

Some users are confused about the use of shutdown(). HTTP servers are usually passive, they will only close a socket when the connection got disrupted, as in the above example.

I am not sure who is responsible to close a connection when using REST-API, but if you need to close a connection actively (and gracefully), the following can be done:

    /* Start closing the connection: */
    FreeRTOS_shutdown( xSocket, FREERTOS_SHUT_RDWR );
    for( ;; )
    {
        BaseType_t xRc = FreeRTOS_recv( xSocket, pcBuffer, sizeof pcBuffer, 0 );
        if( xRc > 0 )
        {
            /* Very important: handle the last bytes received. */
        }
        else if( xRc < 0 )
        {
            if( ( xRc != -pdFREERTOS_ERRNO_EAGAIN ) &&
                ( xRc != -pdFREERTOS_ERRNO_EINTR ) )
            {
                printf("The shutdown is acknowledged by the peer (errno %d)\n", -xRc );
                break;
            }
        }
    }
    /* Close ( free ) the socket. */
    FreeRTOS_closesocket( xSocket );

In other words, call shutdown() once, wait until an API call (like recv() returns a fatal code, and close the socket.

Just to be sure, I just linked one of my HTTP projects with TCP Version2.3.2-LTS-Patch 2 and ran some tests.

All files loaded well using 8 TCP sockets. A POST of a file also worked well. After closing the browser, all connections and sockets got closed immediately:

/* Loading the HTML pages. */

00.223 [IP-task] Prot Port IP-Remote       : Port  R/T Status       Alive  tmout Child
00.223 [IP-task] TCP    21 0.0.0.0:    0 0/0 eTCP_LISTEN   104414      0 0/12
00.223 [IP-task] TCP  2402 0.0.0.0:    0 0/0 eTCP_LISTEN   104370      0 0/3
00.223 [IP-task] TCP    80 0.0.0.0:    0 0/0 eTCP_LISTEN   102031      0 8/16
00.223 [IP-task] TCP  8080 0.0.0.0:    0 0/0 eTCP_LISTEN   102031      0 0/16
00.223 [IP-task] TCP    80 192.168.2.5: 3207 1/1 eESTABLISHED    4078  16378
00.223 [IP-task] TCP    80 192.168.2.5: 3208 1/1 eESTABLISHED    4077  16378
00.223 [IP-task] TCP    80 192.168.2.5: 3209 1/1 eESTABLISHED    4077  16380
00.223 [IP-task] TCP    80 192.168.2.5: 3210 1/1 eESTABLISHED    4171  16282
00.223 [IP-task] TCP    80 192.168.2.5: 3211 1/1 eESTABLISHED    5607  15002
00.223 [IP-task] TCP    80 192.168.2.5: 3212 1/1 eESTABLISHED    5594  14983
00.223 [IP-task] TCP    80 192.168.2.5: 3213 1/1 eESTABLISHED    4076  17000
00.223 [IP-task] TCP    80 192.168.2.5: 3215 1/1 eESTABLISHED    4078  16367
00.224 [IP-task] UDP Port  2402
00.224 [IP-task] UDP Port 30718
00.224 [IP-task] UDP Port  2000
00.224 [IP-task] UDP Port 64887
00.224 [IP-task] FreeRTOS_netstat: 16 sockets 4 < 12 < 20 buffers free

/* Closing the browser. */

07.246 [IP-task] Prot Port IP-Remote       : Port  R/T Status       Alive  tmout Child
07.246 [IP-task] TCP    21 0.0.0.0:    0 0/0 eTCP_LISTEN   111999      0 0/12
07.246 [IP-task] TCP  2402 0.0.0.0:    0 0/0 eTCP_LISTEN   111955      0 0/3
07.246 [IP-task] TCP    80 0.0.0.0:    0 0/0 eTCP_LISTEN   109616      0 0/16
07.246 [IP-task] TCP  8080 0.0.0.0:    0 0/0 eTCP_LISTEN   109616      0 0/16
07.246 [IP-task] UDP Port  2402
07.246 [IP-task] UDP Port 30718
07.246 [IP-task] UDP Port  2000
07.246 [IP-task] UDP Port 64887
07.246 [IP-task] FreeRTOS_netstat: 8 sockets 4 < 12 < 20 buffers free

Note that some browsers will keep socket connections open, even after you have closed the tab to the server. So you will only see the eCLOSE_WAIT state appearing, when the browser is entirely closed (or after a long timeout).
Also it is worth knowing that some browsers use less sockets connections that other browsers do. I just tested on both Firefox and on Chrome.

Hi Hein,

Thanks for your inputs. I will improve my “vHTTPClientDelete”-Function, run some longer tests and report back.

Alex.

There is a similar problem if:

ipconfigTCP_HANG_PROTECTION == 1

and a comunication problem causes

xAge > (ipconfigTCP_HANG_PROTECTION_TIME * configTICK_RATE_HZ)

becomes true for a socket with

pxSocket->u.xTCP.bits.bReuseSocket

being set. I added a solution in my code and reported it some years ago.

Today I downloaded FreeRTOS+TCP V3.0.0 and took a look to the source - the possible deadlock is still present there.

I added in “prvTCPStatusAgeCheck”:

if( xAge > (ipconfigTCP_HANG_PROTECTION_TIME * configTICK_RATE_HZ) )
{
    if( pxSocket->u.xTCP.bits.bReuseSocket )
    {
        switch( pxSocket->u.xTCP.ucTCPState )
        {
        case eSYN_FIRST:    // 3 (server) Just created, must ACK the SYN request
        case eSYN_RECEIVED: // 4 (server) waiting for a confirming connection request
                            // acknowledgement after having both received and sent a connection request
            #if( ipconfigHAS_DEBUG_PRINTF == 1 )
            FreeRTOS_debug_printf( ("Inactive socket rem %lXip: %u status %s: listen again\r\n",
                pxSocket->u.xTCP.ulRemoteIP,
                pxSocket->u.xTCP.usRemotePort,
                FreeRTOS_GetTCPStateName((UBaseType_t)pxSocket->u.xTCP.ucTCPState));
            #endif
            // Fall back to eTCP_LISTEN to reassign RemoteIP / RemotePort to the socket
            // because at next connection request one or both of them may be different!
            vTCPStateChange(pxSocket, eTCP_LISTEN);
            return pdFALSE;
        }
    }

    #if( ipconfigHAS_DEBUG_PRINTF == 1 )
    FreeRTOS_debug_printf( ( "Inactive socket closed: port %u rem %lXip: %u status %s\r\n",

Some information was missing in the text

… becomes true for a socket with pxSocket->u.xTCP.bits.bReuseSocket being set while waiting in “FreeRTOS_accept(…” for a client connection

A snippet from my code:

if( conn != FREERTOS_INVALID_SOCKET )
{
    BaseType_t xValue = pdTRUE;
    FreeRTOS_setsockopt(conn, 0, FREERTOS_SO_REUSE_LISTEN_SOCKET, (void )&xValue, sizeof(xValue));
    ...
    ...
    FreeRTOS_bind(conn, &conn_BindAddress, sizeof(conn_BindAddress));
    FreeRTOS_listen(conn, 1);
    datconn = FreeRTOS_accept(conn, &datconn_BindAddress, &datconn_BindAddress_size);
}

NOTE: I ( @htibosch ) edited your post because it was hard to read.
When you post source code, please put it between two lines that contain 3 tildes like here:

~~~
if( conn != FREERTOS_INVALID_SOCKET )
{
    BaseType_t xValue = pdTRUE;
}
~~~

Or you can also use this button:
image

Hello @hs4FreeRtos, thank your for your remarks.

Let me first tell about the re-used socket: it is a socket that is put into listening mode in order to receive a single connection. When the call FreeRTOS_accept( xSocket ) succeeds, it will return the same pointer xSocket as passed as an argument.

Finally, the child socket will get disconnected, and then it must be closed by calling FreeRTOS_closesocket().

Indeed it can happen that a peer stops living while in the SYN phase, and that our socket will hang in either eSYN_FIRST or eSYN_RECEIVED.

When that happens, you propose to put it back to the listening state eTCP_LISTEN. I think that is correct!
At that moment, the socket is still connecting, it has not been returned to the application, and yes, I agree. Also, it hasn’t allocated and stream buffers.

I am only looking for a way to test the proposed change :slight_smile:
Were you able to simulate this situation?

Hello htibosch, thank your for your fast response.
This problem was detected by me some years ago during testing with “Nord VPN” active. The version of “Nord VPN” did not have the “Split tunneling” feature it now have. So “SYN” arrived, but sending “ACK” response was blocked.
Sorry, but I never tried to simulate this behavior since I solved it :wink:
I’ll do that the next days and hopefully, “Nord VPN” can be used to simulate this situation. You’ll get information about the result of this try

I’m sorry - I was’nt successful anymore in simulating the behaviour causing the reportet problem. It should be able to do something like that with wireshark by sending pakets, but unfortunately my experience with the tool is not sufficient enough to know how to do something like that

@hs4FreeRtos : please have a look at PR #545, in which I made the necessary changes.

Thank you for reporting it and for helping to analyse the problem.
Hein

I finally found time to solve the above mention problems in PR #559.

It also addresses the issues mentioned in this post.

Thank you for reporting and for helping to find a solution.