Issue Closing TCP Socket

I am having problems closing my TLS session and it seems to be stemming from when I close my TCP socket. I am looking for some insight on why this is happening.

The issue happens when vSocketClose is called and calls vEventGroupDelete( pxSocket->xEventGroup );

In vEventGroupDelete I hit the following assert:
configASSERT( pxTasksWaitingForBits->xListEnd.pxNext != ( const ListItem_t * ) &( pxTasksWaitingForBits->xListEnd ) );

The versions I am working with are:
FreeRTOS+TCP V2.4.0
FreeRTOS Kernel V10.4.6

If anyone has some starting point ideas about this assert I would appreciate it. I am not sure what the assert is looking for.

Code:

void vEventGroupDelete( EventGroupHandle_t xEventGroup )
{
    EventGroup_t * pxEventBits = xEventGroup;
    const List_t * pxTasksWaitingForBits;

    configASSERT( pxEventBits );

    pxTasksWaitingForBits = &( pxEventBits->xTasksWaitingForBits );

    vTaskSuspendAll();
    {
        traceEVENT_GROUP_DELETE( xEventGroup );

        while( listCURRENT_LIST_LENGTH( pxTasksWaitingForBits ) > ( UBaseType_t ) 0 )
        {
            /* Unblock the task, returning 0 as the event list is being deleted
             * and cannot therefore have any bits set. */
            configASSERT( pxTasksWaitingForBits->xListEnd.pxNext != ( const ListItem_t * ) &( pxTasksWaitingForBits->xListEnd ) );
            vTaskRemoveFromUnorderedEventList( pxTasksWaitingForBits->xListEnd.pxNext, eventUNBLOCKED_DUE_TO_BIT_SET );
        }

        #if ( ( configSUPPORT_DYNAMIC_ALLOCATION == 1 ) && ( configSUPPORT_STATIC_ALLOCATION == 0 ) )
            {
                /* The event group can only have been allocated dynamically - free
                 * it again. */
                vPortFree( pxEventBits );
            }
        #elif ( ( configSUPPORT_DYNAMIC_ALLOCATION == 1 ) && ( configSUPPORT_STATIC_ALLOCATION == 1 ) )
            {
                /* The event group could have been allocated statically or
                 * dynamically, so check before attempting to free the memory. */
                if( pxEventBits->ucStaticallyAllocated == ( uint8_t ) pdFALSE )
                {
                    vPortFree( pxEventBits );
                }
                else
                {
                    mtCOVERAGE_TEST_MARKER();
                }
            }
        #endif /* configSUPPORT_DYNAMIC_ALLOCATION */
    }
    ( void ) xTaskResumeAll();
}

Seems like a data corruption issue. Have you defined configASSERT and enabled stack overflow and malloc failed checks?

When you hit this assert, can you examine the value of pxTasksWaitingForBits and see if it looks legitimate.

After some investigating I have found the following behaviour. Here is the process that leads to this crash:

  1. Connect to TCP
  2. Connect to TLS
  3. MQTT Loop Send/Receive (Works)
  4. Unplug Ethernet
  5. Unsubscribe MQTT
  6. Disconnect TCP Socket
  7. Free TLS Connection
  8. Plug in Ethernet
  9. Connect to TCP (Creates new socket)
  10. Connect to TLS
  11. MQTT Loop Send/Receive (Only Receive works)
  12. When TCP/TLS Tries to cleanup crashes

If you look at the debug messages you can see it referencing the original socket. Iā€™m wondering if this could be the issue. Still looking into this.

Debug Output:

prvIPTask started
prvInitialiseDHCP: start after 250 ticks
vDHCPProcess: discover
vDHCPProcess: discover
vDHCPProcess: timeout 10000 ticks
DNS: 33627308l
vDHCPProcess: offer a601a7eip
vDHCPProcess: reply a601a7eip
vDHCPProcess: offer a601a7eip
vDHCPProcess: acked a601a7eip
[DEBUG] [PKCS11] [createUdpSocket:503] UDP socket has been bound to port 55743

prvProcessDNSCache: add: 'pool.ntp.org' @ d173b56cip
DNS[0x56B8]: The answer to 'pool.ntp.org' (209.115.181.108) will be stored
[INFO] [PKCS11] [resolveDns:210] Resolved time server as 209.115.181.108

[INFO] [PKCS11] [sntpClient_SetTime:423] Received time from time server: pool.ntp.org

FreeRTOS_connect: 20692 to 23b8bedbip:1884
Socket 20692 -> 23b8bedbip:1884 State eCLOSED->eCONNECT_SYN
prvSocketSetMSS: 1400 bytes for 23b8bedbip:1884
Connect[23b8bedbip:1884]: next timeout 1: 3000 ms
MSS change 1400 -> 1460
Socket 20692 -> 23b8bedbip:1884 State eCONNECT_SYN->eESTABLISHED
[DEBUG] [PKCS11] [prvMbedTLS_Initialize:463] PKCS #11 module was successfully initialized.

[INFO] [PKCS11] [C_Initialize:1405] PKCS #11 successfully initialized.

[DEBUG] [PKCS11] [C_GetSlotList:1619] Successfully Returned a PKCS #11 slot with ID 1 with a count of 1.

[DEBUG] [PKCS11] [C_OpenSession:1854] Assigned a 0x2 Type Session.

[DEBUG] [PKCS11] [C_OpenSession:1865] Assigned Mechanisms to no operation in progress.

[DEBUG] [PKCS11] [C_OpenSession:1881] Current session count at 0

[DEBUG] [PKCS11] [C_Login:1967] C_Login is not implemented.

[DEBUG] [PKCS11] [C_GenerateRandom:5627] Successfully generated 32 random bytes.

win_tx_ack: acked 1456 expc 1456 len 276
win_tx_ack: acked 12403 expc 12403 len 1400
win_tx_ack: acked 15408 expc 15408 len 1400

**HERE I DISCONNECTED THE ETHERNET**

keep-alive: giving up 23b8bedbip:1884
Socket 20692 -> 23b8bedbip:1884 State eESTABLISHED->eCLOSE_WAIT
[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_recv:1031] Failed to read data: mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

FreeRTOS_closesocket[20692 to 23b8bedbip:1884]: buffers 16 socks 0
[INFO] [PKCS11] [C_CloseSession:1934] Successfully closed PKCS #11 session.

** HERE I RECONNECTED THE ETHERNET **

FreeRTOS_connect: 6066 to 23b8bedbip:1884
Socket 6066 -> 23b8bedbip:1884 State eCLOSED->eCONNECT_SYN
prvSocketSetMSS: 1400 bytes for 23b8bedbip:1884
Connect[23b8bedbip:1884]: next timeout 1: 3000 ms
Connect[23b8bedbip:1884]: next timeout 2: 6000 ms
[ERROR] [Sockets] [Sockets_Connect:102] Failed to connect to server: FreeRTOS_Connect failed: ReturnCode=-116, 

FreeRTOS_closesocket[6066 to 23b8bedbip:1884]: buffers 16 socks 0

FreeRTOS_connect: 25699 to 23b8bedbip:1884
Socket 25699 -> 23b8bedbip:1884 State eCLOSED->eCONNECT_SYN
prvSocketSetMSS: 1400 bytes for 23b8bedbip:1884
Connect[23b8bedbip:1884]: next timeout 1: 3000 ms
MSS change 1400 -> 1460
Socket 25699 -> 23b8bedbip:1884 State eCONNECT_SYN->eESTABLISHED
[WARN] [PKCS11] [C_Initialize:1400] Failed to initialize PKCS #11. PKCS #11 was already initialized.

[DEBUG] [PKCS11] [C_GetSlotList:1619] Successfully Returned a PKCS #11 slot with ID 1 with a count of 1.

[DEBUG] [PKCS11] [C_OpenSession:1854] Assigned a 0x2 Type Session.

[DEBUG] [PKCS11] [C_OpenSession:1865] Assigned Mechanisms to no operation in progress.

[DEBUG] [PKCS11] [C_OpenSession:1881] Current session count at 0

[DEBUG] [PKCS11] [C_Login:1967] C_Login is not implemented.

[DEBUG] [PKCS11] [C_GenerateRandom:5627] Successfully generated 32 random bytes.

TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
win_tx_ack: acked 1456 expc 1456 len 276
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
TCP: No active socket on port 20692 (23b8bedbip:1884)
[ERROR] [PkcsTlsTransport] [TLS_FreeRTOS_send:1090] Failed to send data:  mbedTLSError= SSL - Internal error (eg, unexpected failure in lower-level module) : <No-Low-Level-Code>.

It is good that you have found a way to reliably repro this. As I mentioned before, if you can examine the state in the debugger when it hits the assert, that may give us some clue.

I found the problem. There was a portion of the code which ended up using the connection after the SSL session had been freed. This was causing the undefined behaviour.

Thank you for reporting back.

@drowsell
Excuse me for abruptly coming to you.
I could be facing the same problem now. What kind of fix did you make? Please give me some details.

I believe there were two errors for me. On was an error on my part where the SSL connection was being referenced after is was freed.

The second that was trickier was an undefined log statement for vLoggingPrintf. In sockets_wrapper.h I believe this code was causing me issues:

/* Prototype for the function used to print to console on Windows simulator
 * of FreeRTOS.
 * The function prints to the console before the network is connected;
 * then a UDP port after the network has connected. */
extern void vLoggingPrintf( const char * pcFormatString,
                            ... );

/* Map the SdkLog macro to the logging function to enable logging
 * on Windows simulator. */
#ifndef SdkLog
    #define SdkLog( message )    vLoggingPrintf message
#endif

I ended up fixing this with the following code in a config file:

#if MY_LOG_LEVEL == LOG_DEBUG
    #define vLoggingPrintf LOG_message
    #define configPRINTF( message ) LOG_message message
#else
    #define vLoggingPrintf LOG_nothing
    #define configPRINTF( message ) LOG_nothing message
#endif

LOG_message is my custom logging function and LOG_nothing is an empty function.

If you let me know what is happening I can see if I remember anything else.

1 Like

Thanks for your reply.
The problem I am experiencing is that when I unplug and plug the Ethernet, I get a data abort exception within MQTT_AGENT_Connect().
I asked this question because I think it is almost the same as your post.
I will try disabling vLoggingPrintf().
Thank you so much, I really really appreciate it.