Code stuck in the vDHCPProcess function mostly when DHCP State machine is eWaitingAcknowledge

Hi,
I am using RA6M4 mcu and freertos + tcp stack with freertos version 10.6.1.
Now, problem is in DHCP when state machine is in eWaitingAcknowledge state. So, if it receives NACK in the options from the server, it sets the client present dhcp state to eInitialWait.

        switch( pxSet->ucOptionCode )
        {
            case dhcpIPv4_MESSAGE_TYPE_OPTION_CODE:

                if( pxSet->pucByte[ pxSet->uxIndex ] == ( uint8_t ) xExpectedMessageType )
                {
                    /* The message type is the message type the
                     * state machine is expecting. */
                    pxSet->ulProcessed++;
                }
                else
                {
                    if( pxSet->pucByte[ pxSet->uxIndex ] == ( uint8_t ) dhcpMESSAGE_TYPE_NACK )
                    {
                        if( xExpectedMessageType == ( BaseType_t ) dhcpMESSAGE_TYPE_ACK )
                        {
                            /* Start again. */
                            EP_DHCPData.eDHCPState = eInitialWait;
                        }
                    }

                    /* Stop processing further options. */
                    pxSet->uxLength = 0;
                }

                break;

Now, if by any chance, it receives any dhcp packet at this instant and peeks the packet, then due to mismatch of present state and expected state, it will be stuck in the loop.

        if( ( EP_DHCPData.eDHCPState != EP_DHCPData.eExpectedState ) && ( xReset == pdFALSE ) )
        {
            /* When the DHCP event was generated, the DHCP client was
            * in a different state.  Therefore, ignore this event. */
            FreeRTOS_debug_printf( ( "vDHCPProcessEndPoint: wrong state: expect: %d got: %d : ignore\n",
                                     EP_DHCPData.eExpectedState, EP_DHCPData.eDHCPState ) );
        }

10.6.1 should be the version of FreeRTOS Kernel; can you share the version of FreeRTOS+TCP you are using?

Now, if by any chance, it receives any dhcp packet at this instant and peeks the packet, then due to mismatch of present state and expected state, it will be stuck in the loop

In the latest version of FreeRTOS+TCP, when you receive a packet on the DHCP socket, the current DHCP state (EP_DHCPData.eDHCPState) is set to the expected state (EP_DHCPData.eExpectedState) in the xSendDHCPEvent() here: FreeRTOS-Plus-TCP/source/FreeRTOS_IP_Utils.c at main · FreeRTOS/FreeRTOS-Plus-TCP · GitHub

Version of FreeRTOS + tcp is 4.0.0

Sir, I know that but it will set expected state when it will come out of the vDHCPProcess function. It is stuck in that loop.

        else if( xDHCPv4Socket != NULL ) /* If there is a socket, check for incoming messages first. */
        {
            /* No need to initialise 'pucUDPPayload', it just looks nicer. */
            uint8_t * pucUDPPayload = NULL;
            const DHCPMessage_IPv4_t * pxDHCPMessage;
            int32_t lBytes;

            while( xDHCPv4Socket != NULL )
            {
                BaseType_t xRecvFlags = FREERTOS_ZERO_COPY + FREERTOS_MSG_PEEK;
                NetworkEndPoint_t * pxIterator = NULL;

                /* Peek the next UDP message. */
                lBytes = FreeRTOS_recvfrom( xDHCPv4Socket, &( pucUDPPayload ), 0, xRecvFlags, NULL, NULL );

                if( lBytes < ( ( int32_t ) sizeof( DHCPMessage_IPv4_t ) ) )
                {
                    if( ( lBytes < 0 ) && ( lBytes != -pdFREERTOS_ERRNO_EAGAIN ) )
                    {
                        FreeRTOS_printf( ( "vDHCPProcess: FreeRTOS_recvfrom returns %d\n", ( int ) lBytes ) );
                    }

                    break;
                }

                /* Map a DHCP structure onto the received data. */
                /* MISRA Ref 11.3.1 [Misaligned access] */
                /* More details at: https://github.com/FreeRTOS/FreeRTOS-Plus-TCP/blob/main/MISRA.md#rule-113 */
                /* coverity[misra_c_2012_rule_11_3_violation] */
                pxDHCPMessage = ( ( const DHCPMessage_IPv4_t * ) pucUDPPayload );

                /* Sanity check. */
                if( ( pxDHCPMessage->ulDHCPCookie == dhcpCOOKIE ) && ( pxDHCPMessage->ucOpcode == dhcpREPLY_OPCODE ) )
                {
                    pxIterator = pxNetworkEndPoints;

                    /* Find the end-point with given transaction ID. */
                    while( pxIterator != NULL )
                    {
                        if( pxDHCPMessage->ulTransactionID == FreeRTOS_htonl( pxIterator->xDHCPData.ulTransactionId ) )
                        {
                            break;
                        }

                        pxIterator = pxIterator->pxNext;
                    }
                }

                if( ( pxIterator != NULL ) && ( pxIterator->xDHCPData.eDHCPState == eLeasedAddress ) )
                {
                    /* No DHCP messages are expected while in eLeasedAddress state. */
                    pxIterator = NULL;
                }

                if( pxIterator != NULL )
                {
                    /* The second parameter pdTRUE tells to check for a UDP message. */
                    vDHCPProcessEndPoint( pdFALSE, pdTRUE, pxIterator );

                    if( pxEndPoint == pxIterator )
                    {
                        xDoProcess = pdFALSE;
                    }
                }
                else
                {
                    /* Target not found, fetch the message and delete it. */
                    /* PAss the address of a pointer pucUDPPayload, because zero-copy is used. */
                    lBytes = FreeRTOS_recvfrom( xDHCPv4Socket, &( pucUDPPayload ), 0, FREERTOS_ZERO_COPY, NULL, NULL );

                    if( ( lBytes > 0 ) && ( pucUDPPayload != NULL ) )
                    {
                        /* Remove it now, destination not found. */
                        FreeRTOS_ReleaseUDPPayloadBuffer( pucUDPPayload );
                        FreeRTOS_printf( ( "vDHCPProcess: Removed a %d-byte message: target not found\n", ( int ) lBytes ) );
                    }
                }
            }
      }

Can you try making the change below:
In vProcessHandleOption(), can you add “EP_DHCPData.eExpectedState = eInitialWait;” to see if the state mismatch doesn’t happen.

static void vProcessHandleOption( NetworkEndPoint_t * pxEndPoint,
                                      ProcessSet_t * pxSet,
                                      BaseType_t xExpectedMessageType )
    {
        /* Option-specific handling. */

        switch( pxSet->ucOptionCode )
        {
            case dhcpIPv4_MESSAGE_TYPE_OPTION_CODE:

                if( pxSet->pucByte[ pxSet->uxIndex ] == ( uint8_t ) xExpectedMessageType )
                {
                    /* The message type is the message type the
                     * state machine is expecting. */
                    pxSet->ulProcessed++;
                }
                else
                {
                    if( pxSet->pucByte[ pxSet->uxIndex ] == ( uint8_t ) dhcpMESSAGE_TYPE_NACK )
                    {
                        if( xExpectedMessageType == ( BaseType_t ) dhcpMESSAGE_TYPE_ACK )
                        {
                            /* Start again. */
                            EP_DHCPData.eDHCPState = eInitialWait;

                          /* ADD THIS LINE to fix the state mismatch */
                        EP_DHCPData.eExpectedState = eInitialWait;
                        }
                    }

                    /* Stop processing further options. */
                    pxSet->uxLength = 0;
                }

                break;

@NikhilKamath, Firstly, I have tried adding this line in the code. It does solve the problem of state mismatch but does not remove the real problem which i found while debugging.
The real problem was that when it receives a nak from a server another server happens to also send an offer to the mcu, due to this after handling the present NAK packet (and changing its state to eWaitingAcknowledge) when it peeks for a received packet, it receives the offer message and gets stuck in the loop.

It also results in other bugs like when we receive an ack but just after it, we receive an offer. There also it will be stuck.

So, to handle this, I am able to think of only releasing the packet when states mismatch.

If anyone have any other way, pls share.

Thanks for the feedback, we will take a look early next week and get back to you.

@Raghav

Can you add the changes in the below git diff to your project and see if it helps?

FreeRTOS-Plus-TCP> git diff  
diff --git a/source/FreeRTOS_DHCP.c b/source/FreeRTOS_DHCP.c
index 6bca3e17..ce8cf219 100644
--- a/source/FreeRTOS_DHCP.c
+++ b/source/FreeRTOS_DHCP.c
@@ -209,14 +209,19 @@
             uint8_t * pucUDPPayload = NULL;
             const DHCPMessage_IPv4_t * pxDHCPMessage;
             int32_t lBytes;
+            struct freertos_sockaddr xSourceAddress;
+
+            memset(&xSourceAddress, 0, sizeof(xSourceAddress));

             while( EP_DHCPData.xDHCPSocket != NULL )
             {
                 BaseType_t xRecvFlags = FREERTOS_ZERO_COPY + FREERTOS_MSG_PEEK;
                 NetworkEndPoint_t * pxIterator = NULL;
+                struct freertos_sockaddr xSourceAddressCurrent;
+                socklen_t xSourceAddressCurrentLength = 0;

                 /* Peek the next UDP message. */
-                lBytes = FreeRTOS_recvfrom( EP_DHCPData.xDHCPSocket, &( pucUDPPayload ), 0, xRecvFlags, NULL, NULL );
+                lBytes = FreeRTOS_recvfrom( EP_DHCPData.xDHCPSocket, &( pucUDPPayload ), 0, xRecvFlags, &xSourceAddressCurrent, &xSourceAddressCurrentLength );

                 if( lBytes < ( ( int32_t ) sizeof( DHCPMessage_IPv4_t ) ) )
                 {
@@ -228,6 +233,11 @@
                     break;
                 }

+                if( xSourceAddress.sin_address.ulIP_IPv4 == 0U )
+                {
+                    memcpy(&xSourceAddress, &xSourceAddressCurrent, xSourceAddressCurrentLength);
+                }
+
                 /* Map a DHCP structure onto the received data. */
                 /* MISRA Ref 11.3.1 [Misaligned access] */
                 /* More details at: https://github.com/FreeRTOS/FreeRTOS-Plus-TCP/blob/main/MISRA.md#rule-113 */
@@ -239,10 +249,12 @@
                 {
                     pxIterator = pxNetworkEndPoints;

-                    /* Find the end-point with given transaction ID. */
+                    /* Find the end-point with given transaction ID and verify DHCP server address. */
                     while( pxIterator != NULL )
                     {
-                        if( pxDHCPMessage->ulTransactionID == FreeRTOS_htonl( pxIterator->xDHCPData.ulTransactionId ) )
+                        if( (pxDHCPMessage->ulTransactionID == FreeRTOS_htonl( pxIterator->xDHCPData.ulTransactionId )) && 
+                        ((xSourceAddress.sin_address.ulIP_IPv4 == xSourceAddressCurrent.sin_address.ulIP_IPv4) &&
+                        (xSourceAddress.sin_port == xSourceAddressCurrent.sin_port)))
                         {
                             break;
                         }
1 Like

Hey @tony-josi-aws,

I ran the code you gave and it did work fine.
I think it has solved this problem for now.
So, Thanks alot for your help.

1 Like

Thanks for reporting back. Will add these changes to the library.