+TCP multi static and ARP issues

Is there a trick or bug in regards to using static ip?

Calling like this

  FreeRTOS_FillEndPoint(&xInterfaceETH, &xEndPointETH,
                        (uint8_t*)systemConfig.netConf.ip,
                        (uint8_t*)systemConfig.netConf.netmask,
                        (uint8_t*)systemConfig.netConf.gateway,
                        (uint8_t*)systemConfig.netConf.primaryDNS,
                        (uint8_t*)systemConfig.netConf.ethMac);
  xEndPointETH.ipv4_settings.ulDNSServerAddresses[1] = secondaryDNS;
  xEndPointETH.ipv4_settings.ulBroadcastAddress =
      xEndPointETH.ipv4_settings.ulIPAddress |
      ~(xEndPointETH.ipv4_settings.ulNetMask);
xEndPointETH.bits.bWantDHCP = pdFALSE;

Does not work correctly, something to do with ARP. Until I ping the unit from the gateway, or call FreeRTOS_NetworkDown on the interface, ARP doesn’t seem to be properly resolving. Ideas?

Edit: This doesn’t work until I call vARPAgeCache(); However, I must call this later, I can’t call when the interface comes up at first. I am using two interfaces, but only one is in use.

Thanks for reporting this. I can not replicate the problem in my DUT, even while DHCP is disabled and a static IP-address is assigned.

Can you check how often and when vARPAgeCache() is being called “automatically”?

Thanks

My development switch is a Ubiquiti EdgeSwitch 8 POE. I know it is related to Spanning tree protocol because if I turn it off everything works normally.

What actually happens is something like this:

The initial gratituous arp(s) that is sent is dropped by the switch. Then since the gateway arp won’t resolve, no packets ever get sent again, its a circular problem. In my testing, it looks like the stack only sends the gratitiuous arp once for some reason.

I think the reason dhcp works ok is that the stack keeps trying and actually sending something until the switch begins to forward packets, it looks like it takes about 4-5 outbound packets to get this working.

I’m unclear how often vARPAgeCache() is being called, but I can check.

Calling this once every 15 seconds resolves this eventually.

static void checkArpStatus(void) {

  uint32_t ulGatewayIP = activeEndpoint->ipv4_settings.ulGatewayAddress;
  MACAddress_t xMACAddress;
  eResolutionLookupResult_t eResult;
  eResult = eARPGetCacheEntry(&ulGatewayIP, &xMACAddress, &activeEndpoint);
  if (eResult != eResolutionCacheHit) {
    /* Gateway MAC not resolved, send gratuitous ARP and request */
    vARPSendGratuitous();
    FreeRTOS_OutputARPRequest(ulGatewayIP);
  }
}

I don’t think vARPAgeCache() is ever called by the stack. If I breakpoint at

        /* Is it time for ARP processing? */
        if( prvIPTimerCheck( &xARPTimer ) != pdFALSE )

in prvIPTimerCheck() pxTimer->bActive == pdFALSE_UNSIGNED is true so it just returns.

I see where vIPSetARPTimerEnableState( pdFALSE ); is called in prvProcessNetworkDownEvent so how/where does this ever get enabled? As an experiment, if I set the xARPTimer.bActive to 1 in watch window, this takes off.

I’ve attached my IP config file, no its not that pretty.

FreeRTOSIPConfig.h (10.3 KB)

Great that you found the cause of the problem, and a trick to avoid the problem.
It is funny to note that I’m also working with a new POE managed TP-Link, which has port mirroring.

The following text does not solve your problem. I will put it here in case other people find mentioned problems:

Some text about my experiences of today’s testing:

it looks like the stack only sends the gratuitous arp once for some reason

I decided to send gratuitous ARP messages regularly to make sure that the ARP caches in other devices are up-to-date all the time.

I just tested your problem, and I did find a firewall problem: although WireShark sees incoming ICMP ping messages, it doesn’t allow my laptop to reply.

After allowing them, the DUT can ping my laptop and vv. When I tell it to trust devices in 192.168.178.0/24, ping messages will be replied.

It doesn’t make a difference if DHCP is enabled or not, it works well with and without.
The function vARPAgeCache() is being called regularly, as it should.

Now we should find out why pxTimer->bActive is false in your case.

Well, I can’t find anywhere it is enabled in code once disabled. Shouldn’t this happen on network up?

Adding to this, could he problem at all be related to ARP being global, instead of per interface? In my case, I think one of the interfaces is FreeRTOS_NetworkDown’d

That sounds like a good explanation.

The ARP implementation is global in a sense that there is only 1 ARP table.

Each entry in the table has a property xNetworkEndPoint, and in a way it behaves like serving multiple ARP tables.

It is more efficient to have 100 entries for two endpoint, in stead of a fixed 50 entries per endpoint.

Here ( in FreeRTOS_IP.c ) is the guilty code:

 void prvProcessNetworkDownEvent( struct xNetworkInterface * pxInterface )
 {
     NetworkEndPoint_t * pxEndPoint;
 
     configASSERT( pxInterface != NULL );
     configASSERT( pxInterface->pfInitialise != NULL );
     /* Stop the Address Resolution timer while there is no network. */
     #if ipconfigIS_ENABLED( ipconfigUSE_IPv4 )
+        /* __HT__ Should be done only if all network interfaces are down. */
+        vIPSetARPTimerEnableState( pdFALSE );
     #endif
     ...
 }

I don’t think it hurts if the ARP timer is always enabled.

I can confirm that this pretty much fixes this. Is there any drawback to removing that disable patch?

This bug would have been introduced in multi only, correct?

Further testing shows another issue. If I change interfaces, the gateway ip is “flushed” from the arp cache, and it never re populates. Do I need to manually trigger this via FreeRTOS_OutputARPRequest for the gateway, or what normally triggers this to search?

Actually, what is happening is:

FreeRTOS_OutputARPRequest is going to the dead interface, because FreeRTOS_InterfaceEndPointOnNetMask returns the first interface with a matching netmask.

Is the solution to have to 0 the ip on the unused network, or should FreeRTOS_InterfaceEndPointOnNetMask be validating that the network is indeed up?

I think the solution may be something like

    NetworkEndPoint_t * FreeRTOS_InterfaceEndPointOnNetMask( const NetworkInterface_t * pxInterface,
                                                             uint32_t ulIPAddress )
    {
        NetworkEndPoint_t * pxEndPoint = pxNetworkEndPoints;

        /* Find the best fitting end-point to reach a given IP-address. */

        /*_RB_ Presumably then a broadcast reply could go out on a different end point to that on
         * which the broadcast was received - although that should not be an issue if the nodes are
         * on the same LAN it could be an issue if the nodes are on separate LAN's. */

        while( pxEndPoint != NULL )
        {
            if( ( pxInterface == NULL ) || ( pxEndPoint->pxNetworkInterface == pxInterface ) )
            {
                #if ( ipconfigUSE_IPv4 != 0 )
                    #if ( ipconfigUSE_IPv6 != 0 )
                        if( pxEndPoint->bits.bIPv6 == pdFALSE_UNSIGNED )
                    #endif
                    {
                        if( pxEndPoint->bits.bEndPointUp == pdTRUE_UNSIGNED ){
                          if( ( ulIPAddress == ~0U ) ||
                              ( ( ulIPAddress & pxEndPoint->ipv4_settings.ulNetMask ) == ( pxEndPoint->ipv4_settings.ulIPAddress & pxEndPoint->ipv4_settings.ulNetMask ) ) )
                          {
                              /* Found a match. */
                              break;
                          }
                        }
                    }
                #endif /* if ( ipconfigUSE_IPv4 != 0 ) */
            }

            pxEndPoint = pxEndPoint->pxNext;
        }

        /* This was only for debugging. */
        if( pxEndPoint == NULL )
        {
            FreeRTOS_debug_printf( ( "FreeRTOS_FindEndPointOnNetMask: No match for %xip\n",
                                     ( unsigned ) FreeRTOS_ntohl( ulIPAddress ) ) );
        }

        return pxEndPoint;
    }

But I’m not sure what this might break.

You wrote:

I think the solution may be something like:

NetworkEndPoint_t * FreeRTOS_InterfaceEndPointOnNetMask( const NetworkInterface_t * pxInterface,
uint32_t ulIPAddress )
{
}

In this solution, you are changing the behaviour of a function that is often used. In some situations, the user wants to see all matching end-points, either up or down.

Beside that, I think that it would solve the problem that you describe.

Actually, this may be something unresolveable on the DUT side. If the switch port your unit is connected to is configured as 802.1X, it may require a known MAC-IP address combo. I have seen this many times; it is simply assumed that any network node generates some traffic of its own, therebye revealing its IP address to the switch.

We generally worked around this issue by having a deamon that would attempt to establish a TCP connection to some dummy server which would force egress packets (non gratituous ARP requests) out the port, therebye informing the switch about our MAC and IP address.

Note that on switches configured that way, you frequently have problems with multi end points as well as the switch may consider anything but a 1:1 correspondence a security breach and block the port.

Since you write that this is related to STP, it may also be the case that the switch can not determine the role of your DUT in the net and thus block the port. What is the MAC address you configured your DUT with? Have you checked with Ubiquiti whether that could be a known problem with them? Can you query the port status on the switch side via a management interface?

I’m unclear on this thing, in my view these are clear bugs, or at least undesirable attributes in most typical embedded systems, with the potential to leave them offline.

Issue 1: The arp timer needs to run regardless

Issue 2: FreeRTOS_OutputARPRequest is fundamentally broken if it outputs ARP based on netmask only. All this takes is for a static ip to be set on first in list (or dhcp to have happened, even if currently offline), and all arp’s will then be sent out on port #1.

Hi Erik,

yes, there may be issues in FreeRTOS+TCP, I am no doubts about that.

I was just elaborating on your observation that a communication had to be “jump started” by an external ARP request. There ARE scenarios in which this is an inherent issue stemming from the behavior of switches that can not be fixed by the affected network devices’ network stacks (just by workarounds as outlined now).

Apologies for the noise if this should not be applicable in your case.

Another place this fails is FreeRTOS_FindGateWay, because it just selects the first matching gateway, regardless of pxEndPoint status.

@FreeRTOS TCP Maintainers?

For FreeRTOS_FindGateWay, maybe something like (Only IPv4 edit)

    NetworkEndPoint_t * FreeRTOS_FindGateWay( BaseType_t xIPType )
    {
        NetworkEndPoint_t * pxEndPoint = pxNetworkEndPoints;
        NetworkEndPoint_t * pxFallback = NULL;

        while( pxEndPoint != NULL )
        {
            #if ( ipconfigUSE_IPv6 == 0 )
                ( void ) xIPType;

                if( pxEndPoint->ipv4_settings.ulGatewayAddress != 0U ) /* access to ipv4_settings is checked. */
                {
                    if( pxFallback == NULL ) {
                        pxFallback = pxEndPoint;
                    }
                    if( pxEndPoint->bits.bEndPointUp == pdTRUE_UNSIGNED ) {
                        return pxEndPoint;
                    }
                }
            #else
                if( ( xIPType == ( BaseType_t ) ipTYPE_IPv6 ) && ( pxEndPoint->bits.bIPv6 != pdFALSE_UNSIGNED ) )
                {
                    /* Check if the IP-address is non-zero. */
                    if( memcmp( FreeRTOS_in6addr_any.ucBytes, pxEndPoint->ipv6_settings.xGatewayAddress.ucBytes, ipSIZE_OF_IPv6_ADDRESS ) != 0 )
                    {
                        break;
                    }
                }

                #if ( ipconfigUSE_IPv4 != 0 )
                    else
                    if( ( xIPType == ( BaseType_t ) ipTYPE_IPv4 ) && ( pxEndPoint->bits.bIPv6 == pdFALSE_UNSIGNED ) )
                    {
                        if( pxEndPoint->ipv4_settings.ulGatewayAddress != 0U )
                        {
                            break;
                        }
                    }
                #endif /* ( ipconfigUSE_IPv4 != 0 ) */
                else
                {
                    /* This end-point is not the right IP-type. */
                }
            #endif /* ( ipconfigUSE_IPv6 != 0 ) */
            pxEndPoint = pxEndPoint->pxNext;
        }

        return pxFallback;
    }

And for FreeRTOS_InterfaceEndPointOnNetMask

    NetworkEndPoint_t * FreeRTOS_InterfaceEndPointOnNetMask( const NetworkInterface_t * pxInterface,
                                                             uint32_t ulIPAddress )
    {
        NetworkEndPoint_t * pxEndPoint = pxNetworkEndPoints;
        NetworkEndPoint_t * pxFallback = NULL;
        /* Find the best fitting end-point to reach a given IP-address. */

        /*_RB_ Presumably then a broadcast reply could go out on a different end point to that on
         * which the broadcast was received - although that should not be an issue if the nodes are
         * on the same LAN it could be an issue if the nodes are on separate LAN's. */

        while( pxEndPoint != NULL )
        {
            if( ( pxInterface == NULL ) || ( pxEndPoint->pxNetworkInterface == pxInterface ) )
            {
                #if ( ipconfigUSE_IPv4 != 0 )
                    #if ( ipconfigUSE_IPv6 != 0 )
                        if( pxEndPoint->bits.bIPv6 == pdFALSE_UNSIGNED )
                    #endif
                    {
                        BaseType_t xMatches =
                              ( ( ulIPAddress == ~0U ) ||
                              ( ( ulIPAddress & pxEndPoint->ipv4_settings.ulNetMask ) == ( pxEndPoint->ipv4_settings.ulIPAddress & pxEndPoint->ipv4_settings.ulNetMask ) ) );

                        if( xMatches ) {
                            if( pxFallback == NULL ) {
                                pxFallback = pxEndPoint;  /* record first match regardless */
                            }

                            if( pxEndPoint->bits.bEndPointUp == pdTRUE_UNSIGNED ) {
                                break;
                            }
                        }
                    }
                #endif /* if ( ipconfigUSE_IPv4 != 0 ) */
            }

            pxEndPoint = pxEndPoint->pxNext;
        }

        /* This was only for debugging. */
        if( pxEndPoint == NULL )
        {
            pxEndPoint = pxFallback;
        }
        if( pxEndPoint == NULL )
        {
            FreeRTOS_debug_printf( ( "FreeRTOS_FindEndPointOnNetMask: No match for %xip\n",
                                     ( unsigned ) FreeRTOS_ntohl( ulIPAddress ) ) );
        }

        return pxEndPoint;
    }

Would there be any side effects from these changes?