We have a so called ‘Cluster’ with Windows 2022 servers.
The cluster contains two servers, let call them server-1 and server-2.
To complete the introduction, there is also a witness, but that’s not important for my issue.
The cluster is set up for redundancy. If a server fails (e.g. powerloss) the other server becomes active so the system continues running. Let’s call the server that fails the ‘old server’ and the server that takes over the ‘new server’.
When testing and shutting down a server we saw the embedded module, running on FreeRTOS with FreeRTOS+TCP wasn’t connecting with the new server but still was sending packets to the old server.
Technically, when migrating from the old server to the new server also the server IP address migrates from the old server to the new server.
In Wireshark it is also seen that the server IP address 10.51.1.1 migrates to MAC address ac:1f:6b:ef:af:45 while the old server had another MAC address (ac:1f:6b:ef:ab:01).
1055 781.388672 SuperMic_ef:af:45 Broadcast ARP 60 Gratuitous ARP for 10.51.1.1 (Request) (duplicate use of 10.51.1.1 detected!) 2022-03-16 10:51:45.415845
Frame 1055: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) on interface 0
Ethernet II, Src: SuperMic_ef:af:45 (ac:1f:6b:ef:af:45), Dst: Broadcast (ff:ff:ff:ff:ff:ff)
[Duplicate IP address detected for 10.51.1.1 (ac:1f:6b:ef:af:45) - also in use by ac:1f:6b:ef:ab:01 (frame 897)]
Address Resolution Protocol (request/gratuitous ARP)
Hardware type: Ethernet (1)
Protocol type: IPv4 (0x0800)
Hardware size: 6
Protocol size: 4
Opcode: request (1)
[Is gratuitous: True]
Sender MAC address: SuperMic_ef:af:45 (ac:1f:6b:ef:af:45)
Sender IP address: 10.51.1.1
Target MAC address: 00:00:00_00:00:00 (00:00:00:00:00:00)
Target IP address: 10.51.1.1
The problem is, FreeRTOS+TCP doesn’t update the ARP table with this gratuitous ARP request in the module and therefore doesn’t send TCP/IP packets to the correct new MAC address after the migration from old to new server.
We use FreeRTOS+TCP V2.3.2 LTS Patch 1
Source FreeRTOS_ARP.c
Function eARPProcessPacket()
case ipARP_REQUEST:
/* The packet contained an ARP request. Was it for the IP
* address of the node running this code? */
if( ulTargetProtocolAddress == *ipLOCAL_IP_ADDRESS_POINTER )
This above check is dropping this gratuitous ARP request because the Target IP address isn’t the local IP address.
I have searched on the internet and I think FreeRTOS+TCP should handle this gratuitous ARP request so an update is done with existing ARP entry’s.
I changed it to the following code:
case ipARP_REQUEST:
/* The packet contained an ARP request. Was it for the IP
* address of the node running this code? */
if( ulTargetProtocolAddress == *ipLOCAL_IP_ADDRESS_POINTER )
{
// Existing code of FreeRTOS+TCP
}
else if ( ulSenderProtocolAddress == ulTargetProtocolAddress ) // Gratuitous ARP request?
{
/* The request is a Gratuitous ARP message.
* Refresh the entry if it already exists. */
/* Determine the ARP cache status for the requested IP address. */
if ( eARPGetCacheEntry( &( ulSenderProtocolAddress ), &( xHardwareAddress ) ) == eARPCacheHit )
{
vARPRefreshCacheEntry( &( pxARPHeader->xSenderHardwareAddress ), ulSenderProtocolAddress );
}
}
Hereafter the switch from old to new server was done perfectly after receiving and handling the gratuitous ARP request.
What do you think? Is FreeRTOS+TCP lacking support of this feature or is Windows not configured correctly?
After reading some info about gratuitous ARP request’s my personal opinion is that FreeRTOS+TCP should be updated with this code.