FreeRTOS+TCP Multi: Issues with multiple endpoints on one interface

Ultimately I’m trying to set up two IP addresses on one interface on the Zynq (Zedboard) so that I can have two TCP servers:
TCP Server 1: 10.100.100.162, Port 5001
TCP Server 2: 10.111.111.163, Port 5001

Issue #1:
Even though TCP Server 1 and TCP Server have different IP addresses,
if I use the same Port number I get an configASSERT when creating the second socket which ends my application.
As a workaround I had to use a different IP port number for the second TCP Server
So I set:
TCP Server 1: 10.100.100.162, Port 5001
TCP Server 2: 10.111.111.163, Port 5002

And then the assert did not trigger.
Although workaround fixes my issue, I wonder why this issue exists.

Issue #2:
I have a Raspberry Pi (aka Rpi_151) connected to the Zynq through an Ethernet Switch.
This Rpi also has a dual IP address (10.100.100.151 and 10.111.111.151).

Observation 2A:
I opened up two terminal windows on Rpi_151.
In one terminal window Rpi_151 pings the Zynq on the 10.100.100.1 network (i.e. ping 10.100.100.162)
On the other terminal Rpi_151 pings the Zynq on the 10.111.111.1 network (i.e. ping 10.111.111.163)

Here’s the weird part: When I ping the 10.111.111.163 in terminal 2, then in terminal 1 window the replies to the 10.100.100.162 are paused or stop. Many packets are reported missing.

If I reverse the order of how I add the endpoints on the Zynq then the reverse is true.
(Pinging 10.100.100.162 stops/ pauses the pings/replies from 10.111.111.163).

Observation 2B:
I decided to get a second Raspberry Pi (Rpi_152) which also has dual IP addresses
(10.100.100.152 and 10.111.111.152).

If I use Rpi_151 to ping Zynq at 10.100.100.162
and use Rpi_152 to ping Zynq at 10.111.111.163,
then the Zynq is able to reply to both ping requests no problem.
It’s only when I’m using only ONE Rpi to ping the Zynq on both IP addresses that it seems like something goes wrong with the Zynq.

It’s like the FreeRTOS+TCP gets confused that my Rpi has two IP addresses.

Issue #3
So in order to test my two TCP servers running on the Zynq I have to use two Rpi’s.
One Rpi communicating on the 10.100.100.1 network
and the other communicating on the 10.111.111.1 network.

That’s the workaround for now at least.

Anyway, I’ll have both TCP Servers running on the Zynq.
The third issue I’m seeing is that one of the TCP servers (sometimes TCP Server 1, sometimes TCP Server 2)
will randomly end the connection if I have both TCP servers running simultaneously.

It appears that the “weaker” endpoint is the one I add first (this is the one associated with TCP Server 1). Even just having TCP Server 2 on standby (waiting for clients), and having the TCP Server 1 actively communicating with a client will result in connection dropped after a short period of time (5 min).

In the reverse scenario: TCP server2 actively communicating with a client, and TCP Server 1 waiting, then it seems like TCP server 2 doesn’t drop his client.
I’d have to run this test overnight to see if the second endpoint really is stronger, or if I just got lucky.

Whenever, the TCP Server drops the client I see this on the Zynq Console:
vTCPStateChange: Closing (Queued 0, Accept 0 Reuse 0)
vTCPStateChange: me 0x40cc68 parent 0x40cc68 peer 0x0 clear 0
vTCPStateChange: xHasCleared = 0

Also, when the connection gets dropped, I will go to the client terminal on Rpi and try to ping the Zynq and notice that I can’t ping the Zynq on that ipaddress for a period of time, and then I can ping again normally and reconnect my TCP client after some period of time (10 sec approx).

The source code for the TCP server included in my main.c was mostly taken from the tutorials here:

I wonder if one or more of my issues is related to the following forum post here:

I am attaching my main.c file.
I am also attaching my FreeRTOSIPConfig.h file
main_sdog.c (9.8 KB)
freeRTOSIPConfig_sdog.h (17.9 KB)

@svgarcia

Even though TCP Server 1 and TCP Server have different IP addresses,
if I use the same Port number I get an configASSERT when creating the second socket which ends my application.

FreeRTOS+TCP shares the entire port number range across all endpoints and interfaces, meaning that all endpoints (even from different interfaces) have to use ports from the same set. Hence port numbers are independent of endpoints and have to be unique across applications even if the endpoints used for communication are different.

This Rpi also has a dual IP address (10.100.100.151 and 10.111.111.151).
When I ping the 10.111.111.163 in terminal 2, then in terminal 1 window the replies to the 10.100.100.162 are paused or stop. Many packets are reported missing.

Does your RPi has different MAC addresses for each endpoint? Can you share the full endpoint details of both your Zynq endpoints and RPi endpoints, including netmask, gateway, MAC, etc.

Also, please check if the behavior is the same if you connect the RPi to a router instead of switch to communicate with the Zynq.

TCP server2 actively communicating with a client, and TCP Server 1 waiting, then it seems like TCP server 2 doesn’t drop his client.

It seems like the either the server/client is closing the connection because of inactivity. Try decreasing the ipconfigTCP_KEEP_ALIVE_INTERVAL to a lower value, maybe 5 seconds or so.

@tony-josi-aws

I believe the Rpi has only one Mac address for it’s ethernet interface.
Only thing I did was configure a second IP address to the Rpi ethernet interface.

**Rpi Network Settings:**
sdog@rpi151:~/config $ ip addr show eth0
    eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether <REMOVED FOR PRIVACY> brd ff:ff:ff:ff:ff:ff
    inet 10.100.100.151/24 brd 10.100.100.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet 10.111.111.151/24 brd 10.111.111.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::592:6ea6:70de:85c7/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

Not sure if there is another Linux command to show more info about the Rpi Network setting.
Do you have a recommendation?

---
**Zynq Network settings**
Here is the  Zynq Network settings (also seen in main_sdog.c file I attached previously)

/* The MAC address array is not declared const as the MAC address will
   normally be read from an EEPROM and not hard coded (in real deployed
   applications).*/

static uint8_t ucMACAddress[ 6 ] = { 0x00, 0x11, 0x22, 0x33, 0x44, 0x55 };

/* Define the network addressing.  These parameters will be used if either
   ipconfigUDE_DHCP is 0 or if ipconfigUSE_DHCP is 1 but DHCP auto configuration
   failed. */

static const uint8_t ucIPAddress1[ 4 ] = { 10, 100, 100, 162 };
static const uint8_t ucGatewayAddress1[ 4 ] = { 10, 100, 100, 1 };

static const uint8_t ucIPAddress2[ 4 ] = { 10, 111, 111, 163 };
static const uint8_t ucGatewayAddress2[ 4 ] = { 10, 111, 111, 1 };

static const uint8_t ucNetMask[ 4 ] = { 255, 255, 255, 0 };

/* The following is the address of an OpenDNS server. */
static const uint8_t ucDNSServerAddress[ 4 ] = { 208, 67, 222, 222 };


/* Only one Ethernet Interface */
static NetworkInterface_t xInterfaces[ 1 ];

/* Two end-points (i.e. two ip addresses on that interface). */
static NetworkEndPoint_t xEndPoints[ 2 ];

Keepalive:
I set the ipconfigTCP_KEEP_ALIVE_INTERVAL to 1 sec.
I still see TCP Server 1 (the one with the “weaker” ip address) close the TCP connection after a short while (after 74 messages).
I’ll keep it running overnight to see how long the other TCP Server2 stays open.

Not sure if I’ll be able to connect my devices to router instead of a switch.
I’ll have to look around to see if I can find a spare router.

Again, you can not have two MAC addresses on the same physical interface, at best you may be able to get it to work somehow in niche network setups, but for any given network infra structure, it is guesswork if it works or not. See here: Changing the IP Address and MAC address during Runtime - Libraries - FreeRTOS Community Forums

Likewise, a singular MAC address but multiple IP addresses over it are a grey zone, many routers and switches may not like that at all.

@svgarcia : What happens if you leave out your Zynq target altogether, but try to cross ping your RPis in the same network setup? Does that work?

Also, as a side note, there should really not be a meaningful use for multiple IP addresses on the same interface in the first place. It somehow contradicts the working model of IP addressing. What do you need that setup for?

@RAc

My setup is 1 interface, 1 Mac Address, but multiple endpoints (i.e. ip addresses).

If I leave Zynq out of the equation, the Rpi devices have no problem communicating with each other. Ping or TCP server / client.

Also, as a side note, there should really not be a meaningful use for multiple IP addresses on the same interface in the first place. It somehow contradicts the working model of IP addressing. What do you need that setup for?

If that were true, then there would be no need for +TCP multi then, right? :wink:

In my particular application I do need to use multiple endpoints because the board with my Zynq only has one ethernet interface and I need to talk to two different IP address networks.
(in my example the 10.100.100.1 network and the 10.111.111.1 network).

you do not need two IP addresses for that use case, just define a dedicated route (ie a gateway) to the other subnet. I do not know how to do that in FreeRTOS+TCP, unfortunately, but that is the standard way to handle this scenario.

@svgarcia

I still see TCP Server 1 (the one with the “weaker” ip address) close the TCP connection after a short while (after 74 messages).

There are a couple of issues with your server code ( main_sdog.c):

static void prvEchoClientRxTask( void *pvParameters )
{
Socket_t xSocket;
static char cRxedData[ BUFFER_SIZE ];

The buffer cRxedData is a static buffer, meaning that all the prvEchoClientRxTask tasks created by both servers are using the same buffer to receive the data, leading to buffer corruption. You need to either use a local buffer (should check if your task’s stack size allows that) or allocate a dynamic memory block specific to each task.

printf("Rcv'd: %.*s\n\n", (int)lBytesReceived, cRxedData);

I’m not sure what underlying output function your implementation uses for printf, and if its using serial (example, UART) output or similar without proper synchronization in place, it will lead to race conditions.

You can take a look at thread safe logging for debug message and so, thats used in the FreeRTOS demos:

Also check this sample TCP echo server demo to see how the sockets are configured (with TX RX timeouts).

@tony-josi-aws

Good catch on the static buffer for rcv.
Originally I just had one TCP server, and didn’t modify the code when I added the second TCP server.

My printf is going to the UART port.

I’ll take a look at the sample demo you linked to see for comparison and see how I can incorporate some of that into my application.

you do not need two IP addresses for that use case, just define a dedicated route (ie a gateway) to the other subnet. I do not know how to do that in FreeRTOS+TCP, unfortunately, but that is the standard way to handle this scenario.

Do you know if this suggestion from @RAc makes more sense vs having two IP addresses on one interface?

Right. In the IP world, an IP address represents a physical device, and subnets are used to group physical devices into logical clusters. “End points” (unfortunately, the term is frequently misused) are combinations of IP addresses and ports relative to the IP address, so to address an end point on a physical device, you need a combination of the device’s address and a port.

All of that belongs to a network infra structure and can not be looked at on a device level alone. For example, more and more routers support vLans which allow network administrators to organize logical subnets, eg route all packets to/from subnet 192.168.1.x through one port of a given router, and all packets within 192.168.2.x through another port. In such setups, your “violation” of the principle “one IP address=one end device” must comply with the vLan configuration of the network in the field, or it will very simply not work. Another issue is port security; many routers will block a port if traffic to/from one end node with multiple IP addresses is detected.

I do not know why multiple IP addresses on the same interface are supported by FreeRTOS+TCP (or other network stacks, for that purpose) in the first place. It does not interoperate well with the addressing philosophy of networking. For the above reasons, I would not use it, at least not for end devices.

Edit/additional thoughts:

If you look at the issue from a system wide point of view, then the question is “who organizes/structures your network?”

Network administrators look at their networks as a (potentially very large) set of devices. It is their job to structure their network such that it is easily maintainable, scalable, secure, efficient etc. Thus, the structure (partitioning/segmenting/routing etc) of a network largely reflects their view on things, and third party components must fit into that view.

You will get very funny looks from network admins if you try to interfere with their structuring of the network, for example by trying to tie one end device to two subnets at the same time. The mildest reaction you can expect is “we must reserve the right to restructure the network at any time without any prior notice, and that must be transparent, so your end device may get a different IP address from us and must work just like before. So what you want may work now with some tweaking, but we can not reconfigure your end devices when we do restructure the network.” In practice, that is not always possible, but the more special needs you demand from the admins in whose nets your targets are supposed to run, the harder it will be for your company to sell your systems.

@tony-josi-aws

I went and incorporated the code from the example you pointed me to.
Any differences from the original example can be found by searching for term “sdog”.
Only thing I did differently was use xtaskCreate instead of using xTaskCreateStatic.

When I have both TCP servers running on the Zynq, very soon afterwards
(anywhere from 30sec up to 400sec or more), one of the TCP servers will end the connection.

When I go back to the client (Rpi Terminal), it may take a few seconds before I am able to manually restart the client and have the client reconnect to the TCP server that kicked it off.
(During that interim I can’t reconnect the client or even ping the Zynq on that IP address that the TCP server was using).

Sometimes TCP server 1 will “fail”, and sometimes TCP server 2 will fail.
Although it seems like TCP server 1 appears to fails more often that Server 2.
(Server 1 is the endpoint I added first to the interface).

When this failure happens I see this on the Server side:
vTCPStateChange: Closing (Queued 0, Accept 0 Reuse 0)
vTCPStateChange: me 0x40a3e0 parent 0x40a3e0 peer 0x0 clear 0
vTCPStateChange: xHasCleared = 0

On the client side I see this:
Traceback (most recent call last):
File “/home/sdog/dev/tcp/tcp_py/tcp_client.py”, line 66, in
data = sock.recv(BUF_SIZE)
^^^^^^^^^^^^^^^^^^^
ConnectionResetError: [Errno 104] Connection reset by peer

Recall that this is when Client 1 and Client 2 are on separate machines.

Issue regarding using One Rpi for all clients VS Each TCP Client on separate machines:
If I use the same machine for Client 1 and Client 2,
then the situation is even worse.
For example I have a Raspberry Pi called Rpi_151 machine which has two IP addresses
(10.100.100.151 and 10.111.111.151)

If I use Rpi_151 to ping the Zynq at 10.100.100.162 replies come back.
However, in a separate terminal window when I use Rpi_151 to ping Zynq at 10.111.111.162,
then I stop seeing ping replies from 10.100.100.162 on the first terminal window (or I see many ping replies getting missed).

Similar behavior happens when TCP Client 1 and TCP Client 2 are run on the Rpi_151 machine.
Hence, why I was using a separate Rpi for each client.

Note: That if I only initialize the Zynq endpoints, but don’t start any TCP servers,
then one Rpi has no problem pinging the Zynq from both IP addresses at the same time.

I have included my new main.c and FreeRTOSIPConfig.h files here:
main_sdog2.c (11.5 KB)
freeRTOSIPConfig_sdog2.h (18.0 KB)

@svgarcia

static const uint8_t ucIPAddress1[ 4 ] = { 10, 100, 100, 162 };
static const uint8_t ucGatewayAddress1[ 4 ] = { 10, 100, 100, 1 };
static const uint8_t ucIPAddress2[ 4 ] = { 10, 111, 111, 163 };
static const uint8_t ucGatewayAddress2[ 4 ] = { 10, 111, 111, 1 };

What are these gateway devices: ucGatewayAddress1 and ucGatewayAddress2 are those routers?

I suppose your network setup is not able to work correctly with more than IP addresses using the same MAC addresses as suggested by @RAc.

Could you try checking the behavior with a router instead of a switch?
Also add debug logs before the following break statements:

                if( lSent < 0 )
                {
                    /* Socket closed? */
                    break;
                }
            }
            else
            {
                /* Socket closed? */
                break;
            }

to log the error code returned by the socket APIs.

@tony-josi-aws
TCP Server 1 connection will be to 10.100.100.x
and TCP Server 2 connection will be to 10.111.111.x
Hence why I made the GatewayAddresses 10.100.100.1 and 10.111.111.1 respectively.

Should I have used a different value for the Gateway?

There is no router in my lab setup. It is just two Raspberry Pis and the Zynq all connected by an unmanaged ethernet switch (D-link DGS-2205).

I’ll add some logging before the breaks and report back.

@tony-josi-aws
For debugging are you just asking for values of
lsent, lTotalSent, and lBytes?

Or what exact debug code do you recommend putting in these locations to capture error codes?

@svgarcia

Looking at your application code, there is a chance that your TCP server tasks are created and FreeRTOS+TCP APIs are called before the TCP/IP stack is ready.

Take a look at how tasks are created after the TCP/IP stack is ready in the same demo I shared in last post to see how vApplicationIPNetworkEventHook_Multi can be used to create application tasks such that it ensures the tasks are not created until the TCP/IP stack is ready.

Should I have used a different value for the Gateway?

There is no router in my lab setup. It is just two Raspberry Pis and the Zynq all connected by an unmanaged ethernet switch (D-link DGS-2205).

Based on your setup information, I believe you don’t have a Layer 3 device that could perform routing across subnets.

For debugging are you just asking for values of
lsent, lTotalSent, and lBytes?

Yes, log all those values before each break.