Speed up FreeRTOS+TCP on Zynq

Hi!
Can someone assist me in speeding up the TCP/IP performance on a Zynq using FreeRTOS+TCP library?
Using the FreeRTOS+TCP implementation of iperf3 I am measuring around 50Mbit/s, while Windows claims that a 1Gb/s link is established.
I guess there’s multiple things to consider here, but I don’t really know where to start and/or which code parts to show.
Help is much appreciated.

I find the biggest performance limitation to often be design-based. How are you using the FreeRTOS APIs? If you’re using them from a single task, what is the priority of the task in relation to other tasks? How is this task communicating with other tasks?

Have you built your solution off of an example? If so, what example?

Adding on to what Kody suggested, did you check whats the maximum speed supported by your PHY and the cable, router, etc.?

Windows claims that a 1Gb/s link is established.

Are you directly connecting your device to Windows? If so, did you compare the speed results when connected via a router to make sure the PHY is negotiated to the best configuration?

If you have enough RAM, try increasing the values of the buffers used by the IPERF. You can update that by trying to tweak these configs in theiperf_config.h:

#define ipconfigIPERF_TX_BUFSIZE				( 18 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE				( 12 )
#define ipconfigIPERF_RX_BUFSIZE				( 18 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE				( 12 )

#define ipconfigIPERF_RECV_BUFFER_SIZE			( 12 * ipconfigTCP_MSS )

Does this mean that the PHY is 1Gb/s? If so, then it is the maximum possible speed at which the PHY can operate. Then, maybe, it doesn’t directly translate to the actual speed at which the devices can talk to each other as there is a lot of processing involved to direct the packet content to the correct socket which is then read by an application.

Thanks for the answers so far!

My setup is as follows:

  • xc7z014s on a Trenz Electronic TE0720 SoM, SoM on a custom carrier but I can go to a Trenz TE0701 if necessary
  • The PHY on the SoM is a Marvell Alaska 88E1512 capable of 1Gb/s
  • direct Ethernet connection to a Windows PC via a ~1m CAT7 cable
  • the PCs NIC is a Realtek PCIe GbE controller type
  • FreeRTOS-Kernel and FreeRTOS-Plus-TCP are the latest version
  • iperf is from a post by @htibosch, sorry I can’t post the link as I am a “new user”
  • iperf window sizes are as suggested by @tony-josi-aws
  • There is only one task that initializes the interface via the pxZynq_FillInterfaceDescriptor, FreeRTOS_FillEndPoint and FreeRTOS_IPInit_Multi() functions. After that vIPerfInstall() is called and tne init task is suspended.
  • I didn’t change the priority of the iperf task so it is MAX - 3 (i.e. 7)
  • ipconfigIP_TASK_PRIORITY is unchanged, i.e., MAX - 2
  • configMAC_INPUT_TASK_PRIORITY in unchanged, i.e., MAX - 1

With this I am still measuring around 50Mbit/s.

@tony-josi-aws , I’m not sure what you mean by “If so, did you compare the speed results when connected via a router to make sure the PHY is negotiated to the best configuration?”.
No I only ever tried it with a direct connection, how would I use a router to make sure the negotiation is correct? As far as I know, the negotiated configuration is reflected in the speed rating Windows gives in the Status dialog of the Ethernet interface which says 1Gb/s.

Does noone have an idea how to improve?

@LinkwitzRiley

Ideally there shouldn’t be a difference if you are using a router in between or not, but if you are using it, the link from your PC to the router should be always up and you can make sure that the PC side is communicating with the best possible negotiated settings (ruling out the possibility that there is something going wrong with the negotiation between the PC and your device).

What speed do you observe with UDP traffic in IPERF?
When IPERF is run, do you have any other tasks that are using the CPU time?
Is your code compiled with optimization enabled and debug off?

Is it possible to share your FreeRTOSIPConfig.h, specifically ipconfigTCP_MSS?

Thanks, @tony-josi-aws !

I don’t have a router available to test in the way you are suggesting but I just tested the Ethernet on the PC side with another PC.
Running iperf3 (TCP) I measured 926Mbit/s, so I’m quite sure the PC side of things is okay.

I am afraid the iperf implementation I am using doesn’t support UDP or it doesn’t work - I have to look into that and come back to that.

No, as I’ve pointed out in my “setup” there are no other tasks running and the order of task priority is emac, ip, iperf from highest to lowest priority.

The code is compiled with -O3, I am not debugging.

In FreeRTOSIPConfig.h ipconfigTCP_MSS is defined as
#define ipconfigTCP_MSS ( ipconfigNETWORK_MTU - ( ipSIZE_OF_IPv4_HEADER + ipSIZE_OF_TCP_HEADER ) )
which evaluates to 1420 with MTU being 1500 and both the header sizes being 20.

@LinkwitzRiley

Can you try running this implementation of IPERF by @htibosch : freertos_plus_projects/plus/Common/Utilities/iperf_task_v3_0f.c at master · htibosch/freertos_plus_projects · GitHub

It supports both UDP and TCP.

@tony-josi-aws
I was using his implementation v3_0d.
However I tried with v3_0f and it still doesn’t work.
I am using iperf-3.1.3 on Win 11.
There is only one UDP packet sent and then the measurement just freezes.

Apart from that, why do you expect a significantly different result with UDP than with TCP?
(Btw. the result with TCP using v3_0f was pretty much the same as with v3_0d).

There is only one UDP packet sent and then the measurement just freezes.

What’s the IPERF command you use on the PC side when running for UDP?
Can you check what happens with this command?
iperf3.exe -c <server-ip> --port 5001 -u --bandwidth 0 -V

why do you expect a significantly different result with UDP than with TCP?

It’s common to observe higher throughput for UDP compared to TCP because of protocol overhead, retransmissions, etc., in TCP.

Comparing that will help to identify if the slow throughput is caused by suboptimal TCP settings or the network link itself.

can we see a wireshark trace of the traffic? @aggarg , can you enable file uploads for @LinkwitzRiley ?

@tony-josi-aws
I used iperf3.exe -c 192.168.168.212 --port 5001 --udp --bandwidth 50M.
With iperf3.exe -c 192.168.168.212 --port 5001 -u --bandwidth 0 -V basically the same happens:
I get

iperf 3.1.3
CYGWIN_NT-10.0 WORKSTATION12 2.5.1(0.297/5/3) 2016-04-21 22:14 x86_64
Time: Fri, 20 Dec 2024 10:25:20 GMT
Connecting to host 192.168.168.212, port 5001
      Cookie: WORKSTATION12.1734690320.600992.43e6

then nothing happens until I hit Ctrl+C (after several seconds) and I get

- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
CPU Utilization: local/sender 0.0% (0.0%u/0.0%s), remote/receiver 0.0% (0.0%u/0.0%s)
iperf3: interrupt - the client has terminated

@RAc
I attached a zip of a pcapng. This is all the happens until I hit Ctrl+C.
zynq_iperf3.zip (1.1 KB)

I do not understand too much about iPerf, but the payload packets appear to be trashed. For the UDP packets to have length 4 and 8 looks incomplete, and the only packet in the TCP stream that appears to make sense if packet 11 but it says something about udp in the payload.

Other than that, the packet flow looks ok (no spurious retransmissions, no shrinking window sizes etc).

The UDP issue may be caused by the peer’s firewall settings.

@ RAc
iperf3 -u works fine with e.g. another PC as the iperf server, pretty sure this is caused by the Zynq side of things.

Can anyone comment on the initial problem?
IP traffic is very slow in any case not only when using iperf, although that is probably another topic on top.

@LinkwitzRiley

Can you share the FreeRTOSIPConfig.h?
I will try running IPERF on a similar platform and share my observations.

Sure!
FreeRTOSIPConfig.h (21.4 KB)

Anyone have an idea what’s wrong with my FreeRTOS+TCP or how to debug?

@LinkwitzRiley

I tried running the IPERF on Zynq with V4.2.2 version of FreeRTOS+TCP with the following IPERF settings:

#if BUILD_IPERF3

#define ipconfigIPERF_PRIORITY_IPERF_TASK               6
#define ipconfigIPERF_DOES_ECHO_UDP                     1
#define ipconfigIPERF_VERSION                           3
#define ipconfigIPERF_STACK_SIZE_IPERF_TASK             680
#define ipconfigIPERF_TX_BUFSIZE                        ( 32 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE                        ( 32 )
#define ipconfigIPERF_RX_BUFSIZE                        ( 32 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE                        ( 32 )
/* The iperf module declares a character buffer to store its send data. */
#define ipconfigIPERF_RECV_BUFFER_SIZE                  ( 32 * ipconfigTCP_MSS )

#endif

and these were the results:

UDP:

C:\IPERF\iperf3.1.3_64>iperf3.exe -c 192.168.35.253 --port 5001 --udp --bandwidth 0 --set-mss 1460 --bytes 2G
Connecting to host 192.168.35.253, port 5001
[  4] local 192.168.35.13 port 63699 connected to 192.168.35.253 port 5001
[ ID] Interval           Transfer     Bandwidth       Total Datagrams

[  4]   0.00-1.00   sec   114 MBytes   952 Mbits/sec  14540
[  4]   1.00-2.00   sec   113 MBytes   951 Mbits/sec  14500
[  4]   2.00-3.00   sec   113 MBytes   951 Mbits/sec  14510
[  4]   3.00-4.00   sec   113 MBytes   950 Mbits/sec  14500
[  4]   4.00-5.00   sec   113 MBytes   951 Mbits/sec  14500
[  4]   5.00-6.00   sec   113 MBytes   951 Mbits/sec  14510
[  4]   6.00-7.00   sec   113 MBytes   951 Mbits/sec  14510
[  4]   7.00-8.00   sec   113 MBytes   951 Mbits/sec  14500
[  4]   8.00-9.00   sec   113 MBytes   949 Mbits/sec  14490
[  4]   9.00-10.00  sec   113 MBytes   950 Mbits/sec  14490
[  4]  10.00-11.00  sec   113 MBytes   948 Mbits/sec  14470
[  4]  11.00-12.00  sec   113 MBytes   951 Mbits/sec  14510
[  4]  12.00-13.00  sec   113 MBytes   951 Mbits/sec  14510
[  4]  13.00-14.00  sec   113 MBytes   949 Mbits/sec  14480
[  4]  14.00-15.00  sec   113 MBytes   951 Mbits/sec  14510
[  4]  15.00-16.00  sec   112 MBytes   938 Mbits/sec  14320
[  4]  16.00-17.00  sec   113 MBytes   951 Mbits/sec  14500
[  4]  17.00-18.00  sec   113 MBytes   951 Mbits/sec  14510
[  4]  18.00-18.09  sec  10.1 MBytes   950 Mbits/sec  1290
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  4]   0.00-18.09  sec  2.00 GBytes   950 Mbits/sec  0.000 ms  0/0 (0%)
[  4] Sent 0 datagrams

iperf Done.

TCP:

C:\IPERF\iperf3.1.3_64>iperf3.exe -c 192.168.35.253  --port 5001 --bytes 1G -V
iperf 3.1.3
CYGWIN_NT-10.0 lab-win-13 2.5.1(0.297/5/3) 2016-04-21 22:14 x86_64
Time: Fri, 17 Jan 2025 05:05:19 GMT
Connecting to host 192.168.35.253, port 5001
      Cookie: lab-win-13.1737090319.874042.58ac7da
      TCP MSS: 0 (default)
[  4] local 192.168.35.13 port 52970 connected to 192.168.35.253 port 5001
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 1073741824 bytes to send
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec  39.9 MBytes   334 Mbits/sec

[  4]   1.00-2.00   sec  41.8 MBytes   350 Mbits/sec

[  4]   2.00-3.00   sec  42.5 MBytes   357 Mbits/sec

[  4]   3.00-4.00   sec  42.4 MBytes   355 Mbits/sec

[  4]   4.00-5.00   sec  41.9 MBytes   351 Mbits/sec

[  4]   5.00-6.00   sec  42.8 MBytes   359 Mbits/sec

[  4]   6.00-7.00   sec  35.8 MBytes   299 Mbits/sec

[  4]   7.00-8.00   sec  35.1 MBytes   296 Mbits/sec

[  4]   8.00-9.00   sec  35.5 MBytes   297 Mbits/sec

[  4]   9.00-10.00  sec  35.1 MBytes   295 Mbits/sec

[  4]  10.00-11.00  sec  35.0 MBytes   293 Mbits/sec

[  4]  11.00-12.00  sec  34.4 MBytes   289 Mbits/sec

[  4]  12.00-13.00  sec  34.9 MBytes   293 Mbits/sec

[  4]  13.00-14.00  sec  28.9 MBytes   242 Mbits/sec

[  4]  14.00-15.00  sec  35.9 MBytes   301 Mbits/sec

[  4]  15.00-16.00  sec  42.8 MBytes   359 Mbits/sec

[  4]  16.00-17.00  sec  42.0 MBytes   351 Mbits/sec

[  4]  17.00-18.00  sec  42.6 MBytes   358 Mbits/sec

[  4]  18.00-19.00  sec  42.4 MBytes   356 Mbits/sec

[  4]  19.00-20.00  sec  41.9 MBytes   352 Mbits/sec

[  4]  20.00-21.00  sec  41.8 MBytes   350 Mbits/sec

[  4]  21.00-22.00  sec  42.6 MBytes   358 Mbits/sec

[  4]  22.00-23.00  sec  42.5 MBytes   357 Mbits/sec

[  4]  23.00-24.00  sec  42.2 MBytes   354 Mbits/sec

[  4]  24.00-25.00  sec  42.6 MBytes   357 Mbits/sec

[  4]  25.00-25.93  sec  39.0 MBytes   354 Mbits/sec

- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-25.93  sec  1.00 GBytes   331 Mbits/sec
 sender
[  4]   0.00-25.93  sec  1024 MBytes   331 Mbits/sec
 receiver
CPU Utilization: local/sender 0.5% (0.1%u/0.4%s), remote/receiver 0.0% (0.0%u/0.0%s)

iperf Done.