Hi!
Can someone assist me in speeding up the TCP/IP performance on a Zynq using FreeRTOS+TCP library?
Using the FreeRTOS+TCP implementation of iperf3 I am measuring around 50Mbit/s, while Windows claims that a 1Gb/s link is established.
I guess there’s multiple things to consider here, but I don’t really know where to start and/or which code parts to show.
Help is much appreciated.
I find the biggest performance limitation to often be design-based. How are you using the FreeRTOS APIs? If you’re using them from a single task, what is the priority of the task in relation to other tasks? How is this task communicating with other tasks?
Have you built your solution off of an example? If so, what example?
Adding on to what Kody suggested, did you check whats the maximum speed supported by your PHY and the cable, router, etc.?
Windows claims that a 1Gb/s link is established.
Are you directly connecting your device to Windows? If so, did you compare the speed results when connected via a router to make sure the PHY is negotiated to the best configuration?
If you have enough RAM, try increasing the values of the buffers used by the IPERF. You can update that by trying to tweak these configs in theiperf_config.h
:
#define ipconfigIPERF_TX_BUFSIZE ( 18 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE ( 12 )
#define ipconfigIPERF_RX_BUFSIZE ( 18 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE ( 12 )
#define ipconfigIPERF_RECV_BUFFER_SIZE ( 12 * ipconfigTCP_MSS )
Does this mean that the PHY is 1Gb/s? If so, then it is the maximum possible speed at which the PHY can operate. Then, maybe, it doesn’t directly translate to the actual speed at which the devices can talk to each other as there is a lot of processing involved to direct the packet content to the correct socket which is then read by an application.
Thanks for the answers so far!
My setup is as follows:
- xc7z014s on a Trenz Electronic TE0720 SoM, SoM on a custom carrier but I can go to a Trenz TE0701 if necessary
- The PHY on the SoM is a Marvell Alaska 88E1512 capable of 1Gb/s
- direct Ethernet connection to a Windows PC via a ~1m CAT7 cable
- the PCs NIC is a Realtek PCIe GbE controller type
- FreeRTOS-Kernel and FreeRTOS-Plus-TCP are the latest version
- iperf is from a post by @htibosch, sorry I can’t post the link as I am a “new user”
- iperf window sizes are as suggested by @tony-josi-aws
- There is only one task that initializes the interface via the
pxZynq_FillInterfaceDescriptor
,FreeRTOS_FillEndPoint
andFreeRTOS_IPInit_Multi()
functions. After thatvIPerfInstall()
is called and tne init task is suspended. - I didn’t change the priority of the iperf task so it is MAX - 3 (i.e. 7)
- ipconfigIP_TASK_PRIORITY is unchanged, i.e., MAX - 2
- configMAC_INPUT_TASK_PRIORITY in unchanged, i.e., MAX - 1
With this I am still measuring around 50Mbit/s.
@tony-josi-aws , I’m not sure what you mean by “If so, did you compare the speed results when connected via a router to make sure the PHY is negotiated to the best configuration?”.
No I only ever tried it with a direct connection, how would I use a router to make sure the negotiation is correct? As far as I know, the negotiated configuration is reflected in the speed rating Windows gives in the Status dialog of the Ethernet interface which says 1Gb/s.
Does noone have an idea how to improve?
Ideally there shouldn’t be a difference if you are using a router in between or not, but if you are using it, the link from your PC to the router should be always up and you can make sure that the PC side is communicating with the best possible negotiated settings (ruling out the possibility that there is something going wrong with the negotiation between the PC and your device).
What speed do you observe with UDP traffic in IPERF?
When IPERF is run, do you have any other tasks that are using the CPU time?
Is your code compiled with optimization enabled and debug off?
Is it possible to share your FreeRTOSIPConfig.h, specifically ipconfigTCP_MSS
?
Thanks, @tony-josi-aws !
I don’t have a router available to test in the way you are suggesting but I just tested the Ethernet on the PC side with another PC.
Running iperf3 (TCP) I measured 926Mbit/s, so I’m quite sure the PC side of things is okay.
I am afraid the iperf implementation I am using doesn’t support UDP or it doesn’t work - I have to look into that and come back to that.
No, as I’ve pointed out in my “setup” there are no other tasks running and the order of task priority is emac, ip, iperf from highest to lowest priority.
The code is compiled with -O3, I am not debugging.
In FreeRTOSIPConfig.h ipconfigTCP_MSS
is defined as
#define ipconfigTCP_MSS ( ipconfigNETWORK_MTU - ( ipSIZE_OF_IPv4_HEADER + ipSIZE_OF_TCP_HEADER ) )
which evaluates to 1420 with MTU being 1500 and both the header sizes being 20.
Can you try running this implementation of IPERF by @htibosch : freertos_plus_projects/plus/Common/Utilities/iperf_task_v3_0f.c at master · htibosch/freertos_plus_projects · GitHub
It supports both UDP and TCP.
@tony-josi-aws
I was using his implementation v3_0d.
However I tried with v3_0f and it still doesn’t work.
I am using iperf-3.1.3 on Win 11.
There is only one UDP packet sent and then the measurement just freezes.
Apart from that, why do you expect a significantly different result with UDP than with TCP?
(Btw. the result with TCP using v3_0f was pretty much the same as with v3_0d).
There is only one UDP packet sent and then the measurement just freezes.
What’s the IPERF command you use on the PC side when running for UDP?
Can you check what happens with this command?
iperf3.exe -c <server-ip> --port 5001 -u --bandwidth 0 -V
why do you expect a significantly different result with UDP than with TCP?
It’s common to observe higher throughput for UDP compared to TCP because of protocol overhead, retransmissions, etc., in TCP.
Comparing that will help to identify if the slow throughput is caused by suboptimal TCP settings or the network link itself.
can we see a wireshark trace of the traffic? @aggarg , can you enable file uploads for @LinkwitzRiley ?
@tony-josi-aws
I used iperf3.exe -c 192.168.168.212 --port 5001 --udp --bandwidth 50M
.
With iperf3.exe -c 192.168.168.212 --port 5001 -u --bandwidth 0 -V
basically the same happens:
I get
iperf 3.1.3
CYGWIN_NT-10.0 WORKSTATION12 2.5.1(0.297/5/3) 2016-04-21 22:14 x86_64
Time: Fri, 20 Dec 2024 10:25:20 GMT
Connecting to host 192.168.168.212, port 5001
Cookie: WORKSTATION12.1734690320.600992.43e6
then nothing happens until I hit Ctrl+C (after several seconds) and I get
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
CPU Utilization: local/sender 0.0% (0.0%u/0.0%s), remote/receiver 0.0% (0.0%u/0.0%s)
iperf3: interrupt - the client has terminated
@RAc
I attached a zip of a pcapng. This is all the happens until I hit Ctrl+C.
zynq_iperf3.zip (1.1 KB)
I do not understand too much about iPerf, but the payload packets appear to be trashed. For the UDP packets to have length 4 and 8 looks incomplete, and the only packet in the TCP stream that appears to make sense if packet 11 but it says something about udp in the payload.
Other than that, the packet flow looks ok (no spurious retransmissions, no shrinking window sizes etc).
The UDP issue may be caused by the peer’s firewall settings.
@ RAc
iperf3 -u works fine with e.g. another PC as the iperf server, pretty sure this is caused by the Zynq side of things.
Can anyone comment on the initial problem?
IP traffic is very slow in any case not only when using iperf, although that is probably another topic on top.
Can you share the FreeRTOSIPConfig.h?
I will try running IPERF on a similar platform and share my observations.
Sure!
FreeRTOSIPConfig.h (21.4 KB)
Anyone have an idea what’s wrong with my FreeRTOS+TCP or how to debug?
I tried running the IPERF on Zynq with V4.2.2 version of FreeRTOS+TCP with the following IPERF settings:
#if BUILD_IPERF3
#define ipconfigIPERF_PRIORITY_IPERF_TASK 6
#define ipconfigIPERF_DOES_ECHO_UDP 1
#define ipconfigIPERF_VERSION 3
#define ipconfigIPERF_STACK_SIZE_IPERF_TASK 680
#define ipconfigIPERF_TX_BUFSIZE ( 32 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE ( 32 )
#define ipconfigIPERF_RX_BUFSIZE ( 32 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE ( 32 )
/* The iperf module declares a character buffer to store its send data. */
#define ipconfigIPERF_RECV_BUFFER_SIZE ( 32 * ipconfigTCP_MSS )
#endif
and these were the results:
UDP:
C:\IPERF\iperf3.1.3_64>iperf3.exe -c 192.168.35.253 --port 5001 --udp --bandwidth 0 --set-mss 1460 --bytes 2G
Connecting to host 192.168.35.253, port 5001
[ 4] local 192.168.35.13 port 63699 connected to 192.168.35.253 port 5001
[ ID] Interval Transfer Bandwidth Total Datagrams
[ 4] 0.00-1.00 sec 114 MBytes 952 Mbits/sec 14540
[ 4] 1.00-2.00 sec 113 MBytes 951 Mbits/sec 14500
[ 4] 2.00-3.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 3.00-4.00 sec 113 MBytes 950 Mbits/sec 14500
[ 4] 4.00-5.00 sec 113 MBytes 951 Mbits/sec 14500
[ 4] 5.00-6.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 6.00-7.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 7.00-8.00 sec 113 MBytes 951 Mbits/sec 14500
[ 4] 8.00-9.00 sec 113 MBytes 949 Mbits/sec 14490
[ 4] 9.00-10.00 sec 113 MBytes 950 Mbits/sec 14490
[ 4] 10.00-11.00 sec 113 MBytes 948 Mbits/sec 14470
[ 4] 11.00-12.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 12.00-13.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 13.00-14.00 sec 113 MBytes 949 Mbits/sec 14480
[ 4] 14.00-15.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 15.00-16.00 sec 112 MBytes 938 Mbits/sec 14320
[ 4] 16.00-17.00 sec 113 MBytes 951 Mbits/sec 14500
[ 4] 17.00-18.00 sec 113 MBytes 951 Mbits/sec 14510
[ 4] 18.00-18.09 sec 10.1 MBytes 950 Mbits/sec 1290
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 4] 0.00-18.09 sec 2.00 GBytes 950 Mbits/sec 0.000 ms 0/0 (0%)
[ 4] Sent 0 datagrams
iperf Done.
TCP:
C:\IPERF\iperf3.1.3_64>iperf3.exe -c 192.168.35.253 --port 5001 --bytes 1G -V
iperf 3.1.3
CYGWIN_NT-10.0 lab-win-13 2.5.1(0.297/5/3) 2016-04-21 22:14 x86_64
Time: Fri, 17 Jan 2025 05:05:19 GMT
Connecting to host 192.168.35.253, port 5001
Cookie: lab-win-13.1737090319.874042.58ac7da
TCP MSS: 0 (default)
[ 4] local 192.168.35.13 port 52970 connected to 192.168.35.253 port 5001
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 1073741824 bytes to send
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 39.9 MBytes 334 Mbits/sec
[ 4] 1.00-2.00 sec 41.8 MBytes 350 Mbits/sec
[ 4] 2.00-3.00 sec 42.5 MBytes 357 Mbits/sec
[ 4] 3.00-4.00 sec 42.4 MBytes 355 Mbits/sec
[ 4] 4.00-5.00 sec 41.9 MBytes 351 Mbits/sec
[ 4] 5.00-6.00 sec 42.8 MBytes 359 Mbits/sec
[ 4] 6.00-7.00 sec 35.8 MBytes 299 Mbits/sec
[ 4] 7.00-8.00 sec 35.1 MBytes 296 Mbits/sec
[ 4] 8.00-9.00 sec 35.5 MBytes 297 Mbits/sec
[ 4] 9.00-10.00 sec 35.1 MBytes 295 Mbits/sec
[ 4] 10.00-11.00 sec 35.0 MBytes 293 Mbits/sec
[ 4] 11.00-12.00 sec 34.4 MBytes 289 Mbits/sec
[ 4] 12.00-13.00 sec 34.9 MBytes 293 Mbits/sec
[ 4] 13.00-14.00 sec 28.9 MBytes 242 Mbits/sec
[ 4] 14.00-15.00 sec 35.9 MBytes 301 Mbits/sec
[ 4] 15.00-16.00 sec 42.8 MBytes 359 Mbits/sec
[ 4] 16.00-17.00 sec 42.0 MBytes 351 Mbits/sec
[ 4] 17.00-18.00 sec 42.6 MBytes 358 Mbits/sec
[ 4] 18.00-19.00 sec 42.4 MBytes 356 Mbits/sec
[ 4] 19.00-20.00 sec 41.9 MBytes 352 Mbits/sec
[ 4] 20.00-21.00 sec 41.8 MBytes 350 Mbits/sec
[ 4] 21.00-22.00 sec 42.6 MBytes 358 Mbits/sec
[ 4] 22.00-23.00 sec 42.5 MBytes 357 Mbits/sec
[ 4] 23.00-24.00 sec 42.2 MBytes 354 Mbits/sec
[ 4] 24.00-25.00 sec 42.6 MBytes 357 Mbits/sec
[ 4] 25.00-25.93 sec 39.0 MBytes 354 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
Test Complete. Summary Results:
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-25.93 sec 1.00 GBytes 331 Mbits/sec
sender
[ 4] 0.00-25.93 sec 1024 MBytes 331 Mbits/sec
receiver
CPU Utilization: local/sender 0.5% (0.1%u/0.4%s), remote/receiver 0.0% (0.0%u/0.0%s)
iperf Done.