FreeRTOS-PLUS-TCP Port poor TCP troughput

Hi,
I am porting the Plus-TCP stack (git tag V3.1.0) for my Ethernet driver, Here I am using the Iperf port I got from the community
from: post in forum freertos-tcp-iperf3-server.

In my test with 100M connection, For UDP I am getting 95Mbps but for TCP I was only getting 70Mbps. I want to improve this get similar throughput as UDP.

from my debugging, I suspect the ACK being delayed might be the cause. To test this out, I edited the stack (just to test). This seemd to help and was able to get around 94Mbps throughput

edit:

diff --git a/source/FreeRTOS_TCP_Transmission.c b/source/FreeRTOS_TCP_Transmission.c
index a113d8f9..de58f633 100644
--- a/source/FreeRTOS_TCP_Transmission.c
+++ b/source/FreeRTOS_TCP_Transmission.c
@@ -1362,7 +1362,9 @@
                         /* Normally a delayed ACK should wait 200 ms for a next incoming
                          * packet.  Only wait 20 ms here to gain performance.  A slow ACK
                          * for full-size message. */
-                        pxSocket->u.xTCP.usTimeout = ( uint16_t ) pdMS_TO_TICKS( tcpDELAYED_ACK_LONGER_DELAY_MS );
+                        /*pxSocket->u.xTCP.usTimeout = ( uint16_t ) pdMS_TO_TICKS( tcpDELAYED_ACK_LONGER_DELAY_MS );*/
+                        pxSocket->u.xTCP.usTimeout = ( uint16_t ) tcpDELAYED_ACK_SHORT_DELAY_MS;
+                        /*pxSocket->u.xTCP.usTimeout = ( uint16_t ) 5;*/
 
                         if( pxSocket->u.xTCP.usTimeout < 1U ) /* LCOV_EXCL_BR_LINE, the second branch will never be hit */
                         {

Any suggestion on where I should be looking into, Is there a method to fix this without editing the stack code. What am I missing

@MuhammedZamroodh

Ideally, UDP should be much faster than TCP because there are no packet retransmissions, acknowledgement, or congestion control for UDP. Provided that you are on a 100M connection, TCP throughput of 70Mbps seems normal to me.

As per TCP RFC9293 - 3.8.6.3. Delayed Acknowledgments – When to Send an ACK Segment a delayed ACK must be sent with a delay of less than 0.5 seconds. tcpDELAYED_ACK_LONGER_DELAY_MS is already 20 ms, which is much lower than the maximum limit; decreasing it even lower might introduce unnecessary ACKs in the network, leading to wasted bandwidth in real network scenarios, and in some cases might lead to even slower throughput.

If your use case is just to measure performance via IPERF, I don’t think it’s beneficial to introduce another configuration (ipconfig) macro to make this user a configurable setting. Do you have any use cases outside of IPERF where changing this value seems necessary?

We have a linux port for the same hardware and my reason for investigation is that we get better results in linux (around 94Mbps). Are you sure that the 70Mbps is the expected normal ?

Please share your iPerf commands and results - from my experience iPerf is often used wrongly and gives missleading information.

Apart from that, I am also only getting around 300Mbit/s TCP throughput on a 1000M interface with FreeRTOS+TCP.

Beside seeing a PCAP, produced by iPerf, I would like to hear on what platform you ran FreeRTOS, and on what hardware you ran Linux?
Please compress the PCAP well. You should have the right to upload a file by now.

I have a STM32H7, also 100 Mbps, which also shows a high performance. But I can not tell what is really different? Using faster memories?

I would NOT dare to play with the delayed ACK times. Let’s have a look first at what is happening (the PCAP).

EDIT
Don’t forget to have a look at FREERTOS_SO_WIN_PROPERTIES, a socket option by which you determine the slice of the sliding window, and also the maximum buffer size the socket will allocate.
The defaults may give a poor performance.

pcap.7z (346.6 KB)
I am attaching the PCAP here, I can see some packet loss, do you see anything suspicious here

FREERTOS_SO_WIN_PROPERTIES Played with this and it helped
The following configuration seem to bring the throughput to 86Mbps

#ifndef IPERF_TASK_H_

#define IPERF_TASK_H_

#include <FreeRTOSConfig.h>

#define ipconfigIPERF_PRIORITY_IPERF_TASK ( configMAX_PRIORITIES - 3 )

#define USE_IPERF                               1
//#define ipconfigIPERF_DOES_ECHO_UDP             0

#define ipconfigIPERF_VERSION                   3
#define ipconfigIPERF_STACK_SIZE_IPERF_TASK     (configMINIMAL_STACK_SIZE + 680)

#define ipconfigIPERF_TX_BUFSIZE                ( 256 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE                ( 128 )
#define ipconfigIPERF_RX_BUFSIZE                ( 256 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE                ( 128 )

/* The iperf module declares a character buffer to store its send data. */
#define ipconfigIPERF_RECV_BUFFER_SIZE          ( 16 * ipconfigTCP_MSS )

#define ipconfigIPERF_USE_ZERO_COPY 0

void vIPerfInstall( void );

#endif

Have you tried enabling zero copy for IPERF?

Yes, enabling ipconfigIPERF_USE_ZERO_COPY could make iPERF’s reception run faster.

About the TCP window size: normally when WIN increases, more packets can be sent “in one go”. Only after loads of packets of 1460 bytes each, an acknowledgement will be expected.

Two remarks:

  1. When the peer is on the Internet, do not use such long WIN sizes. Much safer to use a few outstanding packets.
  2. Also on a LAN I wouldn’t use such long sizes, have a try with smaller WIN sizes, it uses much less RAM and the performance doesn’t really decrease.

If your device has like 220 outstanding packets, whing will be very difficult when one packet is missing.

Have you optimised your application, using -Os or -O2? If you want attach your FreeRTOSConfig.h as well.

I just looked at iperf using zero-copy, I think it needs a change:

See the latest iperf_task_v3_0h.c.

This changed:

+const BaseType_t xRecvSize = 0x10000;
 xRecvResult = FreeRTOS_recv( pxClient->xServerSocket, /* The socket. */
                              &pcRecvBuffer,           /* A pointer to a pointer */
-                             sizeof pcRecvBuffer,     /* Any size is OK here. */
+                             xRecvSize,               /* Any size is OK here. */
                              FREERTOS_ZERO_COPY );

Here, pcRecvBuffer is not a real buffer but a pointer. The variable will point to an internal stream buffer of the TCP socket.

The received message will be freed in this call:

FreeRTOS_recv( pxClient->xServerSocket, /* The socket being received from. */
               NULL,                    /* The buffer isn't used. */
               xRecvResult,             /* This is important now. */
               0 );

The PCAP you sent, it shows 44 packets summarized in a single packet. As though the packet was 64240 long, which it isn’t.
Would you know a way of disabling this offloading by the laptop?
Maybe it goes together with checksum offloading?