I see others getting 80 Mbps. Any pointers as to why these results are so poor?
I am using the STM32H7 Ethernet driver that comes with the +TCP stack:
FreeRTOS-Plus-TCP/portable/NetworkInterface/STM32Hxx/NetworkInterface.c
…as opposed to the Ethernet driver that comes with the ST HAL. Which Ethernet driver should I be using?
Basically better avoid HAL when striving for performance and efficiency
In your case the bottleneck is almost certainly not the driver. Question is what is it.
Is there something else eating up the CPU ? Did you configure the stack for performance giving appropriate prio, enough buffers, suitable MTU size, etc ? Which allocation scheme do you use ? Perhaps post your FreeRTOSIPConfig.h.
Do you have Wireshark at hand and had a look at the traffic ? Could you verify the PHY came up with a 100MBit Full-Duplex link ? …
When I tested iperf3 on my STM32H747, I was also happy about the results, and I used the same driver as you do.
In my testing project, the following was defined in FreeRTOSIPConfig.h:
#define USE_IPERF 1
#define ipconfigIPERF_DOES_ECHO_UDP 0
#define ipconfigIPERF_VERSION 3
#define ipconfigIPERF_STACK_SIZE_IPERF_TASK 680
#define ipconfigIPERF_TX_BUFSIZE ( 24 * ipconfigTCP_MSS )
#define ipconfigIPERF_TX_WINSIZE ( 12 )
#define ipconfigIPERF_RX_BUFSIZE ( 24 * ipconfigTCP_MSS )
#define ipconfigIPERF_RX_WINSIZE ( 12 )
/* The iperf module declares a character buffer to store its send data. */
#define ipconfigIPERF_RECV_BUFFER_SIZE ( 24 * ipconfigTCP_MSS )
EDIT The original iperf settings in this post were not the right ones. The above settings give a much better performance. It assumes that there is a lot of RAM for buffering.
When the traffic is so slow, it could be interesting to look at the actual iperf session in a PCAP. Could you make a PCAP, zip and attach it?
In case you can not attach a ZIP file, you can also email it to me:
hein [at] htibosch [dot] net.
But I have asked the moderator to allow you to attach files to your posts.
You can filter the iperf packets with tcp.port==5001.
And can you please also look at the questions from Hartmut?
About priorities:
The task in NetworkInterface: higher priority
The IP-task : normal priority
All tasks that make use of the IP-stack: lower priority.
So the iperf task gets a lower priority. For example:
The Cortex-M7 is running on 400 MHz ( maximum 480 MHz ).
The IDE is STM32CubeIDE, Version: 1.4.2, Build: 7643.
The test shows the following performance:
Receiving TCP data at 80 Mbits/sec or better
Sending TCP data at 90 Mbits/sec or better ( using iperf -R )
Both the laptop and the STM32 board are connected to a (100/1000 Mbps) network switch.
The PHY/EMAC of the STM32 board has a speed of 100 Mbps.
During the test, the CPU had nothing else to do but work on the iperf session.
Thanks for the help, guys! I have attached a pcap of about 5 seconds worth of iper3 testing. Here is the console output during this test (note my speeds have increased to almost 2 Mbps now that I have taken the test device off my busy LAN and instead connected it directly to a laptop via Ethernet):
I saw a slight improvement in speed when I changed the priority of the iperf3 task to be the highest priority task (55 in my setup). I switched the priority back in the attached FreeRTOSIPConfig.h file, as I was trying different things.
I am suspicious of my memory setup. In my linkerscript, I set up an area for the Ethernet buffers (which are surprisingly large, specifically when the iperf server starts):
.ethernet_data :
{
. = ABSOLUTE(0x30000000);
PROVIDE_HIDDEN (__ethernet_data_start = .);
KEEP ((SORT(.ethernet_data.)))
KEEP ((.ethernet_data))
PROVIDE_HIDDEN (__ethernet_data_end = .);
} >RAM_D2
And I also set up the MPU to disable caching for this memory region:
(see attached image of mpu.png)
This device is also running all three ADCs at 60 KHz, as well as running the TinyUSB stack (composite MSC device + VCP). The device only uses Ethernet or USB and never both (config option at boot). In my testing, the USB stack is not running so it should not be impacting this performance.