I’m having difficulties with maximizing throughput with FreeRTOS+TCP. The same code with very old FreeRTOS and lwip achieved ~47 MB/s, but now I can’t get more than ~17 MB/s so I must be doing something wrong. Any suggestions on configuration parameters settings that could help me? I attached my current config files.
Can you monitor you CPU load? Do the 2 tasks (IP-task and the MAC-driver) get enough CPU time? Or do you have higher-priority tasks that keep them from running?
I almost started doubting my self and so I ran the demo on my Zybo board again.
When running iperf in sending mode, the board receives data at a speed of about 477 Mbits/sec.
When adding the -R option, the board sends data at a speed of about 517 Mbits/sec.
That is twice as much as what you measured. That is also what I remember observing when I developed the FreeRTOS Zynq demo.
So what is the difference between your board / application and mine? Is your LAN
I won’t attach my application, but if you write me an email, I will forward it to you. It runs on both MicroZed as well as a Zybo board. My address is hein [at] htibosch [point] com
Some questions:
What are you also using compiler optimisation? I’m using GGC with -Os (optimise for size).
What version of memcpy() are you using? I’m using GCC’s own version.
I attached a PCAP, showing the first 1000 packets of the iperf conversation.
It shows that a window of about 15 KB would be enough to get optimal results.
the difference between your board / application and mine? Is your LAN
Oops. I left that question unfinished: I wanted to ask you if your LAN is quiet enough to leave space for your iperf data? My own LAN was 99% available during the test.
About the PCAP that I attached: you will see that about every 10 packets receive an ACK. That is why I wrote that a TCP-Window size of 10 x 1.46 KB would be enough. Here is an example of a TCP Window size of 12 packets:
#define ipconfigIPERF_TX_BUFSIZE ( 24 * ipconfigTCP_MSS ) /* Units of bytes. */
#define ipconfigIPERF_TX_WINSIZE ( 12 ) /* Size in units of MSS */
#define ipconfigIPERF_RX_BUFSIZE ( 24 * ipconfigTCP_MSS ) /* Units of bytes. */
#define ipconfigIPERF_RX_WINSIZE ( 12 ) /* Size in units of MSS */
No one can guarantee a constant high TCP throughput. If you write the application for both sides, I would consider using UDP in stead.
I’m using -O2 optimization and GCC’s memcpy. I tried with -Os flag but differences are very small. Also, I use dedicaded LAN connection - PC and my device are connected directly via cable, and I don’t use that connection for anything else.
I didn’t try iperf with -R option before, so here are results:
iperf3.exe -c 172.16.0.215 --port 5001 --bytes 100M
Connecting to host 172.16.0.215, port 5001
[ 4] local 172.16.0.1 port 49832 connected to 172.16.0.215 port 5001
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 30.6 MBytes 257 Mbits/sec
[ 4] 1.00-2.00 sec 31.6 MBytes 265 Mbits/sec
[ 4] 2.00-3.00 sec 31.9 MBytes 267 Mbits/sec
[ 4] 3.00-3.17 sec 5.88 MBytes 292 Mbits/sec
That’s much better. But if I switch to my code, I still get ~19MB/s. So, it seems the problem is somewhere in my code, not FreeRTOS. I checked the task priorites and all the tasks I create have lower priorities than ipTask. I didn’t profile code to see what’s using CPU the most yet since it’s a bit tricky in this environment.
When you use the -R option, it means that Zynq is sending data. That gives a good performance. If your application mostly sends from the device to the outside world, that is very good news.
When you omit the -R option, the Zynq receives data and it is more dependent on the speed of its partner. In your test you only get an average speeds of 260 Mbps:
If your embedded device mostly receives data, you may also want to optimise the software on your host… if that is possible.
Does your embedded application mostly send or mostly receive data?
There are many techniques to optimise TCP-communication. You can have a look in the iperf module, or the project under “protocols”, such as the FTP server.
If it is possible, can you post some code that sends/receives the ‘bulk’ TCP data? You can also send it to the email address that I mentioned here above.
Application mostly sends data so difference in my throughput and iperf’s is even more interesting. I tried to send you mail on address you provided but gmail said that it cannot find that hostname.