FreeRTOS + LWIP Stuck issues

diama13 · October 27, 2023, 1:44pm

Hi,

I have multiple FPGA boards equipped with Zynq Ultrascale+ FPGAs and I exchange data packets (e.g 10000 data packets of variable size) among them over TCP/IP. The boards are connected sequentially. For doing that I use FreeRTOS + LWIP on one of the available processors. The problem is that although in most experiments the boards exchange all the data packets successfully some times the boards exchange only a number of the data packets and stop. In that case the whole system is stuck. The number of successfully exchanged packets is not the same in these experiments.
Since every board waits to accept (lwip_rcv) data and forwards (lwip_send) them to the next one suspect that although one of the boards has sent the data the next one has not received them? Is that possible?

manvensh · October 27, 2023, 10:47pm

Welcome to the FreeRTOS forum and thanks for posting!!

This is an interesting issue you’re bringing up here. Did you use with any packet analyzer tool like Wireshark to see if the packets are being sent or not, when you are hitting this issue.
Anyways I have forwarded this ticket to the team working on the connectivity software and will get back to you soon!!

glenenglish · October 29, 2023, 6:03am

Why are you using LWIP and a Ultrascale ZYNQ ? LWIP is really designed for doing to minimum required on a tiny microprocessor.
FreeRTOS has a great and stable and STRONG capable TCP stack. There is a good Ultrascale Zynq port available. You will need to learn and customize a bti, it is not a simpl eout of the box job, But if you are using UltraScale Zynq, then you know just how crazy complex the device is, and this is the easy bit.
No need to use LWIP. I use FreeRTOS on Ultrascale+ ZYNQ for many projects
Regards,
glen english , Xilinx Alliance Partner.

RAc · October 29, 2023, 4:17pm

What do you mean, “stuck?” Is your unit caught in a fault handler, or does FreeRTOS still work (which you can for example monitor be setting a breakpoint at your system tick handler), or are you not at a fault but no activity is visible (which means that when you break into the debugger or look at the real time trace, all you see is cycles in the idle task)?

It is important to provide info as precise as possible as each of those scenarios will take another turn into the problem solving flow chart. The last scenario would hint at a system wide deadlock, typically a critical section misuse, the middle one possible a memory, driver, porting or local deadlock issue and the first one possibly a stack overflow or an application prgramming error (among others).

It is also possible that your system basically works but one of your communication peers does not respond to a communication problem such as a timeout appropriately. As @manvensh mentioned, these kinds of problems are best narrowed down by a wireshark trace.

Shub · November 2, 2023, 3:31am

Hi @diama13 ,

As others have already mentioned that using wireshark may be a good way to go ahead.
However, we have run multiple tests including protocol tests on Zynq Ultrascale hardware using FreeRTOS+TCP stack and have not faced any issues.