I created a simple program for STM32f407 discovery that has FreeRTOS (10.0.1) and LWIP (2.0.3) in STM32CubeIDE 1.1.0. It uses a task to start LWIP and dhcp and kills the task. It has a software timer that blinks an LED and printf a counter. I implemented _write to redirect to TIM. That’s it. In debug I have SWV. data trace timeline and fault analyzer. I kept the host PC run all the time without going to sleep. The board worked for hours (>6) and the LED doesn’t blink any more. Pinging from host PC has no response. No fault was captured.
When I tried to restart debugging, hit the run nothing happened. I had to cycle the board power.
Before this I tried many times using FreeRTOS with LWIP, the board always dies in a couple of hours.
Any thoughts on the combination of FreeRTOS and LWIP and how to improve debugging?
I put the project here: https://github.com/xmkllc/FreeRTOS_LWIP_Auto_Reconnect
Last run died in about 2 hours.
As the project is using lwIP and created by STM32Cube I’m afraid I’m not familiar at all with the drivers used or the threading model used by STM32Cube projects - so I am a bit in the dark as to what to suggest other than waiting for the error condition to happen and then try to figure out what the state of the system is. For example, has the code just hit an assert() preventing anything else from executing further, or maybe run out of buffers due to a memory leak, or any number of other conditions. I think the first thing to determine has the whole system crashed or is it just the network that is no longer running.
Thanks for the feedback. The failure appears to be that the LED stop blinking, and no ping response. The debugging session appears running and it doesn’t go to hard fault for example. The fault analyzer shows no fault detected. I am not sure how to debug for this situation. I have another session running for 4 hours so far. I will see how long it can go and if the IDE can catch anything.
Are you running an up to date version of FreeRTOS with configASSERT() defined to something that will enable you to know if an assert has been hit? The newer the FreeRTOS version the more asserts the code will have to that will trap compile time misconfigurations. Also, did you go through all the items on this page: https://www.freertos.org/FAQHelp.html to ensure you have things like stack overflow checking turned on, etc?
These are symptoms of the failure, not the cause of the failure.
printf() is very [very] often the route cause of issues as it can use a lot of stack, not be thread safe, and even call malloc(). What happens if you run the application with all calls to printf() removed. Also, what is TIM? I thought it was a general purpose timer so it is not clear to me what effect redirecting _write to TIM will have.
I presume that means the application is still running under the control of the debugger.
How are you determining that to be the case? If you pause the debugger what is actually executing? You say the software timer that toggles the LED has stopped, so perhaps you will find the code is sat in a tight loop or assert at a task priority above the timer task priority?
What is configTIMER_TASK_PRIORITY set to in FreeRTOSConfig.h - is it the highest priority of any other task in the system?
Thanks for the feed back.
Yes LED not blinking is a symptom. Same is losing network.
Correction: new _write uses ITM_SendChar. I increased the timer stack size for printg. But I will remove it.
LWIP thread runs at 3.
Initially there is a task that starts LWIP and then kills itself.
How do you find where the tight loop is or an assertion hits? Breakpoint? Printf?
Typically, when the program seems to hang, and you are running connected to a debugger, you can pause the execution of the program, which effectively puts a breakpoint right were the code is currently executing, and you can see where the program is stuck.
Thanks for the tip. Will try.
After 12 hours the program eventually hung and I got this:
It looks like something has written over the top of the queue structure.