Hard fault while calling Check Tasks Waiting Termination in Idle task

Hello,
I have an issue similar to this topic
I’m getting an hard fault while running my STM32H7 FreeRTOS 10.3.1 based application, that is built out of 3 tasks and is working for a while (10min to an hour) with no problems and then fall into an hard fault always with the same PC value that lead to the call to prvCheckTasksWaitingTermination in the idle task.
I’m running with configCHECK_FOR_STACK_OVERFLOW, configUSE_MALLOC_FAILED_HOOK and configASSERT.
I also added configUSE_LIST_DATA_INTEGRITY_CHECK_BYTES as I think some corruption is happing around the list handling from doing a lot of tries when I tried to catch the source of the issue before the hard fault.
I’m working with ST CubeIDE and its fault analyzer list the fault as a “Bus, memory management or usage fault” and “Instruction access violation”
Do you have any suggestion on how I can resolve this?
Thanks,
Eyal

That surely seems like a memory corruption. Can you try increasing stack sizes of tasks one by one and see if that fixes the issue. Then we can use uxTaskGetStackHighWaterMark to determine the stack requirements. If that does not help, can you try disabling tasks one by one and try to determine the culprit one?

Thanks.

Hi Gaurav,
I tried all you suggestions but this issue is still there, as an outcome from some of the changes I did I got an assertion from a graphic library I’m using (TouchGFX) but its hard for me to believe that the problem is in the lib, I assume it is some memory issue that is “hiding” from me.
The fact that most of the time, after I made some changes during the debug process, I’m keep getting hard fault on the same FreeRTOS place (i.e. prvCheckTasksWaitingTermination ) is some kind of a clue, don’t you think?
The only problem is that I do not understand the clue…
Any ideas?
Thanks,
Eyal

The memory corruptions are hard to debug. When it crashes, can you see the task list and examine if any TCB seems corrupted? That way we can find if a TCB is getting corrupted.

One usual technique to debug memory corruption is to put a variable next to the one getting corrupted. Then you can put a data breakpoint on that variable to catch the corruption when it happens. The hope here is that the corruption is large enough to corrupt the variable that we put.

Which STM32H7 are you using? I have one Nucleo-H743ZI2 dev board - if you are using that and are willing to share your code, I can give it a try too.

Thanks.