HardFault - Can't figure out why

ranwa · August 26, 2022, 6:12am

Hi,
I am getting hard fault in my code, running on STM32G0 . does not look like Overflow, all stacks are good, ucHeap is large enough .
Sometime, when is happens, I get the following screen , pTCB is all corrupted .
How can I further understand what call casued the hard fault ?

aggarg · August 26, 2022, 6:54am

You can use the following instructions to find the faulting instruction which can provide some hint - Debugging and diagnosing hard faults on ARM Cortex-M CPUs

Since you know that pxCurrentTCB is getting corrupted, you can define a variable right next to it and put a data breakpoint on it. Assuming that the memory corruption is more than just the pxCurrentTCB, it will allow you to catch the corruption right when it happens.

ranwa · August 26, 2022, 11:52am

Thanks.
I’ve noticed that when setting optimization for GDB (-Og ) , hard fault is “gone” .
What can explain this ?
Ran

hs2 · August 26, 2022, 12:13pm

Very likely code problems like undefined behaviour, missing volatile qualifier where required often revealed by the optimizer.
I guess -Og makes the program running and problems occur with -O2 or -Os.
You can find many, many of such kind of issues in the web
Also different optimizer / code generation flags produce different code which usually differs with regard of stack usage, of course.
Did you define configASSERT and enabled stack checking (as strongly recommended) ?

ranwa · August 26, 2022, 2:03pm

Hi,
configASSERT and enabled stack checking are enabled.
The hard_fault usually happens when trying to access an address within the ucHeap address range.
I’ve tried analyzing save stack registers after hard_fault , but this leads me to nothing .
Only thing that works is running in release or -Og .
meaning - program crashes only when running with no optimization.

Any Ideas ?
Ran

kanherea · August 26, 2022, 9:27pm

Hello Ran,

Did you try the method that Guarav mentioned above? Can you set a data breakpoint?

- Aniruddha

aggarg · August 29, 2022, 5:28am

It is difficult without looking at the code. Would it be possible for you to create a minimal sample demonstrating the problem?

RAc · August 29, 2022, 6:08am

you mean running in DEBUG or -Og?

As a side note: -Og is NOT the same as -O0, it does not disable all optimizations (cf gcc docs).

It is also possible that the problem is still there without optimization, but the timing cnditions change, making the problem show only much less frequently.

Which memory manager are you using? Are you ensuring memory serialization corretly (either via prvPortMailoc() etc using muteces or implementing malloc_lock())?

ranwa · August 31, 2022, 3:00pm

Hi,
The problem is that when I’m reducing code segments, problem also disappear .
Already tried “comment out and test” path, didn’t help .
I am going through the code with a fine comb now

ranwa · August 31, 2022, 3:01pm

I am running in -Og and the problem goes a way .
I am using prvPortMalloc from FreeRTOS

ranwa · August 31, 2022, 3:03pm

It does not look like a data \ unaligned access, I’ve inspected the CPU registers when the fault happens, the LR points to different function every crash

RAc · August 31, 2022, 3:27pm

but which heap manager?

Again, you do not know whether the problem goes away or just shows much less frequently unless you know what it is.

ranwa · August 31, 2022, 5:09pm

Heap 4.
I agree , can’t really tell if it’s goes away or timing is different.

Topic		Replies	Views
Intermittent HardFault Occurrence Kernel debug	1	27	April 9, 2025
My code stuck in HARDFAULT handler Kernel	1	996	September 2, 2020
Hard Fault exception Kernel	3	289	February 13, 2019
FreeRTOS 10.2/3.1 with STM32L/F0 and gcc runs in Hardfault Kernel	6	714	May 28, 2020
HardFault_Handler - 2 UART channels Kernel debug	2	1477	April 12, 2021

HardFault - Can't figure out why

Related topics