Debugging and trying to print stack call trace for a crash - https://elmagnifico.tech/2017/07/27/CmBacktrace/

We are trying to analyze our application crash issue on Renesas MCU using FREERtos. I stumbled on CmBacktrace/cm_backtrace.c at master · armink-rtt-pkgs/CmBacktrace · GitHub where it calls some FREERtos API vTaskStackAddr () [https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/S32K/29035/1/tasks.c]

I am not sure if I am referring to correct source / findings but those where my observations from a Google search.

If in case I am on the right track please guide me how I can Integrate https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/S32K/29035/1/tasks.c into our code and in CmBacktrace - elmagnifico's blog.

State code from FREERtos when CMB_OS_PLATFORM_TYPE == CMB_OS_PLATFORM_FREERTOS.

Please guide - the objective is to get debug information by printing stack call trace for a crash

What is your question? If you are uncertain where to begin, the key steps are most likely to hook in fault handler from cmbacktrace into your interrupt vector table (e.g HardFault_Handler). You might also need to call an init function of cmbacktrace (as early as possible). I don’t have experience of cmbacktrace but we have similar crash debugging features (and more) in Percepio Detect (see Percepio Detect - Percepio).

While I can’t contribute to the question at hand, let me comment at the usefulness of stack traces in general:

  1. Even the best, most expensive and most well reputed debuggers have serious difficulties providing useful stack back traces. This holds particularly true for Cortex ARM MCUs and rests on the fact that compilers can (and typically do) optimize the code to a degree that at times leaves little recognizable relationships between source and executing code. There is loop unfolding, code inlining, tail recursion optimization and many other under-the-hood modifications that significantly affect run time performance but in turn also affect control flow.

  2. A corollary of this is that an unoptimized build (which gives you the highest probability of a useful stack back trace) may produce an executable so different from an optimized one that you may not be able to reproduce your original fault and/or encounter different issues.

  3. (iterated only about the millionth time): In RTOS development, chasing the symptom of a crash (ie studying the faulted system state) very rarely helps you in pinpointing the problem. Typically the root cause has occurred possibly several 10000 cycles prior to the actual crash, so you need to adjust your thinking to rounding up the usual suspects (stack overflows, illegal memory writes, violation of isr rules etc). Normally things like deep instruction traces, hardware breakpoints or MPU supported memory watches get you to the heart of your fault bazillion times faster than trying to figure out the system state at symptom time. Frequently, due to concurrency the same root cause manifests itself differently every time which means that if you, say, add diagnostic code to task A after you determined that there was a crash in A, the next time you may crash in B, so all the work you put into the diags was void.

  4. I did not look too closely at the code you refer to, but to me it seems as if requires you to splice custom code into your code base to support the tracing. Be aware that that again is subject to the Heisenberg effect, ie it changes the run time behavior of your system. You may find yourself chasing a bug in that suite itself, or the bug you try to chase it with may in turn render the utility’s results unusuable.

Best of luck, please let us know if you got that to work and whether it eventually helped you. I am not holding my breath that it does, but the world is full of surprises.

[quote=“RAc, post:3, topic:24884”]

  • different issues.
    [/quote]I would like to know if it is possible to use Percepio Detect for Custom Tailor Embedded PCB boards. Ours Embedded boards runs on a another main PCB board. It has ethernet I/O only as interface. So how do I test my PCB using Percepio Detect. We are using Renesas R9A07G084M04GBGBC0 MCU

What interface in our Embedded board do one require to use Percepio Detect.

Hi

So, i have not seen this particular implementation of a stack backtrace library before, so i cant help you with that. However, thanks for the link. It is always good to find new tools.

I spent some time trying to integrate red-rockets library found here GitHub - red-rocket-computing/backtrace: Embedded Backtrace and Stack Unwinder for ARM Cortex-M a few years ago. If you fail with CmBacktrace, you might want to try with that.

Regards,

Martin

I have not used CmBacktrace before but this file seems like a copy of FreerTOS’s tasks.c with some modifications. Try replacing your tasks.c file with this one. Looking at the license header, this file is based on FreeRTOS version 10.4.6, so it is best that you use FreeRTOS version 10.4.6 and then replace tasks.c.

Just an observation:

When dealing with a mystery crash in FreeRTOS in a place where you have much use of dynamic memory allocation and are not using static allocation for the FreeRTOS objects (tasks, queues, stacks, etc.), and you get meaningless stack traces, I have found that it is often that the saved context for one of the tasks has been corrupted, and the crash occurs at the end of the context switch. Because the stack pointer often invalid at that point, the registers and traceback have no relation to the code that caused the corruption.

[edit]

Many years ago, in the FreeRTOS 9 era, I added some code to the context switch to do a CRC32 calculation on a tasks stack and context after it had been saved, then check it before restoring it, and also to keep a trace buffer with the last N context switches and the return addresses from the call the ISR makes to notify FreeRTOS that it has happened. It helped me find what was overwriting the saved context, but it didn’t identify the code itself, just gave me a list of suspects.

I have since lost that code. Before doing it in code, I tried to do it through debugger scripting, but my GDB scripting chops were not up to the challenge.

1 Like

Thanks for all the information. Recently I am also trying to find debug information using DWARF mechanism but have failed until now.

Firstly I am in need to find exact ARMs DWARF library that support Renesas R9A07G084M04GBGBC0 MCU

Next I need the same mechanism to integrate it in our application - there is very little information on both my cases.

Where can we find libdwarf code for Renesas R9A07G084M04GBGBC0 MCU? How to use the same

Typically, the DWARF debug information is not loaded onto the target system when running FreeRTOS natively, but is used by the debugger from a copy of the ELF image file. Having a DWARF library on the execution target is generally not useful.

Getting an annotated traceback with function names and source lines is very unusual on the target itself.

I agree on same but when issue occurs on client siet or on target systens - how do we identify from where ths issue propagated from - how do we get such traceback with function names?

Generally you don’t, at best you get a trace back with addresses that you look up in the map file from that release of the software.

If it is a big system, maybe you can make a core dump that you can examine, but even that is unusual for an embedded system.

The point is this isn’t a “*nix” level system.

Thanks @richard-damon - that’s exactly what I am looking for

a) core dump
b)get a trace back with addresses

Can you please let me know how to achieve the same?

Thanks in advance

To get a core dump, you just need to program your “crash” trap to copy all of memory (or all memory that you are interested) into some programmable media that you can then transmit the contents of to be analyzed. That tends to require that the system has some block of flash memory big enough to do this, which is not real common.

As to getting a trace back, you just need to write your crash handler to understand your processes ABI for building stack frames (and hoping that the point where this happens has a normal stack frame). You will at least know the address the went into the trap handler, so you can look that up in your linker map.

The point is that you aren’t running under a big OS that supplies all these for you.

And at this point, the TO may wish to go back to my initial remarks about the value of stack back traces.

It should also be noted that Richard’s remark “and hoping that the point where this happens has a normal stack frame” is very appropriate. That hope will remain unfulfilled many more times than one would wish.