Determining the cause of stack overflow and fixing it

I am getting a stack overflow and I’m trying to determine ways around it. I’m using nRF52840.

  • How’d you calculate how much the stack has been consumed in total? In other words, how much I overshot by?

I have this hook function but I don’t get any valuable info out of it. uxHighWaterMark is 0 and I can’t see a proper task name in debugger.

extern "C"
 {
    void vApplicationStackOverflowHook( TaskHandle_t xTask, signed char *pcTaskName )
    {
        ( void ) pcTaskName;
        ( void ) xTask;

        UBaseType_t uxHighWaterMark = uxTaskGetStackHighWaterMark(NULL);
    
        for( ;; ); 
    }
}

Currently, I only have 2 tasks each occupying 4 * 100 bytes in the stack

xTaskCreate(SystemTask::Process, "Run", 100, this, 0, &mTaskHandle
xTaskCreate(MCP9808::Process, "Process", 100, this, 0, &mTaskHandle

100 Words is somewhat small unless you know the task isn’t using much stack.

You likely want to store the handles you get into different variables, so that you can tell which one is which. Then, when you get the overflow, you can compare the handle to your tasks to see which one it is. Likely you are going over so far as to wipe out the name in the TCB.

Yeah mTaskHandle is specific to a class; one is in SystemTask and the other MCP9808 so they’re stored in their own entities.

Upon increasing the stack depth to 300 resulted in no memory error

What platform are you using? Most Cortex M series PODs supports DWT breakpoints, so you can set a breakpoint to hit when a write to a specific location near the top of the stack occurrs, that tends to be a big help. May not work immediately as there is a chance that the location you choose is in a gap that is not directly written to, but generally after a few attempts, the breakpoint gets you right to the culprit.

Cortex M, yes. Where can I enable the breakpoints?

Hi @MasterSil,
Are you using SEGGER Embedded Studio? Refer to debugging page on Nordic website for detail operation.

Thanks.

Yes making sure that you have sufficient stack for each task without wasting RAM that may be needed for something else is a perennial issue. Here are some approaches we use:

  1. In your project, include a crash handler that writes useful data (e.g. task name, nature of fault, and a number of words from the current stack) to non-volatile memory before resetting. Provide a means to read back that non-volatile memory (this can be through a user interface if the project has one, or using a debugger). This can be used for hard faults, FreeRTOS assertion failures, and other unrecoverable situations.
  2. Enable stack overflow checking in FreeRTOS. This is by no means perfect but it will catch some stack overflows. Have a stack check failure go to the crash handler.
  3. If the CPU has a stack limit register (e.g. most ARM Cortex M33 CPUs) then make sure that your FreeRTOS port makes use of it.
  4. Provide a means of reporting the amount of free stack for each task, or recording the minimum free stack seen. We write a specific pattern to all unused memory, then we have a function that iterates through all the tasks, measuring how much memory there is at the end of the stack that still contains the pattern.
  5. Some compilers (including gcc) can output a file giving the stack used by each function. This doesn’t take account of called functions, so you have to trace the call tree and add up the stack used by each nested call to get the total required.
  6. It’s unlikely that your real-time tasks use recursive functions, but if you have any tasks that do call recursive functions, and you can’t place an upper bound on the recursion depth, then additional precautions are needed. One of our applications falls into this category because it includes a language parser. In this case you need to check the amount of free stack in at least one point of each possible cycle of recursion, and if it isn’t sufficient for a cycle (see #5 for how to determine this), gracefully terminate the operation concerned.

HTH David

Since you are already using Embedded Studio this could be of interest for you:

That might require some changes in the FreeRTOS Cortex-M code:
https://wiki.segger.com/Stack_Overflow_Prevention

Maybe I can help with that since I already implemented it for embOS.

One of my tasks is named “Process”
xTaskCreate(A::Process, "Process", 100, this, 0, &mTaskHandle)

but buff here contains Pro. Why could that be?

extern "C"
 {
    void vApplicationStackOverflowHook( TaskHandle_t xTask, signed char *pcTaskName )
    {
        ( void ) pcTaskName;
        ( void ) xTask;

        char buff[100] = {0};
        strncpy(buff, (const char *)pcTaskName, sizeof(buff) - 1); // buff = "Pro"

        UBaseType_t uxHighWaterMark = uxTaskGetStackHighWaterMark(NULL);
    
        while(1);
    }
}

What do you have configMAX_TASK_NAME_LEN set to?

The name gets copied into a fixed length buffer in the TCB.

In FreeRTOSConfig.h

#define configMAX_TASK_NAME_LEN                                                   ( 4 )

So, that is the expected result, the name is truncated to the value specified.

1 Like

Thanks.
The way to tackle this situation is to take the trial and error approach? If 100 stack depth is too small, then slowly increment the value till you no longer run into the issue? Or ofcourse try using smaller array sizes and nested calls?

You can try to estimate the usage by looking at the tasks and what they use as local variables. I tend to start with an over-estimate and then scale down, and perhaps then use the extra space to increase buffers for performance.

If you run your system for a while, and then check the high water mark for all your tasks, you can see where you gave too much extra.

high water mark seems to always return 0 prolly implying there’s nothing more left on the stack no?

High-water mark of zero means that you are using ALL your stack, so I would make it bigger.

Yeah. Once I increased from 100 to 200 stack depth, I stopped running into the issue.
Where else would you read the water mark to see how much stack is left?

In most of my applications, I include a routine that calls uxTaskGetSystemState() to get the information for all the tasks and sends a report to the “debug console”, and I then activate that routine after the system has been exercised for all functions. Tasks close to their limit may be given a bit more stack, and any task with a lot of extra, might be reduced, unless there could be cases where it needs more.

where do you call this uxTaskGetSystemState() routine? in FreeRTOS, tasks are only spawned after runing the scheduler and once scheduler has run, the execution won’t go past it in (in main). Where should I call it? In tasks themselves?

Yes, you call it from a task. I tend to have either a command interface task as part of the system interface taking commands from an external serial port, so I add a command to do this operation, or I include a supplemental debugging task, connected to a debugging serial port with that capability.