Determining the cause of stack overflow and fixing it

MasterSil · October 17, 2023, 5:26pm

I am getting a stack overflow and I’m trying to determine ways around it. I’m using nRF52840.

How’d you calculate how much the stack has been consumed in total? In other words, how much I overshot by?

I have this hook function but I don’t get any valuable info out of it. uxHighWaterMark is 0 and I can’t see a proper task name in debugger.

extern "C"
 {
    void vApplicationStackOverflowHook( TaskHandle_t xTask, signed char *pcTaskName )
    {
        ( void ) pcTaskName;
        ( void ) xTask;

        UBaseType_t uxHighWaterMark = uxTaskGetStackHighWaterMark(NULL);
    
        for( ;; ); 
    }
}

Currently, I only have 2 tasks each occupying 4 * 100 bytes in the stack

xTaskCreate(SystemTask::Process, "Run", 100, this, 0, &mTaskHandle
xTaskCreate(MCP9808::Process, "Process", 100, this, 0, &mTaskHandle

richard-damon · October 17, 2023, 5:47pm

100 Words is somewhat small unless you know the task isn’t using much stack.

You likely want to store the handles you get into different variables, so that you can tell which one is which. Then, when you get the overflow, you can compare the handle to your tasks to see which one it is. Likely you are going over so far as to wipe out the name in the TCB.

MasterSil · October 17, 2023, 6:32pm

Yeah mTaskHandle is specific to a class; one is in SystemTask and the other MCP9808 so they’re stored in their own entities.

Upon increasing the stack depth to 300 resulted in no memory error

RAc · October 17, 2023, 8:44pm

What platform are you using? Most Cortex M series PODs supports DWT breakpoints, so you can set a breakpoint to hit when a write to a specific location near the top of the stack occurrs, that tends to be a big help. May not work immediately as there is a chance that the location you choose is in a gap that is not directly written to, but generally after a few attempts, the breakpoint gets you right to the culprit.

MasterSil · October 17, 2023, 9:00pm

Cortex M, yes. Where can I enable the breakpoints?

ActoryOu · October 18, 2023, 3:40am

Hi @MasterSil,
Are you using SEGGER Embedded Studio? Refer to debugging page on Nordic website for detail operation.

Thanks.

dc42 · October 19, 2023, 12:46pm

Yes making sure that you have sufficient stack for each task without wasting RAM that may be needed for something else is a perennial issue. Here are some approaches we use:

In your project, include a crash handler that writes useful data (e.g. task name, nature of fault, and a number of words from the current stack) to non-volatile memory before resetting. Provide a means to read back that non-volatile memory (this can be through a user interface if the project has one, or using a debugger). This can be used for hard faults, FreeRTOS assertion failures, and other unrecoverable situations.
Enable stack overflow checking in FreeRTOS. This is by no means perfect but it will catch some stack overflows. Have a stack check failure go to the crash handler.
If the CPU has a stack limit register (e.g. most ARM Cortex M33 CPUs) then make sure that your FreeRTOS port makes use of it.
Provide a means of reporting the amount of free stack for each task, or recording the minimum free stack seen. We write a specific pattern to all unused memory, then we have a function that iterates through all the tasks, measuring how much memory there is at the end of the stack that still contains the pattern.
Some compilers (including gcc) can output a file giving the stack used by each function. This doesn’t take account of called functions, so you have to trace the call tree and add up the stack used by each nested call to get the total required.
It’s unlikely that your real-time tasks use recursive functions, but if you have any tasks that do call recursive functions, and you can’t place an upper bound on the recursion depth, then additional precautions are needed. One of our applications falls into this category because it includes a language parser. In this case you need to check the amount of free stack in at least one point of each possible cycle of recursion, and if it isn’t sufficient for a cycle (see #5 for how to determine this), gracefully terminate the operation concerned.

HTH David

Til · October 19, 2023, 1:03pm

Since you are already using Embedded Studio this could be of interest for you:

That might require some changes in the FreeRTOS Cortex-M code:
https://wiki.segger.com/Stack_Overflow_Prevention

Maybe I can help with that since I already implemented it for embOS.

MasterSil · October 19, 2023, 4:54pm

One of my tasks is named “Process”
xTaskCreate(A::Process, "Process", 100, this, 0, &mTaskHandle)

but buff here contains Pro. Why could that be?

extern "C"
 {
    void vApplicationStackOverflowHook( TaskHandle_t xTask, signed char *pcTaskName )
    {
        ( void ) pcTaskName;
        ( void ) xTask;

        char buff[100] = {0};
        strncpy(buff, (const char *)pcTaskName, sizeof(buff) - 1); // buff = "Pro"

        UBaseType_t uxHighWaterMark = uxTaskGetStackHighWaterMark(NULL);
    
        while(1);
    }
}

richard-damon · October 19, 2023, 5:56pm

What do you have configMAX_TASK_NAME_LEN set to?

The name gets copied into a fixed length buffer in the TCB.

MasterSil · October 19, 2023, 6:09pm

In FreeRTOSConfig.h

#define configMAX_TASK_NAME_LEN                                                   ( 4 )

richard-damon · October 19, 2023, 6:27pm

So, that is the expected result, the name is truncated to the value specified.

MasterSil · October 19, 2023, 7:30pm

Thanks.
The way to tackle this situation is to take the trial and error approach? If 100 stack depth is too small, then slowly increment the value till you no longer run into the issue? Or ofcourse try using smaller array sizes and nested calls?

richard-damon · October 19, 2023, 7:34pm

You can try to estimate the usage by looking at the tasks and what they use as local variables. I tend to start with an over-estimate and then scale down, and perhaps then use the extra space to increase buffers for performance.

If you run your system for a while, and then check the high water mark for all your tasks, you can see where you gave too much extra.

MasterSil · October 19, 2023, 7:45pm

high water mark seems to always return 0 prolly implying there’s nothing more left on the stack no?

richard-damon · October 19, 2023, 7:55pm

High-water mark of zero means that you are using ALL your stack, so I would make it bigger.

MasterSil · October 19, 2023, 8:04pm

Yeah. Once I increased from 100 to 200 stack depth, I stopped running into the issue.
Where else would you read the water mark to see how much stack is left?

richard-damon · October 19, 2023, 9:06pm

In most of my applications, I include a routine that calls uxTaskGetSystemState() to get the information for all the tasks and sends a report to the “debug console”, and I then activate that routine after the system has been exercised for all functions. Tasks close to their limit may be given a bit more stack, and any task with a lot of extra, might be reduced, unless there could be cases where it needs more.

MasterSil · October 20, 2023, 4:18am

where do you call this uxTaskGetSystemState() routine? in FreeRTOS, tasks are only spawned after runing the scheduler and once scheduler has run, the execution won’t go past it in (in main). Where should I call it? In tasks themselves?

richard-damon · October 20, 2023, 9:36am

Yes, you call it from a task. I tend to have either a command interface task as part of the system interface taking commands from an external serial port, so I add a command to do this operation, or I include a supplemental debugging task, connected to a debugging serial port with that capability.

Topic		Replies	Views
How to measure the amount of stack used by a particular stack in an application of multitasking? ARM	3	397	July 7, 2022
Stack Size Kernel	18	7526	August 23, 2020
Stack Overflow Detection - Method 2 Kernel	5	3527	May 29, 2023
Use of uxTaskGetStackHighWaterMark() vs configCHECK_FOR_STACK_OVERFLOW as 2 Kernel monitoring	1	83	August 22, 2024
Application stack overflow help on static task Kernel	9	484	December 9, 2023

Determining the cause of stack overflow and fixing it

Related topics