100 Words is somewhat small unless you know the task isn’t using much stack.
You likely want to store the handles you get into different variables, so that you can tell which one is which. Then, when you get the overflow, you can compare the handle to your tasks to see which one it is. Likely you are going over so far as to wipe out the name in the TCB.
What platform are you using? Most Cortex M series PODs supports DWT breakpoints, so you can set a breakpoint to hit when a write to a specific location near the top of the stack occurrs, that tends to be a big help. May not work immediately as there is a chance that the location you choose is in a gap that is not directly written to, but generally after a few attempts, the breakpoint gets you right to the culprit.
Yes making sure that you have sufficient stack for each task without wasting RAM that may be needed for something else is a perennial issue. Here are some approaches we use:
In your project, include a crash handler that writes useful data (e.g. task name, nature of fault, and a number of words from the current stack) to non-volatile memory before resetting. Provide a means to read back that non-volatile memory (this can be through a user interface if the project has one, or using a debugger). This can be used for hard faults, FreeRTOS assertion failures, and other unrecoverable situations.
Enable stack overflow checking in FreeRTOS. This is by no means perfect but it will catch some stack overflows. Have a stack check failure go to the crash handler.
If the CPU has a stack limit register (e.g. most ARM Cortex M33 CPUs) then make sure that your FreeRTOS port makes use of it.
Provide a means of reporting the amount of free stack for each task, or recording the minimum free stack seen. We write a specific pattern to all unused memory, then we have a function that iterates through all the tasks, measuring how much memory there is at the end of the stack that still contains the pattern.
Some compilers (including gcc) can output a file giving the stack used by each function. This doesn’t take account of called functions, so you have to trace the call tree and add up the stack used by each nested call to get the total required.
It’s unlikely that your real-time tasks use recursive functions, but if you have any tasks that do call recursive functions, and you can’t place an upper bound on the recursion depth, then additional precautions are needed. One of our applications falls into this category because it includes a language parser. In this case you need to check the amount of free stack in at least one point of each possible cycle of recursion, and if it isn’t sufficient for a cycle (see #5 for how to determine this), gracefully terminate the operation concerned.
Thanks.
The way to tackle this situation is to take the trial and error approach? If 100 stack depth is too small, then slowly increment the value till you no longer run into the issue? Or ofcourse try using smaller array sizes and nested calls?
You can try to estimate the usage by looking at the tasks and what they use as local variables. I tend to start with an over-estimate and then scale down, and perhaps then use the extra space to increase buffers for performance.
If you run your system for a while, and then check the high water mark for all your tasks, you can see where you gave too much extra.
Yeah. Once I increased from 100 to 200 stack depth, I stopped running into the issue.
Where else would you read the water mark to see how much stack is left?
In most of my applications, I include a routine that calls uxTaskGetSystemState() to get the information for all the tasks and sends a report to the “debug console”, and I then activate that routine after the system has been exercised for all functions. Tasks close to their limit may be given a bit more stack, and any task with a lot of extra, might be reduced, unless there could be cases where it needs more.
where do you call this uxTaskGetSystemState() routine? in FreeRTOS, tasks are only spawned after runing the scheduler and once scheduler has run, the execution won’t go past it in (in main). Where should I call it? In tasks themselves?
Yes, you call it from a task. I tend to have either a command interface task as part of the system interface taking commands from an external serial port, so I add a command to do this operation, or I include a supplemental debugging task, connected to a debugging serial port with that capability.