I’ve been working on a very annoying bug for a while now, and finally had a bit of a breakthrough tonight. I’ve been getting usage fault errors along with other weird behavior in one task and it looks like it’s due to something corrupting that task’s PC (or maybe the stack as a whole) while it is not running, then when a context switch resumes it, bad things happen.
I found this by breaking on the entry of vTaskSwitchContext() and stepping through it to the end. I have configUSE_NEWLIB_REENTRANT enabled, so the last line is:
/* Switch Newlib’s _impure_ptr variable to point to the _reent
structure specific to this task. */
_impure_ptr = &( pxCurrentTCB->xNewLib_reent );
and it is after this line that my fault fires.
Where can I find the stack pointer for each task so that I can keep an eye on the stack while other tasks are running? Also, when a task is not running, how far deep in the stack is the PC stored?
This excellent walkthrough doesn’t explicitly state how many bytes/words get pushed on the stack during a context switch, but it does say that the kernel tracks the SP, just not where.
It depends on the “Port” you are using. In every port there will be a function to restore the context of the currently selected task (which will be called just after the schedule determines which task that will be). Normally that routine will pop off all the registers from the stack, and as a last step pop the PC and go to that address.
Now, if that statement is what causes the fault, my guess is it isn’t the “PC” of the task that has a problem, but that your currentTCB has gotten a bad value.
What’s the function name? I’m using 10.0.0 for an M4 processor.
portRESTORE_CONTEXT didn’t turn up anything.
I think something is clobbering the task’s stack after it switches out, but should know for sure if I can find the portion with the PC in it. While a task is not active, is the value pointed to by pxTopOfStack in the TCB still the actual top of stack?
For Cortex-M processors, the PendSV function will save the context of the previous task, call vTaskSwitchContext, then restore the context from the newly selected task.
Thanks.
Does each task’s pxTopOfStack value only get updated on a context swap? If so, should I be monitoring the processor’s SP while the task is running to verify that it doesn’t blow through the top?
I assume that cases where the stack grows past the limit during operation but is back under the limit for the context switch will not be caught by the taskCHECK_FOR_STACK_OVERFLOW macro, right?
I found at least one of my issues - A buffer was sent to a different task for processing asynchronously and the calling function went out of scope before the buffer was used. When the buffer did get filled, the original task’s stored context was right in the middle of it.
I suspect more than one call may be doing this, so I need to dig through everything related to this now. On the right track anyway!
I was referring to while in debug. I’m watching the TCBs of the tasks where I was having problems. It looks like the pxTopOfStack value in each TCB only gets updated on a context switch.
I fixed the issue that prompted this question and am working on another that requires some digging around in TCBs/Stack.
What’s the order of stacked registers on non-running task?
Looking at pendsv for my M4 port I see:
__asm volatile
(
" mrs r0, psp \n"
" isb \n"
" \n"
" ldr r3, pxCurrentTCBConst \n" /* Get the location of the current TCB. */
" ldr r2, [r3] \n"
" \n"
" tst r14, #0x10 \n" /* Is the task using the FPU context? If so, push high vfp registers. */
" it eq \n"
" vstmdbeq r0!, {s16-s31} \n"
" \n"
" stmdb r0!, {r4-r11, r14} \n" /* Save the core registers. */
" str r0, [r2] \n" /* Save the new top of stack into the first member of the TCB. */
" \n"
" stmdb sp!, {r0, r3} \n"
" mov r0, %0 \n"
" msr basepri, r0 \n"
" dsb \n"
" isb \n"
I don’t know much about ARM assembly yet, but I think this:
Loads the PSP into r0
if the task is using FPU, push registers s16 through s31 onto the stack and update the saved PSP value
push r4 through r11 and r14 onto the stack and update the saved PSP value
save the updated PSP in the TCB as the top of stack
So assuming the task did not use FPU, after it has been switched out of running, the “top of stack” value should point to r14, with r11, r10 …, r4 below it and then the actual task stack space below that, correct?
It’s on the task’s stack, put there automatically by the CPU when the interrupt occurs to trigger a context switch. The CPU stacks R0 through R3, R12, the LR (R14), PC, and SR automatically in Cortex M.