Where is each task's Stack Pointer stored, or, how do I find the PC on a non-running task's stack?

hillridge · January 13, 2024, 1:00am

I’ve been working on a very annoying bug for a while now, and finally had a bit of a breakthrough tonight. I’ve been getting usage fault errors along with other weird behavior in one task and it looks like it’s due to something corrupting that task’s PC (or maybe the stack as a whole) while it is not running, then when a context switch resumes it, bad things happen.

I found this by breaking on the entry of vTaskSwitchContext() and stepping through it to the end. I have configUSE_NEWLIB_REENTRANT enabled, so the last line is:
/* Switch Newlib’s _impure_ptr variable to point to the _reent
structure specific to this task. */
_impure_ptr = &( pxCurrentTCB->xNewLib_reent );

and it is after this line that my fault fires.

Where can I find the stack pointer for each task so that I can keep an eye on the stack while other tasks are running? Also, when a task is not running, how far deep in the stack is the PC stored?

This excellent walkthrough doesn’t explicitly state how many bytes/words get pushed on the stack during a context switch, but it does say that the kernel tracks the SP, just not where.

Thanks

richard-damon · January 13, 2024, 1:24am

It depends on the “Port” you are using. In every port there will be a function to restore the context of the currently selected task (which will be called just after the schedule determines which task that will be). Normally that routine will pop off all the registers from the stack, and as a last step pop the PC and go to that address.

Now, if that statement is what causes the fault, my guess is it isn’t the “PC” of the task that has a problem, but that your currentTCB has gotten a bad value.

hillridge · January 13, 2024, 1:56am

What’s the function name? I’m using 10.0.0 for an M4 processor.

portRESTORE_CONTEXT didn’t turn up anything.

I think something is clobbering the task’s stack after it switches out, but should know for sure if I can find the portion with the PC in it. While a task is not active, is the value pointed to by pxTopOfStack in the TCB still the actual top of stack?

richard-damon · January 13, 2024, 4:13am

For Cortex-M processors, the PendSV function will save the context of the previous task, call vTaskSwitchContext, then restore the context from the newly selected task.

hillridge · January 13, 2024, 2:26pm

Thanks.
Does each task’s pxTopOfStack value only get updated on a context swap? If so, should I be monitoring the processor’s SP while the task is running to verify that it doesn’t blow through the top?
I assume that cases where the stack grows past the limit during operation but is back under the limit for the context switch will not be caught by the taskCHECK_FOR_STACK_OVERFLOW macro, right?

I found at least one of my issues - A buffer was sent to a different task for processing asynchronously and the calling function went out of scope before the buffer was used. When the buffer did get filled, the original task’s stored context was right in the middle of it.

I suspect more than one call may be doing this, so I need to dig through everything related to this now. On the right track anyway!

richard-damon · January 13, 2024, 2:43pm

There is no way for “some task” to see the stack pointer of a different task while that task is running.

hillridge · January 13, 2024, 6:10pm

I was referring to while in debug. I’m watching the TCBs of the tasks where I was having problems. It looks like the pxTopOfStack value in each TCB only gets updated on a context switch.

richard-damon · January 13, 2024, 6:58pm

Yes, the memory location is only written on a. Task switch. You need to look at the actual SP register while the task is running.

aggarg · January 15, 2024, 5:26am

Method 2 described on this page should catch it, assuming the memory outside the stack range was modified.

hillridge · January 19, 2024, 7:46pm

I fixed the issue that prompted this question and am working on another that requires some digging around in TCBs/Stack.

What’s the order of stacked registers on non-running task?

Looking at pendsv for my M4 port I see:

	__asm volatile
	(
	"	mrs r0, psp							\n"
	"	isb									\n"
	"										\n"
	"	ldr	r3, pxCurrentTCBConst			\n" /* Get the location of the current TCB. */
	"	ldr	r2, [r3]						\n"
	"										\n"
	"	tst r14, #0x10						\n" /* Is the task using the FPU context?  If so, push high vfp registers. */
	"	it eq								\n"
	"	vstmdbeq r0!, {s16-s31}				\n"
	"										\n"
	"	stmdb r0!, {r4-r11, r14}			\n" /* Save the core registers. */
	"	str r0, [r2]						\n" /* Save the new top of stack into the first member of the TCB. */
	"										\n"
	"	stmdb sp!, {r0, r3}					\n"
	"	mov r0, %0 							\n"
	"	msr basepri, r0						\n"
	"	dsb									\n"
	"	isb									\n"

I don’t know much about ARM assembly yet, but I think this:
Loads the PSP into r0
if the task is using FPU, push registers s16 through s31 onto the stack and update the saved PSP value
push r4 through r11 and r14 onto the stack and update the saved PSP value
save the updated PSP in the TCB as the top of stack

So assuming the task did not use FPU, after it has been switched out of running, the “top of stack” value should point to r14, with r11, r10 …, r4 below it and then the actual task stack space below that, correct?

Where does the task’s PC get saved?

jefftenney · January 19, 2024, 8:03pm

It’s on the task’s stack, put there automatically by the CPU when the interrupt occurs to trigger a context switch. The CPU stacks R0 through R3, R12, the LR (R14), PC, and SR automatically in Cortex M.

hillridge · January 19, 2024, 8:53pm

So the context switch interrupt stacks those, then the handler pendsv takes the current psp and stacks the others on top?

What’s the final stack look like?

This?
[r14]
[r11]
[r10]
[…]
[r4]
[SR]
[PC]
[LR]
[r12]
[r3]
[…]
[r0]

hillridge · January 19, 2024, 9:45pm

I found a great post on it that cleared things up for me:

jefftenney · January 19, 2024, 9:52pm

Yes, that’s correct. Only a small correction in stacking order:

R4  (lowest address)
R5
...
R11
R14
R0  (here when executing "bx r14")
R1
R2
R3
R12
LR
PC
SR  (highest address)

hillridge · January 19, 2024, 9:53pm

Thanks, I was just fixing the order in my post when I saw your reply come in. This has been helpful.