Software Watchdog Design

OK, I have studied and stepped through xPortPendSVHandler and I think I now understand what I am seeing.

The key instructions are:

tst r14, #0x10						\n" /* Is the task using the FPU context?  If so, push high vfp registers. */
vstmdbeq	r0!, {s16-s31} // 16 words
// VSTM DB 
// Floating-point Store Multiple Decrement Before EQ
// suffix: EQ flag: Z = 1 meaning: Equal

and

stmdb	r0!, {r4, r5, r6, r7, r8, r9, sl, fp, lr} // 9 words
// "STore Multiple Decrement Before" pushes all the callee-saved core registers onto psp 

So, I have 16 + 9 = 25 words pushed onto the stack, which is 100 (0x64) bytes. So that accounts for the 0x64 that I need to add to the pxTopOfStack to get to the R0, R1, R2, R3, R12, LR, PC, and PSR that I’m interested in. Note: I think the frame diagrams I showed above don’t apply here.

With that understanding, in general I need to determine whether or not the high vfp registers are there, to know whether I need to add 9 or 25 words (0x24 or 0x64 byte addresses) to the pxTopOfStack. I should be able to do this by looking at the lr at offset 8 from pxTopOfStack. This should contain an EXC_RETURN Value, and bit 5 will tell me whether or not there are floating-point registers (tst r14, #0x10).

That answers all my questions for now. In summary, what I plan to do is:

  • Have a watchdog counter for each task

  • Each task increments its counter at strategic places within the code

  • All of the blocking calls in the monitored tasks have timeouts so they can increment the counters even if they have nothing else to do.

  • A Software Timer will periodically check the counters to make sure that no task is hung up.

  • configTIMER_TASK_PRIORITY will remain at (configMAX_PRIORITIES - 2). In this case, that means that there is one real-time task that the watchdog can’t always handle, but the Software Timer callback watchdog can preempt any other task.

  • If the watchdog detects a hang, he will:

    • capture (at least) the Program Counter (PC) and Link Register (LR) of the hung task
    • write a fault record to flash memory for later analysis
    • do some cleanup
    • reset system

Thanks for your help, everyone!