" stmdb r0!, {r4-r11, r14} \n"/* Save the core registers. */
… " bl vTaskSwitchContext \n"
… " ldmia r0!, {r4-r11, r14} \n"/* Pop the core registers. */
… " bx r14 \n"
but ARM_CM0/port.c does:
" push {r3, r14} \n"
… " bl vTaskSwitchContext \n"
… " pop {r2, r3} \n"/* lr goes in r3. r2 now holds tcb pointer. */
… " bx r3 \n"
It seems to me that ARM_CM4F is saving and restoring R14 (LR) in the context save area, but ARM_CM0 is saving and restoring to the Main (MSP) stack. So, in the case of a task switch, CM4F is bx’ing to the EXC_RETURN (in LR) of the newly restored task, but CM0 is bx’ing to the EXC_RETURN of the interrupted task. Why the difference?
Cortex-M4 ports saves and restores EXC_RETURN per task because it is used to determine if the task was using FPU (and as a result, if FPU registers need to be stored and restored). Cortex-M0 does not have FPU and therefore it is not needed.
Not necessarily - R3 is not a caller saved register and therefore we need to preserve its value across vTaskSwitchContext call by saving it on the stack. LR is pushed and popped to ensure that the stack remains double word aligned.