vPortStartFirstTask doesn't correctly start task

I have interest to run freeRTOS on an M1 (Microsemi) board but there is no freeRTOS port available for M1. So I picked the M0 port, used NXP M0+ LPC51U68 demo as an example, and made a few update in port.c (portNVIC_PENDSV_PRI and portNVIC_SYSTICK_PRI, which requires 2 bits, instead of 8 bits in M0+, per user manual). also replaced heap_5 by heap_4 due to difficulty to add heap in 2nd segment. And added “weak” and “sections(”.after_vectors") to SVC_Handler and PendSV_Handler (from hal) to make compilation pass.

Back into main(), to keep code minimal at this point, only called vTaskStartScheduler(), which creates the idle and timer task by default (task.c), and then called xPortStartScheduler() and then vPortStartFirstTask() (port.c).
however, got HardFault error.

Debugged into the assembly code in vPortStartFirstTask(). noticed in the NXP demo, the last statement “bx r3” leads to prvTimerTask at 0x5d0b (with “push {r7, lr}”, 1st statement of prvTimerTask). However, with the M1 board, “bx r3” leads to something at 0xfffc0 (with “lsls r2,r3,#13”), which then causes hardFault.

I am not sure if the steps to “start first task” should be different with M1 compared to M0/M0+. I am not proficient at assembly and ARM. so not sure where to find relevant document/guideline or how to trace this issue? any suggestion appreciated.


in addition, have trouble to copy the assembly code from " Debugging Hard Fault & Other Exceptions" (http://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html) and paste to the HardFault_Handler(). any additional edit required to use that code in HardFault_Handler()?

Not familiar with M1 cores. Is there a doc that describes differences between it and M0 cores.

Would assume the 0xfffc0 address is the wrong place to jump to, so suspect the crash is because it’s running random values there or it’s an invalid address. What is the memory map for your design?

I mainly read two documents, “Cortex-M0+ Devices - Generic User Guide” (file name: DUI0662B_cortex_m0p_r0p1_dgug.pdf), and “Cortex-M1 Technical Reference Manual” (file name: DDI0413D_cornexm1_r1p0_trm.pdf), for high level information.

from these two documents, the sys control registers that appear in port.c are the same (except SysTick and PendSV pri bit setting in SHPR3). The overall memory maps are also the same, except for M1 it explicitly indicates 0-0x100000 and 0x10000000-0x1000FFFF for ITCM lower and upper.

so the issue is how to get the right address for the first task (prvTimerTask)? how to know or determine if it is because any error before calling “vPortStartFirstTask”, or because the vPortStartFirstTask is incorrect and picks the wrong address?

Hi tempeli,

vPortStartFirstTask() manually builds a task context that makes the MCU believe it returns to the task function.

Like Richard, I’m not familiar with the M1, but I suspect it’s either viloated stack alignement requirements or a different treatment of the bx instruction. You might want to single step through vPortStartFirstTask right up to the bx r3 instruction and then dump the memory that r3 points to (but make sure to dump it below that address because stacks on the M3 move towards lower addresses). The tech reference manual should then tell you how the bx r3 instructions recovers its registers from the stack.

thanks for the hints. tried and here is the info (could be easier if I can attach screen shot):

  1. reg info right before running “bx r3”
    r0: 0xfffb0
    r1: 0x17f3
    r2: 0x2839
    r3: 0xfffc0
    r4: 0x3fffffff
    r5: 0x1
    r6: 0x22a5832a
    r7: 0xfffa0

    sp: 0xfffc0
    lr: 0x1
    pc: 0x7f0
    xPSR: 0x1000000
    msp: 0xfffc0
    psp: 0xfffc0

code:
ldr: r2, [pc, #44]
ldr: r3, [r2, #0]
ldr: r0, [r3, #0]
adds r0, #32
msr PSP, r0
movs r0, #2
msr CONTROL, r0
isb sy
pop {r0, r1, r2, r3, r4, r5}
mov lr, r5
pop {r3}
pop {r2}
cpsie I
bx r3

  1. after running “bx r3”, it goes to:
    000fffc0: lsls r2, r1, #13
    000fffc2: str r0, [sp, #428]
    000fffc4: movs r0, r0
    000fffc6: movs r0, r0
    000fffc8: movs r0, #130
    000fffca: ler r7, [sp, #136]

  2. however, after moving the scrollbar to show more info, the address changes to 0xfffc1 for “lsls r2 r1 #13”, like this:
    000fffad: ldrb r0, [r7, #17]
    000fffaf: movs r0, r0
    000ffffb1: ; instruction 0xffff3fff
    000fffb5: movs r1, r0
    000fffb7: movs r0, r0
    000fffb9: vaddl.u8 q8, d0, d15
    000fffbd: cmp r0, #57 ; 0x39
    000fffbf: mov r0, r0
    000fffc1: lsls r2, r1, #13
    000fffc3: str r0, [sp, #428]

Sorry for being unclear, tempeli - what I was interested in was the memory block that r3 pointed to (again, ±40 bytes lower than r3’s value) BEFORE the bx r3 instruction was executed (not only processor registers but also memory). Then compare the register set against that block as you did.

Note: What fffc0 contains is probably irrelevant to the error because it’s some “random” address popped from the stack; the true cause of the error more likely is the erratic stack at return time.

Edit: Ok, hold on, r3 already contains fffc0 which according to your initial description already is wrong. So you need to do the single step until the pop {r0,r1,r2,r3,r4,r5} instruction a little beofre that and anylze the memory there…

Thanks. Here is more info, hope I understand you correctly. also I include a little more info and hopefully it will give more clue.

  1. before entering the block:
    r0: 0x2ee0
    r1: 0x0
    r2: 0x0
    r3: 0x2b44
    r4: 0x3ffffff
    r5: 0x1
00002b14:   ldr r4, [r2, #84]       ; 0x54
00002b16:   str r2, [r6, r5]
00002b18:   movs r0, r0
00002b1a:   movs r0, r0
00002b1c:   movs r5, #220   ; 0xdc
00002b1e:   movs r0, r0
00002b20:   movs r5, #220   ; 0xdc
00002b22:   movs r0, r0
00002b24:   movs r5, #220   ; 0xdc
00002b26:   movs r0, r0
00002b28:   movs r6, #76    ; 0x4c
00002b2a:   movs r0, r0
00002b2c:   movs r6, #96    ; 0x60
00002b2e:   movs r0, r0
00002b30:   movs r6, #156   ; 0x9c
00002b32:   movs r0, r0
00002b34:   movs r5, #220   ; 0xdc
00002b36:   movs r0, r0
00002b38:   movs r5, #220   ; 0xdc
00002b3a:   movs r0, r0
00002b3c:   movs r6, #76    ; 0x4c
00002b3e:   movs r0, r0
00002b40:   movs r6, #96    ; 0x60
00002b42:   movs r0, r0
      uxCriticalNesting:
00002b44:   movs r0, r0
00002b46:   movs r0, r0
      uxTopUsedPriority:
  1. after executing “ldr r3, [r2, #0]”
    r0: 0x2ee0
    r1: 0x0
    r2: 0x7b7c
    r3: 0x3438

(value of r3 doesn’t change until executing pop {r0, r1, r2, r3, r4, r5})

00003408:   movs r0, r0
0000340a:   movs r0, r0
0000340c:   add r5, pc, #660        ; (adr r5, 0x36a4 <ucHeap+2900>)
0000340e:   add r5, pc, #660        ; (adr r5, 0x36a4 <ucHeap+2900>)
00003410:   add r5, pc, #660        ; (adr r5, 0x36a8 <ucHeap+2904>)
00003412:   add r5, pc, #660        ; (adr r5, 0x36a8 <ucHeap+2904>)
00003414:   add r5, pc, #660        ; (adr r5, 0x36ac <ucHeap+2908>)
00003416:   add r5, pc, #660        ; (adr r5, 0x36ac <ucHeap+2908>)
00003418:   add r5, pc, #660        ; (adr r5, 0x36b0 <ucHeap+2912>)
0000341a:   add r5, pc, #660        ; (adr r5, 0x36b0 <ucHeap+2912>)
0000341c:   lsls r1, r2, #30
0000341e:   movs r0, r0
00003420:   movs r3, #205   ; 0xcd
00003422:   movs r0, r0
00003424:   movs r0, r0
00003426:   lsls r0, r0, #4
00003428:   add r5, pc, #660        ; (adr r5, 0x36c0 <ucHeap+2928>)
0000342a:   add r5, pc, #660        ; (adr r5, 0x36c0 <ucHeap+2928>)
0000342c:   add r5, pc, #660        ; (adr r5, 0x36c4 <ucHeap+2932>)
0000342e:   add r5, pc, #660        ; (adr r5, 0x36c4 <ucHeap+2932>)
00003430:   movs r0, r0
00003432:   movs r0, r0
00003434:   lsls r0, r4, #1
00003436:   strh r0, [r0, #0]
00003438:   adds r3, #232   ; 0xe8
0000343a:   movs r0, r0
0000343c:   movs r0, r0
0000343e:   movs r0, r0
  1. upon executing “pop {r0, r1, r2, r3, r4, r5}”

    r0: 0xfffb0
    r1: 0x17f3
    r2: 0x0
    r3: 0x7c78
    r4: 0x3ffffff
    r5: 0x1

00007c44:   ldrb r0, [r1, #17]
00007c46:   movs r0, r0
00007c48:    ; <UNDEFINED> instruction: 0xffffffff
00007c4c:   ldrb r0, [r1, #17]
00007c4e:   movs r0, r0
00007c50:   ldrb r0, [r1, #17]
00007c52:   movs r0, r0
      uxCurrentNumberOfTasks:
00007c54:   movs r2, r0
00007c56:   movs r0, r0
      xTickCount:
00007c58:   movs r0, r0
00007c5a:   movs r0, r0
      uxTopReadyPriority:
00007c5c:   movs r2, r0
00007c5e:   movs r0, r0
      xSchedulerRunning:
00007c60:   movs r1, r0
00007c62:   movs r0, r0
      xPendedTicks:
00007c64:   movs r0, r0
00007c66:   movs r0, r0
      xYieldPending:
00007c68:   movs r0, r0
00007c6a:   movs r0, r0
      xNumOfOverflows:
00007c6c:   movs r0, r0
00007c6e:   movs r0, r0
      uxTaskNumber:
00007c70:   movs r2, r0
00007c72:   movs r0, r0
      xNextTaskUnblockTime:
00007c74:    ; <UNDEFINED> instruction: 0xffffffff
     xIdleTaskHandle:
00007c78:   cmp r7, #96     ; 0x60
00007c7a:   movs r0, r0
      uxSchedulerSuspended:
00007c7c:   movs r0, r0
  1. then next “pop {r3}” load 0xfffc0 to r3 register

is it possible that your SP is invalid at startup or you don’t have enough stack allocated for the startup? vPortStartFirstTask() is executed in the context of the startup procedure. What I see looks almost as if the code doesn’t have a chance to set up the first task’s stack frame ok.

What does pxCurrentTCB point to before you call vPortStartFirstTask()? Does that look ok?

If you look at xPortInitialiseStack() (here is the one for the M0 https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM0/port.c#L132 you will see some registers are initialised to known values while others are left at their startup values. The ones that are left at their startup values are the ones where you see pxTopOfStack -= 5; (jumping over 5 registers) and pxTopOfStack -= 8; (jumping over 8 registers). The code that starts the scheduler should pop the values shown their into their respective registers. Getting a lot of F’s in those values, as you showed, could be because of a stack misalignment as per RAc’s prior comment.

When I create a new port I tend to give each register a known and unique value so when I pop them off the stack into the registers I know each value is going into the correct register. You can see that being done in this port layer: https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CA53_64_BIT/port.c#L180 - it may help you to do the same, at least during the development phase.

Franky speaking, I don’t know if SP is valid or if it’s big enough. how to check that?

inspecting pxCurrentTCB before right before calling xPortStartScheduler (which then calls vPortStartFirstTask) shows that pxCurrentTCB points to 0x343b (with pxTopOfStack pointing to 0x33e8), I don’t see either address in the map file. bad sign? task incorrectly created?

however, after loading pxCurrentTCBconst2 into r2, r2 value is 0x7b7c, pointing to pxCurrentTCB (although currentTCB might be wrong).

rtel,
I tried to fill r1~r12 (in the order 12, 3, 2, 1, 0, 11, 10, …, 5, 4, according to the original comment in the M0 port file), following the files you pointed to, with the last one as 0x04040404UL (for R4).
Before returning pxTopofStack, inspecting pxTopOfStack shows address 0x2fc0 (ucHeap+960), and value of 67372036 (4040404). but when looking at the registers, seems they are different
r0: 0x3000 (not 0)
r1: 0x1f55
r2: 0x4040404
r3: 0x2fc0
r4: 0x0
r5: 0x1
r6: 0x22a5832a
r7: 0fff30
r8: 0xcd290224
r10: 0x36c77f3
r11: 0x9594f9d9
r12:0x3fffffff

and
xpsr: 0x21000000 (not 0x1000000)
pc: 0x80a (not 0x1f55)
LR: 0x1767

it seems only
R2 was set (but with the value 04040404).
R1 was set (but with value of pc)
R4 might be set with value for R0.

just read the two manuals again and indeed the register maps ARE different.
for M0+:
SP + 0x1c: xPSR
SP + 0x18: PC
SP + 0x14: LR
SP + 0x10: R12
SP + 0x0c: R3
SP + 0x08: R2
SP + 0X04: R1
SP + 0X00: R0
(other registers unspecified here)

for M1:
xPSR:
R15 (PC)
R14 (LR)
R13 (SP)
R12
R11
R10

R1
R0
there is one register R13 (SP) not in M0+.
However, when stepping through these initializations, it seems each time R2 and R3 are changed (to value and address of pxTopOfStack), while most other registers doesn’t change. the register values still don’t match these above init values.
Why initializing these does not change the registers?

and continuing the debugging with the above change doesn’t correct the R3 register (0xfffc0) before “bx r3”.

many thanks

I assume here you are talking about the function pxPortInitialiseStack: FreeRTOS-Kernel/port.c at main · FreeRTOS/FreeRTOS-Kernel · GitHub

Note down this pxTopOfStack value (0x2fc0) and it should be the value in r0 after you execute ldr r0, [r3] instruction: FreeRTOS-Kernel/port.c at main · FreeRTOS/FreeRTOS-Kernel · GitHub

If that value is same, examine the memory at the location and see if it still contains valid values (i.e. values you filled in pxPortInitialiseStack).

Thanks.

No, you’re not likely to see those values verbatim in the map file unless you created the tasks statically. During xTaskCreate(), memory for both the TCB and the task stack is allocated from the heap.

Does your IDE provide a way to decode the TCB, ie look at the fields of the TCB structure? Can you post a dump of such an inspection?

many thanks for all you guys’ suggestions. here is some info following your instructions.

at the end of prvAddNewTasktoReadyList for the idle task, the pxCurrentTCB looks like

with the memory for the pxTopofStack as:

and for the tmr svc task

and memory:

looks ok, right?

then stepping through vPortStartFirstTask until “adds r0, #32” gives r3, r2, and r0. they seems matching above tmr_svc task, right? and with more step by step debugging info shown below.

and

before executing “pop {r0, r1, …, r5}”, msp and psp have value of 0xfffa0, while at 0xfffa0, the value is 0xfffb0. at 0xfffb0, it’s 0x3fffffff.

after executing this, msp and psp have value of 0xfffb8, and at 0xfffb8, the value is 0xfffc0.

(may need to zoom in the picture to see clear info).

any evidence of stack alignment issue? or any other causes?

This seems interesting - looks like pop does not retrieve correct values from stack. Does the hardware have any cache (incoherent cache may cause incorrect reads from memory)?

What are the values of PSP and MSP registers and values at those memory locations before executing pop {r0, r1, r2, r3, r4, r5} and after executing it?

Thanks.

Hi, aggarg,
I don’t see any documents mentioning cache, it’s Microsemi proasic3L M1 dev board.
I just updated the above post with step by step info to show all registers, hopefully this gives the full picture (sorry fortoo many screen captures).
I also suspect there are some weird behavior from “msr PSP, r0” ~ “isb sy”, as in each step some other registers also change their values.

Thank you for the images. I see that both MSP and PSP are always same and even updated together when you pop - this is strange!

I guess your stack is being restored from MSP - can you check the contents of the location 0xffffa0?

Thanks.

hi, aggarg,

just updated the above post with the value at these registers. sorry for not including them earlier.

I have seen those updated images and I am unable to explain why msr does not set psp. I am asking for the content of the location 0xfffa0.

Also, would you please read back psp - add the following two lines after msr psp, r0:

movs r0, #0 /* Clear R0 to be sure. */
mrs r0, PSP /* Check the value of r0 after this statement. */

Thanks.

below is the info before and after executing these two lines. seems psp is read correctly into r0. and not sure if I catch you correctly, the content or value at the address 0xfffa0 in memory is 0xfffb0.

something wrong with “msr PSP, r0”?