Cortex M4 Hardfault in PendSV during initial bringup

Hello, I am getting hardfaults from the PendSV interrupt. I am trying to get FreeRTOS 8.2.3 running on a SAMD51 microcontroller (cortex m4). I have reduced the scope of my problem down to purely FreeRTOS territory, Segger code, and Cortex m4 standard features. I still don’t understand enough about interrupt masking, recovery, and cortex hardfaults to understand what my next debugging step needs to be.

I got the code from Atmel Start (so hopefully it came with sane defaults), and I then added the Segger Systemview patch (see the segger wiki, article FreeRTOS_with_SystemView v8, with some adjustments that are hopefully correct for 8.2.3 patch not aligning). I also increased configQUEUE_REGISTRY_SIZE, set configCHECK_FOR_STACK_OVERFLOW to 2, and enabled xTaskGetIdleTaskHandle. I also configured RTT to hopefully mask correctly:

#define SEGGER_RTT_MAX_INTERRUPT_PRIORITY          (0x80)   // Interrupt priority to lock on SEGGER_RTT_LOCK on Cortex-M3/4

Initially, I thought that I was messing up NVIC priorities, so first I did things like

NVIC_SetPriority(ADC1_0_IRQn, configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY);
NVIC_SetPriority(ADC1_1_IRQn, configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY);
NVIC_SetPriority(EIC_15_IRQn, configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY+2);

but as I kept running into issues, I eventually commented out all peripheral initialization and just created 2 tasks:

#define debug_printf(ch, ...) do {UBaseType_t saved = taskENTER_CRITICAL_FROM_ISR(); SEGGER_RTT_printf(ch, __VA_ARGS__); taskEXIT_CRITICAL_FROM_ISR(saved); } while (0)
void rtos_printing_tester(void * arg)
{
    int i = 0;
    while(true) {
        debug_printf(0, "testing %d\n", i++);
        vTaskDelay(pdMS_TO_TICKS(1000));
    }
}

When I run one task, it works fine, printing to the Segger RTT console once per second. And I can see in the systemview all the interrupts/context switches/etc, and it looks fine.

When I start a second of the same task, I get a hardfault. I added code to hopefully retrieve the PC where the hardfault happened, and I believe it’s happening in PendSV based on looking up the address I found in the dissassembly:

void prvGetRegistersFromStack( uint32_t *pulFaultStackAddress )
{
/* These are volatile to try and prevent the compiler/linker optimising them
away as the variables never actually get used.  If the debugger won't show the
values of the variables, make them global my moving their declaration outside
of this function. */
volatile uint32_t r0;
volatile uint32_t r1;
volatile uint32_t r2;
volatile uint32_t r3;
volatile uint32_t r12;
volatile uint32_t lr; /* Link register. */
volatile uint32_t pc; /* Program counter. */
volatile uint32_t psr;/* Program status register. */

    r0 = pulFaultStackAddress[ 0 ];
    r1 = pulFaultStackAddress[ 1 ];
    r2 = pulFaultStackAddress[ 2 ];
    r3 = pulFaultStackAddress[ 3 ];

    r12 = pulFaultStackAddress[ 4 ];
    lr = pulFaultStackAddress[ 5 ];
    pc = pulFaultStackAddress[ 6 ];
    psr = pulFaultStackAddress[ 7 ];

    SEGGER_RTT_printf(0, "pc=%x\n", pc);
    /* When the following line is hit, the variables contain the register values. */
    for( ;; );
}

void HardFault_Handler(void)
{
    SEGGER_SYSVIEW_RecordEnterISR();
    IRQn_Type actirqn = ((int32_t)__get_IPSR()) - 16; /* -16 see IPSR bit assignments */ \
    SEGGER_RTT_printf(0, "hardfault interrupt prio %d\n", NVIC_GetPriority(actirqn)); \
    __asm volatile
    (
        " tst lr, #4                                                \n"
        " ite eq                                                    \n"
        " mrseq r0, msp                                             \n"
        " mrsne r0, psp                                             \n"
        " ldr r1, [r0, #24]                                         \n"
        " ldr r2, handler2_address_const                            \n"
        " bx r2                                                     \n"
        " handler2_address_const: .word prvGetRegistersFromStack    \n"
    );

}

I’m not sure where to go from here: there are no interrupts running that could call any FreeRTOS code, so I think I’ve ruled out the NVIC configuration. I don’t understand how just running 2 tasks could be resulting in a fault. I would really appreciate advice as to what I should investigate next.

taskENTER/EXIT_CRITICAL_FROM_ISR in debug_printf used in task context is incorrect, taskENTER/EXIT_CRITICAL should be used. Maybe you need macros dedicated to task and ISR context. Unfortunately I don’t know if this fixes your (sporadic ?) Hard_Faults…
Is there a reason why you stick to the pretty old FreeRTOS version ?

This looks somewhat like the entry for your pending SVC handler in the ISR points to something that is not “naked,” meaning the compiler wraps the ISR into something that unwantedly saves/restores registers. That’s a no no for the service handler. Can you post your startup code as well as the declaration of your service ISR?

Edit: Something similar has been discussed here:

I used the _FROM_ISR variants so that I could also use the same debug_printf from ISR & task contexts. I thought that the _FROM_ISR variants are slower, but there’s not a downside to use them.

I chose v8.2.3 originally b/c I wanted MPLAB X support for FreeRTOS. As it’s turned out, Microchip’s stuff works as well as I should have expected, not as well as I wish it did. I can try upgrading to v10 if other things don’t work.

I’m using a CMSIS compliant chip, so I went with that approach. My ISR config is

#define vPortSVCHandler SVCall_Handler

#define xPortPendSVHandler PendSV_Handler
#define xPortSysTickHandler SysTick_Handler

In the port.c, here’s the prototype and impl of PendSV:

void xPortPendSVHandler(void) __attribute__((naked));
void xPortSysTickHandler(void);
void vPortSVCHandler(void) __attribute__((naked));
void xPortPendSVHandler(void)
{
    /* This is a naked function. */

    __asm volatile(
        "   mrs r0, psp                         \n"
        "   isb                                 \n"
        "                                       \n"
        "   ldr r3, pxCurrentTCBConst           \n" /* Get the location of the current TCB. */
        "   ldr r2, [r3]                        \n"
        "                                       \n"
        "   tst r14, #0x10                      \n" /* Is the task using the FPU context?  If so, push high vfp
                                                       registers. */
        "   it eq                               \n"
        "   vstmdbeq r0!, {s16-s31}             \n"
        "                                       \n"
        "   stmdb r0!, {r4-r11, r14}            \n" /* Save the core registers. */
        "                                       \n"
        "   str r0, [r2]                        \n" /* Save the new top of stack into the first member of the TCB. */
        "                                       \n"
        "   stmdb sp!, {r3}                     \n"
        "   mov r0, %0                          \n"
        "   msr basepri, r0                     \n"
        "   dsb                                 \n"
        "   isb                                 \n"
        "   bl vTaskSwitchContext               \n"
        "   mov r0, #0                          \n"
        "   msr basepri, r0                     \n"
        "   ldmia sp!, {r3}                     \n"
        "                                       \n"
        "   ldr r1, [r3]                        \n" /* The first item in pxCurrentTCB is the task top of stack. */
        "   ldr r0, [r1]                        \n"
        "                                       \n"
        "   ldmia r0!, {r4-r11, r14}            \n" /* Pop the core registers. */
        "                                       \n"
        "   tst r14, #0x10                      \n" /* Is the task using the FPU context?  If so, pop the high vfp
                                                       registers too. */
        "   it eq                               \n"
        "   vldmiaeq r0!, {s16-s31}             \n"
        "                                       \n"
        "   msr psp, r0                         \n"
        "   isb                                 \n"
        "                                       \n"
#ifdef WORKAROUND_PMU_CM001 /* XMC4000 specific errata workaround. */
#if WORKAROUND_PMU_CM001 == 1
        "           push { r14 }                \n"
        "           pop { pc }                  \n"
#endif
#endif
        "                                       \n"
        "   bx r14                              \n"
        "                                       \n"
        "   .align 2                            \n"
        "pxCurrentTCBConst: .word pxCurrentTCB  \n" ::"i"(configMAX_SYSCALL_INTERRUPT_PRIORITY));
}

In my startup file, it is defined as weak:

void PendSV_Handler(void) __attribute__((weak, alias("Dummy_Handler")));

My dissassembled PendSV_Handler is:

00002a34 <PendSV_Handler>:
    __asm volatile(
    2a34:   f3ef 8009   mrs r0, PSP
    2a38:   f3bf 8f6f   isb sy
    2a3c:   4b14        ldr r3, [pc, #80]   ; (2a90 <pxCurrentTCBConst>)
    2a3e:   681a        ldr r2, [r3, #0]
    2a40:   f01e 0f10   tst.w   lr, #16
    2a44:   bf08        it  eq
    2a46:   ed20 8a10   vstmdbeq    r0!, {s16-s31}
    2a4a:   e920 4ff0   stmdb   r0!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    2a4e:   6010        str r0, [r2, #0]
    2a50:   f84d 3d04   str.w   r3, [sp, #-4]!
    2a54:   f04f 0080   mov.w   r0, #128    ; 0x80
    2a58:   f380 8811   msr BASEPRI, r0
    2a5c:   f3bf 8f4f   dsb sy
    2a60:   f3bf 8f6f   isb sy
    2a64:   f001 f924   bl  3cb0 <vTaskSwitchContext>
    2a68:   f04f 0000   mov.w   r0, #0
    2a6c:   f380 8811   msr BASEPRI, r0
    2a70:   bc08        pop {r3}
    2a72:   6819        ldr r1, [r3, #0]
    2a74:   6808        ldr r0, [r1, #0]
    2a76:   e8b0 4ff0   ldmia.w r0!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    2a7a:   f01e 0f10   tst.w   lr, #16
    2a7e:   bf08        it  eq
    2a80:   ecb0 8a10   vldmiaeq    r0!, {s16-s31}
    2a84:   f380 8809   msr PSP, r0
    2a88:   f3bf 8f6f   isb sy
    2a8c:   4770        bx  lr
    2a8e:   bf00        nop

It looks like the PendSV handler is appropriately naked.

When I stripped down the example, I believe that the fault comes during the idle task, around 36da plus or minus:

000036cc <prvIdleTask>:

            A critical region is not required here as we are just reading from
            the list, and an occasional incorrect value will not matter.  If
            the ready list at the idle priority contains more than one task
            then a task other than the idle task is ready to execute. */
            if (listCURRENT_LIST_LENGTH(&(pxReadyTasksLists[tskIDLE_PRIORITY])) > (UBaseType_t)1) {
    36cc:   4907        ldr r1, [pc, #28]   ; (36ec <prvIdleTask+0x20>)
                taskYIELD();
    36ce:   f04f 23e0   mov.w   r3, #3758153728 ; 0xe000e000
    36d2:   f04f 5280   mov.w   r2, #268435456  ; 0x10000000
            if (listCURRENT_LIST_LENGTH(&(pxReadyTasksLists[tskIDLE_PRIORITY])) > (UBaseType_t)1) {
    36d6:   6808        ldr r0, [r1, #0]
    36d8:   2801        cmp r0, #1
    36da:   d9fd        bls.n   36d8 <prvIdleTask+0xc>
                taskYIELD();
    36dc:   f8c3 2d04   str.w   r2, [r3, #3332] ; 0xd04
    36e0:   f3bf 8f4f   dsb sy
    36e4:   f3bf 8f6f   isb sy
    36e8:   e7f5        b.n 36d6 <prvIdleTask+0xa>
    36ea:   bf00        nop
    36ec:   200028d8    .word   0x200028d8

what are your task priorities? what is the config for your max task priority?

One of the benefits of using a newer FreeRTOS version is that there are more configASSERT checks in the FreeRTOS code. This helps a lot catching issues in application code. I guess you have configASSERTdefined.
Note that the point of the FromISR vs. normal FreeRTOS API isn’t about being faster or slower. They are dedicated to the respective calling context and have to be used accordingly.

I have #define configMAX_PRIORITIES ((uint32_t)5)and the tasks are created as

    if (xTaskCreate(
            rtos_printing_tester, "print", TASK_EXAMPLE_STACK_SIZE, NULL, TASK_EXAMPLE_STACK_PRIORITY, NULL)
            != pdPASS) {
        while (1) {
            ;
        }
    }
    if (xTaskCreate(
            rtos_printing_tester, "print", TASK_EXAMPLE_STACK_SIZE, NULL, TASK_EXAMPLE_STACK_PRIORITY, NULL)
            != pdPASS) {
        while (1) {
            ;
        }
    }

I will try to migrate to a newer FreeRTOS version. That’s a compelling reason to upgrade–I thought it was just for static allocation and streams, which I don’t need. Improved configASSERTs will hopefully help. I do have configASSERT defined.

As mentioned in this application note from ARM, can you check the content of HFSR?

If the hard fault is forced, please enable Usage, Bus and MPU faults. If you use CMSIS, you can use the following code snippet from the above doc:

SCB->SHCSR |= SCB_SHCSR_USGFAULTENA_Msk |  SCB_SHCSR_BUSFAULTENA_Msk|  SCB_SHCSR_MEMFAULTENA_Msk; 

Also, can you try without enabling any of the interrupts (ADC and EIC)?

Thanks.

I’m taking a long weekend, but as soon as I can, I will report back with the contents of HFSR and looking at the specific faults. Thank you for that document reference–I think that exercising all the possibilities in there is going to get me to the solution, which I’ll be able to share for others who might run into this.

I have been doing all of this without enabling peripheral interrupts–I’m using dozens and dozens of peripherals, but I ended up commenting out all of their initialization before posting, because I wanted to eliminate that I was making the “standard” ARM CM4 mistake of misconfiguring something with the NVIC. So everything in this thread shouldn’t have any interrupts like EIC or ADC enabled.

what is this set to??? :roll_eyes: