UsageFault after returning from xPortPendSVHandler after sending a notification

I have been chasing a crash for the past weeks and I’m hoping to get some help here. My problem is that I’m getting a usage-fault upon returning from the xPortPendSVHandler after I sent a notification from an ISR. I’m running this on an XMC4800, using the ARM-GCC port. I managed to reduce it to the example below:

#include "FreeRTOS.h"
#include "task.h"

#include "XMC4800.h"
#include "core_cm4.h"

// the specific vector is not important, just that the notifications comes from an IRQ
static const IRQn_Type irq_n = SCU_0_IRQn;
static TaskHandle_t waiting_task, trigger_task;

void Error_Handler() { while (1) {} }
void HardFault_Handler() { Error_Handler(); }
void MemManage_Handler() { Error_Handler(); }
void BusFault_Handler() { Error_Handler(); }
void UsageFault_Handler() { Error_Handler(); }
void vApplicationStackOverflowHook(TaskHandle_t xTask, char *pcTaskName) { Error_Handler(); }

void SCU_0_IRQHandler() {
    BaseType_t xHigherPriorityTaskWoken = pdFALSE;
    xTaskNotifyFromISR(waiting_task, 1, eSetBits, &xHigherPriorityTaskWoken);
}  // UsageFault after returning from xPortPendSVHandler

void my_waiting_task(void *unused) {
    uint32_t notification = 0;
    // crash occurs only if task is blocked
    xTaskNotifyWait(0, 0xFFFFFFFF, &notification, portMAX_DELAY);
    while (1) {}

// not strictly necessary, original project had the interrupt coming in from hardware
void my_trigger_task(void *unused) {
    while (1) {}

int main() {
            my_waiting_task, "Waiting Task", 512,
            NULL, 2, &waiting_task
    );  // higher priority, so it is blocked by the time the notification arrives
            my_trigger_task, "Trigger Task", 512,
            NULL, 1, &trigger_task

    // IRQ must be less important than any FreeRTOS IRQs, since it uses the FreeRTOS API
    // watch out for ARM Cortex priority pitfalls (inverted numerical priority)
    NVIC_SetPriority(irq_n, configMAX_SYSCALL_INTERRUPT_PRIORITY + 10);

    return 0;

As far as I can see, I’m not doing anything wrong or unusual. I’m using the ISR-safe API calls, I’m yielding at the end of it, I’m aware of the pitfalls with ISR priorities on CM4s. I have updated FreeRTOS to the latest LTS-version, including the ARM-GCC-port, got a fresh FreeRTOSConfig.h, updated CMSIS & XMCLib, updated to the latest GCC-ARM toolchain. I am far away from a stackoverflow, assertions are of course enabled. This was orignally running on a custom board, I have since moved to the XMC48 Relax Kit from Infineon, which reproduces the same problem.

When I break inside the PendSVHandler, just before the return, the stack looks like so:

0x1ffe87a0 |  24 9a fe 1f   fd ff ff ff   01 00 00 00   01 00 00 00
0x1ffe87b0 |  00 00 00 10   04 ed 00 e0   a5 a5 a5 a5   4f 19 00 08
0x1ffe87c0 |  36 09 00 08   10 00 00 21   00 00 00 00   d4 87 fe 1f
0x1ffe87d0 |  c8 91 fe 1f   01 00 00 00   74 9a fe 1f   fd ff ff ff
0x1ffe87e0 |  00 00 00 00   00 00 f0 00   34 ef 00 e0   00 00 00 c0
0x1ffe87f0 |  00 00 00 00   e7 2d 00 08   c8 2c 00 08   00 00 00 61

Here, $sp = 0x1ffe87a4. The address at 0x1ffe87bc lies inside vPortValidateInterruptPriority, the one after lies at the end of my interrupt. From my (very limited) understanding, this looks fine as well. At least I can’t make out a difference to any other working case.

Some final notes:

  • The port mentions a workaround for XMC4000 specific errata, namely WORKAROUND_PMU_CM001. It replaces bx lr with push & pop. I’m somewhat confused by that, my errata sheet mentions CPU_CM.001 but that adresses interrupted loads for $sp, not $pc. Either way, the workaround does not influence the outcome.
  • The FreeRTOSConfig.h had an unused #define xPortPendSVHandler PendSV_Handler_Veneer, hidden behind an ENABLE_CPU_CM_001_WORKAROUND. I could not find any reference to that, except for an old forum post, which didn’t really clear things up for me.
  • For some reason, if I step over the PendSVHandler with my debugger, everythong works just fine. If I let it run without it, or even if I single-step the assembly, it crashes…

Thanks for reading so far, I appreciate your help :slight_smile:

Have you ensured that your pendSVC interrupt has the lowest possibly irq priority? To me it looks as if a context switch is attempted into an isr.

SysTick and PendSV have a priority of 2, SVC has a priority of 0.

I do not know what svc and pendsvc on your platform are and how they map to the standard freertos interrupts, but the interrupt that will eventually execute the task switch must have the lowest prioriy. Normally that is the pendSVC interrupt. 2 does not look right there.

Are you explicitly setting these priorities? We set them to the lowest priority in the port code and you do not need to set them explicitly.

I have set configKERNEL_INTERRUPT_PRIORITY to 2 within my FreeRTOSConfig and made sure my user interrupt is less important, apart from that I have not touched the port or FreeRTOS.

An easy way to figure out whether we are barking up the right tree is to temporarily reset the priorities to lowest and see if that solves your problem.

Can you try removing that or changing that to 0xFF?

I can edit two config defines, configKERNEL_INTERRUPT_PRIORITY and configMAX_SYSCALL_INTERRUPT_PRIORITY. I wasn’t sure if you mean numerically lower or logally lower, so I just tried a few variants.

I can set everything to 63 (the numerically highest, logically lowest prio), then everything works as expected. But that is quite impractical and does not seem to be the intended way, since I can’t use any other priorities other than 63 now, since
configASSERT( ucCurrentPriority >= ucMaxSysCallPriority );
fails. I can set the configMAX_SYSCALL_INTERRUPT_PRIORITY to be numerically lower, and everything still works, but as I understood it that is simply wrong? That would mean that the FreeRTOS ISRs can be interrupted by any other user interrupt, which is not intended.

Any other configuration I tried crashes in the same way, including setting configKERNEL_INTERRUPT_PRIORITY to 0 (numerically lowest, logically highest).

Just to be sure and to avoid confusion, on Cortex M processors, the priority is inverted, so 0 is the highest priority. Since I have 6 bits for priorities, 63 is the lowest priority.

Exactly. Also see this post Understanding priority levels of ISR and FreeRTOS APIs - #16 by aggarg containing a good and easy to understand explanation using interrupts/ISRs with FreeRTOS.

1 Like

This understanding is not correct. configKERNEL_INTERRUPT_PRIORITY needs to be set to the lowest. The use of this constant has been removed in the latest code and we directly program PendSV and SysTick to the lowest priority -

configMAX_SYSCALL_INTERRUPT_PRIORITY is not related to configKERNEL_INTERRUPT_PRIORITY and can be set to any value - you just need to ensure that you do not call FreeRTOS APIs from an ISR with logically higher (numerically lower) priority interrupt.

The reason behind this assert trigger is, as you mentioned:

setting configMAX_SYSCALL_INTERRUPT_PRIORITY to 63 leads to all the priority bits being set and NVIC_SetPriority(irq_n, configMAX_SYSCALL_INTERRUPT_PRIORITY + 10); cannot set the irq_n priority to 73 since there are only 6 priority bits implemented. Therefore, when IRQ invokes task notification this assert gets triggered.

As pointed out by @aggarg, configMAX_SYSCALL_INTERRUPT_PRIORITY can be set to any value between 0 and 63. But setting it to 63 means that no IRQ can call FreeRTOS kernel APIs. Set this to a value which provides value range on both sides (above and below the value), so that any user interrupts with numerically lower value (high priority) can still interrupt the kernel, but cannot use kernel APIs. Any user interrupt that needs to use kernel APIs must have numerically higher value (lower priority) than configMAX_SYSCALL_INTERRUPT_PRIORITY.

Also, I’m curious to understand the reason behind the usage fault, if you get a chance, can you please check the value in 0xE000ED28 CFSR (Configurable Fault Status Register) when usage fault occurs? Thanks

This understanding is not correct. configKERNEL_INTERRUPT_PRIORITY needs to be set to the lowest.

Thanks, that was not clear to me. As I incorrectly understood it, the kernel ISRs needed to be the most important interrupts. Apparently the opposite is the case. I am assuming this is because otherwise it could preempt critical sections within user-interrupts?

No, I already wrote this in my very first answer: Context switches are performed during this ISR, and context switches must only and exclusively save and restore task-related stack frames, not irq related stack frames.

I already did point out there that this must have the lowest pri. Sorry for not being clear enough.

Oh, that makes sense, thanks! :slight_smile:

No worries, the whole debacle with inverted priorities does not make communication easier. I appreciate your help!

Well, its been solved now, but to still your curiosity, the CFSR was 0x40000.

well yes, it’s one of those things in computing that can be implemented (and justified) either way but raise all forms of debugging hell when you have to somehow tie them together. Like endianness or 0- vs. 1-based indexing. My gut feeling from experience is that those issues account for easily 60% of all of the endless hours spent debugging.