At wits end - going to unhandled-IRQ handler from vTaskSwitchContext for no apparent reason

Writing a program for the ATSAMC21E18A, using Microchip Studio, FreeRTOS 10.
One of my tasks, which is very similar to another, which is not misbehaving, is throwing an error that I cannot explain.

The task is continuously cycling, checking a queue for a command to execute. It receives the command and is working its way through, and gets to one

case FPGA_HandlerState_t::operating : //FPGA is working
  if (GlobalFlags.FPGA_Handler_RxQueue_Handle != NULL) { //safety
    if (xQueueReceive(GlobalFlags.FPGA_Handler_RxQueue_Handle, (void *)(&IncomingMessage), portMAX_DELAY) == pdPASS) { //returns errQUEUE_EMPTY if timeout and nothing on queue
      //command interpreter
      switch (IncomingMessage.Command) {
      case FPGA_Interface_Commands::reset :
        task_this_p->FH_resetFPGA(task_this_p);
        break;
      case FPGA_Interface_Commands::getmode :
        task_this_p->FH_getmode(task_this_p);           <======
        break;

Ok, no problem. For some reason, whenever it gets to this call the RTOS does a context switch, The next thing that I know it ends up at the ā€œDummy_Handlerā€ routine in startup_samc21.c ā€œ\brief Default interrupt handler for unused IRQs.ā€. Looking at the call stack it gets there, every time, from vTaskSwitchContext in tasks.c, from line 2667 at the call to taskCHECK_FOR_STACK_OVERFLOW. However itā€™s not going there from the stack-overflow handler in main.c, which is implemented, and Iā€™ve ended up there before, so I know it works. The stack dump looks ok, the Queues look ok. No clue as to what the thing is complaining about? How do I find what is triggering this?

further data. I traced the failure down to this point:

BaseType_t FPGA_Handler_task::send8Data_to_user(uint8_t *data_p, uint8_t count)
{
  BaseType_t retval = pdPASS;
  char dataString[16];
  
  //get semaphore
  xSemaphoreTake(GlobalFlags.UserInterface_TxQueue_Semaphore_Handle, 100);
  
  for (int i = 0; i < count; i++) {
    sprintf(dataString, "%#.2x ", data_p[i]);
    
    char *cp = dataString;
    for (size_t i = 0; i < strlen(dataString); i++) { //do not include terminal NULL, do not trigger addition of CRLF and CURSOR
      retval = xQueueSendToBack(GlobalFlags.UserInterface_TxQueue_Handle, (const void *)(cp+i), pdMS_TO_TICKS(100));
      if (retval != pdPASS) {
        
        //release semaphore
        xSemaphoreGive(GlobalFlags.UserInterface_TxQueue_Semaphore_Handle);
        
        return retval;
      }
    } //end for
  } //end for

FPGA_Handler_task is sending a response back to the UserInterface queue for transmission through the USART. The queue has 64 elements, this message is only 5 characters long, so Iā€™m not overflowing the queue. If fails on the last character.

The weird thing is that it fails in different places depending on where I have the breakpoints. Sometimes it fails at the final return, sometimes at the xSemaphoreGive statement.

Did you check if it is an unhandled interrupt ? I think NVIC ISPR register contains the currently pending IRQs.
If there was no matching IRQ itā€™s probably a stack overflow or another memory corruption. Stack overflow checking even at level 2 canā€™t catch ALL overflows.

You should handle a possible timeout properly and maybe not just proceed without owning the semaphore.

Yes, just noticed that. Will fix, but thatā€™s not the problem.

How do I look at that register? Itā€™s not in the IO or the Processor debugger window.

Usually itā€™s possible to display the peripheral registers and also those of the NVIC (interrupt controller) in the debugger. Atmel Studio (?) supports that as far as I remember.
Also if you have some memory left to waste, try to increase the stack.
With such weird crashes a stack or (sprintf) buffer overflow is my 1st guess.
BTW Iā€™m always using snprintf or something similar to avoid string buffer overflows.


I can see it but Iā€™m not sure how to interpret it.

If ISPR is 0x0 I think there is no interrupt pending and ending up in the unhandled interrupt dummy_handler is the result of something else.
Do you catch the fatal exceptions like HardFaults etc. by dedicated handlers ?

No, Iā€™m only catching the FreeRTOS stuff like the stack overflow.

Iā€™d propose to add exception handlers for the few processor exceptions, too.
For development ending up in a forever loop like the standard FreeRTOS asserts/failed hooks. Thatā€™s often useful to narrow down weird crashes. Unfortunately those processor exception handlers like normal interrupt handlers have no call stack to backtrace ā€¦

There is a little info on how to determine which exception handler is executing here: https://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html

Thanks much. Iā€™m going to have to do some coding here.

Itā€™s ending up in the HardFault handler. Any suggestions on how to figure out whatā€™s causing it?

See the link Richard posted before for some useful hints.
Can you single step through the code with a debugger until it hardfaults ?

Itā€™s ending up in the HardFault handler. Any suggestions on how to figure out whatā€™s causing it?
The assembly code given in the link doesnā€™t work on an M0 core.

The fault handler assembly code doesnā€™t compile for an M0 core. Iā€™ll try single-stepping through the code, the problem is that itā€™s running multiple tasks so Iā€™m having a hard time figuring out where the fault is.

Single stepping code should be possible regardless how many other tasks are running. Or is the faulty code executed by multiple tasks simultaneously ?

Seems that the example code is for Cortex-M3 and higher and is not supported by Cortex-M0.

Edit: Itā€™s documented on Debugging and diagnosing hard faults on ARM Cortex-M CPUs that itā€™s related to ARM Cortex-M3 and ARM Cortex-M4 microcontrollers

Google ā€˜hardfault cortex-m0ā€™ provided these hopefully useful links:

https://community.arm.com/developer/ip-products/system/f/embedded-forum/3257/debugging-a-cortex-m0-hard-fault

Thanks much. Iā€™m new to ARM interrupt handling. Quite a bit of a step up from AVRā€™s.
Just out of curiousity, the FreeRTOSConfig.h has
configMAX_PRIORITIES = 5
but the M0 only uses 3 interrupt priority levels (2 bits). Is this FreeRTOS priority the same as the processor priority? Should I set this to 3?

Found it! It was an indexing problem in a loop. Thanks much, all.