FW freezes when notifying multiple tasks in an ISR

Dear Community,

I’m having problems notifying multiple Tasks in an ISR via direct to task notifications. The Firmware freezes after running for a while without even the tick interrupt coming through. And this seems to happen only when compiler optimization is turned on (O2)
The setup looks like this:
– C++14
– Nios2 Board running with auto generated BSP (manually adapted to use FreeRTOS Kernel).
– FreeRTOS Kernel Release V10.5.1

I wrote a C++ wrapper for the API, and the notification looks like this:

#include "FreeRTOS.h"
#include "task.h"
#include <cstdint>
#include <chrono>
#include "is_isr.h"

namespace osal
{
namespace this_thread
{
std::uint32_t sleepUntilNotify(
    std::chrono::milliseconds maxWaitTime = std::chrono::milliseconds(portMAX_DELAY)) noexcept
{
  return ulTaskNotifyTake(false, pdMS_TO_TICKS(maxWatiTime.count()));}
}

template <typename... Args>
class Thread<void(Args...)> final
{
public:
  ...

  void notify() const noexcept
  {
// a variable is modified in the assembly when entering/exiting an ISR, which gives us the knowledge whether we are in an ISR.
    if (isISR()) 
    {
      BaseType_t higherPrioTaskWoken = false;
      vTaskNotifyGiveFromISR(m_handle, &higherPrioTaskWoken);
      if (higherPrioTaskWoken)
      {
        vTaskSwitchContext();
      }
      return;
    }
    xTaskNotifyGive(m_handle);
  }

  ...

private:
  ...

  TaskHandle_t m_handle;
  ...
};
}

in the ISR, 2 Tasks are notified directly

void irqHandler(void* context)
{
  ...
 // let's suppose thread1&2 are just static variables that are accessible in the ISR.
  thread1.notify();
  thread2.notify();
}

What am I doing wrong here?
Thanks a lot!

Which FreeRTOS port are you using? The convention is to call portYIELD_FROM_ISR:

portYIELD_FROM_ISR( higherPrioTaskWoken );

Can you break the code in the debugger and see what it is doing when it appears frozen?

I’m using the FreeRTOS-Kernel\portable\GCC\NiosII port, where no portYIELD_FROM_ISR is defined.
There is a define for portEND_SWITCHING_ISR though

#define portEND_SWITCHING_ISR( xSwitchRequired ) 	do { if( xSwitchRequired ) vTaskSwitchContext(); } while( 0 )

I did try and debug the optimized build. Not sure it renders anything reliable. Often it showed that the FW is stuck in the while loop in task.c in xTaskResumeAll()

BaseType_t xTaskResumeAll( void )
{
  ...
    taskENTER_CRITICAL();
    {
        --uxSchedulerSuspended;

        if( uxSchedulerSuspended == ( UBaseType_t ) pdFALSE )
        {
            if( uxCurrentNumberOfTasks > ( UBaseType_t ) 0U )
            {
                /* Move any readied tasks from the pending list into the
                 * appropriate ready list. */
                while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
                {
                    pxTCB = listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) ); /*lint !e9079 void * is used as this macro is used with timers and co-routines too.  Alignment is known to be fine as the type of the pointer stored and retrieved is the same. */
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
                    portMEMORY_BARRIER();
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
                    prvAddTaskToReadyList( pxTCB );

                    /* If the moved task has a priority higher than or equal to
                     * the current task then a yield must be performed. */
                    if( pxTCB->uxPriority >= pxCurrentTCB->uxPriority )
                    {
                        xYieldPending = pdTRUE;
                    }
                    else
                    {
                        mtCOVERAGE_TEST_MARKER();
                    }
                }
...
}

I do not think you can call vTaskSwitchContext() (whatever that does) twice in one isr. The result necessarily will be undefined. It must confuse the scheduler.

Just tried out with only one vTaskSwitchContext(), could still reproduce. Thanks for the input though.

I trimmed the FW to narrow down the problem. It seems to only occur in the constellation of 2+ tasks that are notified in the ISR and a task for the EtherCat Slave communication, which runs auto generated C code from Beckhoff (SSC Tool). So don’t look too hard in to the Kernel Code…

Here’s the trimmed example:

namespace
{
void ecatSlowLoop()
{
 while (true)
 {
   vTaskDelay(3u);
   IOWR(DUAL_CONFIGURATION_BASE, 0, 0x02u); //watchdog
   MainLoop(); //EtherCat 
 }
}

void task1()
{
 std::uint32_t cnt = 0;
 while (true)
 {
   ulTaskNotifyTake(false, portMAX_DELAY);
   ++cnt;
   if ((cnt % 100u) == 0u)
   {
     //blink some LED
   }
 }
}

void task2()
{
 std::uint32_t cnt = 0;
 while (true)
 {
   ulTaskNotifyTake(false, portMAX_DELAY);
   ++cnt;
   if ((cnt % 150u) == 0u)
   {
     //blink some LED
   }
 }
}

TaskHandle_t task1handle = nullptr;
TaskHandle_t task2handle = nullptr;

void irqHandler(void* context)
{
 ...

 BaseType_t higherPrioTaskWoken = false;
 vTaskNotifyGiveFromISR(task1handle, &higherPrioTaskWoken);

 vTaskNotifyGiveFromISR(task2handle, &higherPrioTaskWoken);
 if (higherPrioTaskWoken)
 {
   vTaskSwitchContext();
 }
}


}  // namespace

int main()
{
 xTaskCreate((TaskFunction_t)task1,
             "task1",
             configMINIMAL_STACK_SIZE,  // 1024
             nullptr,
             2,
             &task1handle);

 xTaskCreate((TaskFunction_t)task2,
             "task2",
             configMINIMAL_STACK_SIZE,  // 1024
             nullptr,
             3,
             &task2handle);

 TaskHandle_t esHandle = nullptr;
 xTaskCreate((TaskFunction_t)ecatSlowLoop,
             "es",
             configMINIMAL_STACK_SIZE,  // 1024
             nullptr,
             1,
             &esHandle);

 alt_ic_isr_register(IRQ_CONTROLLER, PDI_CHANNEL, &irqHandler, nullptr, nullptr);

 vTaskStartScheduler();
}

extern "C" void vApplicationTickHook()
{
 //blink some LED
}

extern "C" void vApplicationIdleHook()
{
 //blink some LED
}

Is the stack for your third task large enough? Is stack checking enabled?

The stack is 1K words, 4KB, pretty large I would say for a FW Task. We are using method2 for stack overflow checking.

Is it possible that the ISR fires before the scheduler is started? Can you try moving interrupt init code to one of the tasks?