Assert happends in prvCheckForRunStae() on smp

Hi
I am porting SMP on Cortex-A32 arch32*2 but encountered a problem. The assert happend in prvCheckForRunState()

         /* Enabling interrupts should cause this core to immediately
         * service the pending interrupt and yield. If the run state is still
         * yielding here then that is a problem. */
        configASSERT( pxThisTCB->xTaskRunState != taskTASK_SCHEDULED_TO_YIELD );

The API that changes TCB->xTaskRunState to taskTASK_SCHEDULED_TO_YIELD is prvYieldCore()

            portYIELD_CORE( xCoreID );                                               \
            pxCurrentTCBs[ xCoreID ]->xTaskRunState = taskTASK_SCHEDULED_TO_YIELD;   \

In my implement, portYIELD_CORE( xCoreID) sends a IPI interrupt ( a SGI in gic-v2 ), which changes “ulPortYieldRequired[ portGET_CORE_ID() ] = pdTRUE” and can response rightly.

Besides tick interrupt and IPI interrupt, there is only one IPC interrupt. The problem is not easy to reproduce, so I’m looking for some suggestions to debug it.

Thanks!

Hi @Saiiijchan ,
One quick check would be to find out if the Interrupts and Core-id mapping in the GIC is fine. Are all the interrupts coming on the core they are supposed to be handled.

When you enable interrupts on this line, the task should yield and change its state, isn’t it? When the assert hits, can you examine the value of pxThisTCB->xTaskRunState? In one such case that I helped someone debug before, the value was correct when examined in the debugger - the was some cache coherency problem in that case.

Hi @aggarg and @Shub
After assert happends, I use Jlink to watch pxCurrentTCBs on assert core, the xTaskRunState is -2 and the TCB message is complete. It may not be a problem of cache consistency and stack overflow. I will check irq setting and IRQ handling flow

Hi @aggarg & @Shub
The implement of FreeRTOS_Tick_Handler causes the problem. After IPI sets ulPortYieldRequired, a pending systick is fired and clear ulPortYieldRequired. After modify tick handler, it works well.

@@ -401,9 +401,7 @@ void FreeRTOS_Tick_Handler( void )
                /* Increment the RTOS tick. */
-                ulPortYieldRequired[ xCoreID ] = xTaskIncrementTick();
+               if ( xTaskIncrementTick() != pdFALSE ){
+                       ulPortYieldRequired[ xCoreID ] = pdTRUE;
+               }
        }

Thanks for your help :smiley:

Glad that you figured!

Dear Aggerwal
I met the case you mentioned. I am not familiar with cache coherency, could you please provide some solutions? Thanks a lot.

When test another case, I also encountered this problem. Should I add memory barrier instructions in somewhere?

Is it possible to test by turning off the caches? Does the problem go away when cache is turned off? Alternatively, can we move all the TCBs and into a memory which is not cached and then try?