Hi
I am porting SMP on Cortex-A32 arch32*2 but encountered a problem. The assert happend in prvCheckForRunState()
/* Enabling interrupts should cause this core to immediately
* service the pending interrupt and yield. If the run state is still
* yielding here then that is a problem. */
configASSERT( pxThisTCB->xTaskRunState != taskTASK_SCHEDULED_TO_YIELD );
The API that changes TCB->xTaskRunState to taskTASK_SCHEDULED_TO_YIELD is prvYieldCore()
In my implement, portYIELD_CORE( xCoreID) sends a IPI interrupt ( a SGI in gic-v2 ), which changes “ulPortYieldRequired[ portGET_CORE_ID() ] = pdTRUE” and can response rightly.
Besides tick interrupt and IPI interrupt, there is only one IPC interrupt. The problem is not easy to reproduce, so I’m looking for some suggestions to debug it.
Hi @Saiiijchan ,
One quick check would be to find out if the Interrupts and Core-id mapping in the GIC is fine. Are all the interrupts coming on the core they are supposed to be handled.
When you enable interrupts on this line, the task should yield and change its state, isn’t it? When the assert hits, can you examine the value of pxThisTCB->xTaskRunState? In one such case that I helped someone debug before, the value was correct when examined in the debugger - the was some cache coherency problem in that case.
Hi @aggarg and @Shub
After assert happends, I use Jlink to watch pxCurrentTCBs on assert core, the xTaskRunState is -2 and the TCB message is complete. It may not be a problem of cache consistency and stack overflow. I will check irq setting and IRQ handling flow
Hi @aggarg & @Shub
The implement of FreeRTOS_Tick_Handler causes the problem. After IPI sets ulPortYieldRequired, a pending systick is fired and clear ulPortYieldRequired. After modify tick handler, it works well.
Is it possible to test by turning off the caches? Does the problem go away when cache is turned off? Alternatively, can we move all the TCBs and into a memory which is not cached and then try?
There are still some questions that I haven’t figured out yet
I set MMU to move heap to non-cache ( include TCB) and the problem still exits. I am tracing the setups of cache, MMU and page table. Besides, the debugger behave is confused that the command " x + address" in GDB shows the value of address in virtual memory which is the value in cache or phisical memory.
I set the breakpoints on 0x60391a38 before entered vAssertCalled. At this time, the value of r3 was 0xfffffffe (-2, taskTASK_YIELDING) and was different from the value calculated by the tcb address in [fp, #-8]. Is it possible that the value has been modified again at this time because is was not in critical section?
666 portENABLE_INTERRUPTS();
667
668 /* Enabling interrupts should cause this core to immediately
669 * service the pending interrupt and yield. If the run state is still
670 * yielding here then that is a problem. */
671 configASSERT( pxThisTCB->xTaskRunState != taskTASK_YIELDING );
Should not happen in a normal case - so if that is happening, we need to find out how is that value getting changed. Do you want to have a debug session to debug this? If yes, please drop me your email and preferred times in the DM.