Problems with xTaskRunState and prvYieldForTask

Now I’m debugging my port on my 4-core RISC-V Soc and encountering some problems. I have been working on it for days but I have no idea how to solve it. I need some help.

I create one task for each cpu. They are thread_0 ~ thread_3, binding core 0 ~ core 3. All of them have the priority 32, which is the highest in the system. they take the Mutex, print and then delay.

void task_entry(void *parameter)
{
    while (1)
    {
        int id = csi_get_cpu_id();
        if (xSemaphoreTake(xMutex, portMAX_DELAY) == pdTRUE)
        {
            printf("[%s] in %lu core, count: %d\r\n", pcTaskGetName(NULL), (unsigned long)id, g_count++);
            xSemaphoreGive(xMutex);
        }
        vTaskDelay(pdMS_TO_TICKS(2000));
    }
}

Preemption is enabled.

When configRUN_MULTIPLE_PRIORITIES is set to 0. Sometimes it is stuck after running for some seconds, and the debugger shows that all the cores are running IDLE tasks, even there’s a higher priority user task in pxReadyTaskLists[32].

(gdb) i th
  Id   Target Id                    Frame 
  1    Thread 1.1 (CPU#0 [running]) prvIdleTask (pvParameters=<optimized out>)
    at /home/phf/yoc212/components/freertos/FreeRTOS/Source/tasks.c:5847
  2    Thread 1.2 (CPU#1 [running]) prvPassiveIdleTask (pvParameters=<optimized out>)
    at /home/phf/yoc212/components/freertos/FreeRTOS/Source/tasks.c:5725
  3    Thread 1.3 (CPU#2 [running]) prvPassiveIdleTask (pvParameters=<optimized out>)
    at /home/phf/yoc212/components/freertos/FreeRTOS/Source/tasks.c:5725
* 4    Thread 1.4 (CPU#3 [running]) prvPassiveIdleTask (pvParameters=<optimized out>)
    at /home/phf/yoc212/components/freertos/FreeRTOS/Source/tasks.c:5725

But now thread_3 with priority 32 is in pxReadyTasksLists[32], which is not scheduled.

(gdb) p (TCB_t)*((TCB_t*)pxReadyTasksLists[32]->xListEnd->pxNext->pvOwner)
$2 = {pxTopOfStack = 0x500b5f70, uxCoreAffinityMask = 8, xStateListItem = {xItemValue = 0, 
    pxNext = 0x5002c088 <pxReadyTasksLists+1296>, pxPrevious = 0x5002c088 <pxReadyTasksLists+1296>, 
    pvOwner = 0x500b6450, pvContainer = 0x5002c078 <pxReadyTasksLists+1280>}, xEventListItem = {xItemValue = 30, 
    pxNext = 0x5009fa88, pxPrevious = 0x5009fa88, pvOwner = 0x500b6450, pvContainer = 0x0}, uxPriority = 32, 
  pxStack = 0x500b0ac0, xTaskRunState = -1, uxTaskAttributes = 0, 
  pcTaskName = "thread_3", '\000' <repeats 23 times>, uxTCBNumber = 11, uxTaskNumber = 0, uxBasePriority = 32, 
  uxMutexesHeld = 0, pxTaskTag = 0x0, pvThreadLocalStoragePointers = {0x0, 0x0, 0x0, 0x0, 0x0}, 
  ulRunTimeCounter = 0, ulNotifiedValue = {0}, ucNotifyState = "", ucStaticallyAllocated = 0 '\000'}

I printed some logs to the memory and dump it from GDB. And I found the problem.

There was a previous call prvYieldCore( 3 ) and the IDLE task in core-3 is set to -2, namely taskTASK_SCHEDULED_TO_YIELD. I think this may stops prvYieldForTask from selecting core-3
to yield for the thread_3. So this task doesn’t run thus the other cores can only run IDLEs when configRUN_MULTIPLE_PRIORITIES is set to zero.
Because only the condition if( ( taskTASK_IS_RUNNING( pxCurrentTCBs[ xCoreID ] ) != pdFALSE ) && ( xYieldPendings[ xCoreID ] == pdFALSE ) ) is satisfied that core-3 can be selected as xLowestPriorityCore in prvYieldForTask.

So core-3 cannot enter software-interrupt to call vTaskSwitchContext() any more because no one triggers it, after pxCurrentTCBs[3]->xTaskRunState is set to -2.

Should an IPI be sent to core-3 at a specific time to solve this problem? When and where to send it?

I wrote a simple log function in mylog.c and mylog.h log the events. tasks.c and FreeRTOS.h are also modified, to add some tracepoints.
The codes and logs are here.

Thank you for your time!

files-and-logs.7z (64.4 KB)

The log is too long so let me explain it. There are some details in it.

In line 293, cpu0 released the mutex and enter xTaskRemoveFromEventList, and move thread-3 from event-list to ready-list

[cpu 0] add (0x500b64a0, thread_3) to ready list in xTaskRemoveFromEventList

This was the last time when thread_3 is operated. But it had never been scheduled since then.

And let’s look at line 183

[cpu 3], (0x50028740, IDLE2) continues to run on core 3
[cpu 3] traceRETURN_vTaskSwitchContext

This was the last time for core-3 to return from vTaskSwitchContext(), in software interrupt. It hadn’t entered the software interrupt function after returning. And it had been running prvPassiveIdleTask till I stopped it.

Line 212, core0 entered vTaskYieldWithinAPI and was selecting a core to yield.

[cpu 1] traceRETURN_vTaskYieldWithinAPI, nesting = 0
[cpu 1] selecting cpu, xCoreID = 0, task-in-core: (IDLE3), core-running-state = 0, task-running-state = -1, pending = 0
[cpu 1] selecting cpu, xCoreID = 1, task-in-core: (Tmr Svc), core-running-state = 1, task-running-state = -1, pending = 0
[cpu 1] selecting cpu, xCoreID = 2, task-in-core: (IDLE0), core-running-state = 2, task-running-state = -1, pending = 0
[cpu 1] selecting cpu, xCoreID = 3, task-in-core: (IDLE2), core-running-state = -2, task-running-state = -1, pending = 0

We can see pxCurrentTCBs[3]->xTaskRunState = -2 so this core didn’t match the condition to yield for the task. And thread_3 with core-affinity binding to core-3 had nowhere to run. and this had never changed till the end.

And vPortYield_Core(xCoreID) is the port function called by portYIELD_CORE. We cannot find vPortYield_Core(3) after line 212. So core3 hadn’t received an IPI request till the end because of its taskTASK_SCHEDULED_TO_YIELD.

Lets consider the following scenario:

  1. thread_0 running on core_0 acquires mutex and executing the print statement.
  2. thread_1 on core_1 attempts to take mutex and blocks. The moment it blocks, the only tasks eligible to run on this core is idle task. And because multiple priorities are not allowed to run in parallel, no other task priority 32 task can run and as a result thread_0 is evicted.

So this seems like an application bug to me. Can you try the following 2 scenarios:

  1. Set the priorities of all the tasks to 0.
  2. Set the priorities of all the tasks back to 32 and set configRUN_MULTIPLE_PRIORITIES to 1.

But this seems to violate the principle of priority preemption as the thread with higher priority is evicted.

When I enable configRUN_MULTIPLE_PRIORITIES , it hits the configASSERT at line 850 in task.c, in prvCheckForRunStateChange

            /* Enabling interrupts should cause this core to immediately
             * service the pending interrupt and yield. If the run state is still
             * yielding here then that is a problem. */
            configASSERT( pxThisTCB->xTaskRunState != taskTASK_SCHEDULED_TO_YIELD );

After reading my log, I found that it did enter the interrupt after portENABLE_INTERRUPTS but pxThisTCB->xTaskRunState didn’t change after vTaskSwitchContext.

What is the meaning of taskTASK_SCHEDULED_TO_YIELD? When does the pxCurrentTCBs[ ( xCoreID ) ]->xTaskRunState change from taskTASK_SCHEDULED_TO_YIELD to running state xCoreID or taskTASK_NOT_RUNNING?

It does seem so but think about what all threads would you run which satisfy the constraint “multiple priorities cannot run simultaneously”.

We have removed this assert in the latest version: FreeRTOS-Kernel/tasks.c at main · FreeRTOS/FreeRTOS-Kernel · GitHub. Please try the latest main code.

I’m developing the SDK, so a tagged release is required. But fortunately we can avoid this by clearing the interrupt pending bit BEFORE calling vTaskSwitchContext() in software interrupt handler. This is the key to the problem because the IPI request might be sent after vTaskSwitchContext(), before returning from the interrupt. If the pending bit is cleared AFTER the context switching, we will miss the new interrupt request.

And as you mentioned above, setting configRUN_MULTIPLE_PRIORITIES to 1 is also necessary.

Thank you for your time!

You may want to use the version 11.2.0 which we just released yesterday.