Consecutive executions of the scheduler

Hi,

We’d like to report an assertion failure caused by a starved task when the time slicing policy is specified.

When the time slicing policy is specified, the scheduler is required by either the timer interrupt or tasks. If the timer interrupt just interrupts the execution of the scheduler required by a task, there is a chance that the scheduler is executed consecutively. The task selected by the first scheduler will be immediately replaced by the latter. Hence, a task is scheduled, but not executed.

We believe the chance of consecutive executions of the scheduler depends on architecture. In ARMv7M, there is an optimization called “Tail Chaining”. This optimization takes a pending exception at the exception return phase. We believe “Tail Chaining” definitely makes consecutive executions of the scheduler happens. To be specific, consider a scenario that SysTick becomes pending when PendSV is executing. Because of tail chaining, the pending SysTick will be taken at the exception return phase of PendSV. Since the time slicing is specified, SysTick sets PendSV to the pending state and then PendSV is taken again.

We exploit the discovery to starve a victim task in BlockQ.c. The task is starved for a while that an assertion in BlockQ.c is eventually violated. In short, we create an additional task in BlockQ.c called PLUG. The PLUG task runs for the time slightly shorter than a time slice and yields. It intends to make SysTick interrupt the execution of the scheduler while the victim task is selected. We make the PLUG task continuously starve the victim task by controlling the run time of the PLUG task. After a while, the check task notices that the victim task has no progress and raise an error.

BlockQ_STM32F429I_GCC.zip (21.2 KB)

The code is modified from FreeRTOS/Demo/CORTEX_M4F_STM32F407ZG-SK. We compile the code with GCC and run the binary on STM32F429I board. By the way, there are GDB reports in the code showing that the PLUG task starves the correct victim task.

We were wondering is this a well known problem? If not, we propose a simple fix for the problem without modifying the FreeRTOS kernel code. The fix is to stop the timer interrupt from requiring the scheduler by disabling the time slicing policy. If the time slicing policy is needed, however, we have no general solution. One possibility is to break the continuous starvation by delaying a task for one time slice after the task yields (either through preemption or taskYIELD()). However, we are not satisfied with the latter solution. We would appreciate any opinion from the community about possible solutions or the problem.

Thanks,
Kai

Hi Kai,

In general, if a task is willing to take 100% of the CPU, then FreeRTOS will allow it to do so when there are no higher priority tasks ready to execute. As you describe, other tasks at the same priority are not guaranteed to get any CPU time either.

The real cause of the issue is your task design and your task priority. Tasks should be designed to accomplish their work and then wait for more work. If your task’s work cannot fit into that design model (say for example a task that works continually to find the next larger prime number), then it should have the lowest priority. If you have more than one of those tasks, then you would need to implement your own purpose-built CPU sharing mechanism, such as the one you describe (no time slicing) or something else via IPC.

As a side note, tail chaining is not relevant here as the same thing occurs in CPUs that don’t tail chain. Tail chaining merely speeds entry into the next ISR. Without tail chaining, the CPU would still go from one ISR directly to the next ISR.

In general, if a task is willing to take 100% of the CPU, then FreeRTOS will allow it to do so when there are no higher priority tasks ready to execute.

I’m slightly confused by this line. Does this line still hold with the time slicing policy? I thought the FreeRTOS with time slicing policy will make the same priority tasks share the CPU.

By the way, thank you for the side note.

Or, more simply stated, time-slicing will HELP TO, but not GUARANTEE tasks of a given level get time. A “more complicated” slicing algorithm could be used, but that would always cost time/space.

FreeRTOS assumes that the programmer understands the consequences of his decisions, and trusts him.

Yes.

The time-slicing policy typically results in tasks of the same priority sharing the CPU equally. However, the actual result depends on the tasks themselves. In a contrived case (like the one you shared in your post), a task could starve.