Hi,
We’d like to report an assertion failure caused by a starved task when the time slicing policy is specified.
When the time slicing policy is specified, the scheduler is required by either the timer interrupt or tasks. If the timer interrupt just interrupts the execution of the scheduler required by a task, there is a chance that the scheduler is executed consecutively. The task selected by the first scheduler will be immediately replaced by the latter. Hence, a task is scheduled, but not executed.
We believe the chance of consecutive executions of the scheduler depends on architecture. In ARMv7M, there is an optimization called “Tail Chaining”. This optimization takes a pending exception at the exception return phase. We believe “Tail Chaining” definitely makes consecutive executions of the scheduler happens. To be specific, consider a scenario that SysTick becomes pending when PendSV is executing. Because of tail chaining, the pending SysTick will be taken at the exception return phase of PendSV. Since the time slicing is specified, SysTick sets PendSV to the pending state and then PendSV is taken again.
We exploit the discovery to starve a victim task in BlockQ.c. The task is starved for a while that an assertion in BlockQ.c is eventually violated. In short, we create an additional task in BlockQ.c called PLUG. The PLUG task runs for the time slightly shorter than a time slice and yields. It intends to make SysTick interrupt the execution of the scheduler while the victim task is selected. We make the PLUG task continuously starve the victim task by controlling the run time of the PLUG task. After a while, the check task notices that the victim task has no progress and raise an error.
BlockQ_STM32F429I_GCC.zip (21.2 KB)
The code is modified from FreeRTOS/Demo/CORTEX_M4F_STM32F407ZG-SK. We compile the code with GCC and run the binary on STM32F429I board. By the way, there are GDB reports in the code showing that the PLUG task starves the correct victim task.
We were wondering is this a well known problem? If not, we propose a simple fix for the problem without modifying the FreeRTOS kernel code. The fix is to stop the timer interrupt from requiring the scheduler by disabling the time slicing policy. If the time slicing policy is needed, however, we have no general solution. One possibility is to break the continuous starvation by delaying a task for one time slice after the task yields (either through preemption or taskYIELD()). However, we are not satisfied with the latter solution. We would appreciate any opinion from the community about possible solutions or the problem.
Thanks,
Kai