A scenario that a task in a ready list is not executed on the ARM_CM4F porting

kaizsv · February 26, 2020, 9:20am

As the title, I’m wondering the rationality of this scenario on the real world environment. I run the demo program (Demo/CORTEX_M4F_STM32F407ZG-SK) on STM32F429 DISCO, but compile only one application (Common/Minimal/dynamic.c). Note that configUSE_PORT_OPTIMISED_TASK_SELECTION is 0 and there are six tasks in ready lists, “CNT_INC”, “LIM_INC”, “C_CTRL”, “SUSP_TX”, “SUSP_RX”, and “IDLE”. The scenario** begins when “CNT_INC”, “SUSP_RX”, and “IDLE”, having priority 0, remain in the ready list (pxReadyTasksLists[0]). “SUSP_RX” is the task that I doubt it is executable under three conditions.

The first condition is “CNT_INC” changes its priority value. When “CNT_INC” is moved to higher level of ready lists, such as pxReadyTasksLists[1], the vacancy of “CNT_INC” in pxReadyTasksLists[0] is replaced by the previous task because pxIndex cannot point to NULL. This decreases the opportunity of the replacing task (“IDLE” in this case) to be selected by the scheduler.

If “CNT_INC” does not change its priority value and no task in ready lists is changed, the scheduler will select the next ready tasks in the following sequence.

“CNT_INC” → “SUSP_RX” → “IDLE” → “CNT_INC” → “SUSP_RX” → “IDLE” → …

However, I observe the scheduler selects the next ready task in the following sequence on the demo program through GDB.

“CNT_INC” → “SUSP_RX” → “CNT_INC” → “IDLE” → “CNT_INC” → “SUSP_RX” → “CNT_INC” → “IDLE” → …

The second condition is the tail-chaining mechanism of ARMv7-M architecture. Given that user tasks can invoke taskYield() to relinquish the processor. If SysTick generates IRQ*** before PendSV returning to Thread mode, PendSV will not return to Thread mode but Handler mode (SysTick in this case) because of the tail-chaining mechanism. Then, SysTick requests PendSV again which flushes out the first selected task. In other words, the first selected task is selected by the scheduler but not executed by the processor.

Consider the second task sequence above, the third condition is SysTick generates IRQ exactly between “CNT_INC” invokes taskYield() and PendSV returns to “SUSP_RX”. In other words, SysTick separates the processing time, which has a specific pattern, repeatedly. The repeated pattern is illustrated below.

This condition continues until a delayed task, “SUSP_TX” in this case, is awakened. “SUSP_RX” is supposed to react to “SUSP_TX”, however, it is flushed out because of the second condition.

In summary, the first condition can be observed on real environment and the second condition is based on the documentaries. The third condition, however, seems unreasonable. But I’m wondering it can be programmed coincidentally. Consider the processing of “IDLE” task in the repeated pattern, it can be any other task that long enough for SysTick to separate the processing time like the third condition. For example, the maximum frequency of STM32F429 DISCO is 180MHz and the SysTick rate is 1KHz. If the repeated pattern requires exact 180K cycles to operate, then the third condition might be satisfied. And things might go wrong if the task which is flushed out needs to react to others in time.

Please let me know if I miss some information.

** “LIM_INC” is suspended; “C_CTRL” invokes vTaskDelay(), and “SUSP_TX” is also delayed because it cannot send more data to a full queue.
*** Because SysTick and PendSV have same exception priority, SysTick sets itself to pending state.

rtel · February 26, 2020, 7:08pm

There is a lot to digest here, so lets start with this part.

Are you saying the pre-condition here is that all four tasks are at the idle priority and do not change their priority so you would expect to see each execute in turn, but you are observing that CNT_INC is executing more often? What are the other tasks (LIM_INC, SUSP_TX, etc.) doing at this time? Are they also running?

kaizsv · February 27, 2020, 5:54am

No, only three tasks (“CNT_INC”, “SUSP_RX”, and “IDLE”) are in the ready list. And yes, they are at the idle priority. As for the others, “LIM_INC” suspends itself. “C_CTRL” invokes vTaskDelay(). “SUSP_TX” is also delayed but at this time the queue is full.

“CNT_INC” → “SUSP_RX” → “CNT_INC” → “IDLE” → “CNT_INC” → “SUSP_RX” → “CNT_INC” → “IDLE” → …

The task in the sequence means the current running task, which is pointed to by pxCurrentTCB. The arrow means the scheduler selects the next task and switches the context to the selected task. “CNT_INC” is absolutely the only running task when it raises its priority value. After it resets the priority value to the idle priority, vTaskPrioritySet() decides to yield the task. “SUSP_RX” continues until interrupted by SysTick and rescheduled by PendSV. “IDLE” also yields the task.

The first condition begins when three tasks are placed in the ready list in order.

pxReadyTasksLists[0]: “CNT_INC”, “SUSP_RX”, “IDLE”
pxReadyTasksLists[1]:

When “CNT_INC” raises its priority value, it is removed from pxReadyTasksLists[0]. The vacancy of “CNT_INC” in pxReadyTasksLists[0] is replaced by the previous task (“IDLE” in this case) because pxIndex cannot be a Null pointer.

pxReadyTasksLists[0]: “IDLE”, “SUSP_RX”
pxReadyTasksLists[1]: “CNT_INC”

“CNT_INC” is appended to the ready list after it resets its priority value.

pxReadyTasksLists[0]: “IDLE”, “SUSP_RX”, “CNT_INC”
pxReadyTasksLists[1]:

If no task priority is changed, the next selecting sequence is “SUSP_RX”, “CNT_INC”, and “IDLE”. The opportunity of “IDLE” to be selected by the scheduler is reduced.

I do not think this is a serious problem since a task has changed its priority value. Moreover, the effect will be mitigated if there are more than three tasks at the idle priority. The problem I’d like to highlight is this condition along with two others might cause an unexpected event. However, I only research the two other conditions based on documentaries. So I open this discussion to get more information from the freeRTOS experts.

rtel · February 27, 2020, 3:07pm

I will need to study in more detail to know if there is a problem or not. The FreeRTOS scheduler will select each task in sequence from the Ready list (it is a simple walk of the linked list) but makes no assertions about each task at the same priority having the same amount of execution time. If a task runs for a small fraction of its time slice when an interrupt occurs and that interrupt causes a context switch, it will get pre-empted. Likewise if you have multiple yields in at the lowest priority interrupt each will get processed - although in the case of an interrupt on a Cortex-M the yield interrupt has the lowest priority so will only occur when there are no higher priority interrupts pending or nested - you can have multiple interrupts at the lowest priority though. I think you know all this already - but explaining for other readers.

kaizsv · February 28, 2020, 9:45am

I appreciate your response. So, I continue to explain the second condition.

The second condition is the Tail-chaining mechanism. According to B1.5.12 in ARMv7-M Reference Manual, another asynchronous exception might be already pending or arrive during exception return. The Tail-chaining mechanism optimizes the interrupt response when there is a pending exception with a higher priority than the target[1]. In this case, the processor takes the Pending exception immediately on exception return. In CM4 and CM3 Devices Generic User Guide, furthermore, the Tail-chaining mechanism is active if there is a pending exception meets the requirements of the exception entry on completion of an exception handler. Note that one of the requirement of exception return is there is no pending exception with sufficient priority to be served.

In FreeRTOS, user tasks can invoke taskYield() to relinquish the processor. After PendSV is set to pending state, a DSB and an ISB instructions ensure the side effect is recognized by the processor[2]. Assume the exception priority is sufficient to be served, PendSV is taken immediately. SysTick may generate IRQ during exception entry[3] or after PendSV is active[4]; however, it is set to pending state because PendSV and SysTick have same exception priority. After PendSV handler selects the next ready task and switch the context to the selected task, PendSV calls bx instruction to return to the Thread mode[5] (return address 0xFFFFFFFD). The bx instruction triggers the exception return[6] which also enable the tail-chaining mechanism. Based on the documentaries, PendSV returns to the pending SysTick rather than the selected task. Then, the SysTick handler requests PendSV again and the second PendSV flushes the first selected task out.

I will try to do an experience on the real environment next week. In my opinion, this rarely happens unless in an extreme condition (the third condition.) Moreover, tail-chaining belongs to ARMv7-M only (I haven’t digged into ARMv8 or other architecture, like RISC-V.)

[1] The target is the task (Thread mode) or exception (Handler mode) where exception return is going to return to. Note that any task has the lowest exception priority (see Execution priority and priority boosting section).
[2] see System Control Space (SCS) Programming in section 3.3
[3] see B1.5.11 Exceptions on exception entry
[4] see Active state at the end of this page
[5] see Table B1-8 at B1.5.8 Exception return behavior
[6] see B1.5.8 Exception return behavior, A7.7.20 BX, and BXWritePC() at section Pseudocode details of ARM core register operations

rtel · February 28, 2020, 3:18pm

Yes, I am familiar with how the Cortex-M works. What you describe here is what I refer to in my previous post:

kaizsv · March 20, 2020, 5:59am

I apologize for giving too much detail in my last post. I’d like to focus my problem on three conditions.

The first is that the scheduler does not sequentially walk through the Ready list when a task priority had been modified. The second is that the ARMv7-M tail-chaining mechanism might result in a concatenated yields in. A concatenated yields in makes the FreeRTOS scheduler jump over a ready task. The third is an extreme case. If the second condition happens repeatedly, a delayed task might be awakened before a jumped task can be executed. This condition damages task synchronization.

jefftenney · March 20, 2020, 4:32pm

On cases 2 and 3, you may be accidentally describing a system that simply has too much work to do. Not enough CPU. You have a task delaying for something that happens right away. In that case, yes, all the starvation could conceivably land on one task, even if it shares the same priority as other tasks getting lots of CPU time.

And tail chaining seems to be irrelevant here. A yield can give 0 cycles to a task, and then we can have another yield, all without tail chaining. Would you be less concerned if a yield allowed the task to execute one instruction before another yield gives the CPU back to the CPU hogging task?

kaizsv · March 20, 2020, 5:23pm

Yes, maybe I concern too much about case 2 and 3. Thanks for the reply.

kaizsv · March 23, 2020, 12:32pm

I misunderstood this question. Case 2 alone is not my concern. If a yield gives 0 cycles to a task and another yield happens, the task is swapped by the scheduler without being executed. Assume the swapped task needs to respond to a delayed task (e.g. SEND/RECV tasks in dynamic.c.) My concern about case 3 is the delayed task might misjudge the swapped task has been executed and cause an error no matter how many ticks to delay. To conceive case 3, case 1 and 2 must be considered together. A program might be programmed coincidentally that SysTick is generated repeatedly and exactly just before the yield to the swapped task (e.g. “SUSP_RX” in the below figure.) After the first yield, the processor tail-chains to SysTick and requests PendSV (the second yield) asynchronously. This causes a task in a Ready list is not executed until the delayed task is awakened. I’m wondering whether case 3 is a possible scenario or just my over concern?

richard-damon · March 23, 2020, 12:55pm

When two tasks are at the same priority, and one of the tasks yields just before the tick interrupt, it can starve the second task. Giving it exactly 0 cycles is hard, but giving it very few isn’t that hard, possibly even accidentally. The Idle task has a should yield option in part to handle this (I presume), With a Yield, the task at idle priority just behind the idle task well get less time per cycle than other tasks that might be at the idle priority.

One partial fix would be if the port layer could provide a function that could determine if we are ‘near’ the end of a tick, and a task that is yielded to after this point get to continue past the tick. You still will the issue that tasks preempted by higher priority tasks won’t get a full slot.

To handle something like that, you would need to get some high resolution time value (like used for task stats) and keep track of how much a task has gotten since it was switched in by round robin and not switched out until it achieved a certain minimum execution time. (Many details in that likely need to be resolved).

jefftenney · March 23, 2020, 2:58pm

@kaizsv, from what I see, everything you say can happen actually can happen (and without tail chaining too). I believe the issue at hand is whether to be concerned about it.

I might have the wrong idea, but I think you might be describing the risks of using the FreeRTOS scheduler as a synchronization mechanism. For example, Task A does job a repeatedly, Task B does job b repeatedly, and other tasks do other jobs. And the system requirement states that jobs a and b must be done one after the other precisely 1:1.

If I am understanding your concern correctly, then it may help to know that the scheduler is not intended to be a task-synchronization mechanism.

If you switch to using synchronization primitives (like semaphores, queues, task notifications, etc), does that alleviate your concern?

kaizsv · March 24, 2020, 1:57pm

@richard-damon Thanks for the response. Maybe my words are imprecise. My original intention is to describe a synchronization issue. In my first post, the IDLE task I mentioned in case 3 is that there might be another program which having one more task other than the dynamic.c application. This additional task has the idle priority and can be programmed accidentally that SysTick is generated repeatedly and exactly just before the yield to the starved task. In other words, although the IDLE task is forbidden to invoke any blocking function, there might be other tasks at the idle priority do the same jobs.

@jefftenney Thanks for the response. I think this information can alleviate my concern about case 2 and 3.

Because synchronization primitives may place the current task on an event list, if another task is starved (cannot check the event list) then the same issue might happen. But this is just an extreme case.

Actually, I’m trying to model the FreeRTOS kernel and verify it on the Spin model checker. From case 1 to 3, I’m wondering do they only happen on the abstract model? I think my concern is alleviated and thanks for all the responses.

richard-damon · March 24, 2020, 4:03pm

May understanding of the issues presented is there seems to be two very different issues.

The first is an issue caused by repeated changing of priorities of tasks. I can believe that there are conditions where continual adjustment of priority might affect the sequencing of tasks at the same priority level and possible causing starvation. The simple solution here is don’t see changing the priority level of tasks. It is hard for me to imagine why a task would need rapid multiple changes of priority, that sounds to me like somewhere the partitioning of jobs for the tasks was done wrong.

The second case is a task doing a yield very close to the tick interrupt, and starving the next task in sequence by always giving it very short (or zero) time left on the tick. This is the case where I was suggesting that this could be fixed with the addition of a port function to detect near end of tick switch ins, and giving those tasks a second tick to run. If a port can’t easily detect when in the tick it is, it could just declare it is ALWAYS almost the time of the tick, and then tasks will get 2 ticks to run, the first might be shortened by getting time from a yield.

A second mitigation is that tasks need to be careful about yielding. The Idle task has the option to tell it not to yield ever, but always use its full quantum, so that if there are multiple other tasks at the idle priority that don’t block, then they all will get the same time (adjusted by the time taken by higher priority tasks).

You do need to also be careful about tasks that block for 1 tick, but since a 1 tick block can be virtually 0 time (if made late in the tick), normally that isn’t a desired condition.

kaizsv · March 25, 2020, 5:29am

Yes, I appreciate you to help me summarize this scenario.

richard-damon · March 25, 2020, 11:33am

Thinking a bit more about this, the fundamental issue is that the Round-Robin schedule doesn’t make any promises about ‘fairness’, and it is quite possible for not well-designed programs to accidentally (or intentionally) starve one of the tasks. I can see several approaches that might be able to provide a higher level of fairness.

Allow the raising of the quantum for the Round-Robin form every tick to every N ticks, If N is 2, then even if a task gets it turn at the very end of the tick, it still gets a full tick of being the task at that priority to run. This is similar to how vTaskDelay(N) will actually delay between N-1 and N ticks depending on when in the tick it is executed.
The disadvantage of the above is that we have slowed down the rate that the tasks get executed, which might be an issue in some cases, and a solution to that would be to add a function in the port layer to indicate that we are near the end of a tick period, and a task that gets its turn in that period gets an extra tick of quantum. That says that if the ‘end’ is defined as the last 1/4, then every task will get between 1/4 and 1 1/4 tick periods with a quantum of 1. This could also be used for higher quantums to more evenly divide the time period. We might even define a variant of vTaskDelay that increments its delay time in that case, making a vTaskDelay(1) have a larger guaranteed delay.

Both of those suffering from the issue that if higher priority tasks take up significant CPU time, then the task might not get as much work done itself. One solution would be to use a higher frequency counter like used for Real-Time States (maybe allow something higher frequency since we don’t need to accumulate for extended time periods and the scheduler doesn’t put the task to the end of the list until it uses up its allocation, at which point its accumulated time gets reset back to zero.