How to detect task overruns

EugeneK · May 1, 2025, 7:04pm

Hello,

Is it possible to detect task period overrun with FreeRTOS and consecutive activation before previous instance of this task finishes?

In the context. I have external fixed scheduler activating tasks at fixed period using direct-to-task notifications.

What happens when new notification is sent to a task before previous task instance finishes its execution?
How could I detect if next period activation happens before task from previous one completes? Basically, before ulTaskNotifyTake() call in the task body.
Is it possible to set activation limit per task in the FreeRTOS itself and have some error response or callout?

Regards,
Eugene

rtel · May 1, 2025, 7:23pm

What happens when new notification is sent to a task before previous task instance finishes its execution?

It depends on how you use the task notification. In your case I suggest using the notification value as a counter so notifications sent before the previous period completes are latched. The count is the difference between how many times the notification has been “given” and “taken”.

How could I detect if next period activation happens before task from previous one completes? Basically, before ulTaskNotifyTake() call in the task body.

Look at the notification count value - but be careful of race conditions. How/if this works will depend on the priorities of the tasks “giving” and “taking” the counter relative to each other.

Is it possible to set activation limit per task in the FreeRTOS itself and have some error response or callout?

There aren’t any built in mechanism that enable that.

richard-damon · May 1, 2025, 7:30pm

The best way to detect that a new activation came before the previous one finished is to have the task first check when it is done if there is another activation pending by doing a 0 tick wait call to ulTaskNotifyTake(), and flagging some sort of error in your code if the activation has come.

EugeneK · May 1, 2025, 7:49pm

@ rtel If I understand correctly notification count value is returned by ulTaskNotifyTake.

Using ulTaskNotifyTake is somewhat late because the task is already in overrun state.

Ideally, vTaskNotifyGiveFromISR call is the time when multiple activations should be detected. However, that API returns void. Is there a way to access notification count from the call site of vTaskNotifyGiveFromISR?

Regards,
Eugene

EugeneK · May 1, 2025, 7:55pm

@ richard-damon IMHO, overrun check should not be done by overrunning task. It has overrun and anything that it does is potentially incorrect. It is OS or external scheduler responsibility.

It would be helpful to have some indication at the time of notification or task activation.

Regards,
Eugene

rtel · May 1, 2025, 8:15pm

You can use ulTaskNotificationValueClear() to query a task’s notification value (count in this case) without actually clearing the value just by setting the “bits to clear” parameter to 0.

richard-damon · May 1, 2025, 8:15pm

Would using xTaskNotifyAndQuery meet your needs? To do this you would need to do the Take without clearing, and then when done do the take again to clear it.

EugeneK · May 2, 2025, 6:32pm

richard-damon I tried xTaskNotifyAndQueryFromISR( handle, eIncrement, 0U, &prev_notify_val, &temp_yield) and it wreaked havoc in my system. Tasks would run but strangely according to Percepio but notify value was not incremented at all. Whereas, vTaskNotifyGiveFromISR would seem to work fine. Not sure what to make of it.

EugeneK · May 2, 2025, 6:41pm

rtel ulTaskNotifyValueClear seems to work but that does not solve the problem because if task is executing (notification was taken) when new notification is sent then count is 1 which does not indicate the problem when there is apparently one there.

I wonder if I can somehow query task state and use that for detecting multiple notifications. Something like eTaskGetState but that has some limitations for using in ISR. Other ones require configUSE_TRACE_FACILITY which is not feasible. Any suggestions?

Regards,
Eugene

richard-damon · May 2, 2025, 8:52pm

That seems to have the parameters swapped (value should be before action). I would need to check, but I suspect operation 0 is eNoAction, so that is what it did.

EugeneK · May 2, 2025, 9:49pm

I think notification APIs may not be appropriate. They are designed to do just opposite of multiple activation error detection. I have also noticed that behavior could differ based on task priority and setting threshold may not universaly work for highest and lower priorities. Quering for task state: ready or running could be a way out as long as task state can be reliably determined from an ISR. Will try that…

RAc · May 3, 2025, 8:07am

you seem to also follow the “recently fashionable” design pattern to create and delete tasks liberately and frequently.

Please be reminded that this is NOT a recommended design pattern in real time operating systems. The issue you observe is only one of many possible corollaries of abusing this pattern. In RTOS, tasks are normally used to be created just once and remain dormant unless notified “to do something.” If you refer to “task instances,” it looks a lot like you would be much better off with that approach.

Generally, task deletion and -recreation is discouraged for many reasons. Task notifications will not be your only reason for concern when you do that. Maybe you would want to outline your use case here so we could suggest better and less error prone architectures?

EugeneK · May 3, 2025, 11:21am

@RAc Not sure why you have come to this conclusion but that is ultimately not true. I have static safety critical system where task activation pattern is driven by a fixed schedule. System may eventually see and tolerate task overruns but there has to be a limit. This is what drove the question and not dynamic task creation and deletion. FreeRTOS does not seem to support activation limits and i am trying to get help from community on how to better implement that.

richard-damon · May 3, 2025, 11:51am

I think the issue is that what can the OS due generically if a task over consumes the processor. This really needs to be an application decision. As has been pointed out, there ARE ways to make the overrun detectable.

To get better help, a better description of what you are looking to do might help.

EugeneK · May 3, 2025, 12:31pm

@richard-damon I am not trying to do anything that is extraordinary.

System has to acquire sensor data and process it. Acquisition has to be at fixed interval with almost no jitter. Processing is offloaded to tasks and those tasks may jitter as long as they meet system deadlines (which are not necessarily single task deadlines) as that does not violate acquired data times or system behavior. There are other tasks in the system which do not have such strict deadlines and those tasks may potentially overrun but system has enough throughput to recover due to its physical properties without compromising deadlines. However, there is a hard limit to that behavior and that is from where my question comes.

One would say that we would need faster processor but that is not realistic. If you wonder what kind of system this is, I guarantee that you have at least one in your car.

In the end, it really simple question regardless of the system:

“how to detect and react to multiple direct-to-task notifications when task has not completed its run before ulTaskNotifyTake given notification count limit?”

The task pattern:

void Task(void * Params)
{
for loop
{
ulTaskNotifyTake(..)
… do something
}
}

Timer ISR activation pattern:

{
vTaskNotifyGiveFromISR(…)
}

richard-damon · May 3, 2025, 12:53pm

So, if you find that you have over-run your timing, what do you want to do? THAT is a question only you can answer.
If your system can’t meet the requirements as designed, what do you do? Again, that is a question only you can answer.

Part of your problem is that the simplest mechanisms don’t get the feedback you want, but that doesn’t mean you can’t get it, you just need to use the mechanisms that do. If you start with the assumption you are using the wrong methods to get what you want, you have broken your design from the start.

The simplest way to detect that you are “behind” with direct to task notifications, is to use a shared variable. The sender checks the variable, and if it is set, we are in overrun, and needs to decide what to do about it.
It then sets the variable and notifies the task. When the task is done, it resets the variable to let the sender know.
As long as the variable is a type that can be set and cleared atomically, that system is fully robust.

An alternate method using just the FreeRTOS notification primatives is to use the Notify and Querry to sent, to see that you have overrun, and when you receive, you DON’T clear the value immediately, but wait to clear after you process. This would allow “restricted” tasks to share the information through the API.

Your system setup seems to say you want minimal sensor aquistion tasks that on the schedule immediately grab the data, and then pass that on to non-critical tasks for processing. That meets the minimal jitter.

All your talk about “overrun” is by definition very application specific on how to handle, so the providing of the methods I described above, is the best that FreeRTOS can provide.

Perhaps part of the issue is we often think of an operating system as something we work “under” and it needs to provide the resource to use, but FreeRTOS isn’t that sort of system, it is a library providing some resources to allow the program to do things, but the program is ultimately in control in how it uses the “tool” of FreeRTOS, and FreeRTOS doesn’t need to provide for everything the program needs, it can add capabilities to the system.

You ended up with an XY problem when you started with a defined method of signalling, and then tried to find out how to get information out of that method that isn’t there. You need to start with your real requirements, and try to figure out how to implement them.

EugeneK · May 3, 2025, 1:03pm

@richard-damon Why would quering for task state: ready or running in ISR using eTaskGetState() not work? This seems to be simplier solution to any other? That is assuming that the function is safe to use in ISR. Is it?

On the other hand, this detection behavior is not new to RTOSes and has nothing to do with system architecture.

All i am asking is how to do this with FreeRTOS and minimal effort in application.

richard-damon · May 3, 2025, 1:29pm

It doesn’t end in FromISR, so can’t be assumed to be usable in an ISR.

Second, there is no function by that name, the closest is vTaslGetInfo, which also gets stack usage, and that can be somewhat time consuming, and is marked for limited usage.

Note. Detecting “ready” is also not a reliable method to see if it is done, as fetching the sensor data likely will use I/O with the task blocked during that I/O (you are not using busy-loop for I/O are you?). You want to know if it is DONE, and that needs some positive signal from the task doing the work.

EugeneK · May 3, 2025, 1:49pm

@rtel thank you for supporting this discussion.

AFAIK and per documentation, task waiting on notification is in blocked state. Task that has taken one and is not running yet is in ready state. Thus ready and running are collison states.

That’s the question if eTaskGetState() is safe from timer ISR when that ISR is the highest priority that is allowed to invoke RTOS services. When there are higher priority ISRs they do not interct with OS and can’t interfere with eTaskGetState(). Even when they do they would put queried task into ready state.

What’s wrong with this logic?

hs2 · May 3, 2025, 2:09pm

I’d use a simple (atomic) busy flag or budget (up/down) counter of the target task as proposed by @richard-damon. As far as I got it you want to detect task notifications while the target task hasn’t completed its work triggered by the previous notification.
Seems that would be sufficient.