Possible stale software timer expiries

nigel-paton wrote on Thursday, April 25, 2019:


(FreeRTOS Kernel V10.1.1)

I think there might be the possibility to get stale software timer expiries if the timer daemon task is at a lower priority than the task starting/stopping the timer. By ‘stale timer expiry’ I mean a timer expiry callback being called after a timer has been stopped or restarted.

In timers.c, prvProcessTimerOrBlockTask() there are the following lines of code

			( void ) xTaskResumeAll();
			prvProcessExpiredTimer( xNextExpireTime, xTimeNow );

If a higher priority task has become unblocked during the task suspension period (e.g. triggered via an interrupt) and then stops and restarts the timer that is about to expire the timer expiry function will still get called and I don’t think there is any way for the expiry callback function to detect that it is for an old stale instance of the timer.

Possible workarounds:

  1. Raise the timer task priority above all other tasks using timers. Problem with this is that the timer activity will affect possibly critical high priority tasks.
  2. Create a new timer each time. Adds memory and processing overhead of creating a new timer each time a timer needs to be started.

Is there a way to solve this?



richarddamon wrote on Thursday, April 25, 2019:

I think you are desctibing something that is a fundamental race, and unless the environment of how timer callbacks occur is changed (making it part of a large critical section) really is fundamental, because the event could happen a bit later, rather than during the suspension, but just after the timer callback is called and before it gets to do anything, and here it is obvious that there isn’t anything the timer code could do about it.

In facg, there is a fairly large window for this race, since timers are operated on via a queue, any operaton that tries to adjust a timers state, from the time the timer task has checked the queue for changes and then goes off and runs the expried timers won’t be recorded until AFTER those timers have fired.

Thus is your case, the timer call back ISN’T really ‘stale’, as the timer has expired and not yet really restarted, as that restart hasn’t happened yet, but is just in the queue of operations.

One possible work around for your conditions is the task setting up the timer could set a variable to the time it expects the timer to expire, and the timer call back can see if time is ahead or behind that mark .Remember that here time is a circle, so the question is about the size of the current tick count and the expected tick count. if current - expected is 0 or a small number, we are after the time, but if it is a large number when treated as a TickType_t (which is unsigned) then expected is slightly ahead of us.

nigel-paton wrote on Thursday, April 25, 2019:

Hi Richard,

Thanks, I’ll give that solution a try.