Event flags:more discussion & practical issue

dutchuncle wrote on Tuesday, January 16, 2007:

Practical and philosophical issues of Event Flag Groups

Rather than making this a feature request, I’m working on a contribution, and hitting a snag.

Background:  Event Flags are typically implemented as bitwise flags in a portBASE_TYPE.  The flags may be set by any process that has the group’s handle, in any combination (essentially a bitwise OR).  More importantly, the group may be waited on by multiple processes at once, ALL OF WHOM see the flag post(s) simultaneously (and atomically).  This is very different from a queue or semaphore.  When an empty queue is written, ONLY the highest-priority process waiting on that queue gets the data; when a flag is posted, ALL of the processes wating on that flag are readied (though of course they get the opportunity to run in priority order). This is complicated by the operations provided for waiting on flags:  the process can specify a 32-bit mask and a code indicating whether ALL of the bits must match or ANY of the bits are sufficient, and in some systems whether the bits should be atomically cleared after a successful match - though as stated clearing by one process does NOT interfere with another pending process “seeing” the bits for its own readiness test.

FreeRTOS queues are implemented by activating the highest priority task on the event list, with all queue manipulation being done in the context of the task.  The problem with doing this for flags is that the processes would run in priority order with cascading effects, rather than the atomic operations required.  Besides, unlike queues which typically have one consumer waiting, flags are designed to have multiple processes awaiting different combinations of flags.  (In a VRTX application, we had a dozen processes waiting with masks which mostly had different conditions and *all* had two common bits for STOP and RESET.)  It would be inefficient to wake up all of the processes, perform all of the context switches, and let them all test for readiness (and probably suspend again) in context; instead, efficiency suggests performing the masking by scanning the waiting-event-list within the flag-post routine to decide which process(es) to awaken.  That means the mask and operation-code isn’t kept in the context stack, and it’s per-process, so it makes sense to put it in the process control structure.   BUT in good structured style, richardbarry has hidden the process structure inside the tasks.c module along with various private functions to maintain the various process lists, which means that either (a) the masking and process-waking routine should be in tasks.c violating functional separation, or (b) tasks.c should provide accessor and setting functions to private data along with whatever new function is needed to manipulate the process queue(s) incuring more overhead and complexity and *still* violating functional separation somewhat the other direction.

My own inclination is to bring the flag operation into tasks.c for efficiency, because it seems to me like keeping more privacy around the process handling, thus the lesser evil.  What do other people think?

PS:  The extra words could be used for different purposes, like “number to get” for queue or semaphore operations.  The same issues come up if one permits “value” operations on semaphores (post N or take N where N != 1).  I have seen this used to manage data-block usage where processes needed various numbers of blocks atomically for temporary or message use,  and hoarding caused resource lockup.  It’s an easier concept (and more efficient) than looping to get blocks and then having to unwind if there aren’t enough. 

rtel wrote on Tuesday, January 16, 2007:

Let me see if I am keeping up with this :wink:

> Rather than making this a feature request, I’m working on a contribution, and
> hitting a snag.

Excellent - thanks.

> FreeRTOS queues are implemented by activating the highest priority task on the
> event list, with all queue manipulation being done in the context of the task.

Just to clarify…FreeRTOS queues are implemented by ‘readying’ the highest priority task on the event list, rather than ‘activating’ it.  The task is removed from the event list and placed into the ready list.  It does not necessarily run, unless it coincidentally also has the highest priority out of all the Ready state tasks.

> The problem with doing this for flags is that the processes would run in priority
> order with cascading effects, rather than the atomic operations required. 

It would be possible to move more than one task from a ‘waiting for event’ list into the appropriate Ready list, and only perform one call to the scheduler when all the tasks have been moved.  The scheduler would then select the highest priority ready task as the task to enter the Running state.

>Besides,
> unlike queues which typically have one consumer waiting, flags are designed
> to have multiple processes awaiting different combinations of flags.  (In a
> VRTX application, we had a dozen processes waiting with masks which mostly had
> different conditions and *all* had two common bits for STOP and RESET.)  It
> would be inefficient to wake up all of the processes, perform all of the context
> switches, and let them all test for readiness (and probably suspend again) in
> context;

>instead, efficiency suggests performing the masking by scanning the
> waiting-event-list within the flag-post routine to decide which process(es)
> to awaken. 

Just a general comment - scanning lists can be time consuming so normally would not be done within a critical section implemented using portENTER_CRITICAL(), but instead using vTaskSuspendAll() which keeps interrupts enabled.  This technique cannot be done within an interrupt however.

The data structures chosen to reference tasks to bits I think will be the key to the efficiency.

One random thought  - should RAM permits - would be to have an event handler task.  This would run at a high priority and handle all the readying of tasks from (at the task level).

>That means the mask and operation-code isn’t kept in the context
> stack, and it’s per-process, so it makes sense to put it in the process control
> structure.   BUT in good structured style, richardbarry has hidden the process
> structure inside the tasks.c module

Thats always a bugger :wink:

>along with various private functions to
> maintain the various process lists, which means that either (a) the masking
> and process-waking routine should be in tasks.c violating functional separation,
> or (b) tasks.c should provide accessor and setting functions to private data
> along with whatever new function is needed to manipulate the process queue(s)
> incuring more overhead and complexity and *still* violating functional separation
> somewhat the other direction.

Yes - the queues make calls into the task module to perform the event handling, with the queues themselves performing the queue data manipulation, but this is only ever called once as only a single task is woken, as you say.

All good stuff.

Regards.

dutchuncle wrote on Tuesday, January 16, 2007:

>>One random thought - should RAM permits - would be to have an event handler task. This would run at a high priority and handle all the readying of tasks from (at the task level).

Do you mean a task that scurries around and checks the status of all of the event flags and all of the things pending on them?  Nah, the burden belongs on the task doing a post, because only that flag needs to be checked.  Unless you’re trying to decouple realtime by pushing that work into user state from interrupts . . . which helps avoid latency but winds up doing exactly the same amount of work. 

Unless we were to hide this “task” right inside the kernel, just below the clock tick, like the opposite of an idle task?  Hey, where are the timeouts handled anyway - TaskIncrementTick is in interrupt state, isn’t it? That could also be a process at infinite priority, if it isn’t already, with this event stuff tagging along.