Yield in critical section?

nobody wrote on Wednesday, May 10, 2006:

Hi there,

I’m attempting a port of FreeRTOS to the Motorola (now Freescale) CPU32, derivate 68332. If I understand taskYIELD() correctly, it may not return to the caller (by definition almost)… however, in the function xTaskResumeAll()  (tasks.c), it seems possible that taskYIELD() is called from code that owns the critical section which seems like a recipe for desaster to me? Or am I missing something? 

rtel wrote on Wednesday, May 10, 2006:

Each task must maintain its own interrupt status.  Therefore if a task yields with interrupts disabled, when the task next runs it is guaranteed that interrupts will again be disabled.  The tasks that execute between the task yielding and next running will have their own interrupt status so interrupts will not remain disabled for the duration.

Different ports manage this in different ways.  Some enter critical sections by pushing the interrupt status onto the stack, and in these cases normally the whole process is automatic.

Other ports store critical section nesting depth (normally stored in a variable called uxCriticalNesting) as part of the task context.

Regards.

nobody wrote on Thursday, May 18, 2006:

I don’t understand how a task can be allowed to yield from within a critical section.  Isn’t the whole point behind a critical section to protect some “critical” piece of code?

For example, if I have two task that both use a shared resource, I am supposed to protect the use of that shared resource by using a critical section.  If some OS call from within my critical section decides to yield, the other task will have an opportunity to use the shared resource.  How can this work?

I do understand that the current architecture correctly maintains the interrupt enable status even through a yield inside a critical section, but how does this help?

Another example, I have two tasks, T1 and T2.  Suppose T1 has a critical section of code during which interrupts cannot be allowed to run and further suppose that T2 currently has interrupts enabled.  A yield is allowed during the critical section of T1, switching to T2.  Now an interrupt occurs, and is serviced.  I have just messed with the critical section of T1!

Please help me understand how this can work?  What am I missing?

Thanks!

imajeff wrote on Thursday, May 18, 2006:

Keep in mind that the best design is not to try to limit possibilities. If you are writing a critical task which should not call Yield, then do not call Yield. It should be easy enough to control yourself when you write it. But if there is a time when someone will need the possibility, then it should be possible.

Remember certain protections are not meant to keep a malicious programmer safe, just to protect from unexpected conditions in the system.

After all, who is to stop you also from writing `jmp 0xfffff` when there is no code there? Just don’t do what is obviously not what you want.

nobody wrote on Thursday, May 18, 2006:

Thank you for the reply, Jeff.

I am not concerned with my code calling Yield.  I am concerned with an OS function calling Yield.  Am I to restrict myself from calling any OS function within my critical section?  I suppose that would solve the problem.  Is that the intent of the current architecture?  Perhaps we should have some documenation on what OS functions might implictly call Yield, and therefore should not be called from a critical section (unless special precautions are taken).

Thanks again,
David

imajeff wrote on Thursday, May 18, 2006:

Oh, that makes a difference, sorry I missed it.

I have been wondering about the reason people are complaining of unexpected delays (even extreem interrupt latency). Maybe this is the problem.

rtel wrote on Thursday, May 18, 2006:

In brief:  Normally you would not want to call yield, or call a blocking API function (which would call yield to perform the block) from within a critical section at the application level.  However, there is nothing stopping you from doing so if this is your intention and your design can cope with it.  I would say however that if your design can cope with it then it is unlikely that the critical section is being used properly.

In general I would say don’t call yield from within a critical section.

The kernel source itself (as opposed to an application) does yield from within critical sections, but only in a couple of very controlled situations.  This is why the implementation permits it.  The rationale is all to do with minimising the amount of time interrupts are left disabled during queue operations.

Take the following example from within the FreeRTOS.org source:  You want to post to a queue.  To check the queue status you access the queue structures so need exclusive access.  Rather than disable interrupts, you ‘lock’ the queue to provide thread (and isr) safe access.  Now say you find the queue is already full meaning you want to block.  You cannot block with the queue locked as nothing else could use it, but when the task unblocks you need exclusive access to the queue again.  So the procedure is:
1) Enter critical.
2) Unlock queue - now safe to yield.
3) Yield.
…other tasks can now execute and access the queue.  When data is available the task is unblocked…
4) Start executing again.  You are now automatically in a critical region as this is where you blocked from.
5) Lock the queue again.  Nothing could have accessed it since you unblocked as interrupts were disabled.
6) Now the queue is locked you can exit the critical section and carry on enjoying exclusive access to the queue even though interrupts are enabled.

Regards.

imajeff wrote on Thursday, May 18, 2006:

I would conclude that xTaskResumeAll() is one specifically not for calling from a critical section in which you do not want a task of higher priority to take over.

adarkar9 wrote on Thursday, May 18, 2006:

Thank you Richard, for your reply.

I understand the need of the OS to Yield during a critical section.  Again, this is during "very controlled situations" as you mentioned.  Perhaps the documentation for "taskENTER_CRITICAL()" should state something like:

"You should not call a blocking API function from within a critical section."

or

"A cooperative context switch can occur when in a critical region if a blocking API function is called."

Of course, this would require documenting which API functions are blocking.

Ideally, all blocking API functions would have non-blocking counterparts that could be used in critical regions.  The non-blocking function would fail if the operation could not be completed.  This would allow the application code to exit the critical region, Yield, and try again later.

nobody wrote on Thursday, May 18, 2006:

I don’t think this is necessary.  It is explicit that if you call a blocking api function then a block will occur.  If you want to do this from a critical section then you have to assume you know what you are doing.  The code cannot prevent application design errors.

adarkar9 wrote on Thursday, May 18, 2006:

Okay.  It is explicit that calling a blocking API function may cause a block.  I agree.  The API functions are not currently labeled as blocking or non-blocking, however.

As I said earlier, one solution is to refrain from using any API functions from within a critical section.  Another solution is to have a list of API functions that can safely be called from within any critical section.

More food for thought:  What if I have a low priority task that decides to enable two higher priority tasks.  I would like to do this in a “critical region” because I don’t want the first task to preempt me before I enable the second one.

Again, I think a note about a possible context switch from within a critical section would be useful.  I would have found it useful a couple of days ago.

Thanks

adarkar9 wrote on Thursday, May 18, 2006:

Well, things are not as bleak as I thought.  Sorry for not doing enough research before posting.  It looks like to properly protect "my" kind of critical section is a two-step process:

VTaskSuspendAll();
taskENTER_CRITICAL();

critical code here

taskEXIT_CRITICAL();
VTaskResumeAll();

By doing this, it appears I can call an API function that may enable a higher priority task without switching to that task.

Also, I further found that all blocking routines do support a non-blocking mode by specifying an xBlockTime or xTicksToWait of zero.

Great job guys!  Sorry for the hasty conclusion.

In light of my recent experience, perhaps a note pointing to vTaskSuspendAll() in taskENTER_CRITICAL might be warranted.  Something like:

"Calls to some API functions may cause context switches within the critical region if the scheduler is not suspended through vTaskSuspendAll()."

Regards.