UDP: Calls to FromISR with enabled interrupts

michalmn wrote on Tuesday, June 04, 2013:

When running my own application that uses UDP transport layer on an LPC3250 board I experienced some issues with corrupted xPendingReadyList. With “corrupted” I mean weird links (more than one cycle) between list items, wrong number of contained items, etc. This problem puzzled me for quite a while until I finally looked on the implementation of xProcessReceivedUDPPacket. It calls

xSemaphoreGiveFromISR( pxSocket->xWaitingPacketSemaphore, &xHigherPriorityTaskWoken );

outside of any critical section. After I replaced it by xSemaphoreGive() all problems were gone.

My understanding is that the FromISR() functions are only to be used where the interrupts are disabled. This assumption is also supported by the fact that using the (from my point of view) correct function without FromISR fixed the problem.

Could anyone tell me please whether this is a bug? And, in the event that this is considered correct, give me a short explanation why it is ok to call FromISR functions with enabled interrupts?

Thanks,

Michal

rtel wrote on Wednesday, June 05, 2013:

The point of the “FromISR” functions is that they are safe to use from interrupts, so no, they should not need to be called from critical sections (caveat being, I think, the oddball and old UC3 port).  The implementation of the functions takes care of critical sections with interrupt safe critical section macros.  The normal critical section macros, as used by task code, should not be used in an interrupt (nothing that does not end in “FromISR” should be used in an interrupt - the interrupt safe API is kept separate from the task API to allow the kernel implementation to be smaller, simpler and faster).

That said however, the critical sections inside the FromISR functions are only necessary for ports that support interrupt nesting - which most new ports do, but the 3250 port does not.  In the 3250 port the preprocessor will remove the calls to the interrupt safe critical nesting macros because interrupts should not be enabled anyway, making the macros obsolete.

If you find adding your own critical section makes a difference then I would guess the problem lies in your interrupt entry code.  Are you re-enabling interrupts before calling the handler?

Regards.

michalmn wrote on Wednesday, June 05, 2013:

Thanks for the clarification. The interrupt handler runs with disabled interrupts. At least I do not intentionally enable them. But I will check again whether this might be the case.

Just to give you some more information, I have (among others) a simple task that collects data from memory. The body is just:

f(); vTaskDelayUntil(&lastWakeTime, portMS_TO_TICKS(1));

As long as f() takes less than 1 ms, anything is fine. I wanted to see what happens under high load and if f() has to collect more data, it runs for about 1.5 ms. In this case, after a couple of seconds I end up in an infinite loop in xTaskResumeAll where

while( listLIST_IS_EMPTY( ( xList * ) &xPendingReadyList ) == pdFALSE )

loops forever because the list contain corrupted items. Then I added checks around the code that manipulates xPendingTaskList. and found that after calling vListInsertEnd in xTaskRemoveFromEventList the number of elements has not increased. So I suspect that there is a race condition where two tasks access the list at the same time.

Regards,

Michal

rtel wrote on Wednesday, June 05, 2013:

It sounds like you have looked in this in detail already, so I suspect you have read the following link already, but in case not it contains a list of things to check for:

http://www.freertos.org/FAQHelp.html

Normally when somebody reports that type of corruption it is because they are using a port that does support a full interrupt nesting model, and they have set their interrupt priorities incorrectly, which causes the kernel data structures to be accessed without the appropriate protection.  For that reason it is probably a good idea to look at the interrupt implementation first.

You can tell in interrupts are enabled on an ARM9 by looking at the I bit in the CPSR while inside the interrupt.

How are you entering the interrupt?  The normal way would be using a naked function or asm wrapper that uses the portSAVE_CONTEXT macro.

Do you autovector to the interrupt handlers individually, or do you have a single interrupt entry point for all interrupts?

Regards.

michalmn wrote on Wednesday, June 05, 2013:

Hi Richard,

Finally, your precise hint pointed me to the bug. As you suggested I really enabled the high speed time interrupt (that I’m using to trigger the system tick) far too soon. This caused the interrupt handler to run under certain circumstances partially with enabled interrupts. Thanks for your great help.

Regards,

Michal