xQueueReceive never returns

gmulley wrote on Tuesday, December 15, 2009:

We have been trying for a while to track down a particularly nasty problem.  At some point during execution one of our tasks fails to ever return from a call to xQueueReceive() in spite of the fact that data has been added to the queue.

Specifically, the task executes at priority 10, and spends most of it’s time just waiting to receive data from the queue.  An interrupt occurs periodically and calls xQueueSendFromISR() to push data into the queue.  At this point the task should resume and process the data, but at some point it just never wakes up.

It is important to note that we changed the critical section code to mask ALL interrupts during critical sections (not just the OS tick).  We are using the Pic24 port.  We changed the following lines:

In portasm_PIC24.S line 56:
` MOV #224, W0 /* changed from 32 to 224 to*/`

In portmacro.h line 90:
`#define portINTERRUPT_BITS ( ( unsigned portSHORT ) 7 << ( unsigned portSHORT ) 5 )`

Any suggestions/thoughts/help would be greatly appreciated!

rtel wrote on Tuesday, December 15, 2009:

Are you using the stack overflow checking?  Could it be an overflow causing the problem?


mikeycosta wrote on Tuesday, December 15, 2009:

I’m working with Gabe on the same project.  We are checking the stack and it is fine.  (we have seen the stack problem before with Queues but that caused an address error).  There is no address error this time and we don’t hit a breakpoint in the stackcheckhook().  The OS and all other tasks are still running.

gmulley wrote on Tuesday, December 15, 2009:

I also noticed that if I pause the debugger at this point the OS is in a very bizarre state.  Specifically, the task was successfully added to the ready list, but variable uxTopReadyPriority does not equal 10 (even though it should, since that is the highest priority task ready to run).  Has anyone ever seen anything like this before?

mikeycosta wrote on Thursday, December 17, 2009:

We found that disabling interrupt nesting fixes the problem.  We tracked it down to some interrupt is corrupting the ready task list.  But I don’t see why this would be the case.  All our interrupt priorities are set to 1 which is the same priority as the kernel and system tick.  I’m glad we at least know what is going on but it does not seem to make sense.

Any ideas?

gmulley wrote on Thursday, December 17, 2009:

We finally figured this out.  It appears that even if two interrupts share the same priority they **can** interrupt each other.  This can occur only if the interrupt flag has been cleared for the interrupt that is serviced first.  This is what was happening with us:

- Timer 3 Interrupt cccurs and the CPU enters the timer 3 ISR
- Timer 3 interrupt flag cleared
- OS tick interrupt occurs in the middle of prvAddTaskToReadyQueue after uxTopReadyPriority had been set, but before vListInsertEnd was called.  When the OS tick yields it decrements uxTopReadyPriority down to a low priority.
- When the Timer 3 interrupt is able to run again it adds the high priority task to pxReadyTasksLists
- Now uxTopReadyPriority and pxReadyTasksLists are out of sync.  This meant that the high priority task was never woken up again.

All problems were solved by making absolutely sure that all interrupt flags are cleared **after** the last OS call in an interrupt (except for the yield).  This ensures that interrupts that use OS functions are never nested.

Alternately you can just disable nested interrupts if you don’t need them.

rtel wrote on Friday, December 18, 2009:

Thanks for taking the time to post the solution.