rtel wrote on Wednesday, June 27, 2012:
So, if I follow what you are saying, a task at priority 1 is being signaled by an interrupt via a semaphore, and this is working ok for some time before the system crashes. When the system crashes, pxCurrentTCB is NULL, and the ready tasks list for tasks of priority 1 is empty.
Even with the task not referenced from the ready list, the scheduler should at least select the idle task to run, so pxCurrentTCB should never be zero. This points to a corruption of the data structures somewhere.
It sound like you have debugged this quite deeply, and are familiar with the interrupt priority settings configuration parameters etc. already - so my usual reply of “its probably the interrupt priority settings” might not be the case here. However, it is still worth double checking that the interrupts are running at the priority you think, and that no interrupts you are not aware of are running. These are the items documented on the http://www.freertos.org/FAQHelp.html and http://www.freertos.org/RTOS-Cortex-M3-M4.html web pages.
After that it is a matter of looking at any other potential source of corruption in your code, where are buffers being used, is any shared memory being used, are resources being used by multiple tasks, etc.
Can you keep cutting bits out of the code until the problem stops happening, so you have the smallest amount of code left to look at.
It is also possible to look at the task’s TCB itself, and see if you can work out where the task is. If it is not in its ready list, is it referenced from another list somewhere? You can do this by obtaining the tasks TCB from its handle, then looking at the generic list item’s container value. It should always be one of the ‘state’ lists - be that one of the ready lists, the blocked list, etc. etc. - but in this case that is probably the last resort option as learning where you task is after the problem is not necessarily going to lead you directly to the actual cause of the problem.
Regards.