Invalid entry in xPendingReadyList

dcowan0 wrote on Thursday, April 19, 2012:

I’m running FreeRTOS V7.1.0 on an MSPF5419A.
My ISR supporting an accelerometer gets called when that devices FIFO reaches a specified watermark level.
In the ISR I’m doing a xSemaphoreGiveFromISR on a semaphore and if that method returns pdTRUE calling portYIELD_FROM_ISR passing the variable that was set by the semaphore give. This wakes a task that is blocked on the semaphore which then drains the FIFO. All this works fine. The problem is that intermittently the xPendingReadyList is getting an entry (always 0xF3FFF) which clearly does not correspond to the address of any of my 7 TCBs. This doesn’t usually happen until the system has run for anywhere from 3 to 5 minutes. With this invalid entry things quickly deteriorate and my watch dog timer resets the system. The variables on either side of xPendingReadyList look fine so it doesn’t appear to be caused by memory being clobbered.

Any one have any thoughts as to what could be causing this?



edwards3 wrote on Thursday, April 19, 2012:

The first thing to check for is stack overflow. Do you have configCHECK_FOR_STACK_OVERFLOW set to 2, and a stack overflow hook defined?

You might also get some hints from here

dcowan0 wrote on Thursday, April 26, 2012:

Yes, stack overflow was my first thought, however not only have I enabled the stack overflow checking as you suggested but also added a method to tasks.c that is called by vTaskSwitchContext. I have a vector of pointer to all the TCBs for the tasks I’ve launched and this method verifies that for every task pxTCB->pxStack is still 0xA5A5. If not it will halt at a breakpoint. That breakpoint never occurs.

I’ve also added another method to tasks.c whose purpose is to verify that the pointer to a TCB which is just about to be inserted into a list is valid (again using my vector of saved pointers to TCBs). This is the one which is catching an invalid pointer which is just about to be inserted into the xPendingReadList.

anonymous wrote on Thursday, April 26, 2012:

I had a similar problem last week.  I dont know the details about your chip, but on the Cortex M3 there is a main stack that gets used for interrupts.  That stack is not checked for overflow by FreeRTOS (is my understanding now), and was overflowing in my chip, with very similar symptoms as to what you describe.

dcowan0 wrote on Monday, May 21, 2012:

Having been unable to determine the root cause of this problem (its not a stack overflow) I decided to upgrade to V7.1.1.
After doing that I ran our application for over 49 hours over the weekend without any problems.
Prior to the upgrade it would crash after a few minutes.

I really don’t know if I can claim the problem to be fixed, but for the moment things are solid.