Delayed task lists crashes

rakeshtv wrote on Friday, April 19, 2013:

The crash occurs in the following block called from vTaskIncrementTick. The data suggests that pxTCB is corrupt resulting in a dabort exception. Taking that one step further, there is most likely a problem in the delayed task list.

Can anyone suggest if any fix was made in the delayed task lists. We are using the v6.0.4 version of free rtos code.
Please advice?

#define prvCheckDelayedTasks()                                                                                                                                                                                                                                    
register tskTCB *pxTCB;                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
            while( ( pxTCB = ( tskTCB * ) listGET_OWNER_OF_HEAD_ENTRY( pxDelayedTaskList ) ) != NULL )                                                                     
                        if( xTickCount < listGET_LIST_ITEM_VALUE( &( pxTCB->xGenericListItem ) ) )                                                                                                     
                        vListRemove( &( pxTCB->xGenericListItem ) );                                                                                                                                                                                         
                        /* Is the task waiting on an event also? */                                                                                                                                                                                                
                        if( pxTCB->xEventListItem.pvContainer )                                                                                                                                                                                                  
                                    vListRemove( &( pxTCB->xEventListItem ) );                                                                                                                                                                                
                        prvAddTaskToReadyQueue( pxTCB );                                                                                                                                                                                         

rtel wrote on Saturday, April 20, 2013:

I don’t think that code has been touched - but all the same it is always best to be using the latest version of code.  Upgrading is simply a matter of dropping the latest files over the older files in your project.

Which CPU are you using?  The symptom you describe is normally associated with incorrect interrupt priority assignments on a Cortex-M core.

Have you read the following two pages:


rakeshtv wrote on Monday, April 22, 2013:

We are using the LPC2388 CPU. Where are the interrupt priority assignments set to. The configMAX_SYSCALL_INTERRUPT_PRIORITY is not defined anywhere in our project. We use the VIC configure API tos et the priorities of the interrupt:

#define VIC_SLOT_MAX 15u

#define VIC_SLOT_RTOS 0u              // DO NOT CHANGE
#define VIC_SLOT_TIMER0         (VIC_SLOT_RTOS) // TIMER0 alias for non-RTOS builds !!!
#define VIC_SLOT_BOD            3u
#define VIC_SLOT_DMA 4u
#define VIC_SLOT_USB 6u
#define VIC_SLOT_SD_MMC 7u
#define VIC_SLOT_EINT1 8u
#define VIC_SLOT_TIMER1 9u
#define VIC_SLOT_EINT0 10u
#define VIC_SLOT_WATCHDOG       11u
#define VIC_SLOT_RTC            11u
#define VIC_SLOT_SSP1 14u
#define VIC_SLOT_TIMER2 14u
#define VIC_SLOT_TIMER3 15u

* Configures the LPC2000 VIC controller for an interrupt source
* @param u8SlotNo   : The slot allocated for this interrrupt source
* @param u32Channel : The interrupt source channel number
* @param *pISR      : Address of the ISR
* @return uint8_t: Error code
uint8_t VIC_Configure(uint8_t u8SlotNo, uint32_t u32Channel, uint32_t* pISR)
    uint32_t *pSFR = NULL;

    if ((u8SlotNo > VIC_SLOT_MAX)      ||
        (u32Channel > VIC_CHANNEL_MAX) ||
        (pISR == NULL))
    {   // error, invalid VIC slot or channel or bad fn ptr
        return (1);

    // 23xx devices don’t have a mapping, instead they increase number of channels
    // to 32 and assume a 1 to 1 map. Assume SLOT=PRIORITY

    // Configure the priority number for this interrupt source
    pSFR  = (uint32_t*)(&VICVectPriority0) + u32Channel;
    if (u8SlotNo < 0x10)            // is the slot within the max range (0x0 to 0xF) ?
        *pSFR = (uint32_t)u8SlotNo; // set priority based on allocated slot
        *pSFR = 0x0000000F;         // out of range, so default at lowest priority level

    // Assign the ISR to the VIC address table
    pSFR  = (uint32_t*)(&VICVectAddr0) + u32Channel;
    *pSFR = (uint32_t)pISR;

    // Enable the VIC channel
    VICIntEnable |= (1 << u32Channel);


    // Configure the slot number for this interrupt source
    pSFR  = (uint32_t*)(&VICVectCntl0) + u8SlotNo;
    *pSFR = ((u32Channel & 0x1F) | (1 << 5));

    // Assign the ISR to the VIC slot
    pSFR  = (uint32_t*)(&VICVectAddr0) + u8SlotNo;
    *pSFR = (uint32_t)pISR;

    // Enable the VIC channel
    VICIntEnable |= (1 << u32Channel);
    VIC_TRACE(“VIC Channel %d configured\n”, u32Channel);

    // success
    return (0);

edwards3 wrote on Monday, April 22, 2013:

That is an ARM7 part so will not have a configKERNEL_INTERRUPT_PRIORITY setting as it does not support interrupt nesting. configKERNEL_INTERRUPT_PRIORITY is only needed when interrupt nesting is supported (Cortex-M3, RX, etc)

rakeshtv wrote on Wednesday, April 24, 2013:

We did lot of diagnostics in our application to report the stack dump or trace and the check points trace for the tasks created. From the diagnostics what we find is the code jumps from one place to another without executing the code in between the jump. It is a random jump in a particular task which has many nested while loops.
This is causing another task to get watchdog as well.
Do you know why would the code jump from one location to another which did not have any continue or goto instruction.
Could it be a stack corruption issue?
We tried to reproduce the issue in our test bench but does not happen. We can see this problem only in the customer devices which runs in teh field. This problem does not happen frequently but is random and happens may be once in a while not frequent.

Please suggest what could be the root cause of code jumping in a task?

rtel wrote on Thursday, April 25, 2013:

When the task is executing without any context switches it is just a normal C function - in which case the processor will just execute the instructions and behave exactly as dictated by the instructions it is executing.  If a context switch is occurring then it is *possible* but *unlikely* that the task is being switched out and having its program counter saved, the program counter is somehow being corrupted while it is switched out, and when it starts running again the restored program counter continues from the wrong place.  It the highly unlikely event that that occurred it is again *unlikely* that a corrupted program counter would cause the task to continue within the same task code.  It would be much more likely that a corrupted program counter would just cause a major crash as the processor started to access memory that didn’t contain instructions, a null address, or a misaligned address.

So unfortunately I don’t think I have a realistic scenario that would cause what you are seeing.

Could there be an environmental issue when in the field that is not present on the test bench.  For example, maybe the power is browning out, or maybe there is some external radiated interference, or maybe something is causes an error on the address bus (is the delta in the addresses between the point of the jump and the destination of the jump a power of2?).