I am getting an exception when I tried to stress my application, which eventually causing watchdog reset on LPC55S69 cortex M33 controller.
I tried to capture the instruction which is causing exception and it points to first line of code in xTaskRemoveFromEventList() which is invoked by xQueueGenericSend(). could someone please let me know what could be the reason for this exception?
xQueueGenericSend is calling xTaskRemoveFromEventList() when the task list waiting on queue is not empty but not sure why is it failing.
/* If there was a task waiting for data to arrive on the
queue then unblock it now. */
if( listLIST_IS_EMPTY( &( pxQueue->xTasksWaitingToReceive ) ) == pdFALSE )
{
if( xTaskRemoveFromEventList( &( pxQueue->xTasksWaitingToReceive ) ) != pdFALSE )
{
the equivalent assembly code generated for first few lines of API xTaskRemoveFromEventList() is as below
stack overflow is checked and it is not overflow issue.
ConfigASSERT is not defined but using the default implementation of it, which is disabling interrupts and waiting indefinitely.
/* Define to trap errors during development. */ #define configASSERT(x) if(( x) == 0) {taskDISABLE_INTERRUPTS(); for (;;);}
the exception is occurring before configASSERT line and I suspect it is failing when it tries to load the head of the list, would like to understand why is it failing to fetch? #define listGET_OWNER_OF_HEAD_ENTRY( pxList ) ( (&( ( pxList )->xListEnd ))->pxNext->pvOwner )
Oh yes. @phaneesh86 Recent versions of FreeRTOS come with more useful assertions which could help to narrow down the root cause problem that the list is invalid or got corrupted.
Another reason for internal data corruption might be invalid interrupt priorities when using FreeRTOS (FromISR) API in ISRs.
When in doubt see the FreeRTOS docs regarding RTOS for ARM Cortex-M and
e.g. Understanding priority levels of ISR and FreeRTOS APIs - #16 by aggarg which contains a pretty good explanation.
I do suspect the same that it is trying to access NULL pointer but is there any possibility for it? before calling xTaskRemoveFromEventList() it is checked that list is not empty.
/* If there was a task waiting for data to arrive on the
queue then unblock it now. */
if( listLIST_IS_EMPTY( &( pxQueue->xTasksWaitingToReceive ) ) == pdFALSE )
{
if( xTaskRemoveFromEventList( &( pxQueue->xTasksWaitingToReceive ) ) != pdFALSE )
{
regarding call stack, I don’t have debugger connected to it so can’t say exact call stack but based on the exception handler data and code walk through, it is as below.
There is little merit in trying to get more information about the fault because the real corruption problem has occurred many many cycles before the crash. As @aggarg pointed out, data breakpoints are very valuable tools in pinpointing the root cause. For example, if you happen to discover that the corruption always involves the invalid value 0xdeadbeef at address 0x24448888, define a data watch point to break into the debugger for a write of 0xdeadyyyy to that address. That kind of thing gets you to the root cause much quicker.
Did you already verify that the ISR priorities are valid and the correct FromISR FreeRTOS API is used and also the heap implementation is right ?
Since FreeRTOS allocates task stacks from heap (by default/if configured accordingly) they can’t overlap. In addition on Cortex-M33 the stack check (if enabled) is very reliable because it has HW support for it (stack limit registers).
You also could use a recent FreeRTOS version with the mentioned more sophisticated asserts and checks just for testing purposes to find the problem and apply it to your production application if you really can’t upgrade.