Find Offending Task that caused the Watchdog to Fire

I am relatively new to freeRTOS. I have a system consisting of several tasks. Each of them checks in with the watchdog tasks. The watchdog task periodically kicks the HW watchdog. Now, I have setup a timer interrupt that will fire just before the watchdog is about to fire. This can help in debugging watchdog issues. I want to know the offending task that is going to cause the watchdog timeout. There was some task that took a long time to run, thereby starving other tasks and not letting them checking in with the watchdog.

How do I figure out what the offending task might be?

You want to figure out which task did not kick the watchdog, right? If so, you need to keep track of which tasks kicked off the watchdog. One way to do that is to maintain a bitmap where each bit corresponds to a task. Each task sets a bit in the bitmap - the bit which is not set will tell you the offending task.

#define TASK_0_BIT ( 1 << 0 )
#define TASK_1_BIT ( 1 << 1 )
#define TASK_2_BIT ( 1 << 2 )
#define TASK_3_BIT ( 1 << 3 )
#define ALL_BITS ( TASK_0_BIT | TASK_1_BIT | TASK_2_BIT | TASK_3_BIT )

#define WATCHDOG_CHECK_PERIOD   pdMS_TO_TICKS( 5000 )

uint32_t bitmap = 0;

void task0( void * params )
{
    for( ;; )
    {
        /* Do the work. */

        /* Set the flag. */
        taskENTER_CRITICAL();
        {
            bitmap != TASK_0_BIT;
        }
        taskEXIT_CRITICAL();
    }
}

/* task1, 2 and 3 do the same. */

void watchdogTask( void * params )
{
    for( ;; )
    {
        vTaskDelay( WATCHDOG_CHECK_PERIOD );

        taskENTER_CRITICAL();
        {
            if( ( bitmap & ALL_BITS ) == ALL_BITS )
            {
                /* All tasks are running successfully. Kick
                 * off the HW watchdog. */
            }
            else
            {
                /* At least one task failed to kick the watch dog. Examine
                 * bitmap to find out the offending task. Forcing an assert here
                 * helps in finding the offending task during development. */
                configASSERT( pdFLASE );
            }
        }
        taskEXIT_CRITICAL();
    }
}

If you want to know which task(s) didn’t kick the watchdog, then as aggarg mentioned, look at the list of which tasks have/haven’t kicked the watchdog.

If you want to know which task(s) are causing task to fail because they are starved, then you may want to look at the run-time states option, and see what tasks are using a lot of CPU between watchdog kicks.

Yes, I want to know which task is causing other tasks to not kick the watchdog. I see often that the task that couldn’t kick the watchdog isn’t the offender. It just did not get a chance to kick the watchdog because of its low priority.

Yes, how do I find out which tasks are using a lot of CPU?

Did you look at the run-time state functions, like RTOS - uxTaskGetSystemState()

Consider using Tracealyzer.