Debugging suspended tasks

hillridge wrote on Tuesday, January 29, 2019:

Warning, FreeRTOS noob here.
I’m using Atmel Studio 7, a SAMG55 (M4 ARM) processor, and FreeRTOS 10.0.0 as provided by Atmel

I’m trying to debug why a task is “crashing”. I know that it is due to an I2C perepheral outside the scope of this forum, but I’m not sure how to track down the root cause. I have a debug line in each running task that spits some relevant info to the serial terminal, and when my bug occurs, multiple tasks stop outputting this debug data.

I can view task status with the FreeRTOS Viewer plugin, and I noticed that all of the problem tasks were listed as “Suspended”. When normally operating, I see these as “Running” or “Blocked” (because I have a vTaskDelay(1000) in them).

Questions:

  1. What is FreeRTOS seeing/doing that makes these tasks transition to the suspended state?
  2. How do I determine what has caused this to happen?

Thanks

richarddamon wrote on Wednesday, January 30, 2019:

A task becomes suspended if it, or another task, calls suspend on it, or if it blocks for an ‘infinite’ time (waiting on a queue/semaphore for portMAX_DELAY with the appropriate options set).

Likely you have gotten your tasks into a dead-lock, each waiting for some other task to give something they are waiting on. On option is to find all delays of portMAX_DELAY to something smaller (but still long enough that it ‘shouldn’t’ occur, and calling an error function if it does timeout, and set a break point there to find at least part of the dead-lock.

One other cause of dead-lock is to try to take a non-recursive mutex that you have already taken.

rtel wrote on Wednesday, January 30, 2019:

Some kernel aware debuggers will show tasks that are blocked indefinitely as being suspended, although in reality they are not. Tends to be the older plugins that have that quirk which comes from an internal implementation detail in the kernel.

hillridge wrote on Wednesday, January 30, 2019:

Thanks for the suggestions. I’ll start looking at mutex issues, as there is one involved with the peripheral in question. Richard - how can I tell if the task is actually suspended or indefinitely blocked?

rtel wrote on Wednesday, January 30, 2019:

You would have to know some of the internals within the task control
block, then view those in the debugger.

If a task is blocked indefinitely it is referenced from the suspended
task list called xSuspendedTaskList in tasks.c (because it is not
waiting for a time) but it is also waiting for an event so the
pvContainer member of its event list item (xEventListItem in the TCB) is
not NULL. If it were truly suspended then it would not be waiting for
an event so the pvContainer member of xEventLisItem would be NULL.

Which is a long route to saying the difference is whether xEventListItem
is contained in an event list or not. More recent kernel plug-ins check
that.

hillridge wrote on Friday, March 22, 2019:

Re-visiting this as I have another issue presenting similar symptoms.

I’m using this viewer: https://gallery.microchip.com/packages/6ed3add9-9a41-447f-98cc-1491e87d945d/

Anyone know if it reports suspended correctly?

Where is a good place to put a breakpoint so that I can see what’s going on when this task gets moved to a suspended state? I don’t think anything is using portMAX_DELAY, so I doubt that trick will help.

I have a really weird bug I’m working on right now that was triggered by changing a uint16 to an int16. It’s used by a completely different task than the one that is ending up suspended. It smells like a stack issue, but my breakpoints in vApplicationMallocFailedHook() and vApplicationStackOverflowHook() aren’t getting hit.