FreeRTOS vEventGroupDelete assert failing

nickm2018 wrote on Monday, October 08, 2018:

I am running a custom platform (Cypress CYW43907) using the heap_3 configuration with FreeRTOS v9.0.0, provided in the WICED SDK.

In my application, I am trying to reset a library to a known state after an error but I end up hitting a configAsset call in FreeRTOS queue. The problem appears to be related to the vEventGroupDelete

I stepped into the code with my debugger and the eventGroup is valid and not null.

Call tree

–> vPortFree
–> free
–> some platform library call
–> malloc_lock (heap_3.c)
configASSERT( !( ( xTaskGetSchedulerState() == taskSCHEDULER_SUSPENDED ) && ( xTicksToWait != 0 ) ) );

Going back through the call stack, configUSE_NEWLIB_MALLOC_LOCK is defined to 1, which should suspend the scheduler through vTaskSuspendAll. If I step into the xTaskGetSchedulerState(),

xSchedulerRunning = 1, so scheduler is started
uxSchedulerSuspended = 1, so scheduler is suspended.

So xTaskGetSchedulerState returns taskSCHEDULER_SUSPENDED and xTicksToWait is set to 0xFFFFFFFF, which means the assert will fail.

Seems like there are two calls to vTaskSuspendAll in the call tree, one in vPortFree and the other in vEventGroupDelete. It seems like the scheduler cannot be suspended for xQueueGenericReceive to run.

I doubt this is a bug but more of a configuration issue. Anyone have some ideas?

nickm2018 wrote on Monday, October 08, 2018:

My platform has added _malloc_lock and _malloc_unlock functions to provide reentrancy for malloc/free calls. This adds a mutex to the system which calls the xSemaphoreTakeRecursive function with a delay of portMAX_Delay. I am guessing the vEventGroupDelete was never meant to be called in a system that depends on sub functions requiring the scheduler for timeouts since the first line in vEventGroupDelete suspends the scheduler.

Not exactly sure how to proceed…

rtel wrote on Monday, October 08, 2018:

Sorry to hear you are having troubles. First I can confirm that, as a
general rule of thumb, you should not call FreeRTOS kernel API functions
that can block while the scheduler is either suspended or you are in a
critical section. That is to prevent logic errors - if a function needs
to block because, for example, you are reading from a queue with a block
time but the queue is empty then a context switch must be allowed to
happen otherwise the queue read function will return without either
timing out or receiving date.

Normally the C library malloc() and free() would only be used if you are
using other third party libraries that are using these functions,
otherwise heap_4.c is the preferred memory allocater
( ). If you need to use the stand
library heap then you can use heap_3.c, which wraps the standard library
malloc() and free() to make them thread safe (somewhat crudely), and
then you can direct malloc() to pvPortMalloc() and free() to vPortFree()
(see the link already posted).

I’m not sure what configUSE_NEWLIB_MALLOC_LOCK does - I searched the
kernel source files and can’t find a reference to it.

nickm2018 wrote on Monday, October 08, 2018:

configUSE_NEWLIB_MALLOC_LOCK is added from my platform SDK.

It looks like the simplest work around is to change to a statically allocated EventGroup.

richard_damon wrote on Tuesday, October 09, 2018:

My thought are that the simpler solution is first, rather than using a mutex, have your malloc lock use the scheduler stop, but before calling the stop, see if it is already stopped, and if so don’t stop or restart it (I don’t think the vTaskSuspendAll/ResumeAll handle nesting). The problem with a mutex here is that you might be in another memory allocation, (and thus can’t nest) but switch to another task that is doing something that can’t block (like you hit) that also wants to do a memory allocation. Suspending the schedule says that can’t happen.

rtel wrote on Tuesday, October 09, 2018:

Minor correction - the suspend/resume scheduler functions can be nested