FreeRTOS Heap corruption

segonzal · July 27, 2023, 6:58am

Hello,

I’m playing around with a FreeRTOS app and I’ve come up with something that looks a bit strange, some head corruption, the sequence goes on something like this:

An application task with a priority higher than the priority of the timer task is running and is about to delete its RTOS primitives before terminating. One of these primitives is an event group.
An ISR runs and signals the event group using xEventGroupSetBitsFromISR(). That sends a message to the low-priority timer task, and that message contains a pointer to the event group object.
When the ISR exits, the application task with a priority higher than the priority of the timer task continues running. The task calls vEventGroupDelete() to delete the event group and then terminates itself.
The FreeRTOS timer task is scheduled to run. It picks up the message that’s asking it to signal the event flags object, but that object was deleted already in step 3. The signalling changes some state in the object, but that ends up stepping on some random part of the heap and corrupts the heap.
The application eventually crashes when allocating from the corrupted heap.

I think that vEventGroupDelete() does not say if you’re not allowed to delete the object from a task that has higher priority than the timer task if the object was recently signalled from an ISR. Does it makes sense that if the kernel implements the signalling operation via some deferred call, that deferred call would be canceled in the delete operation if the deferred call has not yet run?

I would like to understand this better.

Also, to avoid accessing memory that has been freed, should the message sent by xEventGroupSetBitsFromISR() to the timer task’s queue be somehow canceled by vEventGroupDelete() if the timer task hasn’t run yet, or should vEventGroupDelete() document some extra limitation regarding when you’re allowed to delete the Event Group object?

Kind regards,
Sergio.

aggarg · July 27, 2023, 8:36am

This is a case of “use after free”. I do not think there is way to cancel the call posted to timer queue. Can you not make the timer task highest priority? Why do you need to delete RTOS primitives and tasks?

richard-damon · July 27, 2023, 12:05pm

A basic rule for handling the “lifetime” of objects in a multi-threaded system is that you can not end the life of an object unless you KNOW you have exclusive access to it, that is, no other thread might attempt an access.

Since the ISR might access the EventGroup, the task can’t delete the EventGroup until after it makes sure the ISR will not access it anymore, and since the signaling from the ISR to the EventGroup is done via the Timer/Service task, if the ISR has already signaled it, the task can’t delete the EventGroup until that signal is processed. (That second part of the problem can only occur if the task isn’t lower priority than the Timer/Service task).

Your operation is essentially fundamentally built with a race condition, so needs to be modified to eliminate it.

Note, unless this task is the only task to be using the EventGroup, you also have the problem that any other task using it needs to handle the EventGroup going away.

My normal answer is that these sorts of primitives don’t get deleted and then recreated later, but just live for the life of the program. They are created before the scheduler starts and just iv

segonzal · July 28, 2023, 4:32am

I have tested making the Timer Task with the higher priority before and it does solve the task priority problem, I will take these into consideration so that I can accommodate my architecture.

Thank you very much for your answers, they have helped a lot!