Releasing Resources on Task Exit/Delete

system · August 11, 2016, 3:23pm

herlien wrote on Thursday, August 11, 2016:

FreeRTOS v9.0.0 on PIC32MX using MPLabX v3.35 with XC32 v1.42. I.e., latest of everything, as far as I can tell.

I’m porting a data acquisition / instrument controller system from a home-grown OS on a 68332 to FreeRTOS on PIC32MX. Part of what I’m porting is a mechanism to release system resources when a task exits or is deleted. To do so, I have a porting layer that wraps functions for task create/delete, semaphore creation/give/take, malloc/free, etc. It stores information about the resources used in ThreadLocalStorage, and uses that information to:

release resources when the task exits or is deleted. E.g., frees memory, gives back MUTEX sems, etc.
for instrumentation and display purposes, to allow the user and or developer (me) to observe system state.

The malloc/free works fine. Giving back semaphores works when the task voluntarily exits, since I catch that fact and release the semaphores from within that task’s context. The problem is when the task is externally deleted by another task. I fail an assertion at line 3783 in file tasks.c. From inspection, it appears that the assertion is checking that the caller owns the MUTEX. Questions:

Why an assertion? I could perhaps see checking this condition in the code and returing a failure for the xSemaphoreGive() call. But an assertion seems rather extreme.
Can someone please elucidate this situation, and come up with some suggestion for how I may accomplish my goal? That goal is to make sure that the MUTEX is released when the task dies.

I’ve now changed my code to NOT try to release semaphores in this case. From some cursory testing, it appears that the semaphore is released anyway. Can someone please confirm this?

However, in this case (semaphore had been taken by a task that subsequently exits), my system status display routine gets and displays a garbage string when using
pcTaskGetName(xSemaphoreGetMutexHolder(mySemaphore))

Again, could someone verify and advise?

Thank you very much for your help.

rtel · August 12, 2016, 7:43am

rtel wrote on Friday, August 12, 2016:

If a semaphore is held by a task, and that task gets deleted, then there is nothing in the code that will automatically release the semaphore.

Looking at the source code I don’t think there is an easy way around this. The mutex can be reset by passing its handle into xQueueReset(), but then to make it a mutex rather than a queue you would need to call prvInitialiseMutex() too - and that function is not publicly accessible.

Perhaps, if you are 100% sure there are no other tasks blocked on the mutex, it could be deleted then re-created?

system · August 16, 2016, 10:10pm

herlien wrote on Tuesday, August 16, 2016:

Thank you. In general, there is no way to assure that no other tasks are blocked on the mutex. Indeed, the most common (and useful) scenario is:

User notices that system appears hanged
Using inspection routines, user ascertains that many threads are pending on a particular mutex, and determines which thread owns that mutex
User kills the thread that owns the mutex.

In the existing system, killing the mutex owner releases the semaphore, allowing the waiting threads to acquire it and run, in order. Unless I can resolve this dilemma, it appears there will be no recourse other than to reboot the system.

I should add that this typically takes place over a comms link to a system deployed at sea. In some cases, it’s over a cabled link to a system some hundreds or thousands of meters under the sea.

rtel · August 17, 2016, 8:52am

rtel wrote on Wednesday, August 17, 2016:

Yikes. It sounds like you have a deadlock built into your system, which is deployed in an inaccessible place. Could you re-architect the system so the deadlock is avoided in the first place?

system · August 17, 2016, 8:01pm

herlien wrote on Wednesday, August 17, 2016:

It’s not that, so much. It’s that periodically someone will integrate a new instrument handler that’s not well behaved. Hasn’t happened in a long time, as our instrument suite is relatively stable for most deployment scenarios. But it gives me a warm fuzzy feeling to know that, if a scientist decides to add a new strange instrument, or worse yet, a homegrown instrument where the instrument itself is not stable, then we have mechanisms in place in case things go south. Of course, the optimal solution is to always thoroughly debug and test any new instrument handlers. But the scientists don’t always understand software engineering principals.

richard-damon · August 18, 2016, 12:27am

richard_damon wrote on Thursday, August 18, 2016:

The issue is we have tasks, which are in many ways, more like a thread than a process. On a ‘big’ system, where you have a number of independant processes, each protected from each other and sharing information with each other only via OS provided connections, it is standard for the OS to keep track of resources allocated to a process and automatically free those when the process terminates.

In Tasks, like in threads, there isn’t a strong wall between them, so things are shared on a much more ad hoc manner, and defining ‘ownership’ really can be difficult. It is quite possible for one task to create something and give it to another, and if the thing went away just because the first task died it would cause a lot of trouble. Mutexs are a bit special here, as when a task task it, it owns the mutex, but FreeRTOS doesn’t keep a central repository listing of all the mutexes currently in the system.

I find that, in general, in an enviroment like FreeRTOS, the ‘random’ aborting of a task is generally a bad idea, if something has gone wrong, you really need to reboot to fix things, as you have no idea what else might be in a ‘bad’ state.

system · August 18, 2016, 4:44pm

herlien wrote on Thursday, August 18, 2016:

Thank you Richard. I don’t disagree with anything you said. I do have a bit of a problem with one particular implementation decision in FreeRTOS. I believe (and I wrote my original post partially to get confirmation, as I can’t be sure) that somewhere in the software stack executed with xSemaphoreGive(), it uses an assertion to ensure that the caller actually owns the mutex. Given your statements about fluid ownership, I would prefer the OS to simply allow the mutex give to take place. But as I said, I may be misinterpreting what’s happening under the hood.

I also agree that killing a task is generally a bad idea. But when something of this nature occurs, it may be desireable to salvage whatever you can of the deployment (i.e. try to accomplish the scientific goals), and then recover the equipment for post mortem and further testing. Especially since ship time typically costs on the order of $30K/day, with a day each required for deployment and recovery.

tlafleur · August 18, 2016, 5:42pm

tlafleur wrote on Thursday, August 18, 2016:

Hi Bob… It been a long time from the Kildall-DRI-CP/M days…
I’v been using FreeRTOS for over 10 years now in my projects…

tom [at] lafleur (dot) us

richard-damon · August 19, 2016, 1:48am

richard_damon wrote on Friday, August 19, 2016:

Mutexes check that the giver is the same task as which took the Mutex. The issue with trying to automatically give any Mutex that was taken is that FreeRTOS has not list of Mutexes to check to see if the task has a hold on any Mutex. My understanding is that there are technical reasons relating to possible priority inheritance that make this needed (the task holding the mutex might have had its priority raised if a higher priority task is waiting on the mutex.)

Topic		Replies	Views
deleting task wich have a semaphore Kernel	1	189	April 10, 2010
Delete task holding a mutex. Is it released? Kernel	9	514	May 28, 2011
Task-Mutex inter-dependency Kernel	4	198	May 23, 2019
Semaphore & vTaskDelete Kernel	1	138	March 1, 2012
Delete a task that waits for a binary semaphore Kernel	8	255	January 24, 2017

Releasing Resources on Task Exit/Delete

Related topics