Making sure a deleted task's OS/clib resources have been freed

I know this has been asked before but I’ve been unable to find an answer, hopefully someone can put me right quickly.

When a task deletes itself by calling vTaskDelete(NULL) and returning, I am aware that the idle task needs to run to clean up any OS resources the task occupied. I have been calling vTaskDelay() from the parent task with a nice long time in it (seconds), with no other tasks running, to let the idle task do the clean-up, however I am still seeing memory being leaked.

I believe what is happening is that memory allocated due to library calls (newlib) made from the task are not being freed. For instance, the first call to malloc() from the new task, say for just four bytes of storage, leads to the heap being reduced by something like1400 bytes, when only four bytes are recovered by the accompanying free(). Subsequent calls to malloc() from the new task behave normally. I see similar heap loss with other library functions called from the new task that require re-entrancy protection (e.g. strtok()).

Obviously the RTOS needs to handle C library re-entrancy (configUSE_NEWLIB_REENTRANT is 1) but what I don’t understand is why I am unable to recover that memory when the task is deleted. Might I be missing a configuration option somewhere? Is my scheme of calling vTaskDelay() actually not working at all and I should be doing something else? Or am I expecting too much?

This is on an STM32F4 processor incorporating the Dave Nadler _sbrk() implementation with FreeRTOS v10.2.1.

If the task has allocated any memory, or the library allocated memory for that task, something needs to free that memory. FreeRTOS itself has no knowledge of any memory that the task has allocated for itself, so such memory will not be freed by FreeRTOS. FreeRTOS know enough to free up the structures that IT allocated when setting up the task. If newly while running the task made memory allocations for task buffers, then it is the responsibility of newly, maybe with the help of the task, to free those allocations. You may need to step through that initial call to malloc to see what new buffers are being setup and if newlib provides a mechanism to release those.

Yes, obviously, all malloc() calls have accompanying free() calls, there is no user-driven malloc() left hanging in this scenario. I will try a simple test task and post the result here.

First, unless you are using an old version of FreeRTOS, you do not need to let the idle task run for the resources allocated by the scheduler to a task (the task control block and stack) to be freed when a task is deleted unless a task deletes itself.

To the point of your question: Can you put a break point on the memory allocator’s internals to see what is allocating the memory that is not getting freed? I have seen unexpected things allocate memory (printf(), rand(), etc.), but in this case I suspect it will be the C run time allocating something for itself.

Seems you create your tasks with dynamic stack/TCB allocation and I think it could be caused by heap fragmentation effects.
If for instance the 1st task had a stack of 1k but the 2nd task needs 1.1k than a new block from _sbrk might be taken even if the 1st task was already cleaned up by idle task and it’s 1k stack block would be free to be (re-)used but it’s simply too small.
I guess that’s what you see when checking the heap after the 4 byte allocation of the 2nd task, which is surely not taking 1400 byte itself.
However, even if there are a few surprising (raw) memory allocations from _sbrk at the beginning chances are, that this will settle during runtime. This depends on your application allocation behavior and also on the heap implementation used, which usually tries to merge free’d blocks to larger free blocks with more or less effort to mitigate heap fragmentation.
Try to avoid (legacy) stateful libc functions due to reentrancy and internal memory allocation issues. Sometimes there are better versions like strtok_r.
You might think about creating your task statically i.e. with stack and TCB buffer provided by the caller to avoid workarounds like waiting (and praying) that the idle task has freed up the task resources of deleted tasks.
Yes, there are some pitfalls with dynamic task creation/deletion related to resource management and task synchronization. Hence it’s often recommended to design and use a fixed set of tasks (if possible).

It’s busy in here! Thanks for all the suggestions guys. First, here’s a simple test example to show what I mean:

// A test task launched by the application task
static void testTask(void *pParameters)
{
    size_t x;
    void *pMem;

    x = xPortGetFreeHeapSize();
    printf("*** testTask started, %d byte(s) of heap free.\n", x);

    printf("malloc()ing 1 byte...\n");
    pMem = malloc(1);
    printf("%d byte(s) of heap now free (%d byte(s) consumed).\n",
           xPortGetFreeHeapSize(), x - xPortGetFreeHeapSize());

    printf("free()ing that byte again...\n");
    free(pMem);
    printf("%d byte(s) of heap now free (still %d byte(s) consumed).\n",
           xPortGetFreeHeapSize(), x - xPortGetFreeHeapSize());

    printf("testTask ending...\n");

    vTaskDelete(NULL);
}

// The application task
static void appTask(void *pParam)
{
    TaskHandle_t handle;

    printf("*** appTask started, %d byte(s) of heap free.\n",
           xPortGetFreeHeapSize());

    if (xTaskCreate(testTask, NULL,
                    256, NULL, 5, &handle) == pdPASS) {
        printf("Task is running, waiting for 1 second...\n");
        vTaskDelay(1000 / portTICK_PERIOD_MS);
    }

    printf("appTask completed, %d byte(s) of heap now free.\n",
           xPortGetFreeHeapSize());

    while (1) {}
}

// Entry point
int main(void)
{
    TaskHandle_t handle;

    // Reset all peripherals, initialize the Flash interface and the Systick
    HAL_Init();

    // Configure the system clock
    systemClockConfig();

    printf("\n\nStarting the RTOS to run appTask...\n");
    if (xTaskCreate(appTask, NULL, 256, NULL, 2, &handle) == pdPASS) {
        // Start the scheduler.
        vTaskStartScheduler();
    }

    // Should never get here
    assert(false);

    return 0;
}

This results in the following output:

Starting the RTOS to run appTask...
*** appTask started, 113020 byte(s) of heap free.
*** testTask started, 110324 byte(s) of heap free.
malloc()ing 1 byte...
108844 byte(s) of heap now free (1480 byte(s) consumed).
free()ing that byte again...
108856 byte(s) of heap now free (still 1468 byte(s) consumed).
testTask ending...
Task is running, waiting for 1 second...
appTask completed, 110084 byte(s) of heap now free.

I’ll try putting some break points in next.

On the specifics: good to know I don’t have to let the idle task run to clean up resources, that’s useful. On the fragmentation question, I’d assumed that newlibs mallinfo, specifically fordblks, was literally that, the free blocks. Whether they are useable or not due to fragmentation is another matter, they would still be reported as free would they not?

The implementation of xPortGetFreeHeapSize() is just Dave Nadler’s, this.

First, since you are doing a vTaskDelete(NULL) you will need to let idle run to reclaim, but in my mind that should be happening in any well designed system. Any task that does not block for SOMETHING periodically, needs to be at task priority 0 to avoid task starvation.

Second, my concern is that newly in the malloc call decided to allocate some internal buffers for the task. (One issue is that there are lots of variations of newly configurations, and I don’t know which one is being used here). The real test is to step through that call to malloc and see what it is doing, and what it is allocating.

Yes you’re quite right: with a break-point on _sbrk_r() (see call list below) I can see that it is the first call to printf() from a new task which is malloc() ating 1468 bytes of memory and never giving it back. I had kind of assumed that the re-entrancy protection in FreeRTOS helped with this but I guess it only stores pointers for newlib and there’s nothing to tell newlib it’s time to clean up. Bugger. Not sure what to do now. Could switch to using blah_r() but would that actually help if printf() needs 1468 bytes it’s still going to allocate…? This is the newlib nano that is provided by ST in their STM32F4 FW.

EDIT: looking at what I believe to be the source code for the _reclaim_reent() which FreeRTOS correctly calls, the last comment in the function is “/* Malloc memory not reclaimed; no good way to return memory anyway. */”. Like free() is not a good way…?

- in main() by _puts_r() with reent = 0x200006f0 asking for 436 bytes: this is the first printf().
- in main() by _puts_r() with reent = 0x200006f0 asking for 1032 bytes: this is also the first printf().
- in main() by xTaskCreate() with reent = 0x200006f0 asking for 1032 bytes: creating appTask.
- in main() by xTaskCreate() with reent = 0x200006f0 asking for 196 bytes: TCB for appTask?
- in main() by vTaskStartScheduler() with reent = 0x200006f0 asking for 520 bytes: kicking off FreeRTOS.
- in main() by vTaskStartScheduler() with reent = 0x200006f0 asking for 196 bytes: TCB for idle task?.

- in appTask() by _mallinfo_r() with reent = 0x2000414c asking for zero bytes: xPortGetFreeHeapSize(). 
- in appTask() by printf() with reent = 0x2000414c asking for 436 bytes: me printing the heap size.
- in appTask() by printf() with reent = 0x2000414c asking for 1032 bytes: also me printing the heap size.
- in appTask() by xTaskCreate() with reent = 0x2000414c asking for 1032 bytes: creating testTask.
- in appTask() by xTaskCreate() with reent = 0x2000414c asking for 196 bytes: TCB for appTask?

- in testTask() by _mallinfo_r() with reent = 0x20004ea4 asking for zero bytes: xPortGetFreeHeapSize(). 
- in testTask() by printf() with reent = 0x20004ea4 asking for 436 bytes: me printing the heap size.
- in testTask() by printf() with reent = 0x20004ea4 asking for 1032 bytes: also me printing the heap size.
- at last, in testTask() by _malloc_r() with reent = 0x20004ea4 asking for 12 bytes: my malloc() plus overhead.

That may be a question to ask on some newlib support forum. The _reclaim_reent() is suppose to return all the memory newlib used. I think the ‘Malloc memory’ the comment is referring to is general memory allocated by malloc in the task (which well might not want to be used)

One thing to check is if the allocation wasn’t a ‘task specific’ buffer, but was doing something like allocating space for something like stdout, and thus is a one time cost for the whole program, and not a leakage per task. As I mentioned, I would try stepping through that first call thought its setup and see what is allocated and what it is used for. (and compare that to the newlib code to figure out what options your system was built with).

Yes, thanks for all your help, it’s certainly not anything to do with FreeRTOS whatever it is. The code in question, in this particular case, is test code which I am, of course, using to check for memory leaks. I don’t think any of our actual operational code suffers from this issue, thankfully.

So for now I can take account of it in the memory leak testing and that is sufficient. Have a good [rest of] your Saturday night :-).

This looks like a FreeRTOS bug, because of a misunderstanding about newlib behavior. Here’s what I think is happening when FreeRTOS newlib reentrancy is enabled:

  1. newlib often references the global _impure_ptr to access the reentrancy structure for the current context. newlib starts with _impure_ptr pointing to an initial reentrancy structure. In a single-threaded environment that’s all that is ever used.

  2. FreeRTOS does not re-use that initial reent structure??

  3. FreeRTOS allocates a new reentrancy structure each time a new task is created (actually allocated as part of TCB), and sets _impure_ptr on each context switch.

OK so far. Unfortunately:

  1. On task deletion, FreeRTOS calls _reclaim_reent() - which seems appropriate. Note _reclaim_reent() frees any allocated reentrancy sub-structures allocated, using pointers within reentrancy struct. Unfortunately the first thing newlib does, trying to protect itself, is ignore the request if _impure_ptr still points to the reentrancy structure to be reclaimed. First _reclaim_reent() line is:

    if (ptr != _impure_ptr)

Sooo… I think what FreeRTOS should do is set _impure_ptr to the initial static pool prior requesting reclamation. Setting it to 0 would be a bit scary.

@rtel Richard?

Hope that helps (and hope I’m understanding the code correctly),
Thanks,
Best Regards, Dave

PS: Let me know if you agree and want me to attempt a patch and pull request…

PPS: @RobMeades - You can check this by setting _impure_ptr to 0 immediately prior the reclaim call in task.c …

Actually, it sounds like it is only a problem for self deletion, as that is the case where ptr will == _impure_ptr, since in this case, we WILL switch to a new task, and the starting of that task will set _impure_ptr, we should be safe, and the 0 trap a good check on the many machines where 0 is read only memory, setting it to zero for the case of self deletion only, should be safe.

@richard-damon Yup, Thanks!

Actually it looks like self deletion is already covered correctly here.

So a task never calls prvDeleteTCB() for itself, right?

The code he posted ONLY had self deletion.

One thought I had was does printf do something to ‘open’ stdout, and that take some fixed resources.

That idea makes sense to me. To test for a leak then, we’d have to repeat the process of task create, task run, task delete, multiple times. The first iteration might indicate a leak, but subsequent iterations would hopefully indicate no (further) leak. @RobMeades that test would be worth doing.

I hacked _impure_ptr to 0 in tasks.c:

#if ( configUSE_NEWLIB_REENTRANT == 1 )
{
    _impure_ptr = 0;
    _reclaim_reent( &( pxTCB->xNewLib_reent ) );
}

…and re-ran the above test code but, unfortunately, the outcome was the same, still 1468 bytes consumed.

I don’t fully understand the “delete task or not” discussion, so I tried removing vTaskDelete(NULL) from the end of testTask and of course I just hit the assert in prvTaskExitError().

Then, to @JeffTenney’s point, I cut and pasted another creation of testTask into the same test code and the result is below. Just to entertain us, remembering that printf() appeared to make two allocations, 436 and 1032 bytes, it looks like the allocation of 436 happens once, when printf() is called in the first testTask created by appTask (noting that this is NOT the first time printf() has been called; appTask calls it first) while the allocation of 1032 happens the first time printf() is called in both of the testTasks created by appTask. Happy to try out any more suggestions, though I’m going sleepy-byes for around 8 hours now.

Starting the RTOS to run appTask...
*** appTask started, 113012 byte(s) of heap free.
*** testTask started, 110316 byte(s) of heap free.
malloc()ing 1 byte...
108836 byte(s) of heap now free (1480 byte(s) consumed).
free()ing that byte again...
108848 byte(s) of heap now free (still 1468 byte(s) consumed).
testTask ending...
Task is running, waiting for 1 second...
*** Doing it again, 110076 byte(s) of heap now free.
*** testTask started, 108848 byte(s) of heap free.
malloc()ing 1 byte...
107804 byte(s) of heap now free (1044 byte(s) consumed).
free()ing that byte again...
107816 byte(s) of heap now free (still 1032 byte(s) consumed).
testTask ending...

Right, an IO control structure is allocated, as I found out here: newlib and FreeRTOS
I thought it would be cleaned up in _reclaim_reent but perhaps its not doing what I expected. The newlib code is a bit confusing with a thicket of conditional macros to accommodate different newlib build options.

@RobMeades - could you post your newlib.h and _newlib_version.h files?

Sorry I don’t see what’s happening…
Thanks,
Best Regards, Dave

The I/O Control Structures aren’t task specific, but I think what is being allocated is the buffered output buffer for stdout, which I think is shared by all threads.

Certainly: files attached (from STM32Cube 1.4.0, matching the newlib libraries from there hopefully):

newlib.zip (2.1 KB)