Heap, stack, printf and malloc failure

zetter wrote on Wednesday, August 15, 2018:

Sorry for the somewhat vague subject, but I’m not really sure where our problems generate from. Probably a lack of understanding of the way FreeRTOS allocates memory.

We’re rewriting a project based on an Atxmega256 without any OS where we basically wrote everything ourselves to be based on FreeRTOS. We’re currently having some troubles with memory allocation. Every now and then the vApplicationMallocFailedHook is called, and we’re not really sure why, we should have alot of free memory. Maybe someone with a better understanding of FreeRTOS can shed som light on this.

First of all, we’re running the code on an ATSAM4E16E . The project uses FreeRTOS 10 with FreeRTOS+TCP and FreeRTOS+FAT. The current code samples the internal thermometer from the ATSAM every 100 ms and writes the sample to a text-file on an SD-card. A parrallel task reads from the same file (implemented with a binary semaphore) and uploads the data to a server via Ethernet. A third task is running a small web server. We based the web server code on the sample code available with FreeRTOS+TCP but rewrote it to serve pages located in predefined strings instead of reading files from the flash. The idea is that the production code will also feature the web-server, and we don’t want to write html-files to an external flash memory in production. This of course leads to alot of strings defined in memory, but the pages served are very minimal compared to the amount of SRAM available.

So, first of all, the ATSAM has 128 Kbytes of Embedded SRAM. Since almost all code is run inside FreeRTOS tasks we’ve tried to alllocate as much memory as possible to the FreeRTOS heap. However, the linker complains that we’ve run out of ram during build if we set configTOTAL_HEAP_SIZE to more than 96x1024 (96 kB). It doesn’t really ad up that the code outside the FreeRTOS tasks would require that much memory. Have we missed something? Shouldn’t we’ve be able to set it to close to 128x1024?

Also, we’ve assigned a total of 15 kB of stack stack (4xthe sum of all stack depth given to the tasks in xTaskCreate). But, when we poll xPortGetFreeHeapSize() it’s typically at 40 kB. Shouldnt it be around 80 kB? xPortGetMinimumEverFreeHeapSize() is often down to about 20 kB. Where’s all the heap going? Shouldnt the sum of all available stack be the maximum amount of memory consumed by the tasks? We’re using heap4.

For the individual tasks, the usStackHighWaterMark shows reasonable values with quite low memory usage.

So, to the real problem. Except for the desire to understand the memory allocation better we’re having some crashes. Every now and then the vApplicationMallocFailedHook is called. It seems to occur more often when the system is stressed (reload the web pages fast while doing alot of other stuff), but sometimes it will just occur after a while of normal operation. At first, the malloc failures were quite frequent, but we recently realized the IDLE task didn’t have time to run often enough and therefore didn’t have time to free memory from other tasks. We’ve since decreased the priority of almost all tasks to the lowest priority so that IDLE can run more. This decreased the problem but didn’t make it go away. If a task is running out of memory, shouldn’t the vApplicationStackOverflowHook be called for that task? The stack overflow detection is working and has been called every now and then during development when we’ve allocated to little stack to a task.

Could the problems be related to printf and snprintf? We’ve implemented our own logger so we can call Logger_info(format, …) and Logger_debug(format, …) in the code. The calls are defined as a function that adds FUNCTION, FILE and LINE and RTC time to the debug string via vprintf. It’s used quite extensively in the code.

The other suspect is snprintf which we use alot for building the HTML pages. The html pages are defined as strings, e.g. “

free heap%d” which is used in funtions to add html code with values from various parts of the system to a buffer with snprintf. The buffer is then streamed in chunks to the HTTP client via Ethernet.

Have we misunderstood something in the way FreeRTOS handles memory?

rtel wrote on Wednesday, August 15, 2018:

Which heap memory implementation are you using for FreeRTOS? heap_1.c
to heap_5.c are available in the download.
https://www.freertos.org/a00111.html

Which buffer allocation scheme are you using for the TCP stack?
https://www.freertos.org/FreeRTOS-Plus/FreeRTOS_Plus_TCP/Embedded_Ethernet_Buffer_Management.html

However, the linker
complains that we’ve run out of ram during build if we set
configTOTAL_HEAP_SIZE to more than 96x1024 (96 kB).

You can check the map file output by the compiler to see where the other
RAM is being allocated.

Also, we’ve assigned a total of 15 kB of stack stack (4xthe sum of all
stack depth given to the tasks in xTaskCreate). But, when we poll
xPortGetFreeHeapSize() it’s typically at 40 kB. Shouldnt it be around 80
kB? xPortGetMinimumEverFreeHeapSize() is often down to about 20 kB.
Where’s all the heap going? Shouldnt the sum of all available stack be
the maximum amount of memory consumed by the tasks? We’re using heap4.

Remember stacks are specified in words, not bytes, so if you set the
stack size to 1K when calling xTaskCreate() on a SAM you will get 4K
bytes of stack as the word size is 4 bytes.

So, to the real problem. Except for the desire to understand the memory
allocation better we’re having some crashes. Every now and then the
vApplicationMallocFailedHook is called. It seems to occur more often
when the system is stressed (reload the web pages fast while doing alot
of other stuff), but sometimes it will just occur after a while of
normal operation. At first, the malloc failures were quite frequent, but
we recently realized the IDLE task didn’t have time to run often enough
and therefore didn’t have time to free memory from other tasks. We’ve
since decreased the priority of almost all tasks to the lowest priority
so that IDLE can run more. This decreased the problem but didn’t make it
go away.

In FreeRTOS V10 that should only make a difference if a task deletes
itself. If a task deletes another task then the memory for the deleted
task will get freed immediately. Do you have tasks deleting themselves?

Could the problems be related to printf and snprintf? We’ve implemented
our own logger so we can call Logger_info(format, …) and
Logger_debug(format, …) in the code. The calls are defined as a
function that adds FUNCTION, FILE and LINE and RTC time to the
debug string via vprintf. It’s used quite extensively in the code.

Potentially. Which printf() implementation are you using? The one that
comes with Atmel studio? You might find printf() calls malloc(), but if
you are using any heap allocation scheme other than heap_3.c you would
normally set the size of the heap allocated by the linker to 0 as it
isn’t used - hence something calling the standard C malloc internally
can be a problem. In general, printf() like functions in GCC can use a
LOT of stack, but I don’t think they would normally use much heap.

zetter wrote on Wednesday, August 15, 2018:

Richard Barry, thanks for the quick response!

Which heap memory implementation are you using for FreeRTOS? heap1.c to heap5.c are available in the download.
We’re using heap4.c.

**Which buffer allocation scheme are you using for the TCP stack? **
BufferAllocation_2.c. We haven’t really thought to much about the buffer allocation but rather used the same one as in the demo without looking closely at it. Now, reading about it, it states: “The TCP/IP stack will recover from a failed attempt to allocate a network buffer, however, as the standard heap implementation is used such a failure will result in the malloc failed hook being called (if configUSE_MALLOC_FAILED_HOOK is set to 1 in FreeRTOSConfig.h).” That might be the issue. However, it can’t se why it would consume that much heap.

Remember stacks are specified in words, not bytes, so if you set the
stack size to 1K when calling xTaskCreate() on a SAM you will get 4K
bytes of stack as the word size is 4 bytes.

Yes, we multiply the stack size with sizeof(portSTACK_TYPE) when calculating the number of bytes allocated in total to all tasks. It ads up to 15 kB, so something else is consuming the heap.

In FreeRTOS V10 that should only make a difference if a task deletes
itself. If a task deletes another task then the memory for the deleted
task will get freed immediately. Do you have tasks deleting themselves?

Well, one task that inits the storage (FreeRTOS+FAT) deletes itself. We would prefer not running it in a task, but there were some parts of +FAT that required it related to locking. Otherwise the tasks run continously. I’m not sure about the FreeRTOS_TCP_server though. We have only modified the FreeRTOS_HTTP_server.c, which runs with the FreeRTOS_TCP_server. I guess that code might create stuff dynamically.

Potentially. Which printf() implementation are you using? The one that
comes with Atmel studio? You might find printf() calls malloc(), but if
you are using any heap allocation scheme other than heap_3.c you would
normally set the size of the heap allocated by the linker to 0 as it
isn’t used - hence something calling the standard C malloc internally
can be a problem. In general, printf() like functions in GCC can use a
LOT of stack, but I don’t think they would normally use much heap.

We’re not using Atmel Studio but rather our own makefile with the arm-none-eabi toolchain.

Best regards,
Fredrik

rtel wrote on Wednesday, August 15, 2018:

"The TCP/IP stack will recover from a failed attempt to allocate a
network buffer, however, as the standard heap implementation is used
such a failure will result in the malloc failed hook being called
(if configUSE_MALLOC_FAILED_HOOK is set to 1 in FreeRTOSConfig.h)."
That might be the issue. However, it can't se why it would consume
that much heap.

Put a break point in the malloc failed hook then look a the callstack.
If it is just a network buffer that failed to allocated, perhaps because
there was a lot of network traffic at that time, then the hook can
safely return and the TCP protocol will do whatever is necessary.

arm-none-eabi toolchain.

You could use newlib nano then.

zetter wrote on Thursday, August 16, 2018:

We don’t really have any debug setup yet, using printf-debugging so far. Is there any way to print the call stack in FreeRTOS? Even if it turns out to be the ethernet buffers, I still have a hard time understanding why they would consume that much heap. We have one open socket to the server where we upload data, and I guess the FreeRTOS_TCP_Server opens sockets on incomming requests.

You could use newlib nano then.
I think we’re allready using newlib (probably not nany, ARM stataes that the toolchain is based on newlib on the download pageI). There seems to be some varying opinions on wether this is a good idea or not to use newlib. In the end, some version of printf-stdarg.c seems to be recommended. Would you agree?

rtel wrote on Thursday, August 16, 2018:

To see the stack you really need a debugger.

A network could conceivably consume many packets if there was heavy
traffic, especially if there are lots of broadcast messages. Make sure
you don’t have the network card in promiscuous mode - basically perform
as much filtering in the hardware, or at least before you need a buffer,
as possible.

zetter wrote on Thursday, August 16, 2018:

Richard, ok, thanks for the great support!

We have ipconfigNETWORK_MTU set to 1500 and ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS set to 45 which I guess would make a total of ~66 kB buffer allocation possible. If that would happen, the heap would defenetely run out. Maybe we should decrease theese values significantly? I’m unsure about the hardware filtering. We’re using the gmac, gmac_raw, gmac_phy and ethernet_phy drivers from ASF 3.4. We’ve based our network_interface.c on the example found in the FreeRTOS+TCP demo for Atsam4E16E with some minor changes (If I remember correctly, as missing include). If filtering of packets isn’t working, and the buffers can sum up to ~66kB + some overhead, this would most lilkely cause issues since the test unit is connected to our office network with alot of traffic.

Also, maybe we should switch to printf-stdarg.c from FreeRTOS+TCP just to avoid possible excessive heap usage.

zetter wrote on Friday, August 17, 2018:

For Richard and anyone who might stumble upon similar problems in the future:

We switched to printf-stdarg.c from FreeRTOS+TCP which drastically decreased stack usage for most tasks. Works well, even though it didn’t do much for heap usage.

We analyzed the heap usage a little more, and as Richard suggested above, it seems to be related to network traffic. We therefore decreased the ipconfigNETWORK_MTU to 586 (lowest possible with DHCP) and ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS to 10 (just a low number as suggested by the FreeRTOS example config page for low RAM usage). This freed up a bunch of heap during heavy network usage, and it looks like it has solved our problems.

We’ll keep analysing the behavour and keep logging the findings here if the problem wasn’t solved by the above.

Thanks again for the support.

rtel wrote on Friday, August 17, 2018:

Thanks for taking the time to report back.