Hello. I’m implementing an http server on STM32F746 using FreeRTOS. The system has 3 tasks in total and it serves two different .html files containing media files(images in this case). Everything worked great until I decided to open both webpages at the same time: one loaded perfectly but the other caused a malloc fail (program was reaching vApplicationMallocFailedHook) . Then I realized (using xPortGetFreeHeapSize function ) that I was running out of heap when I request the second webpage. At this time I tried to increase configTOTAL_HEAP_SIZE from 16x1024 to 30x1024. The lack of heap problem was solved but with that I got another problem: both webpages not loading the images/loading indefinitely. Sometimes I was getting strange symbols on my CubeIDE console(I use SWV ITM Data console for logging) , sometimes one of the tasks was not being created at all. I dropped configTOTAL_HEAP_SIZE down to 25*1025 and the first page loads fully but the second still misses contents/hangs.
I found a thread describing a similiar issue here and tried to use static tasks creation but I got the same problems and sometimes worse.
I increased _Min_Heap_Size = 0x4000; and _Min_Stack_Size = 0x4000; in my .ld file but got no improvement. Not getting any stack overflow.
The main reply text is all jumbled; I typed this between work related tasks.
The TL;DR is that I think you may have DMA crossing memory block boundaries, and that you can fix it by carefully placing where the buffers are being allocated from to not cross those boundaries.
There are some possibilities I can think of; here’s my line of thought:
Assuming you’re not sizing the heap through the linker control file (which memory allocator are you using?), the heap ends up in the BSS section of the ELF image file.
The STM32F746 has several different, and in some places non-contiguous, blocks of embedded SRAM:
0x0000_0000 - 0x0000_3FFF - ITCM-RAM (16 Kbytes)
0x2000_0000 - 0x2000_FFFF - DTCM-RAM (64 Kbytes)
0x2001_0000 - 0x2004_BFFF - SRAM1 (240 Kbytes)
0x2004_C000 - 0x2004_FFFF - SRAM2 (16 Kbytes)
The tightly coupled instruction memory (ITCM) address range is not contiguous with the tightly coupled data, SRAM1 and SRAM2, however it is the best place for code to be stored. The first bit of that address range is where the interrupt vectors go, leaving less memory for the executable code for your application or FreeRTOS kernel, so by default, it puts the .text segments in SRAM1 (probably)
The linker script determines where any given ELF segment ends up
If the linker script is naive, it might ignore DTCM (or just put the linker defined program stack and heap, which FreeRTOS doesn’t use by default) and treat SRAM1 + SRAM2 as a contiguous 256K memory range. The text segments (instructions) and read-only data go first, so it’s possible that BSS, or some other data segment, is crossing the SRAM1 to SRAM2 boundary
I don’t think that a single DMA operation can cross boundaries between different memories
If a buffer used by the HTTP server is crossing the SRAM1 - SRAM2 boundary, it’s likely that the DMA operation feeding the Ethernet controller will abort; this will cause some symptoms that may match what you’re seeing
If my reasoning is correct, a possible solution is to create or edit the project linker script to explicitly place the allocator heap in one of the SRAM banks (probably SRAM1) starting at a specific location that ensures it does not cross RAM bank boundaries. If you’re using dynamic task creation, the 4K stack size may be wasting memory (if the code is creating large arrays or structures on the stack, this is not necessarily true).
My preference when working with FreeRTOS is to not use dynamic object allocation in the kernel, and always statically define all of the Task structures and stacks, along with all the other FreeRTOS objects you’re using. This means the heap is not getting used for the kernel objects or your task. If I can’t make the code use statically allocated buffers, I use a heap manager that automatically uses the heap section defined in the linker script rather than declaring a static array large enough for configTOTAL_HEAP_SIZE. I don’t recall which allocator that one is. It is possible to give the compiler a segment name on a per statically defined object definition basis, so the linker script will group like-named segments into a contiguous area within bounds that you can define in the linker script. A well designed linker script, tailored to the application, and a bunch of #pragmas or __attribute(()) to tell the compiler what segments to put code and statically defined data at can let you optimize memory usage and improve performance by putting critical code and data in the ITCM and DTCM, respectively, and leave you more space for things in SRAM1 and SRAM2.
Thank you @danielglasser for the detailed response it added more clarity. I actually placed my DMATx and DMARx descriptors on the very beggining of SRAM2(Memory_B1,Memory_B2) my linker script:
Usually the stack pointer starts from the very high memory address (0x2004_FFFF in my case) and increase downwards. The heap on its turn, goes from after the .bss region upwards, according my .map file. So the hypothesis could be that heap is growing up to SRAM2 and crossing SRAM2, corrupting the DMATx and Rx descriptors.
The limitation of DTCM memory is that DMA has no access to it, thus they can’t be used to store ethernet DMA descriptors.