Cortex M4, Core Coupled Memory (CCM)

dkure wrote on Tuesday, February 11, 2014:

Hi All,

I am looking into using the Core Coupled Memory in my project and I was wondering how other people are going about this with FreeRTOS.

(Core-Coupled memory, is a special area of memory offset from standard RAM, CCM can’t be DMA’ed to, but it does provide no-wait state access to memory. Benifits of CCM would be to hold run time code of functions needing to be highly optimised or lookup tables that need to be accessed quickly, rather than waiting on flash, which requires wait states),

I have currently setup a section in my linker script which sets aside 64kB of memory for CCM. However I am only able to place static variables into this region, and can’t use it as a generic RAM section. Note, I am not trying to take advantage of CCM for extra speed, I want to take advantage of CCM for extra memory. On the STM32F4 that I am using, 128kB is standard RAM, 64kB is CCM, so its a significate portion of avaliable memory.

I am thinking of extending heap_x.c to provide support for multiple memory segements, one for standard RAM, and one for CCM memory.

This would allow me to dynamically allocate variables into CCM, rather than only being able to allocate them at compile time.

Does anyone have any thoughts on this? Or has anyone implemented something along these lines already.

rtel wrote on Tuesday, February 11, 2014:

The 4 memory allocation functions are kept in the port layer so you can modify them however you like, or provide your own. It seems to be relatively common for people to what to split the heap into several memory regions. You could do this quite easily by adding an attribute to the heap block (which is just a uint8_t array) to force it into which ever memory region you want. If you need a heap that is larger than the CCM block then you will need a little extra effort to update the heap_x.c file to use more than one block of RAM as its heap. It could try to allocate form one block first, and if that fails, try to allocate from the second block.


hawk777 wrote on Friday, February 14, 2014:

Personally, I wrote a port (because none of the three existing ones did quite what I wanted), and I implemented pvPortMallocAligned (which is only used for allocating stacks and nothing else) in CCM, while pvPortMalloc uses normal RAM. Thus, stacks (which hold locals, which should be much more common and much more frequently accessed than globals in good code) are fast, but you have to use heap or global data for DMA, which is IMO a reasonable tradeoff.