pvMalloc "check"

Hi,
I hope this question will make sense, as my brain is currently still a bit fried after debugging an annoying issue (in our own code, of course).

The problem we caused is that we allocated memory in FreeRTOS heap, and wrote outside the buffer. Just one byte though. But that’s enough to create really hard to track issues :).

What happened in some details:

  • at some point (early in startup of our app) we wrote outside a buffer in FreeRTOS heap
  • a while later, once our product is online through modem/eth, I print some FreeRTOS heap stats
    That resulted in a MemManage fault.

After 4 hours, I finally found the problem in our own code, but I found it by adding this in the pvPortMalloc

void * pvPortMalloc( size_t xWantedSize )
{
    /* If malloc is called for size zero, we don't have to do anything */
    if (xWantedSize == 0)
    {
        return NULL;
    }

    BlockLink_t * pxBlock;
    BlockLink_t * pxPreviousBlock;
    BlockLink_t * pxNewBlockLink;
    void * pvReturn = NULL;
    size_t xAdditionalRequiredSize;

    vTaskSuspendAll();
    {
        // Poor mans solution to force the mem man fault earlier.... I hope...
        HeapStats_t xHeapStats;
        vPortGetHeapStats(&xHeapStats );

This worked, lucky me. So I completely understand that this is not a guarantee, because the GetHeapStats will hard fault between two mallocs. One malloc will corrupt the FreeRTOS heap, and afterwards the GetHeapStats might explode. I realize while typing, a better place for this check would probably be AFTER marking the block as used, then I might have a chance of immediately getting the hard fault.

Anyway, my question is, can you think of a better way to (probably conditionally on a FreeRTOS config compiled option) of doing this check.

What happens in my case, is that one byte of the next blocks administration is written to 0x0, which has a fair chance of causing a hardfault when iterating over all free blocks. But I can think of many situations where it would still go unnoticed.

I am trying to find some mechanism which will fire when memory corruption is detected. Maybe it would be nice to enable/disable this check for performance reasons (although in our case, we are not super time critical, some extra check on alloc won’t be visible in our application’s performance anyway).

Could you advise me anything? Or is there maybe something already in more recent FreeRTOS releases? I am on V10.6.1.

Hope this question makes any sense at all
Thanks for any advice

Best regards, Bas

Red zones in front of and behind the user-visible memory block could be added and verified on pvPortFree or by an explict e.g. pvPortMallocCheck call quite similar to the FreeRTOS (SW) stack checking. But this is a bit costly and I’ve done this by extra compile option for my own allocator once. Maybe this could be added to the FreeRTOS provided heap implementations, too.
But it’s by far no silver bullet and doesn’t reliably detect all kind of memory corruptions and can’t be semi-automically done by the scheduler as done for task stacks. So it’s of quite limited use and could cause just the illusion of a memory corruption free application..
IMO you’re better off using (memory) safe language features and programming techniques.
There are quite a number of such features e.g. in (decent) C++ standards which just avoid those dreaded buffer overruns by design e.g. using range based for loops or std::span instead of legacy pointer + size buffer descriptors etc.
Heavy duty dynamic memory instrumentation as known fom bigger/PC Linux platforms is usually simply not applicable due to limited resources of a MCU platform and also due to their runtime impact.

No sure whether it will catch your issue or not, but you may look at configENABLE_HEAP_PROTECTOR which is a mechanism designed to catch heap corruptions: FreeRTOS-Kernel/portable/MemMang/heap_4.c at main · FreeRTOS/FreeRTOS-Kernel · GitHub.

Hello Basprins,

As @aggarg highlighted, to catch silent heap corruption, you can do the following:

  1. Upgrade the kernel to v11.x.x
    where you can enable the configENABLE_HEAP_PROTECTOR macro used in heap_4.c and heap_5.c files.

  2. Turn on the heap protector

    /* FreeRTOSConfig.h */
    #define configENABLE_HEAP_PROTECTOR   1
    
  3. Provide the canary implementation as part of your application

    void vApplicationGetRandomHeapCanary( portPOINTER_SIZE_TYPE * pxHeapCanary )
    {
        *pxHeapCanary = HW_RNG();   /* any non-zero random value */
    }
    

    FreeRTOS calls this hook as part of the prvHeapInit function to get a random value to be used as canary. Make sure your hardware RNG or another entropy source is ready by then. For early bring-up you can fall back to a fixed non-zero constant.

  4. Keep assertions enabled
    heapPROTECT_BLOCK_POINTER XORs the pointers with a random canary value, heap overflows will result in randomly unpredictable pointer values which will be caught by
    heapVALIDATE_BLOCK_POINTER assert. Hence, you’d need to define your configASSERT implementation.

    #define configASSERT( x ) if( !( x ) ) { /* Your Implementation */ }