SMP Porting Guide and Cache Coherency - Dual Core Cortex M7

Hello, my name is André Zeps and I’m an engineer at Kostal Automotive in Germany.

I’m pretty new to the FreeRTOS forum but I thought it would make sense to make an account now.

I’m currently trying to port the SMP branch to an Infineon Traveo II which offers a dual core Cortex M7. Documentation about the porting process is pretty scarce but some threads in the forum here and also the source code of the rp2040 helped a lot.

SMP Porting Checklist - A53 *4 as reference seems to be an unofficial porting guide.

Sadly, not everyone had a satisfying result in the end though: SMP Porting to Cortex R5 - #29 by akj90

While it is pretty clear that spinlock hardware must exist to support mutual exclusion, this is not directly apparent for cache coherency which seems to be essential for fast performing SMP. Neither XCore AI nor RP2040 have any DCache so they don’t have any issue with that.

My current work is based on the deactivation of the DCache and I’m wondering how it could possibly be activated without any stochastic issues in the future.

The whole kernel must either be in a non-cache area or the cache must be frequently invalidated.

The heap data and the “heap organization structure” also has to be suitable for the task.

I wanted to create this thread here to discuss how much thought yet was invested into cache coherency for CPUs that are not supporting it with a snooping protocol. I would expect to have some sort of macro which the kernel could use to invalidate certain areas which could have been touched by the other core. This macro would then have to be implemented for every supported controller.

Or is my task already in vain and SMP is not something that should be done on a controller without cache synchronization hardware?

Kind regards,
André

Hi André,

Deactivation of DCache is one way to run the FreeRTOS SMP on platform without hardware cache coherency function. Currently, FreeRTOS SMP implementation doesn’t support software cache coherency feature.

If we are going to support this feature, I can think of the following scenarios that need to be considered:

  • Task moving between different cores - the task can write the thread data on one core and switch to another core to read it back and use it.
  • Kernel shared data - these data are protected by critical section or suspending scheduler and can be read or write by different cores.
  • Task shared data - FreeRTOS provides inter-task communication methods like queues, mutexes, semaphores … These mechanisms also need to be considered.

There may be other scenarios we also need to consider. If you already have some idea about this feature, please feedback in this thread. We will put it into consideration when planning this feature.