I am debugging an issue on my costum Cortex-R5 FreeRTOS port. I’m currently using FreeRTOS Kernel V11.1.0.
I traced my wait API call OS_WaitForNotificationIndexed() down to xTaskNotifyWaitIndexed, which maps to xTaskGenericNotifyWait in tasks.c.
Observed behavior:
With cache enabled, notification wait behaves incorrectly. I don’t get out of the wait state
With cache disabled, it appears to work.
After applying this Change (inserted snippet from V10.6.1):
--- a/src/os/freertos/tasks.c +++ b/src/os/freertos/tasks.c @@ -7769,6 +7769,12 @@ TickType_t uxTaskResetEventItemValue( void ) { traceTASK_NOTIFY_WAIT_BLOCK( uxIndexToWaitOn ); prvAddCurrentTaskToDelayedList( xTicksToWait, pdTRUE ); + + /* All ports are written to allow a yield in a critical + * section (some will yield immediately, others wait until the + * critical section exits) - but it is not something that + * application code should ever do. */ + portYIELD_WITHIN_API(); } else {
After adding portYIELD_WITHIN_API at that point, behavior is stable again. My understanding is:
Once the task is moved to the delayed/blocked list, it must yield immediately.
portYIELD_WITHIN_API forces an immediate scheduler handover from inside the kernel API path.
On this Cortex-R5 port, it also includes ordering barriers (DSB/ISB) after triggering the software interrupt, which likely matters more when cache is enabled.
With cache disabled, slower execution can mask this race.
Is this the expected explanation, or is there another port-specific reason why cache-enabled mode exposes this when portYIELD_WITHIN_API is omitted?
Is your custom port a SMP port? When using SMP proper cache usage becomes very important, as all the processes need to keep a coherent view of memory, and those barrier instructions are important for that.
Note that one of the big changes in the code from v10 to v11 was the built in support for SMP.
To be precise it is a port for the TMS570CL435. This processor is arm R5-based. As there is no FreeRTOS port for this we are writing our own based on the R4 port and the provided HalCoGen (Code generator from Texas Instruments) code. As HalCoGen only supports older FreeRTOS versions we had to adapt it ourselves.
When adapting this code we did not make any adaptions for SMP. I’m also not familiar with the implications of SMP on the port code.
The TMS570CL435 behaves like a single core. So my understanding is that this also not necessary if I’m only using the single core variant. Shouldn’t it then behave like before, without SMP?
taskYIELD_WITHIN_API is called here after resuming the scheduler and it should be enough. Attempting to yield when the scheduler is suspended does not seem like the right thing to do.
If your part is single core, you do not need to worry about SMP.
Unfortunately this port is not applicable for our MCU.
Yes, I also saw that taskYIELD_WITHIN_API() is called shortly after. Could that if clause in between take too much time and cause my Problems? I see that this small change makes the difference between a working version and a non working version and I want to understand why.
This likely means that the issue is elsewhere and you are just masking the problem by making these small changes. Have you written this R5 port yourself or did you get it from some vendor? Can you do a quick comparison with the port I shared above and see if something is missing (like a barrier etc)?