I’m porting FreeRTOS SMP to a multi-core RISC-V processor that lacks Inter-Processor Interrupts (IPIs), and I’m seeking advice on my software-based workaround.
My Design: My approach is to use a shared array, volatile BaseType_t xYieldRequest[NUM_CORES], to signal when a core needs to reschedule.
Each core has its own timer interrupt firing at the tick rate.
In the ISR, Core 0 is the only one that calls xTaskIncrementTick().
All other cores simply poll their flag in the xYieldRequest array to decide if they need to perform a context switch.
Key Question:
I’m aware that the most direct issue with this polling design is the context switch latency, which can be up to one tick period.
My main question is: Besides this latency, what other fundamental problems or subtle side-effects should I be concerned about with this architecture?
One thing to watch out for is “volatile” might not be good enough, that just forces the code on the processor to check the variable, even it it knows it recently read it. You still need to make sure the two processors see a coherent view of that memory, so need to take into account any caching done by the processors.
If you don’t have an IPI, you still need to make sure you can implement the inter-processor locks that are needed, which again may need to worry about memory coherency.
Part of the issue is shared memory with sufficient native coherency tends to be slow as every read or write needs to arbitrate the access
FreeRTOS SMP handles yield requests from other cores (via portYIELD_CORE()) at the following places:
Waiting to enter the critical section in prvCheckForRunStateChange()
Checking xYieldPendings when leaving the critical section
Asynchronous yield request through IPIs when task is not in the critical section
The first two synchronous handling mechanisms may not be affected, but the third is impacted by replacing IPIs with shared-memory xYieldRequests.
I can think about the following drawbacks:
Memory-related Issues
As Richard mentioned, using shared-memory as IPIs should consider:
Memory coherency problems
Race condition concerns
Scheduler-related Issues
portYIELD_CORE now takes more time to take effect on other cores, resulting in:
Unpredictable context switch timing
Increased priority inversion duration
Delayed response to scheduling events
Performance Impacts
Polling xYieldRequests causes system performance impacts including:
CPU overhead
Increased power consumption
While this solution might work, you might need to carefully evaluate these trade-offs against your system’s requirements, especially regarding real-time constraints and performance needs.