When running SMP, what is the expected performance impact between 1 CPU core and > 1 CPU core?

Hi,

I was just curious if anyone had an idea of whether firmware performance should increase or decrease when running SMP vs 1 core? Ideally, i would assume that if you have more CPU cores available, you can run more tasks in parallel and expect better performance, but I know with multi-core, we have to introduce new mechanisms like spin-locks for task and ISR locking, and even new functions like getting CPU core ID where as in 1 core, these things are not needed. Since these new functions have overhead, does it mean that running multi-core actually could cause a decrease in performance (i.e. cpu ticks to run firmware increases when on one core vs two? ).

While I can image cases where the overhead could get so severe that a two core SMP could be slower than a single core equivalent system , it would seem to be a very contrived case that shouldn’t have thought about multi-core in the first place. Almost all of the overhead only happens when we are interacting with the OS, and not doing “real” work, so the problem cases are mostly where you do too little work in one place and keep passing it from actor to actor.

Most of the overhead should be pretty small for a system designed properly unless you hit a lot of contention for the system resources that need those locks.

so i have a scenario where there are a lot of task that run which are responsible for yielding on their own (calling taskyield() at the end of the routine) because we are using co-operative scheduling (no-preemption). If some of the tasks aren’t doing much and yield quickly/often, causing a context switch, would this cause a perf drop? I noticed that context switching appears to be more expensive (more cpu cycles required) in multi-core compared to single core. Also, when you say pass it from actor to actor, I assume you mean context switching?

Using vTaskYield is not a great way to distribute work, as you waste time by going to a task that doesn’t have work to do. Rather than yielding, they should block to wait for another piece of work.

One of the biggest limitations of just doing yielding, as then yielding again if you don’t have more to do is that such a task really should be at priority 0 (Idle priority) or otherwise lower priority tasks can’t get CPU time.

Tasks really should be divided into groups by how important it is for them to get their work done quickly, so the schedule can actually have something to go on.

2 Likes

okay yea that makes sense, thank you!