SMP Performance Benchmarking

lock contention, and yes granular locks should fix the problem.

Great - thank you for explaining!

The problem is that the critical section in the Mutex isn’t just about the resources of the mutex, but manipulating the state of the scheduler, so there is contention.

Yes, first using a more granular lock to see if we need to get to the more global lock might improve SMP performance, but at the cost of single core performance or needing two different algorithms in the code.