I am currently experimenting with Mutexes in connection with FreeRTOS 8.2.0 on a Cortex-M4 based (168Mhz) STM32F4-Discovery board using the GNU C Compiler (gcc-arm) with an optimisation level of 2.
I am interested in the performance of Mutexes, in particular I want to know how fast they can solve a priority inversion problem. For measuring the time FreeRTOS needs to detect and fix priority inversion I have thought of the following scenario:
There are three tasks - LowPriorityTask (LP), MiddlePriorityTask (MP) and HighPriorityTask (HP) - and a mutex. MP and HP begin in a sleeping state. LP takes the mutex and wakes up HP which means that LP gets preempted by HP. HP then starts the time measurement, attempts to take the mutex, fails and enters a blocked state waiting for the mutex to get available. Therefore LP continues to execute, wakes up MP and would now theoretically be preempted by MP - this would result in slowing down the execution of HP. The RTOS detects this situation and raises the priority of LP so that it can continue executing until it gives the mutex. After this has happened HP can continue executing and take the mutex itself, after this the time measurement stops.
The same scenario in pseudocode:
function LowPriorityTask while true do Take_Semaphore Wakeup_HighPriorityTask Wakeup_MiddlePriorityTask Give_Semaphore end while end function
function MiddlePriorityTask while true do DO_SOMETHING end while end function
function HighPriorityTask while true do START_MEASUREMENT Take_Semaphore END_MEASUREMENT end while end function
The time measurement is being done by toggling a GPIO Pin before and after the test sequence and checking the output on an oscilloscope.
Everything works as supposed, but I have noticed one thing that I do not understand:
If I increase the stack size of the three tasks, then the performance gets better. For example with a stack size of 128kB, the resulting time is constantly 31.3µs, but when the stack size is 512kB, then the resulting time is 30.8µs.
Does anyone have an idea why increasing the stack size of the tasks has a positive influence on the performance in this scenario? If wished, I can also supply the actual code that I used.
I am looking forward to reading your responses!