I’m trying to understand of when exactly to use a semaphore instead of a mutex in an RTOS scenario. Let me explain my case:
I have four periodic tasks running with periods of 250 ms, 500 ms, 720 ms, and 1000 ms. All these tasks are running at the same priority, and they share a common resource UART for data transmission. Now, at every 1 second mark, all four tasks become ready at the same time, so there’s a chance they’ll all try to access UART more or less together.
In this case, should I use a mutex or a semaphore to protect UART? Both mutex and semaphore can be used to protect resource. I’m confused about when exactly semaphore should be used instead of mutex in such scenarios.
Can someone explain in simple terms, using this UART + periodic tasks scenario, when semaphore makes sense and when mutex is the right choice?
Mutex: Provides mutual exclusion for a single shared resource. Only one task can hold it at a time. Aka Mutual-Exclusion
Semaphore: Controls access to a pool of identical resources. Can allow multiple tasks to proceed simultaneously.
Now Binary Semaphores provide a similar overall behavior to Mutexes but do not contain a priority inheritance mechanism (this doesn’t matter for your use case for now since all tasks have the same priority). I’d recommend reading more here and following the supplied guidance…
Binary semaphores and mutexes are very similar but have some subtle differences: Mutexes include a priority inheritance mechanism, binary semaphores do not. This makes binary semaphores the better choice for implementing synchronisation (between tasks or between tasks and an interrupt), and mutexes the better choice for implementing simple mutual exclusion.
The way I think about it is that if the ONLY access method is that a TASK will acquire the lock, use the resource, then give up the lock, then you can, and should use a mutex. Using a mutex breaks up the priority inheritance problem that occurs with a semaphore, as a middle-priority tasks can keep a low-priority task that holds the mutex from running, even if a higher-priority task wants it, because the higher-priority task blocking on the mutex will temporarily raise the priority of the low-priority task till it gives up the mutex.
Semaphores can’t do that because they don’t have the acquisition/release pattern, but are often used by one task to mark that something is available, and another task takes it.
We have had this many many times before: IF you insist in accessing a UART from multiple tasks, use muteces, that is what they are for.
But keep in mind that this control flow normally is a poor model of what a UART is about. As the name implies, a UART is an inherently serialized device, and the better way to model this is through a single task that services the device, itself being presented the data to transmit/receive by client tasks via IPC (typically queues). That makes mutual exclusion unnecessary. Everything else is but a pandora’s box of problems.
Most likely the protocol you are supposed to implement is a half duplex/response-request protocol where some entity of your system sends a request and immediately afterwards waits for a response from the peer. In that case it seems natural to “atomize” each transaction via a mutex and let each of several tasks process their respective transactions concurrently.
I promise you that you will regret that design. It bears many many pitfalls.
Oh no, not at all. I may not have been clear enough, even though I wrote “it seems natural… I promise you that you will regret that design.”
Thanks for pointing out a potential unclarity - I just meant to point out that on first glance, a protocol like that (unlike other protocols) may make it look like a feasible implementation to concurrently access the serial device from multiple threads. It only looks like that, but it is not (quoting myself “it bears many pitfalls”).
@RAc, and if reading you correctly, you’re recommending to control the UART by single dedicated task which will perform all kinds of UARAT transmissions and will process all UART receptions. And other tasks which needs UART interactions should request them through a some kind of a queue - stream buffer, message buffer. Is it your recommendation?
well, it is the only design that has proven sturdy and reliable in the many years I have coded protocol drivers and also generic enough to serve different types of higher level serial protocols.
And I have found it inadiquate for many purposes as either you limit you inter-task communication to not queing up requests, or your “driver” will violate the principle of priority.
The use a mutex to get access to the communication port at least gives the next operation to the highest priority task.
I find most usages of a serial port can be put into 3 categories:
Output stream, multiple tasks wanting to write status messages. Best handled by a queue or stream buffer with a mutex input guard to let a task put its message into the stream.
Query Response Device: Multiple tasks arbitrate with a mutex to send a query to the device, and get an answer and then releases the device. Normally have a core module, NOT a task, that different task call to make their requests, and internally, the module uses the mutex and the serial port.
Asyncronous Device, might send some messages without a specific query or can queue up multiple requests before answering the first. THIS case needs a separate task. Until you reach the point of the device either giving unprompted messages or being able to handle multiple requests and needs to sort out the responses you don’t need the separate task on the serial receiver. The transmitter side may still work sometimes in the callers task with a mutex guard over internal data. The driver is still a seperate module, that has a task within it as mostly a private implementation detail.
Though good advice has already been given, I will chime in with a few rules of thumb that I have about using Mutex vs. Semaphore:
If the thread will perform a blocking operation while holding the resource, use a semaphore
If the operations performed while holding the resource are asynchronous, such as sending multiple characters over a serial interface, use a semaphore which is released after the operation has been completed
If none of the above apply, a mutex may be a better choice for simple cases.
Typically, I only use mutexes to prevent concurrency problems when updating shared data structures, registers, or other direct updates, and semaphores for most other things.
I have done the separate serial output task thing many times, and endorse that solution for anything more complex than sending status out a common serial port, however for debugging it’s often overkill.
When it’s a bidirectional protocol, I always use a separate task that handles requests from the business logic tasks, with the request including “send this, put the response here, call this function when done or error” sort of information, but the UART task has to be written to understand the protocol(s) being used. When doing multi-drop RS485, this is especially useful. I have also used technique this for i2c with different tasks responsible for different i2c connected devices (I/O expanders, sensors, etc.).
suppose we have a low priority task that is currently running and it has already locked a mutex to access a shared resource. While it is still using resource, a high priority task becomes ready and also needs the same resource. Normally a high priority task would preempt the low priority one, but since the mutex is already locked the high priority task cannot continue. It will remain blocked until the low priority task finishes and releases the mutex.
Exactly. And due to the priority inheritance feature of a mutex the low prio task inherits the priority of the blocked higher prio task until it has finished its work on the locked resource and releases the mutex. After that the low prio task is reset back to its normal, lower prio.
That’s needed to avoid or handle the so called priority inversion problem.
Yes, and as Hartmut mentioned, the mutex avoid the priority inversion problem of a task with priority between the high priority task that is waiting, and the low priority task that needs to finish, from using the CPU, delaying the high priority task.
With a semaphore, that middle priority task will be the highest priority task that is ready, and thus will be run. With a Mutex, when the high priority task blocks on the Mutex, the low priority task “inherits” (temporarily) its priority, so it will be higher than the middle priority task, and continues on to completion, and when it releases the mutex, it will be returned to its lower priority and the high priority task resumed.
Semaphore (binary or not) are meant to be used for inter-task synchronization and mutex are meant to be used for mutual exclusion. Chapter 7 of the book I shared above provides good details.
My rule is that a Semaphore is only used when you can’t use a mutex due to the giving task/isr won’t be the task that previously took it.
My biggest usage is in drivers, where an ISR gives the semaphore to indicate an I/O operation is done, and the task no longer needs to wait for the results. This can’t use a Mutex, since you can’t give a mutex in an ISR.