Mutex problem

alan_rouse wrote on Saturday, August 11, 2012:

I have an application that uses numerous I/O devices: 3 USARTs, I2C, USB and SPI.  I have developed a common channel through which all I/O is directed:
Call Open(Stream)
if writing, call putch(Char) as many times as required
If reading, call getch() as many times as required
Call Close().

Open() sets a variable that is used by putch() and getch() to define the port to be used.  This generally sends data to or from a buffer.

If reading,Close() simply releases the port.
If writing, Close() initialises the relevant driver then enables an interrupt so the data in the buffer is sent on the relevant port. It then waits until the data has been sent before releasing the port.  To detect when the data has been sent, Close() uses xSemaphoreTake() and the interrupt routines use xSemaphoreGiveFromISR() to synchronise.

So far so good.  However, I also need mutual exclusion to ensure two tasks cannot simultaneously access the resource.  This is complicated by a number of considerations:

1) At any given time only one task can be writing.
2) At any given time only one task can be reading.
3) It is permissible for a task to simultaneously read from one port and write to another, or for one task to be reading while another is writing.
4) It is not possible to simultaneously read from and write the I2C, SPI or USB ports.

To address this I have implemented a Session mutex for each port, a Write mutex and a Read Mutex.  Open() will therefore take two mutexes, a Session mutex and either the Write mutex or the Read mutex.

However, I find that very occasionally (maybe after several hours of running) the system hangs at the dreaded FreRTOS routine:

		/* *** NOTE ***********************************************************
		If you find your application is crashing here then likely causes are:
			1) Stack overflow -
			2) Incorrect interrupt priority assignment, especially on Cortex-M3
			   parts where numerically high priority values denote low actual
			   interrupt priories, which can seem counter intuitive.  See
			3) Calling an API function from within a critical section or when
			   the scheduler is suspended.
			4) Using a queue or semaphore before it has been initialised or
			   before the scheduler has been started (are interrupts firing
			   before vTaskStartScheduler() has been called?).
		See for more tips.
		for( pxIterator = ( xListItem * ) &( pxList->xListEnd ); pxIterator->pxNext->xItemValue <= xValueOfInsertion; pxIterator = pxIterator->pxNext )
			/* There is nothing to do here, we are just iterating to the
			wanted insertion position. */

This always happens while the Close() routine is waiting for the interrupt semaphore to indicate that a message has been sent.  It appears that interrupts are disabled, since the system hangs midway through the send process, but I can’t see anywhere in my code that would do this.

I have just seen that the API function manual description of vSemaphoreCreateBinary() states:

A fast and simple priority inheritance mechanism is implemented that assumes a task will only ever hold a single mutex at any one time.

Could the problem be caused by the fact that in my application a task hold two mutexes, a Session mutex and a Put/Get mutex?

rtel wrote on Saturday, August 11, 2012:

It is possible, but unlikely.  The priority inheritance mechanism does not stack inherited priorities.  For example, if your priority 1 task takes mutex A, then takes mutex B, then because it is holding mutex A it gets promoted to priority 2, then because it is holding mutex B it gets promoted to priority B, when it gives back mutex A it will get demoted back to priority 1 even though it still holds mutex B.  That might result in priority inversion in your case, but it would not result in a crash (deadlock maybe, but that is not what you are seeing I don’t think).


alan_rouse wrote on Saturday, August 11, 2012:


Thanks for your prompt reply.  I am a bit confused about the terminology Priority 1, Priority 2 and Priority B. In my application a task will take a Session mutex, then the Put/Get mutex. It will then release the Put/Get mutex, then release the Session mutex.  Please confirm that this is the right interpretation:

1) Task1 takes a Session mutex, then takes, say, the Put mutex and runs at Task1 Priority.
2) Task2 attempts to take the same Session mutex, so is blocked.  Task1 is promoted to (Task2 Priority + 1)
3) Task3 takes a different Session mutex then attempts to take the Put mutex, so is blocked.  Task1 is promoted to (Task3 Priority + 1).
4) Task1 releases the Put mutex, so its original priority is restored. Does it become Task1 priority or (Task2 Priority + 1)?
5) Task3 is unblocked so it takes the Put mutex and runs at Task3 Priority.
6) Task1 releases the Session mutex, so its priority is restored to Task1 Priority.
7) Task2 is unblocked so it takes the Session mutex.  It then attempts to take the Put mutex so is blocked by Task3. Task3 is promoted to (Task2 Priority + 1)
8) Task3 releases the Put mutex, so its priority is restored to Task3 Priority.
9) Task2 is unblocked so it takes the Put mutex and runs at Task2 Priority.
10) Task2 releases the Put mutex, then releases the Session mutex.

The area of uncertainty is in (4).  Is there any possibility that this could induce a system crash?


richard_damon wrote on Saturday, August 11, 2012:

A few comments.
In 2, Task 1 takes is promoted to Task2Priority, not Task2Priority+1, Task2Priority+1 may not be a valid priority, and Task2Priority is good enough since Task2 will be put into a blocked state, Task1 will now run (as much as Task2 would have been able) to finish its use of the mutex. Same holds for 3

In 4, Task1 will be dropped to its original priority as FreeRTOS does not have a stack for priority inheritance. This can have some problems with priority inversion as a Task 4 with a priority between Task1 and Task2 can delay Task2, even though it has a lower priority.

None of these actions should cause a program crash/hang. The sort of hang you seem to describe is normally due to something corrupting the FreeRTOS data structures.

I am also not sure what the purpose of your “session” mutexes is for. A task needs to claim exclusive use to the read or write channel of a serial port, or the whole of another type of port, but that should be enough, as you specifically allow one task to read and a different task to write to a serial port.

alan_rouse wrote on Saturday, August 11, 2012:

I understand your explanation.  It sounds as though the problem lies elsewhere, but it is going to be difficult to trace, particularly since the problem occurs so infrequently.

By the way, the reason for session mutexes was explained in my original post.  Although one task can be reading while another is writing, this is not allowed for the USB, SPI or I2C ports, so I use session mutexes on these.

richard_damon wrote on Saturday, August 11, 2012:

So you need a “session” mutex (guarding both read and write for a device) OR a “read” mutex or a “write” mutex. Each “Channel” has just one mutex, Serial ports support 2 channels, one read, one write, other devices just have a single “Channel” used for both types. No need to have two mutexes.

Unless a “Session” is some other scarce resource, not yet described, that needs to be shared to access the devices, I don’t see the need for a second mutex.

alan_rouse wrote on Monday, August 13, 2012:

Thanks for the help.  There are other considerations that mean two mutexes have to be taken when accessing certain I/O streams, particularly SPI.  However, I reviewed my logic and realised I could make some simplifications. Unfortunately the problem persisted.

I eventually traced it to an interrupt priority issue.  One of my I/O routines was running at a priority higher than configMAX_SYSCALL_INTERRUPT_PRIORITY.  I was therefore wrong in suspecting that the problem was due to mutexes.

The exercise has at least furthered my understanding of how FreeRTOS works, even though it has been frustrating! 

Thanks again for your help.