Safe passing of buffers through a queue

system · March 25, 2013, 8:30am

anguel78 wrote on Monday, March 25, 2013:

Hi!

In all FreeRTOS manuals I read that passing large buffers by reference (pointers) through a queue is common practice. But I have not seen a good explanation on how to protect the buffer contents.

Let’s assume that I have a few producer tasks and one consumer task that communicate through a queue. A global buffer must be filled appropriately by each producer task and then consumed by the consumer task. So I will pass the pointer to the buffer through the queue.

Now I wonder what is the best way to protect the buffer so that it can be safely shared between the producer tasks and the consumer task. Should I use semaphores? If so where and how many? And do I have to limit the queue size to one message to prevent corruption of the buffer?

Thank you in advance,

Anguel

rtel · March 25, 2013, 9:30am

rtel wrote on Monday, March 25, 2013:

The FreeRTOS+UDP code can be used as an example as it passes network buffers up and down the stack and between tasks by passing references to the buffers on queues.

There are several different approaches that you can take. The critical thing is to ensure that only one task has ‘ownership’ of the buffer at any one time.

For example, if Task A obtains a buffer (from a pool, or dynamically, an array, where ever) then it ‘owns’ the buffer. How that buffer is marked as in use (so no other tasks obtain the same buffer) is up to you. If the buffer is allocated dynamically (using heap_4.c for example, as that does not suffer so much from memory fragmentation) then the task that allocates the buffer can use it quite safely without interference from other tasks because no other tasks know of its existence anyway.

If Task A then decides it no longer needs the buffer, because it owns it, Task A is also responsible for returning it.

If Task A fills the buffer with data that needs processing by another task it can pass a pointer to the buffer in a queue. At that point, Task A no longer owns the buffer and must not modify it, or in any way unallocate it.

If Task B receives a pointer to the buffer from the queue then Task B owns the buffer. It can do what it likes with it, but once it has finished with it it must either return/unallocate it (to avoid memory leaks) or pass it onto another task.

When the buffer is unallocated it is either freed (if it was dynamically allocated in the first place), or in some other way marked as available again (if it is returned to a pool of buffers).

Regards.

system · March 25, 2013, 10:15am

anguel78 wrote on Monday, March 25, 2013:

Richard,

Thank you very much for the extremely fast reply. I actually had a look at the FreeRTOS+UDP code but did not really understand what technique it uses to protect the buffers. Probably it uses some of the buffer allocation techniques you described in your reply.

Now in my case I am sending pointers to string literal constants OR a pointer to a single modifiable global buffer in order to save RAM and to prevent the complexities of allocation. I thought to do it this way:
1. Any producer tasks attempt to send a message with a buffer pointer to the queue (queue length = 1 guarantees that only one producer will actually succeed).
2. The consumer reads the message and pends on a semaphore to wait for the data to become available.
3. As the producer’s message made it into the mailbox, this producer now has exclusive access to the buffer (because of queue length = 1) and can fill it with data. As soon as this is done it posts the semaphore to the consumer to signal that data is ready.
4. The consumer now gets the semaphore and reads the data from the buffer and uses it.
5. The consumer again starts waiting on the queue.

I don’t know if there is any more efficient or more safe way to do this. Any comments are welcome.

Regards,
Anguel

system · March 25, 2013, 2:36pm

travfrog wrote on Monday, March 25, 2013:

Hi Anguel,

I have a system with similar requirements. I have several tasks that want to pipe data through a common communications link. The producer then has 2 options. 1) it can block on a semaphore which the consumer will release once it has finished with the data or 2) it can wait for a reply message from the producer.

The difference is the producer ether blocks completely or continues to execute if there are other things it can do that do not involve the data. Option 2 is necessary if the producer must continue to process incoming data from elsewhere. Option 1 is the simpler approach.

From your description, it appears you only have a single buffer shared between multiple tasks right? I think a semaphore will do the trick. Any task wishing to use the buffer must obtain the semaphore first - you might want to block for a set time period to help catch any system load issues or deadlock conditions. The Q size on the consumer need not be limited to 1, especially if there are other things it needs to be doing (like processing communication link interrupts).

If you use some form of buffer-in-use indication, make sure the code that marks that buffer as in use is atomic otherwise you will end up with some hard to find corruption bugs.

Travfrog.

system · March 25, 2013, 4:23pm

anguel78 wrote on Monday, March 25, 2013:

Thanks Travfrog,

I will have to think a little more how to do it. RTOSes look easy at first glance, but they can also introduce some very subtle problems into simple tasks

Regards,
Anguel

system · March 25, 2013, 4:31pm

travfrog wrote on Monday, March 25, 2013:

They absolutely do :-).

richard-damon · March 26, 2013, 12:31pm

richard_damon wrote on Tuesday, March 26, 2013:

A good RTOS should actually help here more than hurt., at least as long as your requirements are indicative of using one (Needing rapid responses to some inputs). Anytime you have multiple threads of execution accessing common data, there are issues of needing interlocks of some form, this is fundamental. There are 3 basic ways to meet “Real Time” timing requirements:

1) Put such a fast processor in that it can meet the requirements with non-pre-emptive code, this will often become expensive and power hungry. If it doesn’t, than this may well be the best choice.

2) Do the critical timing in an interrupt routine, and have a single threaded main loop handling the rest. This doesn’t need a OS, and limits the limits how much code need to deal with pre-emptive issues, but that code is more difficult as you have less tools to deal with them, and the tools tend to be broader in effect. Typically critical sections of code will need to disable interrupt, causing added latency, or critical operations done inside an interrupt routine, possible delaying other critical operations. This may be the solution if you have just a few real time requirements.

3) Use a Real TIme operating system to manage your requirements. Interrupt routines exist to field hardware request, and perhaps handle some of the most critical operations, while tasks handle the rest.

A key aspect that needs to be dealt with is working with shared data. Generally a piece of code should “own” the data that it is using, and not be sharing it at that moment. The one major exception is data so simple that its changes are “atomic” (and you need to be very careful then that all accesses ARE done atomically). A RTOS will provide the tools to allow the various parts of the program to share and take ownership of the pieces of data it needs to use.

Where you are seeing “subtle problems cause by RTOSes”, it normally isn’t a problem caused by the RTOS, but by the definition of the requirements of the system, having defined data shared between two unsynchronized sections. Once the concept of proper ownership of data is put into place, then the only problem left is finding the right tool to implement it, which is what the RTOS will (hopefully) provide.

In the case presented, if there is a single buffer, which can be owned, sequentially, by one of several tasks at a time, then semaphores provide a good solution. From the problem statement I would use TWO semaphores, one to grant access by a producer, and one for access by the consumer. A producer grabs the producer semaphore, and when it has it, fills it with data. When done is sets the consumer semaphore. Thus no producer can access the buffer yet, as it is busy. When the consumer gets its semaphore, it can empty the buffer and then raise the producer semaphore, letting the next producer work. There is no need for a queue (except in how semaphores are built on queues), as all you are passing between tasks is the fact of availability of THE buffer

A second method, if you want to allow producers to not need to wait to generate data, would be to have a set of buffers. Each producer when it wants to generate data, gets a buffer from a source (perhaps a queue holding the list of buffers), fills the data, and then posts the address of the buffer to a queue for the consumer. The consumer then gets that address, consumes it, and returns it to the free buffer store.

In both of these methods a/the buffer is in one of several states: empty/ready for a producer, being used by a producer, full waiting to be consumed, being consumed. In each state its ownership is clear, and so is who can access/modify the data.

system · March 27, 2013, 8:42am

anguel78 wrote on Wednesday, March 27, 2013:

richard_damon,

Thank you very much for the extensive explanation and the solution presented. I think that this is really a more efficient way than using the queue that I had in mind first. My initial idea was to be able to also pass additional data like producer ID through that queue but this can be probably easily added inside the semaphore protected section dealing with the global buffer.

Regarding your comment on atomic operations, if I remember correctly I read somewhere that with modern MCUs it is a high risk to rely on operations to be atomic as such MCUs use internal cache pipelines. So I will try to avoid this.

Thank you once again!

Anguel

richard-damon · March 27, 2013, 12:33pm

richard_damon wrote on Wednesday, March 27, 2013:

When you get to MULTI-CORE processors, you need to start worrying about things like cache coherency, and making sure all processors see the same data. For these sorts of machines you need to include a memory barrier in the instruction to force the processor to get the value from main ram and put the results back into main ram atomically too, which will be a slow operation (internally the semaphore code will need to do something like this too, as all synchronization must depend on atomic operations, if not at the processor level, at least at the effective execution level.

To my knowledge, FreeRTOS is NOT set up to work on a Multi-core processor, as the “Disable Interrupt” operation is no longer good enough to provide a critical section. (If a port did provide a PORT_ENTER/EXIT_CRITICAL that worked for a multi-processor it might).

The step from single core to multi-core is a much bigger step in complexity than from single threaded to multi-threaded (in my opinion), although the OS might be able to hide that complexity.

system · March 29, 2013, 8:56pm

anguel78 wrote on Friday, March 29, 2013:

The multi-core story sounds scary. For me it is hard to think of all possible scenarios on a single core and a few tasks doing things in parallel… And besides the complexity introduced by an RTOS what really concerns me is the time spent in synchronizing data. Real-time sounds “very fast” at first but it turns out that it actually takes away a lot of CPU cycles to achieve that real-time. Each time I have to access a simple shared variable from multiple tasks I feel bad about wasting CPU cycles for semaphores or queues. But on the other hand with todays fast MCUs we can achieve more flexibility through an RTOS.

richard-damon · March 30, 2013, 3:18am

richard_damon wrote on Saturday, March 30, 2013:

Actually, if the access to the shared variable is quick enough, (within allowable latency margin on interrupts), then using critical sections (disable interrupts) to protect is a much faster option. Real time is NOT “very fast” but managed maximum delays.