More efficient queue

paul_piak wrote on Monday, May 26, 2008:


My system receives commands through an RS232 serial port, which is sent to the application through a FreeRTOS message queue.

Usually the RS232 messages are more than 1 byte long, typically about 10, but the length is variable up to 61 bytes.
In the current situation, the application task requests to receive 1 byte from the queue (xQueueReceive()) and is put to sleep because the queue is still empty, resulting in a task-switch (to for example the idle task).
When the RS232 port receives a byte, the interrupt service routine moves this byte to the queue (xQueueSendFromISR()). The kernel discovers that a task is waiting for this queue to deliver the byte and this task is woken. This causes another task switch from the idle task bvack to my application task.
However, only 1 byte does not make a full command. The application task continues to receive the rest of the bytes in the message (which haven’t even been received by the RS232 by now, thus are not in the queue). Thus the task is immediately suspended again until it is rewoken (indirectly by the RS232 interrupt service) after the 2nd byte has been sent to the queue. Also for the 3rd, 4th etc bytes, the task is repeatedly suspended and rewoken when the byte is received until the message is complete.

This way, the application is suspended and rewoken once for every byte that is received.
Every time this costs a few hundred cycles, and for a 60 bytes message on the RS232, this amounts to an awful lot of lost cycles.

I would like to have the option to specify a variable number of items that is expected from the queue. The application would then only be woken after that number of bytes is received (or at a time-out of course, just like xQueueReceive())
Thus only 1 suspend + wake are required for an arbitrary length message.
Does anybody know if there is a clever way to do this with the current API?
I would think this requires a modification of the API since the queue mechanism needs a parameter to specify the number of bytes for each waiting task.

I have 3 years experience with FreeRTOS by now and am very willing to do my part on this but if I am correct on needing to modify the task manager code I’m going to need some support (especially I would like it to become included in the API when it has matured enough, so I don’t get stuck with 5.0.0 for the rest of this products economical life…)


rtel wrote on Monday, May 26, 2008:

This is the way that most of the demos work and it should be noted that it is not intended to be an optimal solution (this is noted in most ports).  In fact the demos often are written to generate as many interrupts and context switches as possible in order to test the port is robust.  To do this a single char is received at a time and the UART FIFO’s and DMA’s are not sued.

The way I would normally do this is to first of all make the peripheral as efficient as possible.  This would mean making use of the buffering on the peripheral itself (FIFO) and using the DMA if one is available.  The DMA or interrupt handler can then move characters from the peripheral and place them in a circular buffer.  When a whole message is received, or a timeout occurs, the task that is going to process the characters can be woken through the use of a binary semaphore rather than a queue.  The binary semaphore does not pass data, it just unblocks the task, and only does it when there is enough data to make it work unblocking the task.  The task can then check the circular buffer and process all the characters it finds before going back to the blocked state.


richard_damon wrote on Monday, May 26, 2008:

While Richard Barry suggest a circular buffer, another option that I have used (assuming the interrupt routine can parse message boundaries) is to have a couple of message buffers to store messages, and when a message is received, you post a message to that effect identifying the buffer that contains the message. This may require more memory than the circular buffer, as the buffers will typically be statically allocated for the maximum message length, but it can ease the processing of the message as the task doesn’t need to copy the circular queue to a linear buffer or deal with processing the message in the circular buffer.

paul_piak wrote on Tuesday, June 03, 2008:

Thanks for your suggestions, but…

The processor is only a PIC18, so the hardware is pretty inefficient by nature. The fifo receive buffer can only raise the interrupt flag on the first incoming byte, so I have to interrupt on every byte.
Fortunately I already copy the data from the queue to a circular buffer, so the required changes are to add a (global?) variable which specifies the number of requested items to the interrupt routine, and let the RS232 interrupt move the data directly to the circular buffer instead of the queue. The interrupt can unblock the task when the required number of bytes has been received. This actually saves memory, since the buffer stays the same size, but the semaphore is much smaller than the queue.
The problem is that the interrupt can modify the buffer when new data is received, while the task is reading the buffer. Thus the buffer is corrupted. This has to be fixed by making access to the buffer a critical section.
All in all this is a can-do but smells a bit like a kludge. The programmer has to monitor a lot of possible situations, which a modified queue would do for him.


incrediball wrote on Tuesday, June 03, 2008:

I might be missing something but this doesn’t sound too hard: since the number of bytes that needs to be received seems to be known only to the interrupt, the interrupt should signal the application by using a semaphore. The semaphore is only given when the interrupt has deemed that enough bytes have been received for a command. Otherwise you queue the bytes as you were originally doing but the application should not block on reading the queue. Instead it should wait for the semaphore, which will indicate that a complete command is ready for processing. When the application has the semaphore, it can then read the required number of bytes from the queue.

paul_piak wrote on Wednesday, June 04, 2008:

Basically, I think you’re right. I forgot that I could _keep_ the queue. In my enthusiasm for freeing up resources I tossed it away, but a queue is probably the way I am supposed to move data to a task.
Given the limitations of the framework, this is probably how it should be done.

But then, a queue which is never used to block the receiving task is only used half. It still has the associated overhead of checking whether a task should be unblocked, and I already know the answer beforehand: NO. I think this also is overhead I could do without.
Probably just the circular buffer with some critical sections is more efficient, albeit more dangerous since all the protection code (against simultaneous write access by the task and the ISR) needs to be placed by me.

By the way: the _task_ is the only one who knows the number of bytes that needs to be received, not the interrupt. That is why I proposed a ‘xQueueReceive()’ variant which also has an argument for the number of items. It should not test for tasks to unblock if not at least the requested number of bytes are in the queue.

To me this seems to be a useful feature. How many applications out there receive no more than 1 byte at a time, or have only fixed size messages (1 queue itemsize)? I would guess this is the minority.
If everybody else (those who receive variable-length messages) uses 2 queues: 1 queue which never blocks plus a semaphore (= queue without length), to collect data into a multibyte message, then there is room for improvement.


Alternatively, consider this: There is a stripped-down queue which has no length, only the blocking mechanism: the semaphore.
Shouldn’t there also be a queue without blocking mechanism (but with simultaneous write access protection)?