Non-blocking UART transfer

A couple of questions:
I have a sensor task that merely reads sensor data periodically and writes to UART. Interrupts are enabled for both protocols.

  1. Does it make sense to block StartTX after initiating a UART transmission? One reason I can think of is preventing other tasks from using it while it’s in progress. But is this how it’s typically done? Is this still considered non-blocking approach?

  2. If we are to block, binary semaphore or task notification sounds like a good candidate though for the latter, UART task would need to have access to Sensor’s task handle to use it whereas the former doesn’t quite. Does it make semaphore a better candidate?

void UART::StartTX(buffer, size)
{
/*
    1. Enable UART transmission 
    2. Read buffer into TX FIFO
    3. Read a byte from TX FIFO, and write it to TXD register 
   
    ** SHOULD WE BLOCK HERE? **
    ** IF SO, how can we use task notifications? **
*/
}

void UART::WriteToUART(buffer)
{
   StartTX(buffer, size); 
   // ISR follows...
}

static void UART::ISR()
{
   // 1. IRQ is triggered
   // 2. if FIFO != empty
   //       2.1 Read a byte from TX FIFO and write to TXD register  
   //       2.2 Back to 2 
   //    ELSE:
   //       2.3 END TRANSMISSION
   //       ** UNBLOCK THE TASK **
}

void Sensor::Task()
{
    while(1)
    {
        uint8_t data = SensorRead();
        // block this task till sensor read is done (signalled by ISR)

        uart.WriteToUART(data);
     
    }
}

I use a Mutex to guard a shared serial port. When a task wants to send a message, it aquires the Mutex, sends the message, then releases the Mutex.

Depending on how you want to transfer the data to/from the task, you might need a semaphore for the ISR to fire to indicate end of transmission and/or the buffer being available. For moderate speed links I just use a Queue or a Stream Buffer for the data (since the serial port is guarded by the Mutex, and the task only can assess the Stream Buffer after getting the Mutex, you meet the no contention rule for the Stream Buffer)

Note, I encapsulate ALL the UART operations inside a driver, so tasks call the driver with pointers to the data to send and the specification of a UART control block that holds the information for the UART.

Isn’t that accomplished by using a binary semaphore so any subsequent calls to WriteToUART are blocked until the current UART transmission is done? (signalled from ISR)

void UART::StartTX(buffer, size)
{
/*
    1. Enable UART transmission 
    2. Read buffer into FIFO
    3. Write first byte to TXD register 
*/
}

void UART::WriteToUART(buffer)
{
   //  ** take a semaphore **
   StartTX(buffer, size); 
}

static void UART::ISR()
{
   // 1. IRQ is triggered
   // 2. if FIFO != empty
   //       2.1 Read from FIFO and write to TXD register  
   //       2.2 Back to 3. 
   //    ELSE:
   //       2.3 END TRANSMISSION
   //       ** give a semaphore **
}

Unless I misunderstood, the Sensor task is making a UART driver call as well

yes, and Richard states exactly that in the second paragraph of his answer. You two are sort of talking on two different levels; Richard is concerned with multiple tasks attempting to utilize the serial port, whereas you refer to an unverlapped execution of a single transmit operation.

The way it is currently written, a task can only send as a unit, a single buffer of text, which must be smaller than the software FIFO, and all calls must wait for that fifo to be empty before the next buffer can be sent. This makes the “Non-Blocking” operation actually need to block quite a bit, and forces the system to possibly need to use larger buffers.

My method has a Mutex used just at the task level to get exclusive access to the buffer for the serial port, and then it can make multiple calls to the serial port to. Build the “message”, so the task doesn’t need to build up a discreet “message”, but can send the message as its component parts (and the UART class, via base classes, has operations to write strings or numbers to be formatted).

Once the Mutex is obtained, the calls to the individual send operations ADD to the FIFO buffer (a Queue or Stream Buffer) and if there is room, it can return immediately.

A second task sending doesn’t need to wait for the buffer to be empty, it can just add to the buffer, and will continue as long as it isn’t full.

You can’t use a single semaphore signaled from the ISR to do the job of both the Mutex and the FIFO having space.

Yes, most serial drivers provided by vendors can’t do it this way, but are based on take a buffer, send it, and when sent the task can continue, so I need to write my own driver. But I find that to get better operation I need to do this anyway.

A related discussion is here:

Issue with uart RX data when sharing with multiple tasks - Kernel - FreeRTOS Community Forums

Just out of curiosity, how does your architecture handle acknowledgements (ie serial reads that need to sync up with the transmissions), or does your protocol not need those? (warning, slightly off topic)

You handle transmissions that need a reply by not giving up the Mutex for the transmitter until after you get the reply (or the reception times out). You also want to make sure there isn’t a stale response in the buffer, so it tends to be:

Get the Tx Mutex
Clear the Rx buffer
Send Message
Wait for reply or timeout
Give the Tx Mutex.

No? I’m also referring to multiple tasks attempting to utilize the serial port. Not sure what you meant by unverlapped execution of a single TX operation though.

My point was to get the binary semaphore before initializing the UART transmission to prevent other tasks from using the UART while it’s in progress already. The semaphore will be given from within ISR once the FIFO is empty and only then other tasks can utilize it.

So using a binary semaphore to achieve mutual exclusion for UART. Yes?

Also in my case, FIFO is okay to be overwritten so I am not blocking FIFO from pushing stuff in

No, with a single semaphore as shown, the second task (or the first task writing its second chunck of data) will block waiting for the semaphore until the buffer is EMPTY, as that is when the ISR gives back the semaphore. Thus your “non-blocking” operation may often block.

To allow the task that is filling the buffer to work sooner, you need the ISR to release the semaphore as soon as some space is available, but then it doesn’t provide exclusion to protect tasks from mixing there outputs.

You need TWO different primatives.

My system uses a Mutex, and a Queue/StreamBuffer to do the work. The Queue/Stream Buffer could be replaced with a “home-grown” circular buffer and a semaphore if you want.

The Mutex guards the feeding of the buffer, so a task gets all of its data sent together, and then the second primative allows the feeding to block/unblock based on space available.

So the idea is to allow all the tasks to stuff a byte in UART TX FIFO alongside others one by one: once TaskA is done sending a byte, TaskB can now take a semaphore, send a byte, give a semaphore, now TaskA takes semaphore, sends a byte, gives a semaphore, and so on…

In other words, TaskB doesn’t have to wait for the UART TX FIFO to be fully empty before sending its data over?

If that’s the case, are we restarting a UART transmission after each byte is sent by each task?

Earlier you mentioned the following sequence: what’s a message here? is it a byte or a full message i.e, collection of bytes that a task controlling UART has to send? the latter makes more sense and that’s what I’m doing as well (otherwise I see data integrity issues)

Get the Tx Mutex
Clear the Rx buffer
Send Message
Wait for reply or timeout
Give the Tx Mutex.

Message is a FULL message, perhaps even multiple call to different formatted output routines.

Normally, you establish a protocol that says that either all messages start or all end end with the new line character, so all messages are seperate.

If you are talking to a device that you need to wait for a reply, your method doesn’t work (as the next task can send a message as soon as the first is finished being sent).

Also, because you are using a semaphore, and not a mutex (since the ISR is doing the give) you can’t do priority inheritance, so if a low priority tasks starts to output a message, gets interrupt by a medium priority task doing a lot of CPU processing, then a High Priority task wants to send a message, it needs to wait for the medium priority task to block so the low priority task with the uart can finish its work.

With a Mutex, when the High Priority task blocks on the mutex, the low priority task will inherit temporarily its priority so it can finish its task quicker.

yeah in that case you probably wanna give a semaphore after 1) receiving a response 2) no response within timeout

I’m not expecting a response after TXing out the message over UART.

if a semaphore is given within a task instead of an ISR, only then it acts as a mutex? isn’t priority inversion mutually exclusive for mutex anyway?
I thought that was the case but I can see an problem happening if a low priority task takes a semaphore, gets preempted by a medium task, and then a high priority task that attempts to use UART/take semaphore but can’t cause it’s taken by a low priority task.

What’s the typical way to get around this issue with semaphore? (how I’m doing it)

Semaphore, because they support other signalling methods besides the taker, don’t support priority inheritance.

With priority inheritance, which requires using a Mutex, then as I described it, when the High Priority tasks blocks on the Mutex, the Low Priority task that currently has the Mutex inherits the High Priority (temporarily) so the Medium Priority tasks doesn’t delay it any longer.

There really isn’t a good, general purpose way to solve the issue with just a semaphore, which is why you really don’t want to use plain semaphores for mutual exclusion.

Are you saying it doesn’t make sense to use signalling mechanism from ISR once the FIFO is empty? It totally suits my needs here and I’m not sure how else can I go about it

Doing it the way you are doing it does NOT give you the “Non-blocking” performance you say you want. You might not notice it, but you aren’t getting it.

Your design says you can not add to the output buffer a new message until the previous message is FULLY sent out. (or at least loaded into the Tx output regiter/ hardware FIFO).

It also says that if a low priority sender gets interupted in the window when it is loading your software fifo buffer by a middle priority task that takes a while to run, you can block a High priority task wanting to use the serial port.

It also says that your internal software buffer MUST be big enough to handle ANY message that any tasks want to send, as you need to copy the full message into it, and have no mechanism to add more while the message is going out.

The only way to get true “non-blocking” output that avoids mixing messages from multiple tasks is to use TWO primatives, a Mutex for exclusion (which could be downgraded for worse performance to a semaphore) and then something that interact with the ISR to block on if the software buffer get full (or IS the buffer that communicates with the ISR, like a Queue or Stream Buffer).

If you do that, as long as space is available in the buffer, any task can a message to it and continue immediately. You only block if the serial port gets a full buffer behind the data being sent.

The ISR doesn’t need to do anything about the mutual excluison for sending to the serial port, that is all handled at the task level to serialize the requests into the buffer. The ISR just needs to handle the paceing of data through the buffer and out the serial port.

So you’re implying to use queues from task to ISR like this? What happens to xQueueReceiveFromISR if it doesn’t receive anything from the task? I reckon it doesn’t block and returns rightaway?

    // called within task context
    void UART::WriteToUART(uint8_t* data, size_t length) 
    {
        xSemaphoreTake(uartMutex, portMAX_DELAY);

        writeToTXD(*data);
        data++;
        length--;
      
        // Push remaining data into the queue
        for (size_t i = 0; i < length; i++)
        {
            xQueueSendToBack(uartQueue, &data[i], portMAX_DELAY);
        }

        // Release the UART access
        xSemaphoreGive(uartMutex);
    }

    void UART_ISR()
    {
      // IF TX event happened (after TXD register is written to)
        uint8_t byteToSend;
        if (xQueueReceiveFromISR(uartQueue, &byteToSend, nullptr))
        {
            writeToTXD(byteToSend);
        }
    }

Your design says you can not add to the output buffer a new message until the previous message is FULLY sent out. (or at least loaded into the Tx output regiter/ hardware FIFO).

Isn’t this how a typical design would look like anyway and only signifies the UART usage is mutually protected against race conditions?

You only want to send a message at a time and my snippet does it. If TaskA wants to send Hello and TaskB wants to send World, assuming TaskA takes a semaphore first, I should expect the output to be HelloWorld, and that’s what I’d expect with my original design with semaphore.

And that to me sounds like the most non-blocking approach I can get to. I still didn’t get how my design could result in compromising data integrity/mixed messages.

The only issue I see with my design is priority inversion (in case I have such priorities and tasks set out) and the reason being the use of semaphore

Sorry, that was a typo, should have read “unoverlapped.”

Please refer to the other thread about multiple threads accessing serial ports which I published a link to earlier. I am a very strong advocate of the “use a single task - typically a message pump - to exclusively access the UART” approach instead of attempting to strictly serialize multiple thread accesses. That way, you do not run into the potential problems (such as deadlocks due to, for example, missing end of transmit semaphore notifications - see a) below) that the mutex solutions bears. Also, again, the protocol that requires serialization is inherently synchronous and half-duplex, so having a single task serve the UART is a more natural way to model the control flow.

Another not yet mentioned potential problem in the “multiple task” approach is that clients may be starved out when individual tasks heavily use the bus. A single task can enforce mandatory service policies easier and more natural. Yet I understand that this is a question where points can be made for both approaches. A lot of the decision depends on the protocol. For example, there are protocols that require multiple “atomic” transactions for a single client. In that case, for the multiple thread approach, further serialization is required to ensure that after a complete Send-Ack cycle, the bus can not be released to another thread. In a single UART service task architecture, such policies can again be added much easier.

a) of course, missing EOT signals will also - if remaining unprocessed - pose a problem in the single task solution, but as always in concurrent scenarios, too much serialziation is as bad as not enough of it, so why pick a solution that requires more serialization (ie an additional mutex) if another solution gets away without it?

1 Like

@MasterSil
Your getting closer, but still seem stuck in the assumption that you can start processing a second message until the first is completely sent. YES, that is how a lot of manufacture supplied drivers work, but it is less than desireable for a real-time system. The whole discussion is how to get out of that model to something where task can send messages, and normally not need to block. They only need to block if your “buffer” gets full.

This means that UART::WriteToUART can’t assume the serial port is idle when it is called, but it should just add the message to the buffer, and if needed start the transmission IF it is idle.

First design comment, always using portMAX_DELAY is bad, as it says if something goes wrong your system just “hangs” and it can be hard to figure out what happened. My equivalent to WriteToUart takes a third paramater of the maximum time to wait to obtain the mutex, and if that fails it returns an error code.

Then you can’t write the first character to the transmitter directly, as the transmitter might be busy at the moment, so all the bytes are added to the queue.

To start the transmission, I add an “internal” function called “kick” that checks if the serial port is actively sending a character, and if not starts the transmission, this might be by sending the character to the port, but more often I do it by enabling the Tx interrupt and maybe force pend a Tx interrupt request. Adding this kick funciton allows me to have a universal base class that works on many processors, and a derived class for a particular device, and makes the code easier to port to a new processor.

Kick is called after sending each character to the Queue.

In the ISR, first you really need to have your xQueueuReceiveFromISR be passed an actual value for the third parameter, as otherwise when the ISR sends a character, and makes room in the Queue, the task level loop filling the Queue won’t wake up right away. In my mind there are very few cases when you don’t want to process that wake up flag.

Second, when the xQueueReceiveFromISR returns the error for being empty, on many serial ports you need to do something to clear the TX event. Sending a character is one option, but if you don’t have another character, often you need to do. something else (often disable the Tx interrupt).

@RAc In my view it depends a LOT on what you are doing with the serial port, and what is on the other side. Adding a task just to “serialize” access is a heavy hammer, and best to avoid when not needed. This doesn’t mean that the code to process the device gets scattered all over the place, if the device has a protocol, then that is all put into a single driver, that might be used by multiple tasks. The point it makes sense to me to make a dedicated task is when the device might send a message asyncronously (without a just prior command sent to it) or if it can handle multiple requests at once, and the answers may not be in the order sent. Another is if you need to periodically querry the device, then a task for the device might also make sense.

Yes, that is what I meant by my earlier statement that the protocol drives the architecture. When I worked in security/access control, 90% of the work on serial busses was in coding multidrop bus masters over RS485, where there is practically no alternative to a single UART driving task (everything else keeps proving a dead end).

In the application you sketch, I see things like sensor concentrators where the embedded device funnels messages from several sources into one UART with a fixed “message->protocol” sequence. As long as there are no protocol extensions such as messages split into several MTUs or meta sequences like session key negotiations that span over the single sequence->ack control paradigm, you may be ok with mutual exclusion over the sequence.

I would not call a driving task a “hammer,” though, but debating that will very likely be beyond the scope of this forum.

Am I not providing exclusive access to UART? If TaskA accesses the UART/takes a semaphore, TaskB would only be able to access it once TaskA’s message has been sent over UART (once the TX fifo populated by TaskA is empty). I’m still sensing some confusion