Advice on the preferred "architecture" of a specific communication bus?

arnold_w · December 29, 2022, 9:12am

I wasn’t sure where to post this thread, it’s not specific to FreeRTOS or its kernel so I understand if a moderator moves or deletes this thread. But I get the impression there are many helpful persons on this forum so I decided to create this thread even though it might be slightly off-topic. I would like to ask for some architectural advice on the preferred way to implement (or actually, port from “bare metal”) a communication bus. I have it up and running with FreeRTOS, but the code is quite complex so I would like to simplify it.

The communication bus is based on a UART (with DMA) and has a master and a slave and both of them can be requestors (can read/write from/to a server) and servers (can answer to read/write requests from a requestor). The most important functions are called readRegs and writeRegs and they take a 32-bit start address, a data buffer and number of registers to access as parameters. If the master and the slave both want to access the bus at the same time, the master decides who gets the access, but there’s a rule that the master must be fair (as long as the slave wants to initiate an access, it should be granted at least every other access). The master and slave are connected through a detachable cable and when a cable is connected (link established) a discovery process is performed where the master reads information (device type, firmware version, etc) from the slave and then writes information about itself to the slave. The link status is then polled every second (the master sends a specific “are-you-still-connected”-packet) and the slave expects to receive an “are-you-still-connected”-packet, which it responds to, at least every 1100 ms, otherwise the link is considered lost and link lost interrupts are generated. If the bus is busy and two tasks want to use it, then the task with the highest priority should get to use it first (NOT first come, first serve). I want to do as little as possible in interrupt routines (preferably only set an xEventGroup event bit or queue an object into an xQueue) and I would like to have one dedicated high-priority task that is in control of the UART (and DMA) peripheral. I don’t want to use enormous amounts of RAM and I don’t want complicated queues with pointers to temporary dynamically allocated buffers. In my code I have used the following terminology:
Local access: readRegs or writeRegs is called in your code.
Remote access: readRegs or writeRegs is called in the other microcontroller’s code and hence, the handleReceivedReadRegs or handleReceivedWriteRegs callback is called in your code.
Internal access: the 1000/1100 ms-timers has expired and the master needs to do the discovery process (if we currently have no link) or query if the slave is still connected. If this timer expires in the slave, it means the link is lost and some state variables housekeeping needs to be done.

Thank you in advance for any input on this.

RAc · December 29, 2022, 10:14am

Hi Arnold,

this is a very interesting and challenging issue to tackle, and I do not believe you need to fear someone deleting it. Yet I suspect that you will not receive many useful responses, reason being that this goes way beyond the scope of peer-to-peer support. Coincidentally, I have worked on similar designs in the past, and what you are looking at here is hard core consultant work that is worth several $10000 if taken up by an experienced and knowledgeable consultant. I’d love to give pointers, but I very simply do not have the time to do that.

Best of luck, I am very curious about the kind of responses you will receive.

richard-damon · December 29, 2022, 12:04pm

Quickly reading your discription, it definitely sounds like you should have just a single task on each end responsible for controlling the UART.

On the master side, this is important because it is required to use a “fair” scheduling system to allocate this resource, which doesn’t exactly match FreeRTOS’s scheduler system. The receive side needs to give the UART to the operation that the master has said should have it, with some protocol to control sending requests for what the salve side would like to do.

If your processing for the various types of requests are complicated enough that you want to make them different tasks (which you seem to do, though that goes against your rule of minimizing RAM usage, as multiple tasks use up memory for the task stacks), then you could use an event group for those various tasks to indicate to the control task their desire to do and operation, at which point they block until the control task signals them (perhaps with a direct to task notification) that they have control over the UART now, and should do their operation.

cobusve · December 29, 2022, 6:35pm

Very broad topic. Architecturally Queues will probably not work as they will violate your priority requirement, queues will let (naturally) the highest priority task get the queue access first, but once messages are enqueued they will be processed with the same priority (first come, first serve) exactly what you said you do NOT want.

Most of what you describe in terms of arbitration of the link and the protocol definition are out of scope for FreeRTOS and entirely up to your choice based on your requirements and I think we do not have enough of that to comment either way.

In terms of what you need from FreeRTOS I think you are on the right track if you are thinking of event groups or direct to task notifications. Model the system based on events that wake things up and when you need priorities to carry into the protocol use the scheduler to do that for you by letting tasks do the work instead of enqueuing your messages. Just remember that if you have a UART task it has one universal priority.

Remember that a protocol like this has some quantization in its architecture (one message cannot interrupt another message in the middle) so you will need to frame the work, to do that you probably need a “talking stick” pattern, usually done with a mutex, this means if a lower priority task acquires the mutex and is halfway through sending a message it will block higher priority tasks until the message is completed. The corner case you must make sure is working well is if the lower priority tasks has lots of messages to send, make sure it does not starve higher priority tasks. A typical anti-pattern is if the higher priority task sleeps and periodically tries to acquire the mutex, in this case the lower priority task will release and retake the mutex without yielding control (scheduler running) and every time the higher priority task wants it it will be busy …

Anyway some random thoughts.

It would be much more productive for you if you asked much more specific questions though we can discuss those.

richard-damon · December 29, 2022, 10:42pm

In my view, the “Talking Stick” is not a mutex, as that puts the control back to FreeRTOS which doesn’t understand the message scheduling rules.

The Talking Stick just needs to be a “token”, as permission of the Uart Control Task, to use the UART. You could use a Mutex as a “backup” to the stick, so tasks first ask for permission, and when given permission take the Mutex, and after their message, give the Mutex and stick back.

arnold_w · January 2, 2023, 12:23pm

I have split the implementation up into 4 files, any comments on this split?

A file that contains the UART-task where all the interrupts (including timer callbacks) are invoked onto the UART-task using event bits.
A dedicated file to pack/unpack the packets (including CRC-handling and reading/writing from/to whatever the particular address maps to).
A low-level file with the protocol statemachine that controls and the UART (and DMA) and all the invoked interrupts are handled in this file.
A high-level file that the rest of my application code should interface into. This file contains e.g. a user-friendly readRegs function that blocks for the duration of the read and then returns a status code.

arnold_w · January 2, 2023, 12:50pm

There are a number of issues I find challenging and in a “dream world” they have easy solutions.

When a local access is initiated, when the polling timer expires (an “internal access” is initiated) or when a remote access is initiated, simply take a binary semaphore and when it’s finished, simply give the binary semaphore. Of course, this won’t work for internal and remote accesses because the UART-task is the “engine” of the whole bus and this must never be blocked, which could be the case if a local access currently had the binary semaphore (the bus would deadlock). Also, fairness would not be guaranteed.

The discovery process the master performs consists of 4 reads and 2 writes and it would make sense to simply call readRegs 4 times and writeRegs 2 times to do the discovery. However, the discovery process is called from the UART-thread so I don’t believe that would work (deadlock?).

The only solution to the binary semaphore problem I can think of is to create a wrapper around the binary semaphore and introduce the concept of some kind of semaphore owner and implement functions isLocalAccessWaitingForSemaphore, transferOwnershipToInternalAccess and transferOwnershipToRemoteAccess. Then, at the end of all types of accesses, a determineSubsequentAccess-function followed by an executeSubsequentAccess-function would need to be called and this function would transfer ownership of the semaphore appropriately. In case an internal access is triggered while a local access is ongoing, then I would need a variable to indicate that an internal access is deferred and the determineSubsequentAccess-function would need to take this to account. In case there’s an error (e.g. CRC-error in a packet) in a local or internal access, then the access will be retried at most twice and this is also something the determineSubsequentAccess-function would need to take into account. However, the retry-count is something that belong in the high-level file (file #4 above), but now this need to be passed down to the low-level file (file #3 above). So it doesn’t feel like a very clean solution.

arnold_w · January 5, 2023, 10:45am

I asked my boss and he was ok with me revealing more in-depth information about the protocol. There are special packets and regular packets in the protocol. The special packets only consist of one byte and they are unidirectional:
0xFF: Master → Slave. Wake-up byte (in case the slave is using stop mode and its system clock isn’t running).
0x00: Master → Slave. Slave granted next access.
0x3C: Master → Slave. Are you still connected?
0x3D: Slave → Master. Yes, I’m still connected.
0x3E: Slave → Master. Yes, I’m still connected, but please redo the discovery process (sometimes, the link-status include additional connection information related to another bus, e.g. LINK_ESTABLISHED_WITH_PRODUCT_A_WITHOUT_A_CAN_BUS_LINK, LINK_ESTABLISHED_WITH_PRODUCT_A_WITH_A_CAN_BUS_LINK_TO_PRODUCT_B, etc).
The regular packets are readRegs, writeRegs, readRegsReply and writeRegsReply and they are bidirectional. 2 bits of the first byte in a regular packet represent the type of regular packet and the remaining 6 bits represent, loosely speaking, number of 16-bits regs involved in this packet (0 in case of a read regs reply with a non-zero status code). The total packet size in bytes can always be determined from the first byte alone. The remaining bytes are 32-bit start address (requests only), data to write/read (if appropriate), number of regs to read (if appropriate), status code (responses only) and CRC. The first byte in any packet type is always unique, it does not overlap between special and regular packets.

The protocol is based on a 1-wire half-duplex UART and a pin the slave controls (and the master subscribes to external interrupts on). One may ask why I simply didn’t just use two independent UART:s instead and the answer is that not all our hardware using this bus had two free UART:s. If only one pin is available, it can run in reduced 1-wire mode (the master can only be requestor and the slave can only be server) and this is the reason the master is fully in charge of the discovery process (the slave doesn’t read anything from the master during the discovery process; instead the master writes information about itself to the slave during the discovery process). During packet reception, the packet is investigated each time it receives an idle interrupt (it doesn’t assume you only get one idle interrupt per packet). Everything, except the wake-up byte, is request-response based (you should always get exactly one response to your request). When the master wants to initiate an access, it simply sends an readRegs/writeRegs packet and then receives the response packet. When the slave wants to initiate an access, it drives the request-to-send pin low and then waits to receive the special packet 0x00. If it does, it’s free to send its readRegs/writeRegs packet, but if it instead receives a readRegs/writeRegs packet from the master, then there was a race-condition and the master won and the slave should handle the master-initiated access and keep the request-to-send pin low and hope for better luck next time. When the slave has been granted an access and finished that access and fully interpreted the reply packet, it should drive the request-to-send pin high to indicate to the master that it is now idle (the master is free to initiate an access, if it wants to). Since the bus is running across a detachable cable, we must always use a timer to detect timeouts in case the cable is unplugged in the middle of an access. In the case of an error (either a timeout or status code in a reply packet was non-zero), the access should be retried twice before giving up. There is a small risk that the 0x00 “Slave granted next access” special packet is received by accident by the slave if it’s powered-up after the master and the master has already started to initiate an access. In this case either the data on the UART will be corrupted (resulting in a CRC-error and re-transmission) or there will be timeouts because both the master and the slave will think they just initiated an access and now both of them are waiting for a reply packet. In this case the access will timeout and the retry will correct the situation.

arnold_w · January 10, 2023, 1:14pm

My code works like this:

Local access (readRegs/writeRegs could be called from any task, except the UART-task, and the calling task could have higher or lower priority than the UART-task):

Take the semaphore (could potentially take “long” time, in case the bus is busy doing something else).
Copy the readRegs/writeRegs-parameters into static variables inside the low-level driver code.
Set an event bit, which will unblock the UART-task and the UART-task will then initiate the access, with the help of the static variables that were recently copied.
The calling task should wait for an event bit in return from the UART-task, that the access is finished.
When the event bit is set and received by the calling task, the calling task should give the semaphore, if there are no deferred accesses.

Remote or internal access (can only be called from the UART-task)

(Try to) take the semaphore with zero xBlockTime.
If the semaphore was taken, the access is initiated. Otherwise, set a flag that this access is deferred.

When any (local, internal or remote) access is completed, accessCompleteCallback is called in the UART-task. Inside accessCompleteCallback the deferred access flags are investigated and the subsequent deferred access (if any) is initiated. If there are deferred accesses, the semaphore should be inherited to the next access; otherwise
a) if the just completed access was a local access, then set the event bit that this access is now completed and also pass on information that the semaphore should be given by the task that called readRegs/writeRegs.
b) if the just completed access was not a local access, then give the semaphore directly in the end of accessCompleteCallback.

I believe I have a race condition, let’s assume a local access was just completed and the calling task has less urgent priority than the UART-task. Inside accessCompleteCallback the deferred access list is investigated and let’s assume there are no deferred accesses. Hence, the local access will finish after the accessCompleteCallback has finished executing and it will keep the semaphore until that point. But what if an internal access gets triggered right after the deferred access flags were investigated, but before the local access gave the semaphore? In that case the internal access would never be executed. Does anybody know how I can solve this?

arnold_w · January 10, 2023, 1:56pm

I have a feeling that I should somehow connect my deferred access flags to the event bits that my UART-task is waiting for in an infinite loop. Today, the UART-task waits for low-level event bits such as UART Tx/Rx interrupts, timer expiration, external interrupts, etc but maybe I should have a way to sometimes (when the bus is idle) also wait for high-level event bits such as initiate deferred access event bits?

adamds · January 18, 2023, 7:37pm

Arnold, are you still having issues? I am following up to see if you are stuck or if you have resolved your issue.

arnold_w · January 19, 2023, 7:54am

Yes, I’m still struggling. I’m basically trying to build a house without a drawing because I don’t know what the drawing should look like. I have written and implemented everything (well aware there are fundamental issues) and now I test it and solve all the bugs using ugly hacks. For the most part it seems stable, but I recently found an issue in the discovery process (if I disconnect the cable during the discovery process many times, it will usually hang (my statemachine gets destroyed and the semaphore isn’t given afterwards) within 25 attempts, which I plan to solve using an ugly hack.