Need a primitive to abort a queue wait (force return from xQueueReceive())

system · August 13, 2015, 9:08pm

zradouch wrote on Thursday, August 13, 2015:

I need to extend FreeRTOS to allow me to abort a queue wait, i.e., to release a task suspended in xQueueReceive().
Here’s the big picture – I run WolfSSL on top of the LwIP stack with FreeRTOS below the stack. My app needs full duplex network connectivity, but LwIP is unfortunately only half duplex (either send() or receive()), so in order for my app to transmit it needs to abort the pending receive (which eventually translates to xQueueReceive).
My solution/hack is to create a new OS primitive: AbortQueueReceive(queue, task) to be called by the transmit task, that verifies that the specified [receive] task is indeed blocked on the specified queue, and releases the task with an error, as if the xQueueReceive() simply timed out.

I am looking for any pointers, suggestions on where to start, or other ideas or solutions I may have not considered. By the way, I do not consider polling a solution.

Thanks
-Z

richard-damon · August 13, 2015, 10:39pm

richard_damon wrote on Thursday, August 13, 2015:

How about posting a message to the queue the task is waiting on.

system · August 14, 2015, 2:27am

zradouch wrote on Friday, August 14, 2015:

Note that this is a TLS (over TCP) stack waiting for data. Neither the producer (TCP state machine) nor the consumer (rx socket) are my code. I can’t simply inject something into the rx data stream. Also note that such injection would have to be properly synchronized with the reader task, since it would be racing with a real rx frame possibly arriving on the same queue at “the same time”. Finally, to inject an item that would have the right type, yet that would be rejected by the stack, would be extremely difficult if not impossible, requiring detailed understanding of both the TCP and the TLS layers.
Aborting the read with an error, while quite a hack when it comes to modularity, will result in a clean, timeout-like error propagating all the way to the app. Doing this requires neither modifications, nor understanding of either the TCP stack or the TLS layer riding on top of it.

htibosch · August 14, 2015, 6:45am

heinbali01 wrote on Friday, August 14, 2015:

Hi Zdenek,

There are many ways to reach your goal.

Here are some ideas for FreeRTOS+TCP :

One way is to use two separate tasks, one for sending and one for receiving. But you probably want to do both things from within a single task?

Use FreeRTOS_select() to wait for RX, TX and Error events on a set of sockets.

#define ipconfigSUPPORT_SELECT_FUNCTION    1

This method will need a small hack: make a new function:

BaseType_t FreeRTOS_SelectAbort( SocketSet_t xSocketSet );

which can be called from any other task.

A possible implementation would be:

    BaseType_t FreeRTOS_SelectAbort( SocketSet_t xSocketSet )
    {
    SocketSelect_t *pxSocketSet = ( SocketSelect_t * ) xSocketSet;

        xEventGroupSetBits( pxSocketSet->xSelectGroup, eSELECT_EXCEPT );
    }

Here is yet another way, define in FreeRTOSIPConfig.h :

#define ipconfigSOCKET_HAS_USER_SEMAPHORE    ( 1 )

Your task must create a semaphore and in stead of calling FreeRTOS_recv() in a blocking way, your task will block on the semaphore. You can use any value for time-out (maximum blocking time).

After creating a socket, advertise your semaphore:

    #if( ipconfigSOCKET_HAS_USER_SEMAPHORE != 0 )
        FreeRTOS_setsockopt( xMySocket, 0, FREERTOS_SO_SET_SEMAPHORE,
            ( void * ) &( xSemaphore ), sizeof( TickType_t ) );
    #endif

Please note the ampersand in &( xSemaphore ) : setsockopt() wants to receive a pointer to the value.

Now you can do the following, with either Select() or using a Semaphore:

    BaseType_t xMustSendMessage = 0;

    void vMyTask( void *pvParameters )
    {
#if( ipconfigSOCKET_HAS_USER_SEMAPHORE != 0 )
    SemaphoreHandle_t xSemaphore;

        xSemaphore = xSemaphoreCreateBinary();
#elif( ipconfigSUPPORT_SELECT_FUNCTION != 0 )
    SocketSet_t xSocketSet;

        xSocketSet = FreeRTOS_CreateSocketSet();
#endif
        for( ;; )
        {
        const TickType_t ulMaxBlockTime = pdMS_TO_TICKS( 5000ul );

            #if( ipconfigSOCKET_HAS_USER_SEMAPHORE != 0 )
            {
                xSemaphoreTake( xSemaphore, ulMaxBlockTime );
            }
            #elif( ipconfigSUPPORT_SELECT_FUNCTION != 0 )
            {
                FreeRTOS_select( xSocketSet, ulMaxBlockTime )
            }
            #endif



            lRc = FreeRTOS_recv( xMySocket, pcBuffer,
                sizeof pcBuffer, FREERTOS_MSG_DONTWAIT );
            if( lRc > 0 )
            {
                /* Data received */
            }
            else if( lRc < 0 )
            {
                /* Connection error, probably -pdFREERTOS_ERRNO_ENOTCONN
                or -pdFREERTOS_ERRNO_ENOMEM. */
            }

            if( xMustSendMessage != pdFALSE )
            {
                xMustSendMessage = pdFALSE;
                /* xSemaphoreTake was aborted because we must send
                a message. */
            }
        }
    }

Got the idea?

The above is not a polling solution: the task may sleep/block as long as it wants.

Regards.

rtel · August 14, 2015, 9:39am

rtel wrote on Friday, August 14, 2015:

Unfortunately aborting a wait on a queue is not an easy thing to do. When the waiting task leaves the blocked state it will check the queue again - if there is data in the queue it was remove the date and exit the function, if there is no data in it then the task will adjust its block time to see if the block time has expired - and if the block time has not expired simply return to block on the queue once more. So to abort a wait on a queue you must make the task think its block time has expired - but the block time is a function parameter - so either on the stack or in a register.

The code works this way for the following reason - a task can be removed from the blocked state because the queue contains data, only to find that by the time it actually runs again a higher priority task has removed the data from the queue - leaving the queue empty once more. In that case it is incorrect behaviour for the task to leave the xQueueReceive() function [although some RTOSes will happily do it] because it has neither received data nor timed out.

In this case I would suggest you have two options. One is to block on the queue in a loop with a short block time. Each time the xQueueReceive function returns you check to see if you have data, if you don’t have data, you then check to see if you should abort waiting - and only try reading from the queue again if there is no reason to abort.

Another alternative is to block on an event group instead of blocking on the queue directly - one bit in the event group can mean “abort waiting”, while another bit can mean “there is something in the queue”. Only if the latter bit gets set do you try reading from the queue.

Regards.

system · August 14, 2015, 10:41pm

zradouch wrote on Friday, August 14, 2015:

Thanks, I do appreciate your suggestions. Unfortunately, if you re-read my post, most of these suggestions are nonstarters.

@Hein Tibosch

Re 1)
You can’t have two tasks talking to the same half-duplex socket. Half duplex means that the socket can handle only a single direction transfer at a time.

Re 2)
My code is two third-party software packages away from the FreeRTOS calls. I have to work with that. I just know that my app task is blocked in xQueueReceive. I have no idea how it got there, nor do I know anything semantically about the data to be received.

Re 3)
Same problem as in 2)

@Real Time Engineers Ltd.

block on the queue in a loop with a short block time

I call this polling (or, if you prefer, polling in disguise).
I don’t want the latency and I don’t have the cycles. Plus, it would not pass my own code review; there are [very] few cases that justify polling, and this is not one of them.

Another alternative is to block on an event group instead of blocking
on the queue directly

See Re 2) above.

I do understand that aborting the wait is not an easy thing to do. I have been working inside multiple OSs for thirty years. If it was an easy thing, I would have simply done it and not asked for help here :-). I have been using FreeRTOS for a while, and in the recent years it has become my RTOS of choice. But so far, I have made only minor modifications to the FreeRTOS code base, thus my understanding of it is still limited.

I have taken a quick look at the queue code, and concluded that the fake timeout would be by far my best bet. Note both the queue and task in my proposed API. I think I can just mark the task with the fake timeout state, and unblock it. Subsequently, after checking the queue and finding it empty, the task will notice that it has been marked with the fake timeout and will use this information to avoid checking the real timeout
and thus simply return an error to the direct caller (LwIP who is supposed to propagate the error to the WoldSSL caller, who in turn should deliver the error to the [app] caller).

So that’s the theory, I could just really use a few pointers in order to a) get the best place to start hacking, and b) to make sure I am not missing some subtle issues, something quite likely to be found in code like this.

Cheers,
-Z

richard-damon · August 14, 2015, 11:22pm

richard_damon wrote on Friday, August 14, 2015:

Modifing the system to “Abort” a Queue wait (in effect forcing a timeout) has the same level of issues as putting a “null” entry into the queue, you still need to understand how the code is going to react to it, and likely there are issues which need to be changed. (Adding the abort just requires changes to TWO pieces of code instead of one). I also suspect that it would also take some study to confirm that this is just what is needed to convert a half duplex driver into full duplex (you will need to confirm that NO data is reused between transmit and receive sections.

system · August 15, 2015, 12:31am

zradouch wrote on Saturday, August 15, 2015:

The LwIP code expects one of three cases when xQueueReceive returns:

Success; a pointer (to an internal data structure) has been read from the queue.
Success; a NULL has been read from the queue (indicating closed connection).
Error; the error will be propagated to the caller of the socket recv().

I can’t push a NULL; that would shut down the connection. I certainly don’t want to push some [invalid] internal structure and hope it will work. The only option left is to return an error.

Note that my solution keeps the socket strictly half-duplex; I send only if the receive is not pending. This will work by definition; no further study is needed.

richard-damon · August 15, 2015, 5:44pm

richard_damon wrote on Saturday, August 15, 2015:

Let me begin by reiterating that I haven’t look at LwIP so I am only going off general principles.

You start with a basic problem, you need a full duplex stack, but only have a half-duplex stack. This is a fundamental problem, and assuming a ‘quick fix’ will help here sounds like a dangerous assumption (if that was all that was needed to make it really work full duplex, the author/community are very apt to have made such an improvement).

My concern is that by forcing a time-out here in the middle of a reception, you may well break that reception. Thus what should be a single message to recieving task, may well be broken up into multiple message (remember, you forcing a timeout has said there was too long of a space between packets).

There may well be reuse of variables between the receiving operation and the transmitting operation (after all, the library was written as half-duplex, so they don’t both occur at once). So this sort of ‘tampering’ with the library may have consquences.

If I were to do this, I would need to study LwIP enough to know what is needed to make it work full duplex and work there.

You seem to object to modifying LwIP because it is “Third Party”, well to your application, so is FreeRTOS. (Is the FreeRTOS Support that much better? Maybe).

IF you have really determined that all that is needed to make things work is to get that one xQueueReceive to abort its wait, then it would seem that it would be a much smaller change to add a test in LwIP there for a special packet that you make, and treat it like a timeout (that is likely a line of code or two) than to make the more fundamental change to FreeRTOS.

I am not sure why you think you aren’t making this full duplex? If something is blocking on reading data, it obviously thinks that the current direction is a read, and thus (barring a bug somewhere) so does the other end. You can’t just arbitrarily switch the direction and still be truely half-duplex.

system · August 15, 2015, 10:10pm

zradouch wrote on Saturday, August 15, 2015:

Hi Richard,

You are right that both LwIP and FreeRTOS are third-party packages to me. So why do I want to modify FreeRTOS? Actually for multiple reasons. First of all, this is not the first time I encountered the problem of needing to abort a call in FreeRTOS, and I feel that having a mechanism similar to a signal in Unix would be a great enhancement. Secondly, based on my quick evaluation of the respective code bases, and based on my own experience, it will be much easier for me to modify FreeRTOS than to modify LwIP. Finally, I already made some modifications to the FreeRTOS code I run, and I have not touched the LwIP, so from the maintenance point of view any future changes should be made in the RTOS.

Regarding the TCP stack, I am afraid you are missing a few key points.
The duplex property can be, and in this case is, different in each layer of the communication stack. The Ethernet, once half duplex, is full duplex today. The TCP stack engine in LwIP may be (I don’t know for sure, and neither do the LwIP folks :-() full duplex, too. The socket interface in LwIP is definitely half duplex. My app supports full duplex mode if it can run that way. So, when you say “true half duplex” it does not really mean anything (which layer?).

I can’t force an LwIP socket to be full duplex; that would require fundamental redesign. So, as I stated before, I will work with its half duplex semantics. The question is how to design a peer-to-peer app on top of a half duplex channel? One possible answer is to transmit whenever the app needs to transmit, and to suspend in receive whenever the app is not transmitting. This directly implies being able to abort the receive, and luckily for me the receive is abortable by a [commonly used] timeout. So, at the end of the day, all I need is to control the socket interface, and fool it into thinking it is time to time out. It is by far the cleanest solution I have seen so far.

htibosch · August 16, 2015, 1:18am

heinbali01 wrote on Sunday, August 16, 2015:

Hi Zdenek,

Both FreeRTOS+TCP and lwIP are full duplex tcp/ip stacks.

For lwIP running on top of FreeRTOS, a call to lwip_recv() may end-up in a call to xQueueReceive().
For FreeRTOS+TCP, a call to FreeRTOS_recv() may block in xEventGroupWaitBits().

When xxx_recv() is called, the socket option ‘SO_RCVTIMEO’ determines the maximum blocking time.

Giving ‘SO_RCVTIMEO’ a low value would be a ‘polling solution’, true.

In my opinion there is nothing against polling solutions unless:

You need super-fast responses times (e.g. < 10ms)
Your platform has to run “low power”

Both TCP/IP stacks provide possibilities to create an application that works full-duplex, in ways that I described here above. Those are clean and non-polling solutions. The best way I think is to use select().

Wolf’s implementation of SSL calls lwip_recv() from only one place (io.c). It shouldn’t be hard to replace that define into a call which is interruptible?

Adding signal(2)/kill(2) to FreeRTOS sounds attractive. But you’d still have to know:

which socket is being used, that is encapsulated in a SSL object
which QueueHandle_t that lwIP socket is using, that is encapsulated in the lwIP stack.

I’m not sure which way is more work

FreeRTOS already has many possibilities to interrupt blocking calls. An extra kill() doesn’t add much attractiveness and it would be extremely complex to implement in a neat and Real-Time way.

Best regards.

system · August 16, 2015, 12:54pm

zradouch wrote on Sunday, August 16, 2015:

Hi Hein,

I am not sure what makes you say that LwIP is full duplex stack, unlessyou are playing with semantics and want to claim that the stack is full duplex, but its user interfcae is not. LwIP never suported multiple threads performing i/o ona socket. I was told (just within this last week) that there is an effort and pre-alpha code that attempts to fix that. I am deploying in half a million devices; I can’t afford using something that may or may not work. Still, it sounds like I may be missing something, so please feel free to tell me what it is.

I suggest you try setting SO_RECVTIMEO to 20 ms (you seem to think <10 would be a problem) and see what happens to your CPU usage on that particular task.

Replacing Wolf’s call to something other than recv() is certainly a simple change, but it does not help unless I have an interruptible and semantically equivalent (to recv()) call I could make. I have experimented with that already – I had to rewrite the LwIP select() to make that possible. It took too much code and changes in both Wolf and LwIP.
Perhaps naively, but based on my limited experience with FreeRTOS internals, I am thinking a FreeRTOS solution would be much more maintainable.

Note that there is a difference between modifying a third party code, and [ab]using a third party code by touching what should be private data structures (to retrieve sock, queue, etc). Also note that I don’t care which one is “more work”; I can afford to hire someone to make this work (if I could find them). What I care about is maintainability as this is deployed in a commercial product, and the maintenance will cost way more than any development that would be needed to make this work.

Finally, you say:

it would be extremely complex to implement in a neat and Real-Time way

Please explain this statement, especially in light of my suggestion of how it could work (in my previous post). Ultimately, this is the information I came to find here at the FreeRTOS forum. It appears to me that what I proposed is neither complex, nor does it carry any negative implications on the real-time aspects of the resulting implementation, so if I am wrong I want to know.

Cheers,
-Z

htibosch · August 17, 2015, 2:42pm

heinbali01 wrote on Monday, August 17, 2015:

Hi Zdenek,

I am not sure what makes you say that LwIP is full duplex stack…

To start with: most PHY’s and most network drivers are working full duplex, using DMA chains.
In my opinion, if select() is being used, the Berkeley interface can be seen as full-duplex. But apparently you had to change things to make that working well under lwIP.
A second way is to use two tasks, one for reading and one for writing. As far as I know that works without a problem (I’m now testing that continuously with lwIP on a Xilinx Zynq, looks good).

( what I always miss in lwIP is a quick and easy way to find-out the status of a TCP connection )

I am deploying in half a million devices

Lucky you

I can’t afford using something that may or may not work.

I guess that the addition of kill/signal to FreeRTOS won’t make lwIP more reliable?

I suggest you try setting SO_RECVTIMEO to 20 ms (you seem to think <10 would
be a problem) and see what happens to your CPU usage on that particular task.

It depends on the platform of course. But compared to what it takes to calculate SSL, even a time-out of 1 ms won’t give much CPU usage.

I had to rewrite the LwIP select() to make that possible. It took too much code
and changes in both Wolf and LwIP.

FreeRTOS+TCP will be easier to use: it takes full advantage of all FreeRTOS possibilities, not only semaphores and queues.

Interesting: when +TCP was being developed (around 2013), we were thinking of introducing a signal(), it could have solved several problems.
But in stead of signal(), Richard (Barry) came up with the new module ‘event_groups.c’. The same instance of EventGroupHandle_t is now being used by several API’s:

FreeRTOS_recv() will wait for RX and ERR bits
FreeRTOS_send() will wait for TX and ERR bits
FreeRTOS_connect() will wait for TX and ERR bits
FreeRTOS_accept() will wait for RX and ERR bits

If the IP-task will set the ERR bit of a socket, all API calls (xEventGroupWaitBits()) will abort simultaneously.
The +TCP select() functions also use event groups, of course.

I am thinking a FreeRTOS solution would be much more maintainable.

FreeRTOS indeed has a very good maintainer, of course, but he tends to be quite conservative
And that is probably the reason that the source code is still so clean and comprehensible.

What I care about is maintainability as this is deployed in a commercial product
and the maintenance will cost way more than any development that would be needed to make this work.

What do you mean with maintenance? Once a product is tested and sold, should it change in the future? Are you expecting new ‘issues’ once it is in the field?
Are you planning to keep on upgrading the libraries used?

Finally, you say:

it would be extremely complex to implement in a neat and Real-Time way
Please explain this statement, especially in light of my suggestion of how it could work (in my previous post).

If you think it is doable, give it a try!
I am mostly concentrated on other FreeRTOS products like +TCP and +FAT.

Regards, Hein

htibosch · August 18, 2015, 9:54am

heinbali01 wrote on Tuesday, August 18, 2015:

Hi,

For anyone who is interested: I attached sources for two quick-and-dirty echo servers:

lwip_webserver.c for lwIP
plus_webserver.c for FreeRTOS+TCP

For each new client, two tasks will be started: echo_recv_task and echo_send_task.

I must say that in the lwIP version, the tasks sometimes get stuck: they get no more cycles and they never return. The +TCP echo server still runs OK and it has no orphans.

If you compare the sources, you’ll see some differences between the two TCP stacks, both in style and naming.

Regards.