Delete and restart a task blocked on FreeRTOS_accept()

I have a server task blocked on FreeRTOS_accept():

ctx->s = FreeRTOS_accept(*s, &addr, &addrlen); //ctx->s is a Socket_t, s is a Socket_t *

Under certain circumstances, I want to restart this server task by running vTaskDelete() followed by xCreateTask().

If the task is not yet blocked on FreeRTOS_accept(), then this works just fine. After the task restarts, the socket can be bound, placed in a listening state, and then block on accept, and we continue happily…

If the task is blocked on FreeRTOS_accept(), then I cannot find a way to unblock it, delete the task, and then restart it. The task does restart, but then the call to FreeRTOS_bind() gives me an error:

vSocketBind: TCP port [port] in use

I have tried closing the socket with the code below, but it doesn’t enter the conditional (i.e., ctx->s is NULL).

if (ctx->s != NULL) { // <-- this conditional is not satisfied
  FreeRTOS_shutdown(ctx->s, FREERTOS_SHUT_RDWR);

  // flush

  FreeRTOS_closesocket(ctx->s);
  ctx->s = NULL;
}

What I think I want to be able to do is force FreeRTOS_accept() to return, and then close the socket. I have tried FreeRTOS_SignalSocket(), but I don’t think that’s what it’s for (and it didn’t work…).

Key point, aborting a task does NOT release the things the task has acquired, so is generally a bad idea. It isn’t like in a system like Linux where you can abort a processes and get cleanup to happen, tasks are more like threads than processes in this regard.

I was trying to follow the model in the Simple TCP Echo Server demo:

The main task loop just accepts a connection on a socket and then creates a task to handle that connection, deleting the task when the connection is handled and then running through the loop again.

It seems like a reasonable use case for some logic in that main loop (or in another monitoring task) deciding that blocking on the FreeRTOS_accept() call is no longer desired or required. That’s effectively what I’m trying to do here. Is there no way to gracefully exit from that blocked call to FreeRTOS_accept()?

As often, I agree with Richard Damon.

Before killing a task, you should make sure that the task has released all handles, mutexes, sockets, heap allocations, anything!

I must admit that I never kill a task. When I want a task to terminate, I send it a request to terminate. The task knows exactly what resources must be released.

Note that a call to FreeRTOS_accept() can also return after a timeout. This timeout for FreeRTOS_accept() is determined by ipconfigSOCK_DEFAULT_RECEIVE_BLOCK_TIME, or by an individual socket property set with FREERTOS_SO_RCVTIMEO.

Hein,

Thanks. What do you use to send the request to the task? A notification? Can you point me to an example (GitHub or a demo somewhere) where the task is running through it’s normal infinite loop, but can receive a request to delete itself?

Thanks,

Mike

What if you just close the server socket (from an other task) ?
accept should then error out (I guess with -pdFREERTOS_ERRNO_EINVAL) and you could restart the server loop by re-creating the socket and waiting for incoming connections again …
Or simply set a global restart flag before close to give the server task an additional hint what to do on accept returning an error.

Killing and respawning tasks seems like a low hanging fruit but in fact the opposite is true !
This also applies to fully featured OSs like Linux or Windows with multi-threaded, single-process applications. This kind of application is kind of similar to embedded FreeRTOS applications with global, shared resources within the same address space.
You’ll easily run into any kind of resource leaks and all the related troubles.
Well, on resource limited embedded platforms you’ll probably hit e.g. by out-of-memory problems much earlier :slight_smile:

The magic of cleaning up all used resources on kill/exit only applies to complete processes managed by a separate kernel executive on (much bigger) platforms supporting this model.

That’s the most important reason why all experts (here) like Richard and Hein don’t recommend nor using this approach. Maybe except rare special cases, where they exactly know what they’re doing. And I fully agree with them.
Another important point might be deterministic resource usage. Having all required tasks created once you can’t run into problems creating some of them later on and surprisingly getting short of system resources like memory due to side effects e.g. related to heap management.

You’ve been warned :wink:

1 Like

Hartmut (and Hein and Richard),

First, the solution to close the socket from another task is working as expected, so I’ve marked your response as my solution.

Second, I accept, heed, and appreciate all the warnings and concern over killing one task from another.

For background: I’m working with FreeRTOS ported to a research platform (https://www.cl.cam.ac.uk/research/security/ctsrd/). CHERI is a hybrid capability system that provides hardware support for spatial memory safety. I am currently evaluating mechanisms to recover deterministically from the signal provided by the hardware when a memory safety violation is identified. The signal effectively breaks me out of the task and leaves the task in an unknown (or at least untrusted) state, so I don’t think I can rely on the task’s own cleanup.

For initial tests, I’ve made the scope of any task resources that require cleanup (e.g., sockets or allocations) static global, and then I’ve defined companion functions for each task that can be called from the signal handler to clean those resources up. It’s pretty hacky, but I’ve avoided heap memory leaks so far, and appear to be able to restart tasks without side effects. I’d be very glad for recommendations on cleaner implementations, however.

Thanks again for the help.

Mike

Hartmut wrote:

What if you just close the server socket (from an other task) ?

I’m afraid that doing so may lead to a crash. After the socket is closed, the heap will be reused for something else, while the task is still in an API, using a pointer to the socket.

There are checks in all API’s, as many as possible, but the API can not see that the memory space has been given back to the heap.

Hartmut wrote:

Or simply set a global restart flag before close to give the server
task an additional hint what to do on accept returning an error.

Yes like that, or use task-notify, or a semaphore.

The API’s FreeRTOS_recv() and FreeRTOS_select() can be interrupted from outside, by calling :

#if ( ipconfigSUPPORT_SIGNALS != 0 )
    BaseType_t FreeRTOS_SignalSocket( Socket_t xSocket );
#endif

This feature was added for users of wolfSSL (Embedded SSL/TLS Library). That library has blocking calls to FreeRTOS_recv(), and it was not able to respond immediately to any event, only after receiving TCP data. Thanks to this signal, it could interrupt a call to FreeRTOS_recv(), and respond to a button press or whatever event.

We could extend the usage of this signal to accept() and other API’s, but I lack time. And also, it is not really compatible with BSD/Posix.

In my personal projects, I never have blocking calls for sockets, only polling. I let the task sleep ( block ) on a call to FreeRTOS_select() ( which can be signalled too ), or I have the task wait for a semaphore.

Here is a feature that is less known: you can have a socket signal a semaphore on any important event ( connected, disconnected, read, write, error ):

    #if ( ipconfigSOCKET_HAS_USER_SEMAPHORE == 1 )
        #define FREERTOS_SO_SET_SEMAPHORE          ( 3 ) /* Used to set a user's semaphore */
    #endif

    SemaphoreHandle_t xServerSemaphore;
    xServerSemaphore = xSemaphoreCreateBinary();
    /* Connect a socket to a semaphore. */
    FreeRTOS_setsockopt( xSocket, 0, FREERTOS_SO_SET_SEMAPHORE, ( void * ) &xServerSemaphore, sizeof( xServerSemaphore ) );

Now you can write a loop for the task:

    /* Let the task wake-up every 2 seconds at least to do some regural tasks. */
    TickType_t xReceiveTimeOut = pdMS_TO_TICKS( 2000 );
    for( ;; )
    {
        xSemaphoreTake( xServerSemaphore, xReceiveTimeOut );
        /* Now check the socket with non-blocking calls. */
    }

With this method, any other task can get the attention of my task immediately, by signalling the semaphore.

I wrote several big TCP servers with this technique. There is a minor overhead when calling recvv() in a non-polling way.

Summary: you can either use FreeRTOS_select(), which can be signalled, or use a semaphore, which can be signalled from any other task as well.

Another feature you may want to check is ipconfigUSE_CALLBACKS, which allows specific call-backs.

Mike wrote:

The signal effectively breaks me out of the task and leaves
the task in an unknown (or at least untrusted) state

And what about collect and store information and reboot the whole CPU?
That is what I normally do when an exception has taken place: dump the registers in a safe location in RAM and start a reboot.

Hein,

Thanks for all of this. I’ll have a go at these recommendations (probably after Christmas!) and then follow up here.

Mike

From an attacker’s standpoint, though, this is effectively a denial-of-service, right?

With the plethora of vulnerabilities (often memory-safety related) found recently in 3rd party libraries (e.g., URGENT/11, Ripple20, Amnesia:33, [something soon in the 40s…]), we are looking for ways to use CHERI to isolate memory safety exploitation and allow the critical process to decide how to proceed.

For example, it may be worth attempting to restart a non-critical task 2 or 3 times, but if the exception recurs, then cease restarting the non-critical task and allow the critical task to continue, though in a degraded state (and log/inform/etc.).

In our demo application, we have a PLC running a Modbus server receiving commands via TCP from some Modbus client. If a Modbus client is actively trying to exploit the server (there’s a long list of Modbus-related CVEs…), the process control task can and should continue to operate, even if I have to shut down (with or without restarting) the Modbus server.

It may, of course, be preferable (or necessary, depending on where the corruption is) to simply restart the CPU, but if the attacker is aware of that then you have a legitimate availability threat.

Anyway, we’re way off topic here…

Even though I’m currently using the callback feature in a tailored way it would help to add the BSD/Posix feature to seamlessly close a socket from any task which errors out other blocked socket API calls. With the proper error return code it should be clear and documented that the affected socket should not be touched anymore e.g. closed twice.
If this is not possible without adding (much) more overhead the non-BSD socket signals are fine IMHO but should work with all blocking calls in a generic way.
Maybe it’s a personal thing - I don‘t like polling especially if dedicated event signaling is available :wink:
It just feels like a bad compromise…
I hope you’ll find the time / a way to add this to the stack Hein :+1:

@broomstick wrote:

Anyway, we’re way off topic here…

Off topic, but very interesting, thanks for sharing this! Now I understand better why you were asking all this.

Getting a device to reboot can indeed be viewed as a successful attack.

I wrote:

We could extend the usage of this signal to accept() and other API’s,
but I lack time. And also, it is not really compatible with BSD/Posix.

Sorry, I think it is compatible: when a thread blocks in recv(), you can signal that thread and have recv() will return with the errno EINTR.
It is easy and little work to extend this functionality to accept() and send() as well.

it would help to add the BSD/Posix feature to seamlessly close a socket
from any task which errors out other blocked socket API calls. With the
proper error return code it should be clear and documented that the
affected socket should not be touched anymore e.g. closed twice.

I’m afraid that “closing a socket while still in use” would give me nightmares. I have never attempted to do so under another OS.
It is OK two have 2 tasks share a socket: one task may read, while the other task does the writing. But before closing the socket, the two tasks must make sure that there is no active API using that socket.

the non-BSD socket signals are fine IMHO but should work with
all blocking calls in a generic way.

Agreed, and (as I just said) I do think it is a normal way of working to send signals to a threads to terminate an API call.

Maybe it’s a personal thing - I don‘t like polling especially if dedicated event signaling is available :wink:
It just feels like a bad compromise…

I recognise that feeling. But I often realise that TCP/IP is not a real-time thing, there is a randomness to it, especially when you go on the Internet.

The use of ipconfigUSE_CALLBACKS is more specific than the use of ipconfigSOCKET_HAS_USER_SEMAPHORE. But mind you that you’d still have to wake-up a task, because the application hook runs from the IP-task, which imposes limitations ( many API’s may not be called, and do not use blocking calls ). It is better to TaskNotify the owning task.

I added the signal checks in FreeRTOS_accept(). I would not dare to change FreeRTOS_send() and FreeRTOS_sendto(), because that could disturb existing users of +TCP signals.

Here the new version of FreeRTOS_Sockets.c (152.8 KB)

If this version does not compile, you can also just copy the function FreeRTOS_accept().

PS. at this moment I am not able to test the change.

Regards

I‘m aware of the restrictions using the IP-Task callbacks and do only some needed pre-processing in this context. An application network-task is then (task-)notified for further protocol handling as you proposed.
It‘s lean, fast and works for my application - I like this feature :+1: