FreeRTOS_accept() stalls until next event

I’m implementing a modbus server using FreeRTOS V8.2.3 targeting RISC-V, but currently running on QEMU (I recognise this might be my problem… I can provide more details if that would be helpful.).

My server’s FreeRTOS_accept() seems to stall after my client’s FreeRTOS_connect() call until a subsequent FreeRTOS_send() from my client.

In the server task, before the main loop, I call:

Socket_t s;
s = modbus_tcp_listen(ctx, 1);
modbus_tcp_accept(ctx, s);

modbus_tcp_listen() contains the following:

Socket_t new_s;
struct freertos_sockaddr addr;
TickType_t xReceiveTimeOut = portMAX_DELAY;

BaseType_t enable = pdTRUE;
FreeRTOS_setsockopt(new_s, 0, FREERTOS_SO_REUSE_LISTEN_SOCKET, (void *)&enable, sizeof(enable));
FreeRTOS_setsockopt(new_s, 0, FREERTOS_SO_RCVTIMEO, &xReceiveTimeOut, sizeof(xReceiveTimeOut));

addr.sin_port = FreeRTOS_htons(ctx_tcp->port);
if( FreeRTOS_listen(new_s, 1) != 0) {
  return NULL;
}
return new_s;

modbus_tcp_accept() contains the following:

ctx->s = FreeRTOS_accept(s, &addr, *addrlen);
configASSERT(ctx->s != FREERTOS_INVALID_SOCKET && ctx->s != NULL);

return ctx->s;

When I get the server up and running, my debug output shows the following:
Socket 502 -> 0ip:0 State eCLOSED->eTCP_LISTEN
In accept() <-- my own debug statement

Then it stops. When I initiate the connection from the client, it advances a little bit:
xNetworkInterfaceOutput

Not until I initiate my first FreeRTOS_send() from the client, does it proceed further:
prvSocketSetMSS: 1160 bytes for a000202ip:49650
Socket 502 -> a000202ip:49650 State eTCP_LISTEN->eSYN_FIRST
MSS change 1160 -> 1460
prvWinScaleFactor: uxRxWinSize 1 MSS 1160 Factor 0
Socket 502 -> a000202ip:49650 State eSYN_FIRST->eSYN_RECEIVED
xNetworkInterfaceOutput
TCP: passive 502 => a000202ip:49650 set ESTAB (scaling 0)
Socket 502 -> a000202ip:49650 State eSYN_RECEIVED->eESTABLISHED
xNetworkInterfaceOutput
The client connection from 10.0.2.2 is accepted <-- my own debug statement
Waiting for an indication… <-- from the modbus library

Two other oddities:

  1. I need to wait about five seconds between the clients FreeRTOS_connect() and the first FreeRTOS_send() to avoid a timeout at the client. Even if the client times out, the server eventually wakes and processes that first message, but then has nobody to reply to because the client has timed out and shut down. So it seems like the message is sitting in a buffer at the server waiting for FreeRTOS_accept() to complete.
  2. I need to bump the clients RTT tolerance to about two seconds on that first message (the first FreeRTOS_send()) to avoid a timeout at the client.

You are using quite an old version of FreeRTOS - curiously as we didn’t add RISC-V support until much later. Could you tell me which version of FreeRTOS+TCP you are using.

Sorry. I’m new to FreeRTOS and was looking in the wrong place (a config file that has been carried by the project and maintained over time as the kernel has been updated).

The kernel files (e.g., queue.c) are actually v10.2.1.

The TCP files (e.g., FreeRTOS_IP.c) say FreeRTOS+TCP 191100 experimental.


Also, something I failed to mention in my first post:

If I attempt to connect twice in rapid succession from the client, then FreeRTOS_accept() immediately returns and everything works as expected, no delay required and no timeouts.

Can you try the latest release to see if this is still an issue for you. It should be a drop in replacement, I think, although it may ask you to provide an additional callback function to provide some entropy (?):

This is less trivial than it should be because I’m working on a research system that tracks pointer provenance (amongst other things), and so the macros used for converting pointers don’t behave as expected, and FreeRTOS-Plus-TCP required some porting.

That said, someone has already bumped us up to v2.0.11. I am now running my code with that version, but I’m getting the same result (hanging at FreeRTOS_accept()).

I’m able to run the CLI and HTTP server demos without any problems, which are like my modbus application in that they maintain a TCP connection. I can also run the echo server demo.


The modbus application appears to hang in FreeRTOS_accept() from FreeRTOS_sockets.c at the line where it waits for an event to wake it up:

xEventGroupWaitBits( pxSocket->xEventGroup, eSOCKET_ACCEPT, pdTRUE /*xClearOnExit*/, pdFALSE /*xWaitAllBits*/, xRemainingTime );

For the CLI demo, it waits here when the application starts, and then when I connect via telnet it returns and continues on happily.

In my modbus application, it waits here when the application starts. When I connect via the modbus client, the connect() function in the client returns happily and the client continues on (eventually timing out). As I said previously, the the server’s FreeRTOS_accept() function stalls at the line above until:

  1. I pause the client for about 4 seconds before it send()s anything
  2. I run connect() on the client again

Let me know if you think I should bump up from v2.0.11 to the most recent version, given the information above. I can put in the porting effort, but it seems that if the CLI demo is working as expected, then mine should as well.

Thanks again for your help.

We would need to see the version history to know if there were any changes in that area of the code to know if updating would make a difference.

I just tried to walk through…

The version update to 2.0.11 was performed in this commit:

There are lots of whitespace changes since 2.0.11 in FreeRTOS_Sockets.c, so I pulled out just the FreeRTOS_accept() function from 2.0.11 and 2.3.1, removed all leading whitespace, and performed a diff. It’s relatively short and straightforward. Here’s the output:

1c1
< Socket_t FreeRTOS_accept( Socket_t xServerSocket, struct freertos_sockaddr *pxAddress, socklen_t *pxAddressLength )
---
> Socket_t FreeRTOS_accept( Socket_t xServerSocket, struct freertos_sockaddr * pxAddress, socklen_t * pxAddressLength )
3,4c3,4
< FreeRTOS_Socket_t *pxSocket = ( FreeRTOS_Socket_t * ) xServerSocket;
< FreeRTOS_Socket_t *pxClientSocket = NULL;
---
> FreeRTOS_Socket_t * pxSocket = ( FreeRTOS_Socket_t * ) xServerSocket;
> FreeRTOS_Socket_t * pxClientSocket = NULL;
13c13
< pxClientSocket = ( FreeRTOS_Socket_t * ) FREERTOS_INVALID_SOCKET;
---
> pxClientSocket = FREERTOS_INVALID_SOCKET;
16c16
< ( pxSocket->u.xTCP.ucTCPState != eTCP_LISTEN ) )
---
> ( pxSocket->u.xTCP.ucTCPState != ( uint8_t ) eTCP_LISTEN ) )
19c19
< pxClientSocket = ( FreeRTOS_Socket_t * ) FREERTOS_INVALID_SOCKET;
---
> pxClientSocket = FREERTOS_INVALID_SOCKET;
36a37
>
44c45
< pxClientSocket->u.xTCP.bits.bPassAccept = pdFALSE_UNSIGNED;
---
> pxClientSocket->u.xTCP.bits.bPassAccept = pdFALSE;
52c53
< xTaskResumeAll();
---
> ( void ) xTaskResumeAll();
63a65
>
78c80
< client gets connected for this listening socket. */
---
> * client gets connected for this listening socket. */
80,81c82,83
< xAskEvent.pvData = ( void * ) pxSocket;
< xSendEventStructToIPTask( &xAskEvent, portMAX_DELAY );
---
> xAskEvent.pvData = pxSocket;
> ( void ) xSendEventStructToIPTask( &xAskEvent, portMAX_DELAY );
92a95
>
112c115
< xEventGroupWaitBits( pxSocket->xEventGroup, eSOCKET_ACCEPT, pdTRUE /*xClearOnExit*/, pdFALSE /*xWaitAllBits*/, xRemainingTime );
---
> ( void ) xEventGroupWaitBits( pxSocket->xEventGroup, ( EventBits_t ) eSOCKET_ACCEPT, pdTRUE /*xClearOnExit*/, pdFALSE /*xWaitAllBits*/, xRemainingTime );
116c119
< return ( Socket_t ) pxClientSocket;
---
> return pxClientSocket;

I’m not sure if that’s helpful. Really, when I started this thread I hoped I was just doing something obviously stupid (accept() hangs in your code and not the CLI demo. Here’s your configuration problem…). I didn’t intent for anyone to trawl through a bunch of code, but this diff at least looks pretty straightforward. I don’t see anything obvious in the xEventsGroupWaitBits() section, where my code seems to hang.

No, that diff just looks like changes related to coding standard conformance.

As far as the source versions concerned: FreeRTOS+TCP runs with any kernel version that has Event Groups, starting at V8.1.2. But of course it is favourable to use the latest versions of both the kernel and FreeRTOS+TCP..

I am curious which NetworkInterface you are using, where can it be found?

Also it would be helpful to see more complete source code, some module that I can run on 2 CPU’s and see what goes wrong. Would that be possible? Of course I don’t need any of modbus code/contents.

Can you confirm that the server is ready far before the client issues a connect(), so it doesn’t miss any packet?

You see a timeout of 5 seconds. I don’t know of such a delay/time-out in the code. FreeRTOS_connect() uses time-outs of 3, 6, and 12 seconds. It gives up after 3 attempts, which will last about 21 seconds.

PS. You don’t have to call FreeRTOS_connect() in a blocking way. You can call it once, and it should return ‘0’. Then you can wait until FreeRTOS_recv() either returns data, or it returns an error like e.g. -pdFREERTOS_ERRNO_ENOTCONN.
Also FreeRTOS_issocketconnected() can tell if a socket is connected.

Would it be possible to crate a PCAP of the communication using Wireshark of tcpdump? Sometimes seeing the data helps understanding the behaviour.
Hein

Hein,

Thanks for this. Lots of stuff below. I’ve tried to address your points individually. Hopefully I’ve got them all!

Mike


The repo/branch I’m working on is here:
https://github.com/CTSRD-CHERI/FreeRTOS-mirror/tree/modbus_tcp

Actually building and running it is a serious commitment, because the toolchain for this research system is not quite user-friendly enough for prime time yet. That’s why I’ve been trying to compare the execution of my modbus demo with the CLI and HTTP server demo (which are compiling and running with the same toolchain and runtime environment)… Hoping that I can provide enough useful information to get help!

If you’re interested, though, you can start here:


Assuming you don’t want to do that yet, though:

I’m using this NetworkInterface.c (for virtio):

The actual modbus server code is here:


I can confirm that FreeRTOS_connect() and FreeRTOS_listen() both return as expected before I call FreeRTOS_accept() at the server, and that all of that is happening before I ever launch the client, which then calls FreeRTOS_connect() and FreeRTOS_send().

You asked about the timeout:
I wasn’t referring to a built-in timeout for a FreeRTOS function. I found that if I make the client sleep for about 5 seconds between FreeRTOS_connect() and FreeRTOS_send(), then the client/server behaviour proceeds as expected. The server remains in FreeRTOS_accept() during those five seconds, and then returns when the client calls FreeRTOS_send() with the first modbus request.

If I don’t wait those five seconds, then the client calls FreeRTOS_send(), the server doesn’t respond quickly enough, and the client times out. A few seconds after the client times out, the server suddenly returns from FreeRTOS_accept(), finds the buffered request from the client, and tries to respond to it. Watching that behaviour was what caused me to try the 5 second timeout in the first place.


I am running FreeRTOS on QEMU on an Ubuntu 18.04 host. The modbus server is running on FreeRTOS and the client is connecting from the host via localhost:1502. Here is an annotated tcpdump (tcpdump -i lo):

Successful execution:

12:38:37.853855 IP localhost.52748 > localhost.1502: Flags [S], seq 1016627866, win 65495, options [mss 65495,sackOK,TS val 1240153279 ecr 0,nop,wscale 7], length 0
12:38:37.853863 IP localhost.1502 > localhost.52748: Flags [S.], seq 1188478022, ack 1016627867, win 65483, options [mss 65495,sackOK,TS val 1240153279 ecr 1240153279,nop,wscale 7], length 0
12:38:37.853870 IP localhost.52748 > localhost.1502: Flags [.], ack 1, win 512, options [nop,nop,TS val 1240153279 ecr 1240153279], length 0

# sleep the client for 5 seconds here.  The server is currently in FreeRTOS_accept() and the client is waiting to call FreeRTOS_send().  The server will return from FreeRTOS_accept() and immediately call FreeRTOS_recv().

12:38:42.854022 IP localhost.52748 > localhost.1502: Flags [P.], seq 1:13, ack 1, win 512, options [nop,nop,TS val 1240158279 ecr 1240153279], length 12
12:38:42.854027 IP localhost.1502 > localhost.52748: Flags [.], ack 13, win 512, options [nop,nop,TS val 1240158279 ecr 1240158279], length 0
12:38:43.843623 IP localhost.1502 > localhost.52748: Flags [P.], seq 1:13, ack 13, win 512, options [nop,nop,TS val 1240159269 ecr 1240158279], length 12
12:38:43.843632 IP localhost.52748 > localhost.1502: Flags [.], ack 13, win 512, options [nop,nop,TS val 1240159269 ecr 1240159269], length 0
12:38:43.843730 IP localhost.52748 > localhost.1502: Flags [F.], seq 13, ack 13, win 512, options [nop,nop,TS val 1240159269 ecr 1240159269], length 0
12:38:43.884886 IP localhost.1502 > localhost.52748: Flags [.], ack 14, win 512, options [nop,nop,TS val 1240159310 ecr 1240159269], length 0
12:38:43.889227 IP localhost.1502 > localhost.52748: Flags [F.], seq 13, ack 14, win 512, options [nop,nop,TS val 1240159314 ecr 1240159269], length 0
12:38:43.889236 IP localhost.52748 > localhost.1502: Flags [.], ack 14, win 512, options [nop,nop,TS val 1240159314 ecr 1240159314], length 0

Client timeout:

12:34:11.120389 IP localhost.52734 > localhost.1502: Flags [S], seq 1270366466, win 65495, options [mss 65495,sackOK,TS val 1239886545 ecr 0,nop,wscale 7], length 0
12:34:11.120400 IP localhost.1502 > localhost.52734: Flags [S.], seq 106444095, ack 1270366467, win 65483, options [mss 65495,sackOK,TS val 1239886545 ecr 1239886545,nop,wscale 7], length 0
12:34:11.120409 IP localhost.52734 > localhost.1502: Flags [.], ack 1, win 512, options [nop,nop,TS val 1239886545 ecr 1239886545], length 0

# No sleep here.  The client proceeds immediately from FreeRTOS_connect() to FreeRTOS_send().  The server is stuck in FreeRTOS_accept()

12:34:16.120576 IP localhost.52734 > localhost.1502: Flags [P.], seq 1:13, ack 1, win 512, options [nop,nop,TS val 1239891546 ecr 1239886545], length 12
12:34:16.120583 IP localhost.1502 > localhost.52734: Flags [.], ack 13, win 512, options [nop,nop,TS val 1239891546 ecr 1239891546], length 0

# Server responds to the client with an ACK, but never sends and actual response (see difference with the previous dump).

12:34:20.121040 IP localhost.52734 > localhost.1502: Flags [F.], seq 13, ack 1, win 512, options [nop,nop,TS val 1239895546 ecr 1239891546], length 0

# The client has timed out and sends its FIN.

12:34:20.164868 IP localhost.1502 > localhost.52734: Flags [.], ack 14, win 512, options [nop,nop,TS val 1239895590 ecr 1239895546], length 0

# The server responds with a FIN/ACK...

12:34:30.611201 IP localhost.1502 > localhost.52734: Flags [F.], seq 1, ack 14, win 512, options [nop,nop,TS val 1239906036 ecr 1239895546], length 0
12:34:30.611213 IP localhost.52734 > localhost.1502: Flags [.], ack 2, win 512, options [nop,nop,TS val 1239906036 ecr 1239906036], length 0

# After a few more seconds, the server will wake from FreeRTOS_accept(), process the request from the client and try to respond, but the connection is closed.

Update: This behaviour might be QEMU related.

I just modified my code to run the client as a second task. They communicate with one another through the loopback interface on the host. The behaviour was as-expected without any need to delay the client. I.e., the server blocks on FreeRTOS_accept(), returns immediately when it receives the first request from the client, and proceeds to call FreeRTOS_recv().

Again, the CLI demo, running on QEMU, correctly returns from FreeRTOS_accept() when the client connects on the host. I’m not sure why my modbus demo behaves differently, but it doesn’t seem to receive or respond to the event in the same way.