Question about the TCP window

Dweb_2 · April 7, 2022, 9:52am

Hi,
we are using FreeRTOS+TCP V2.4.0 and it’s STM32H7 port.
Our device acts as a TCP server and will answer commands from remote clients with an appropriate response. The client is a terminal program (YAT download | SourceForge.net).

I lately observed, that a shrinking TCP window size caused problems.
Our device uses the default MSS of 1460 bytes. When a command (sent from terminal) triggered a response of 2196 Bytes, the terminal client reduced its window size further and further.
The following wireshark trace shows this, where the FreeRTOS+TCP device is 10.0.32.182 and the terminal is at 10.0.32.142:

The transmission was successful, but sending the same command (see frame #53) to the server (with the same response) permanently bricks the socket. This means, that also smaller responses can’t be sent anymore.

I think there might partly be a problem in the client from the terminal we use, but I also don’t understand the mechanisms in FreeRTOS+TCP. The responses were transmitted with FreeRTOS_send(), which returns values >0, even tho nothing lands on ETH wire.

It seems like the responses can be put into the Tx stream buffers of the socket for some time. I printed the left over space of Tx and Rx buffers of the socket, before calling FreeRTOS_send():
grafik

I dug down really deep into the stack, and found that prvTCPSendRepeated() will try send the frames for all these commands. For the mentioned problem, this fails for SEND_REPEATED_COUNT times.

It seems to me like the packets in the Tx queue are never dropped, after all these failed attempts. So subsequent packets, which might fit into the remote TCP window, also can’t get through. Smaller responses are put into the buffer, after the fails:
grafik

The workaround I found was either to decrease the MSS from 1460 to a smaller value (800 for example), or to reconnect to the socket.

I’m not too sure if there might be other problems, or if some kind of “clearing” API function call would be required. Is this desired behavior after all?

Thanks and best regards

kanherea · April 7, 2022, 8:08pm

Hello @Dweb_2,

This seems like an interesting behavior of the FreeRTOS+TCP stack.
I am looking into it.
In the meanwhile, would you be able to add a Wireshark log showing what happens when you send the command twice that triggers a response of 2196 bytes? If possible, please attach the full log so that I may open that in Wireshark.

Thanks,
Aniruddha

RAc · April 7, 2022, 8:28pm

To me this looks more like a problem of the ethernet driver which for some reason fails to put a DMA packet onto the wire.

hs2 · April 7, 2022, 9:13pm

I’d agree with @RAc. I’ve never encountered such a behavior even though I’ve similar (pretty common) use cases like @Dweb_2. I bet it’s not a stack problem but a driver issue.

Edit: Rethinking about it brings up the question if the client announcing the Rx window somehow misses to receive the complete amount of data stored in it’s socket receive buffer. Then it’s rather a application level protocol issue than a low level driver or TCP stack problem at the device side.
Another indication for an application level problem would be if a close/re-open of the socket restarts the communication with a normal/full sized Rx window.

Dweb_2 · April 8, 2022, 6:04am

Thanks for your answers.
@kanherea this is the respective wireshark file:
tcpWindowTrace.zip (1.7 KB)

@hs2 That could definitely be the case.

Dweb_2 · April 19, 2022, 8:53am

@kanherea Did you investigate the problem any further?

kanherea · April 19, 2022, 11:04pm

Hello @Dweb_2,

Apologies, I was not able to replicate your issue.
Can you please step through the code and see why does prvTCPSendRepeated fail? I mean what is the line of code from where the code returns an error?

Regards,
Aniruddha

htibosch · April 20, 2022, 5:38am

Hello @Dweb_2 , looking at the small piece of PCAP, I conclude the same as Hartmut @hs2 in his EDIT: it looks like the client forgets to read the reply on the socket.

The PCAP shows that +TCP has done its best to deliver the packet. It even sets the PSH flag when sending data, thus notifying the remote end that data may be forwarded to the socket owner.
Normally, the window size of the peer may reduce temporarily, but not for ever.
If you find it difficult to solve this, maybe you can send the TCP code of the client.

Dweb_2 · April 20, 2022, 6:32am

Hello and thanks for your answers.

@kanherea It looks to me like the loop in prvTCPSendRepeated is exited here. xSendLength is zero, even for smaller payloads which might sill fit within the window.

@htibosch Thanks for the confirmation. I think we just have to accept this behavior for now.

htibosch · April 20, 2022, 7:18am

I see, don’t you maintain the source code on the host (client) side?

Dweb_2 · April 21, 2022, 7:39am

We don’t maintain the client side code. I think I can’t share the TCP server souce here.

htibosch · April 22, 2022, 4:15am

Hello @Dweb_2, what you can do is replace your client by a TCP testing program.

I like a simple older utility Hercules. It can open a TCP connection to a server, and let you send and receive data.

Or a newer utility called Packet Sender. In the settings you will find an option “Persistent TCP Connections”. You can send the request and see what is returned by the server.

Dweb_2 · April 22, 2022, 5:51am

I actually tried those two before and additionally the Python 3.10 socket library. They worked fine. BUT all of them offer larger window sizes (>4kB, at least on my machine) and I could never get the exact same parameters (MSS/Window) like those used by the YAT Client.
It seems to me like these are some really low level client/OS settings.

hs2 · April 22, 2022, 6:38am

I think the Rx window is determined by the socket Rx buffer size, which can be configured (via socket option). But regardless of the absolut size is the window shrinking over time / during traffic with your test application ?

htibosch · April 22, 2022, 7:23am

It looks like your client has made this setting:

    unsigned newMSS = 2920;
    setsockopt( mySocket,
                SOL_SOCKET,
                SO_RCVBUF,
                (char*) &newMSS,
                sizeof( newMSS ) );

You could use also set SO_SNDBUF.

@hs2 wrote:

is the window shrinking over time / during traffic with your test application ?

Maybe you can attach another short PCAP?

Dweb_2 · April 22, 2022, 11:55am

So I sent a bunch of random commands via the terminal to the device. The problematic command which I showed at the start of this thread is not included.
I extracted the TCP Window sizes and plotted this graph:

This is the source .pcap file:
tcpWindowSizes.zip (91.7 KB)

htibosch · July 8, 2022, 6:40am

Hello @Dweb_2, I downloaded the YAT terminal v2.4.1, and I had my embedded +TCP server give a 2196-byte reply to the command “longresp”.

YAT received the first block, but seemed “to ignore” all further commands.

This was the session:

ver
 562.317.653 [SvrWork         ] Verbose level 0
longresp
 Hi, we are using FreeRTOS+TCP... ( your first post )

I had a second TCP client connected to the same port 2402, which showed no problems at all.

Here is my PCAP of the session:
yat_session_hein.zip (2.4 KB)

The +TCP server runs on 192.168.2.127