Short introduction:
We have an embedded device, running FreeRTOS with FreeRTOS+TCP.
The embedded device communicates with an application on a Windows 10 PC.
Communication is done with a propriety protocol running on TCP/IP.
The embedded device is the client, the Windows 10 PC the server.
This propriety protocol supports application data and, if there is no application data, a propriety ‘live’ message is send each 30 seconds to detect the device is correctly running.
My problem is that this ‘live’ message is retransmitted each time by FreeRTOS+TCP.
I found out that the ‘SRTT’ is calculated in FreeRTOS_TCP_WIN.c and used to detect if a retransmission is needed.
However, in my case this ‘SRTT’ is calculated and tuned when both parties are exchanging a lot of data at startup. The ‘SRTT’ gets a value of about 70ms.
After exchanging a lot of data there are only ‘live’ messages.
The embedded device sends the ‘live’ message, however, Windows 10 is TCP/IP ACKing this ‘live’ message after 200ms.
This because of the default delayed TCP/IP ACK of 200ms in Windows 10. Note: we don’t want to change this default value.
Because the ‘SRTT’ timer is only 70ms, FreeRTOS+TCP retransmits the ‘live’ message.
Windows 10 is TCP/IP ACKing after that.
A new ‘SRTT’ isn’t calculated because it is a retransmission? (line 1917 FreeRTOS_TCP_WIN.c).
Next time a ‘live’ message is send same behavior happens. The embedded device sends the ‘live’ message. It’s retransmitted. Windows 10 TCP/IP ACKs the message, but nothing happens with the internal ‘SRTT’ timer.
This keeps on forever, with each ‘live’ message.
Wireshark logging, where ‘172.16.43.1’ is the embedded device and ‘172.16.128.113’ is the Windows 10 PC.
Packet 64 is the live packet, packet 65 is the retransmission and packet 66 is the TCP/IP ACK of Windows 10.
64 12.538979 172.16.43.1 172.16.128.113 TCP 99 42832→15042 [PSH, ACK] Seq=1 Ack=46 Win=1460 Len=45 2021-11-19 12:27:55.754532
65 12.692310 172.16.43.1 172.16.128.113 TCP 99 [TCP Retransmission] 42832→15042 [PSH, ACK] Seq=1 Ack=46 Win=1460 Len=45 2021-11-19 12:27:55.907863
66 12.692700 172.16.128.113 172.16.43.1 TCP 66 15042→42832 [ACK] Seq=46 Ack=46 Win=32766 Len=0 SLE=1 SRE=46 2021-11-19 12:27:55.908253
Note that when the ‘SRTT’ timer is large, and there is no retransmission, it is seen clearly that Windows 10 is TCP/IP ACKing after 200ms.
I did a quick read of RFC6298 about the RTO (retransmission timeout) and SRTT.
If I understand it correctly the RTO should never become less then 1 second.
Chapter 2, 2.4: Whenever RTO is computed, if it is less than 1 second, then the RTO SHOULD be rounded up to 1 second.
I also have seen I can change the define ‘winSRTT_CAP_mS’ from 50ms to a higher value, however I don’t like changing 3th party source code.
So my question is, does FreeRTOS+TCP has the correct behavior with respect to resending timers or should the RTO always have a minimum value of 1 second like the RFC describes?
If it’s correct and FreeRTOS+TCP is fine, what is the best way to fix our problem? Changing the ‘winSRTT_CAP_mS’ to a higher value that is tuned with the Windows 10 timing or another solution?