Return value of FreeRTOS_send function becomes pdFREERTOS_ERRNO_ENOSPC (=28)

Hello guys,

In a program using FreeRTOS+TCP, under the follwing evaluation conditions, the return value of FreeRTOS_send function becomes pdFREERTOS_ERRNO_ENOSPC (=28).
In order to prevent pdFREERTOS_ERRNO_ENOSPC from being returned, I would like to know cause and mechanism of pdFREERTOS_ERRNO_ENOSPC to prevent this error from occurring.

On the FreeRTOS_send() API explanation page,
“If timeout occurred before any data could be sent then -pdFREERTOS_ERRNO_ENOSPC is returned.”
I understand that it is stated.
This explanation alone does not explain the cause and mechanism of pdFREERTOS_ERRNO_ENOSPC,
I need your advice because I can’t consider measures to prevent pdFREERTOS_ERRNO_ENOSPC from returning.

■ Evaluation environment and conditions at the time of occurrence
The evaluation environment is as follows.
· Evaluation board using Renesa Electronics RZ/T2M (CPU Core: Coretex-R52 operating clock: 800MHz)
・PC with Windows 10 installed

Program operation:
Transmits 1460 bytes of data every 2 msec from the program running on the evaluation board to PC.
As a Windows10 application on PC, display # when the data is received.

Evaluation contions:
PC environment:
Windows 10
Set TcpAckFrequency=1 in the registry. (Setting to speed up Ack response in Ether communication)

Program:
FreeRTOS Kernel version: V10.3.0
FreeRTOS-Plus-TCP version: V2.3.0

 FreeRTOS
  Tick rate of FreeRtos: 16kHz
 FreeRTOS+TCP
   TCP transfer size: 1460 bytes
   Transfer interval: 2ms
   Tx/Rx Buffer size: 6000

Frequency of error occurrence:
occurs in about 4-5 hours of communication.

Hi @Kamei

FreeRTOS_send() return a negative value only when there are no bytes to be sent. You can check that by using Wireshark to view the network traffic.

So to prevent this from happening you might have to cross check if is your application is actually sending any data or not when pdFREERTOS_ERRNO_ENOSPC is returned? Maybe your application is sending data, but it doesn’t get an ack back? You can check in this direction as well.

Hi @moninom1

Thank you very much for your advice.

In curent sorce code, when the return of FreeRTOS_send() is pdFREERTOS_ERRNO_ENOSPC, It’s in a loop and doing nothing. So application does not send data after occur pdFREERTOS_ERRNO_ENOSPC. We don’t want this error to occur and would like to know the cause.

Is it possible for you to capture the network traffic?

Yes.
The follwing capture is when pdFREERTOS_ERRNO_ENOSPC occer.
192.168.1.100 is PC side and 192.168.1.10 is RZ/T2M side.

Hello, have you configured a send timeout for the socket? A socket can be configured using a static default with ipconfigSOCK_DEFAULT_SEND_BLOCK_TIME or at runtime by configuring FREERTOS_SO_SNDTIMEO with FreeRTOS_setsockopt().

I ask because your FreeRTOS tick rate is set to 16kHz and timeouts are set in ticks. We demo our libraries with a 1kHz tick or slower. The default timeout we provide in our configs is 5000 ticks, or 5ms at 1kHz. 5000 ticks in your case would be 312.5us at 16kHz.

I would also think of the send time-out.
Timeouts are indeed stored as “number of clock ticks”, so in FreeRTOSIPConfig.h you often see a translation:

#define	ipconfigDNS_RECEIVE_BLOCK_TIME_TICKS pdMS_TO_TICKS( 5000U )
#define	ipconfigDNS_SEND_BLOCK_TIME_TICKS    pdMS_TO_TICKS( 2500U )

Another thing to check is the transmission buffer size:

/* Each TCP socket has a circular buffers for Rx and Tx, which have a fixed
maximum size.  Define the size of Rx buffer for TCP sockets. */
#define ipconfigTCP_RX_BUFFER_LENGTH   ( 3U * ipconfigTCP_MSS )

/* Define the size of Tx buffer for TCP sockets. */
#define ipconfigTCP_TX_BUFFER_LENGTH   ( 3U * ipconfigTCP_MSS )

Note that you can set all buffer a TCP-window properties in one socket option: FREERTOS_SO_WIN_PROPERTIES

Are you using sliding TCP windows ( ipconfigUSE_TCP_WIN )?

Thank you very much.
ipconfigSOCK_DEFAULT_SEND_BLOCK_TIME is set 2000.
\FreeRTOSIPConfig.h(71): #define ipconfigSOCK_DEFAULT_SEND_BLOCK_TIME (20000)
Then by FreeRTOS_setsockopt() is set FREERTOS_SO_RCVTIMEO.

I ask because your FreeRTOS tick rate is set to 16kHz and timeouts are set in ticks. We demo our libraries with a 1kHz tick or slower. The default timeout we provide in our configs is 5000 ticks, or 5ms at 1kHz. 5000 ticks in your case would be 312.5us at 16kHz.

I was also concerned about this tick rate setting.
This tick rate is recommended to be 1 kHz, and it should not be faster.
Is this setting causing the error?

Thank you very much.

“number of clock ticks” is set the below in our source code(FreeRTOSIPConfigDefaults.h). maybe defult setting.

#define	ipconfigDNS_RECEIVE_BLOCK_TIME_TICKS	pdMS_TO_TICKS( 5000U )
#define	ipconfigDNS_SEND_BLOCK_TIME_TICKS		pdMS_TO_TICKS( 500U )

The transmission buffer size is set the below in our sourcode.

FreeRTOSIPConfig.h(286): #define ipconfigTCP_RX_BUFFER_LENGTH                   (6000)
FreeRTOSIPConfig.h(289): #define ipconfigTCP_TX_BUFFER_LENGTH                   (30000)

Are you using sliding TCP windows ( ipconfigUSE_TCP_WIN )?

TCP windows is not using like a below setting.
\FreeRTOSIPConfig.h(225): #define ipconfigUSE_TCP_WIN (0)

I would recommend to use a configTICK_RATE_HZ of 1000, unless you have a good reason to get faster clock.

Note that a clock tick rate has nothing to do with performance. A high tick rate gives a higher precision of the clock, but a RTC from the CPU can be much more precise without the overhead of a scheduler.
So a higher tick rate can make your application slower.

#define ipconfigUSE_TCP_WIN    (0)

That is a good choice when the performance of TCP connections is not so important. When enabled, the performance will be better, at the cost of a higher usage of RAM.

But both should work, with or without TCP windows.

The error that you are seeing is that a TX packet doesn’t get delivered / acknowledged. When that happens, the DUT (192.168.1.100) is not allowed to send more data, and its TX stream will get full.

These are the last packets:

and the last packet gets confirmed with Ack=942197861.

Just some questions:

At that moment, is the IP-task still alive?
Can you still ping the DUT?
Does TCP always stop after about 1290 seconds? OR can it happen any moment?
When TCP stops, can you relate it to any other event?

I would recommend to use a configTICK_RATE_HZ of 1000, unless you have a good >reason to get faster clock.

Note that a clock tick rate has nothing to do with performance. A high tick rate gives a higher precision of the clock, but a RTC from the CPU can be much more precise without the overhead of a scheduler.
So a higher tick rate can make your application slower.

Thank you very much for your very good advice.

At that moment, is the IP-task still alive?

No. when error occur, since the IP-task is in an infinite loop, the IP-task will not be able to process anything.

Can you still ping the DUT?

Maybe I think DUT can not recive ping.

Does TCP always stop after about 1290 seconds? OR can it happen any moment?

It occurs in about 4 or 5 hours after starting communication.

When TCP stops, can you relate it to any other event?

I do not know whether be able to relate any other event.

Hi @Kamei ,
pdFREERTOS_ERRNO_ENOSPC - refers to End of Space error.
And this error is coming when the actual TCP transfer is stopped.
I feel the issue is more related to a corner case as it is coming in 4-5 hours.
This can be either because of the DMA transfer getting stuck or MAC/Phy corner case. .
To start with can you dump the DMA and MAC registers in the issue state and analyze the values.

HI @Kamei

Were you able to move forward?

Does this made any difference?

Can you please share these along with the wireshark capture to help us checking it further?

pdFREERTOS_ERRNO_ENOSPC - refers to End of Space error.
And this error is coming when the actual TCP transfer is stopped.

In my understading, If timeout occurred before any data could be sent then -pdFREERTOS_ERRNO_ENOSPC is returned.
Because of bellow writeing in FreeRTOS_send() expranation page.
image

I feel the issue is more related to a corner case as it is coming in 4-5 hours.
This can be either because of the DMA transfer getting stuck or MAC/Phy corner >case. .
To start with can you dump the DMA and MAC registers in the issue state and >analyze the values.

I also think it’s related to a corner case, but in the first place, I also think that the conditions that occur do not match the specifications of FreeRTOS or FreeRTOS+TCP.

Regarding hardware such as DMA,MAC and Phy, no particular problems were found as far as we investigated.

Were you able to move forward?

No, we are not able moving forward.

recommend to use a configTICK_RATE_HZ of 1000

Does this made any difference?

Currently, there is no evaluation environment to check the operation, so we can not check it in particular.

To start with can you dump the DMA and MAC registers in the issue state and analyze the values.

Regarding hardware such as DMA,MAC and Phy, no particular problems were found as far as we investigated.

Can you please share these along with the wireshark capture to help us checking it further?

Due to I am new users, I could not upload file.

image

You should now be able to upload file.

I could upload wireshark capture files.
Original file is huge file, so I devided file less than 4Mbytes and uplad them.

logfile1_00000_20230228230528.zip (3.1 MB)
logfile1_00001_20230228230758.zip (3.1 MB)
logfile1_00002_20230228231028.zip (3.1 MB)
logfile1_00003_20230228231258.zip (3.1 MB)
logfile1_00004_20230228231528.zip (3.1 MB)
logfile1_00005_20230228231758.zip (3.1 MB)
logfile1_00006_20230228232027.zip (3.1 MB)
logfile1_00007_20230228232257.zip (3.1 MB)
logfile1_00008_20230228232527.zip (1.9 MB)

Thanks @Kamei for the logs.

After analyzing all the logs, there is a common observation.
The point where the issue starts, there seems to be a continuous ARP request broadcast packet “HP_e6:f0:cc Broadcast ARP 42 Who has 192.168.1.3? Tell 192.168.1.100” going out on the line. And these ARP request packets are coming so frequently that they seem to be consuming all of the FreeRTOS-plus-TCP buffers and the FreeRTOS_send is starving for Buffers.
One easy way to check this will be to increase the number of buffers. That way the issue should occur less frequently or a little later. However, the condition will still happen as there is a Barage of ARP requests coming in and any number of buffers will not be enough.

Now, coming to the point “why there is an ARP request”.
In the test setup “192.168.1.100 is PC side and 192.168.1.10 is RZ/T2M side”. However, the PC suddenly seems to be trying to send some data to ip-address 192.168.1.3 and hence sending an ARP request for ip-address 192.168.1.3.

I have the following queries:

  1. Is there is a service running on the PC which is trying to send data to 192.168.3 instead of 192.168.1.10
  2. Is there any other device connected to the PC on the same sub-net mask? If yes, it can be that device sending an ARP request and PC is just forwarding the same. Can we put that device or devices in a different subnet mask
  3. Can we create a setup with only the PC and the device in one network. Here it will be helpful if you can explain the setup and the use-case schenario.

Thanks,
Shub

Thank you very much for your strong support.
About continuous ARP request, perhaps you are commenting on the following capture. Is it right?

The problem is that error has occurred at the timing of 91773 inlogfile1_00008_20230228232527.pcapng , and transmission from RZ/T2M side has become impossible.

  1. Is there is a service running on the PC which is trying to send data to 192.168.3 instead of 192.168.1.10

I could not find destination to 192.168.3 but if find it, I don’t know.

  1. Is there any other device connected to the PC on the same sub-net mask? If yes, it can be that device sending an ARP request and PC is just forwarding the same. Can we put that device or devices in a different subnet mask

No. Other device does not connect. Only RZ/T2M is connected to PC with wired Ether cable.

  1. Can we create a setup with only the PC and the device in one network. Here it will be helpful if you can explain the setup and the use-case schenario.

We have a setup with only PCs and devices in one network already.

Hello Kamei, would it be possible to publish or to send the source code that handles the TCP connection?
If you don’t want to publish it, you can also send it to
hein [at] htibosch [dot] net
I will share it with our +TCP team internally.

Also, have you tried to repeat the last call to FreeRTOS_send()? Will you get the same error again?

Looking at the PCAP that you sent, I don’t see any reason why the TCP connection would stop, so I would like to see the code around it.

I asked:

At that moment, is the IP-task still alive?

to which you replied:

No. when error occur, since the IP-task is in an infinite loop, the IP-task will not be able to process anything.

Sorry, I hadn’t seen your reply above.
What I mean is: does the IP-task still make the normal loop, constantly calling xQueueReceive()?

It occurs in about 4 or 5 hours after starting communication.

Can it also occur after 1 hour, or after 7 hours?

Could you have the DUT (192.168.1.10) send a ping every e.g. 10 seconds to the Laptop (192.168.1.100)?
And maybe also in the other direction?
I would be curious to see if both pings will remain sent and answered.

Thank you