+TCP +FAT FTP server slow transfer

prvTCPWindowFastRetransmit: Requeue sequence number 411649 < 413109
prvTCPWindowFastRetransmit: Requeue sequence number 419253 < 419841
prvTCPWindowFastRetransmit: Requeue sequence number 425985 < 427445
prvTCPWindowFastRetransmit: Requeue sequence number 433589 < 434177
prvTCPWindowFastRetransmit: Requeue sequence number 440321 < 441781
prvTCPWindowFastRetransmit: Requeue sequence number 447925 < 448513
prvTCPWindowFastRetransmit: Requeue sequence number 454657 < 456117
prvTCPWindowFastRetransmit: Requeue sequence number 462261 < 462849
prvTCPWindowFastRetransmit: Requeue sequence number 468993 < 470453
prvTCPWindowFastRetransmit: Requeue sequence number 476597 < 477185
Network buffers: 38 lowest 38
Network buffers: 37 lowest 37
emacps_send_message: Time-out waiting for TX buffer
Network buffers: 36 lowest 36
Network buffers: 35 lowest 35
Network buffers: 34 lowest 34
Network buffers: 33 lowest 33
emacps_send_message: Time-out waiting for TX buffer
emacps_send_message: Time-out waiting for TX buffer
ulTCPWindowTxGet[47954 - 19655]: Change Tx window: 26280 -> 2920
Network buffers: 32 lowest 32
Network buffers: 31 lowest 31
Network buffers: 30 lowest 30
Network buffers: 29 lowest 29
Network buffers: 28 lowest 28
emacps_send_message: Time-out waiting for TX buffer
Network buffers: 27 lowest 27
Network buffers: 26 lowest 26
Network buffers: 25 lowest 25
Network buffers: 24 lowest 24
emacps_send_message: Time-out waiting for TX buffer
emacps_send_message: Time-out waiting for TX buffer
Network buffers: 23 lowest 23
Network buffers: 22 lowest 22
Network buffers: 21 lowest 21
Network buffers: 20 lowest 20
Network buffers: 19 lowest 19
emacps_send_message: Time-out waiting for TX buffer
Network buffers: 18 lowest 18
Network buffers: 17 lowest 17
Network buffers: 16 lowest 16
Network buffers: 15 lowest 15
emacps_send_message: Time-out waiting for TX buffer
emacps_send_message: Time-out waiting for TX buffer

I was seeing this output in my logs but ONLY when I ran the FTP client code from my RaspberryPi and not from my Windows 10 dev machine.

This morning, I made changes to my FreeRTOSIPConfig.h (17.7 KB) and when I ran the FTP program from my RaspberryPi, the messages that I have shown above went away. The file download was mostly quick (1-2 pauses in the middle) and no errors - you can see that in this video.

However, I still see the 5s silent period when I run the same Python script to download the FTP files from my Windows 10 machine. I don’t see any errors being printed in the serial terminal. You can see that in this video.

Based on this note, it seems that my Raspberry Pi setup matches what you are seeing since I was seeing this message but ONLY when I ran the FTP download from the Pi, never from the Win10 machine.

I swapped the cables I am using between the RPi and Win10 machine and the performance stayed the same.

This is the Python script I am using to download the files: FTPClient.zip (983 Bytes). If you would like to use this, all you have to do is modify the variables ip_address, port, and filename in the main section of the file right at the bottom.

Do you think performance should be consistent regardless of what host OS the FTP client is downloading the files from?

This is a snippet of the FTP client program running on the RPi.

If you zoom in on this image, you can see that the file ending in _104626.bin took 5.46s to download and the next file _104628.bin took 0.46s. In the serial terminal, however, the download speed is reported the same for both the files.

So that 5s delay is still there when I run on the Pi, it is just not as frequent as it is on Windows. You can also see in the logs that the message "emacps_send_message: Time-out waiting for TX buffer". did not get printed.

When I try to run the Python script it gives an error:

Traceback (most recent call last):
  File "E:\temp\ftp\FTPClient.py", line 86, in <module>
    ftp_client.download_file(filename='megabyte.bin', download_path='./')
  File "E:\temp\ftp\FTPClient.py", line 56, in download_file
    res = self._ftp_client.retrbinary('RETR ' + filename, fp.write, blocksize=0x8000)
  File "C:\Python39\lib\ftplib.py", line 428, in retrbinary
    self.voidcmd('TYPE I')
  File "C:\Python39\lib\ftplib.py", line 283, in voidcmd
    self.putcmd(cmd)
  File "C:\Python39\lib\ftplib.py", line 205, in putcmd
    self.putline(line)
  File "C:\Python39\lib\ftplib.py", line 200, in putline
    self.sock.sendall(line.encode(self.encoding))
AttributeError: 'NoneType' object has no attribute 'sendall'

I googled this error but the answers do not seem to apply. I also tried Python 3.7, same result.

Could you lower this time-out in the code from 5 seconds to e.g. 0.5 second?

TickType_t xBlockTimeTicks = pdMS_TO_TICKS( 500U );

In none of the network interfaces, such a long time-out was chosen. But also I think that the parameters should be tuned so that no time-out ever occurs.

I forgot a line in the script - ftp_client.connect() - this needs to go in the main section of the file like this:

ftp_client = FTPClient(ip_address=ip_address, port_num=port)
ftp_client.connect()
ftp_client.download_file(filename=filename, download_path='./')

I believe you saw that error because the FTP server wasn’t connected.

I will reduce the timeout and get back to you soon.

I was trying to debug a different issue related to FreeRTOS+FAT and found myself in ff_sddisk.c for the Zynq port of the library. The issue I was having was that while writing to the eMMC, I was often running into periods where my other tasks would all stall for 5s. This all of a sudden reminded me of this issue where we had also seen a block of 5s which was causing the FTP to be slow.

I changed sdWAIT_INT_TIME_OUT_MS in that file from 5000ms to 50ms and put a breakpoint in this function: u32 XSdPs_WaitInterrupt( XSdPs *InstancePtr, u32 ulMask, u32 ulWait )

I added an extra if statement in that function to check for the timeout condition after the do-while loop exited. That is where I put the breakpoint.

I was transferring some files over FTP and all of a sudden that breakpoint got triggered! So I am pretty sure that the 5s issue I was seeing earlier is related to this.

So now what I am not sure about is why that 50ms timeout was getting triggered… It is happening at seemingly random intervals so it is difficult to say if its a driver issue or hardware issue.

Actually, this reminds me of a sync bug I had seen many years ago (and consequently I unfortunately don’t have any records of except what’s in my memory):

I once came across a DMA network driver that got out of sync, so it didn’t signal a free DMA buffer as the buffer was freed but delayed (if I remember correctly, when it was “nudged” by the next incoming packet, which had the interesting property that the bug showed only under small network load).

So I’d recommend looking more closely at the sequence of buffer/signal usage in the network driver. The 5s may simply be periodic external events that “retrigger” the buffer signaling mechanism.

@sidmodi wrote me directly:

One thing I just noticed is that the case where a timeout happens is periodic - it happens consistently after 30 writes of 512KB buffers to the eMMC… not sure what this means.

I only know a little about SD-cards, much less about MMC. But this makes me wonder if the card was to do some expensive operation after 30 writes ( sure it is not 32 writes, which would make 16 MB ).
Does the card have big erase blocks of 16 MB?

I changed the write size from 512KB to 128KB and the timeouts I was seeing while writing the eMMC went away. However, I did see a single timeout during an FTP transfer but that was 1 timeout in over 400 files.

I will dig into the eMMC datasheet a little bit to see if it could be hardware related.

For those waiting for a solution: @sidmodi and I have investigated the problem, which led to a new version in PR #33: “Xilinx/Zynq.2019.3 Avoid race condition when waiting for status”.

Thank you Sid for reporting about this problem, and also for your patience.