STM32F7 TX DMA buffers 0

Could I get some ideas where to look for an issue with a project using STM32F769? I am using the code I have found floating around form the NetworkInterface.c, as far as a know the newest stuff.

The issue at hand is pings work, the network driver appears to work ok, but the debug output shows a slowly diminishing “Network buffers: 17 lowest 14”, etc, with TX DMA buffers going to lowest 0.

Iperf output is rather lopsided.

Perhaps related, but do pvPortMallocLarge and pvPortMallocSocket need non cached memory?

Would recommend non-cached memory - but this doesn’t sound like an issue related to that.

Could I get some ideas where to look for an issue with a project using STM32F769?

The STM32Fx FreeRTOS+TCP driver in the Github repo has just been updated.

If I am not mistaken, the first 64 KB of RAM is not cached. The driver wants to declare the DMA buffers in that part, for instance:

__attribute__ ((section(".first_data")))
	ETH_DMADescTypeDef  DMARxDscrTab[ ETH_RXBUFNB ];

The linker file should delare .first_data before the .data segments.

Perhaps related, but do pvPortMallocLarge and
pvPortMallocSocket need non cached memory?

The socket objects are never accessed by DMA, so pvPortMallocSocket() may return cached memory.
pvPortMallocLarge() may also return cached memory: it allocates the so-called stream-buffers that belong to TCP sockets. And also it is used to declare TCP-segment information.

However, you are advised to use BufferAllocation_1.c: otherwise pvMalloc() will be used to allocate Network Buffers. Network Buffers are accessed by DMA.
pvMalloc() will probably not return non-cached memory.

The simplest solution for now, as Richard just wrote: do not use caching at all. When that works well, you can try to enable caching.

What I am trying to understand is the relationship between the dma descriptors and the xNetworkBuffers.

BufferAlloction_1.c defines these as

static NetworkBufferDescriptor_t xNetworkBuffers[ ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS ];

but these aren’t allocated as .first_data, so do or do these not need to be in uncached/DTCM space?

If they don’t need to be, what is the core point behind BufferAllocation_1.c recommendation for the stm32f/DMA ?

BufferAlloction_1.c defines the network buffer descriptors as:

static NetworkBufferDescriptor_t xNetworkBuffers[ ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS ];

NetworkBufferDescriptor_t is a struct with information about a network descriptor. It has one important pointer: pucEthernetBuffer that points to the actual Network Packets.

When a driver is zero-copy, pucEthernetBuffer is passed to DMA, both for reception as well as for sending.

When using BufferAlloction_1.c, the network packets will be stored in a static char array ucNetworkPackets[]:

void vNetworkInterfaceAllocateRAMToBuffers( NetworkBufferDescriptor_t pxNetworkBuffers[ ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS ] )

{

static
__attribute__ ((section(".first_data")))
__attribute__ ( ( aligned( 32 ) ) )
uint8_t ucNetworkPackets[ ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS * ETH_MAX_PACKET_SIZE ] ;

( I rearranged the declaration to make it easier to read ).

To summarise, two objects are shared with DMA:

  • The buffers pointed to by pucEthernetBuffer
  • the so-called DMA descriptors. They describe the packets that are exchanged

The DMA-descriptors are stored in DMATxDscrTab and DMARxDscrTab.

If things are still not clear, please say so. I will update the sources to bring more clarity.
Thanks

Erik I just opened a new Pull Request #1739 which adds a bit more clarity about buffers and caching.

The adapted source code can be found here. I also added comments to the README.TXT file.

But first I would trye out the driver without enabling memory caching.

Following up on this, the newer iperf3 gives me good results, 60-70 mpbs.

I’m not sure why old iperf would do something that would cause the stm32 to send Zero windows per wireshark, but it pretty much dies using iperf2. on the v1 source.

Perhaps a define should be added for non used xCheckLoopBack in the newer source?

the newer iperf3 gives me good results, 60-70 mpbs

That is a good performance!
Remember to use compiler optimisation and to disable stack checking ( in case you want to find the maximum ).

I don’t remember what the problems were with earlier versions were.

xCheckLoopBack() : yes sorry, I left it in some versions of NetworkInterface.c.
I want to create a PR for this and add it as an option.
It will add a kind of loopback interface, and it allows you to test without having Wi-Fi or a LAN. Just use the static IP address.