+TCP Can I do zero copy UDP with app memory ownership?

I am transmitting video bits provided by an FPGA in mapped memory and haven’t figured out how to send a buffer over UDP using zero copy from application owned memory instead of network stack owned memory. Is this possible?

Thank you for your time.

If you do a zero copy UDP packet then only a pointer to the data is passed through the stack - so if the TCP stack and Ethernet driver can access the memory then you should be able to do it.

Some more information about zero-copy: you can find example code of UDP zero-copy here.

If you use BufferAllocation_1.c, there is an application hook that shall create the memory blocks used for storing the packet data.

In this NetworkInterface.c you find an example of the application hook:

void vNetworkInterfaceAllocateRAMToBuffers( NetworkBufferDescriptor_t pxNetworkBuffers[ ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS ] )
{
}

( BufferAllocation_2.c uses pvPortMalloc() to dynamically allocate buffers, that is not what you want ).

The function FreeRTOS_GetUDPPayloadBuffer() is used to allocate network buffers. It returns a pointer to the beginning of the payload data. That is not just a block of memory, it has a back-pointer to the Network Buffer that owns the buffer.

When you send an UDP packet, it travels through the IP-stack, from FreeRTOS_sendto(), it is picked up by the IP-task, which prepares a valid UDP packet, which is passed to xNetworkInterfaceOutput(). No copying is done.

You must make sure though that the “application owned mapped memory” is accessible for the EMAC peripheral.

Thank you Hein for your detailed response. Always good to see your smiling face.

I should have provided more detail.

  • I am using BufferAllocation_1.c
  • I am using the UDP zero-copy example that you reference.
  • We will have TCP servers running at the same time.
  • I have implemented vNetworkInterfaceAllocateRAMToBuffers to assign regular (non video) memory to NetworkBufferDescriptor_t.
  • At the time of UDP sendto, the FPGA indicates which memory address is ready for transmission.
  • I am also writing the device driver (for SmartFusion2), which is functioning.

I may have missed something, but I haven’t figured out how to do this yet. It seems I have two different memory types (using regular or video memory), and I don’t see how to create and request two different types using vNetworkInterfaceAllocateRAMToBuffers.

Is it possible for me to create BufferWithDescriptors for video UDP separately? The pucEthernetBuffer field would be assigned to video memory, sent using zero copy FreeRTOS_sendto(), identified in xNetworkInterfaceOutput by memory address to return it to the video task and not call vReleaseNetworkBufferAndDescriptor.

Does that sound workable?

Gordon, there is an additional problem: the video pixels are stored in special blocks of RAM. FreeRTOS+TCP wants to build up each UDP or TCP packet in a contiguous block of RAM. That means that you will need to use the video memory just before the pixel data, which is probably not available.

lwIP has pbufs, these are chains of memory blocks, connected with pointers. An lwIP driver can send multiple pbufs to the EMAC. It will also need some hacking, but you might realise it more quickly with lwIP.

It is also possible using FreeRTOS+TCP, but you would need to create a fork with a special feature to use two blocks of memory: one for the headers, and one for the payload.

Have you already studies the timing? How much time would be lost by calling memcpy() for every UDP packet? I found that the time needed for memcpy() varies greatly among CPU’s. Some of them use 128-bit move instructions from the FPU.

Regarding RAM layout, we have control of that and already made space for UDP headers. We transmitted the packets directly via the EMAC as an initial test and bandwidth was sufficient.

I did use lwIP for a bit, but was hoping to get better results using FreeRTOS, which for sure has better support :slight_smile:

Right now I’m sending video directly to the driver, going around +TCP for the moment.

Using memcpy() + sendto, we were sending about 1/3 of the needed packets. I haven’t examined the memcpy() instructions. We are running SmartFusion2. It has a Cortex M3.

I considered trying to add a new buffer type to FreeRTOS, but it was over my head how to do that.

Regarding RAM layout, we have control of that and already made space for UDP headers. We transmitted the packets directly via the EMAC as an initial test and bandwidth was sufficient.

Ah, very good.

I considered trying to add a new buffer type to FreeRTOS, but it was over my head how to do that.

Nobody will stop you from writing BufferAllocation_3.c :slight_smile:

The network buffer type NetworkBufferDescriptor_t is defined in FreeRTOS_IP.h. The access functions are defined in NetworkBufferManagement.h.

You could also have two sets of network buffers:

  • Use BufferAllocation_1 or 2 to create and release the normal network buffers
  • Create a special BufferAllocation_3 for the video UDP packets.

You could add a ID field to each NetworkBufferDescriptor_t, to distinguish between normal and video buffers.

I didn’t think to make BufferAllocation_3.c, that’s a good idea.