I am transitioning a project to use SMP kernel, but I noticed that the TCP library using BufferAllocation1 does not function for some reason. BufferAllocation2 seems to work properly from what I’ve seen so far, but there is a noticeable drop in performance compared to 1. The port being used is Xilinx Zynq.
I attached debug messages and wireshark capture of me simply trying to ping the device. When using BufferAllocation1, it does reply to ARP but replies to ping are very delayed if it replies at all. Attempting tcp connection always fails.
I did try to set the affinity of the IP task and the EMAC task to only run on core 0, but it did not seem to improve anything.
So some questions this raises:
Has BufferAllocation1 been tested using SMP on any other ports? Is the problem in the Zynq port or the allocation scheme?
Maybe a dumb question but has the TCP library been tested using SMP? I only ask because I have searched and not found any mention of this. So far BufferAllocation2 (with no set affinities) seems to be working.
Any other thoughts on what the problem could be? If it was some threading problem I would have thought pinning the 2 tasks to core 0 would have worked around the issue.
Is it crashing, or are you not able to send or receive packets? If it’s crashing, are you able to get the call stack/logs?
No, both buffer allocation schemes are not tested with SMP enabled yet.
Is the problem in the Zynq port or the allocation scheme?
Since the major difference between both allocation schemes is that the buffer allocation scheme 1 uses static memory, I would start checking where the ucNetworkPackets/pucNetworkPackets gets allocated and if it’s aligning with the SMP/HW configuration.
I did not observe any crash or asserts. I did include the debug messages in the zip, but I will put them in this post.
FreeRTOS_AddEndPoint: MAC: 00-11 IPv4: c0a8150aip
prvIPTask started
XEmacPs detect_phy: PHY detected at address 0.
Start PHY autonegotiation
Waiting for PHY to complete autonegotiation.
autonegotiation complete
link speed: 1000
prvEMACHandlerTask[ 0 ] started running
Network buffers: 64 lowest 64
Socket 6000 -> [0.0.0.0]:0 State eCLOSED->eTCP_LISTEN
Network buffers: 61 lowest 61
Network buffers: 60 lowest 60
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
Network buffers: 30 lowest 30
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
emacps_handle_error: Receive buffer not available
emacps_handle_error: Receive buffer not available
pxEasyFit: ARP c0a81505ip -> c0a8150aip
emacps_handle_error: Receive buffer not available
Network buffers: 26 lowest 26
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
emacps_handle_error: Receive buffer not available
Network buffers: 24 lowest 24
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a8150aip
Network buffers: 22 lowest 22
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a8150aip end-point c0a8150aip
pxEasyFit: ARP c0a81505ip -> c0a81501ip
pxEasyFit: ARP c0a81505ip -> c0a81501ip
pxEasyFit: ARP c0a81505ip -> c0a81501ip
Network buffers: 9 lowest 9
ipARP_REQUEST from c0a81505ip to c0a81501ip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a81501ip end-point c0a8150aip
ipARP_REQUEST from c0a81505ip to c0a81501ip end-point c0a8150aip
emacps_send_message: Time-out waiting for TX buffer
OK this comment is making me think that the problem might be the BufferAllocation1 is expecting to use uncached memory and maybe that is conflicting with the SMP cache coherent memory. I’m hardly an expert on this topic but I’ll see if I can investigate the ARM memory settings.
These logs indicate that the network driver is facing issues with the RX buffer when receiving a network packet into the buffer, probably losing them and finally running out of network buffers.
Can you check by stepping through init_dma and emacps_check_rx to see if the buffers allocated by pxGetNetworkBufferWithDescriptor are properly initialized and set up with the DMA?
I’m still thinking that this is the most likely reason. BufferAllocation1 modifies the attributes of its memory buffer here. I tinkered around with the memory attribute values and task affinities and sometimes the network will init to a stable and working state but the init seems unreliable and sometimes will fail (which makes me think that the memory is misconfigured).
Meanwhile so far I have not seen any issues with BufferAllocation2. So I am going to move forward with testing my application using this scheme. I think its likely that the Zynq port BufferAllocation1 is not compatible with SMP operation without some modifications.
which will become non-cached memory. It is shared between DMA and the CPU.
I just compiled and ran a Zynq/Zybo project, using BufferAllocation_1.c, while caching enabled for pucUncachedMemory[]. It has no Ethernet connectivity.
I added a new testing macro:
#define uncZYNQ_FORCE_USE_CACHED_MEMORY 1
So before we conclude that caching is the problem, some reparation is needed.
Strange though that @mike919192 reports that his demo does work when using BufferAllocation_2.c. That module uses the default pvPortMalloc(), I assume?
EDIT For me only BufferAllocation_1.cworks, not _2.c