FreeRTOS PlusTCP hard fault with Atmel Cortex SAM4E

I have integrated FreeRTOS PlusTCP 10.2.0 with Atmel Cortex SAM4E Xplained Pro board.

I made a simple test program to send 400 bytes via UDP every 10ms (FreeRTOS tick rate is 1ms).

After some time (random) I got a cpu hard fault.

Debuging the problem, it seems to be related to network traffic and the release of the network buffers.

Monitoring how many times pvPortMalloc / vPortFree is being called, it seems the hard fault is asserted after pvPortMalloc is called more often than vPortFree.

For example, in this test program, during normal operation, debuging the difference between how many times pvPortMalloc and vPortFree were called I got the value of 16, but when the hard fault is asserted this difference increases for 17 or a higher value.

I can`t trace the root cause of the hard fault according to https://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html, even trying the “Handling Imprecise Faults” method.

I have read many issues in this forum related to hard fault and high network traffic but could not solve the problem after trying to replicate some suggestions.

I have tried many different values for the #defines below but the hard fault continues, for example:
#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 10
#define ipconfigEVENT_QUEUE_LENGTH ( ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS + 10 )
#define ipconfigUSE_TCP_WIN ( 0 )
#define ipconfigNETWORK_MTU 586
#define ipconfigUSE_LINKED_RX_MESSAGES 1
#define ipconfigIGNORE_UNKNOWN_PACKETS ( 1 )

I am using BufferAllocation2 scheme with heap4.c with SAM4E internal SRAM (128KB) and with the following definitions:
#define configMINIMAL_STACK_SIZE ( ( unsigned portSHORT ) 256 )
#define configTOTAL_HEAP_SIZE ( ( size_t ) ( 0x8000 ) )

The source code is on https://drive.google.com/open?id=1cUsO-uTJ5eKB4igAG8iQGBKVWVfvK34q

I will really really appreciate any suggestion about how to deal and solve this problem.

Thank you very much in advance!
Best regards,
Marc

At any given time it would be expected that there are more allocations than frees - but the two should not diverge greatly over time as that would indicate a memory leak.

Is the network driver something you created yourself?
Do you have a malloc failed hook implemented? Malloc failures should be handled gracefully, so the hook being called is not a critical error, but would at least let you know if at any time you were running out of memory to help you narrow down the cause.

I would also strongly recommend to use a malloc failed hook, and beside that of course configASSERT(). And stack checking can be useful.

The SAM4E driver has been upgraded several times since you version. You might want to have a look at the latest versions attached to this reply.DriverSAM_2020_mar_3.zip (24.6 KB)

This driver combines SAM4E and SAME70. Please define the macro SAM4E as 1.

As AWS/FreeRTOS is not actively supporting the SAM4E, the drivers on github are not yet updated.

Can you show (attach) some code that is sending the UDP messages? Does it use zero-copy?

Hi guys!

Thank you very much for your support!!!

Regarding your comments above, malloc failed hook is implemented and configASSERT() is enabled but they are not asserting any problem when the hard fault happens.

Thank you also for the new driver DriverSAM_2020_mar_3.zip. In my code I don`t have phyHandling.c file and typedefs PhyProperties_t and EthernetPhy_t requested by this new driver (in my code I have ethernet_phy.c/h and gmac_phy.c). Would you have these files or some example project so I can try to merge to my code?

I didn’t create the network driver from scratch, but I did have to make some modifications because Atmel SAM4E Xplained Pro board uses KSZ8081MNXIA and not ksz8851mnl. I also had to make some merges between Atmel ASF network CMSIS driver and freertos driver, so maybe this changes are causing this hard fault problem.

Regarding the question: Can you show (attach) some code that is sending the UDP messages? Does it use zero-copy?

As additional information, to increase the likelihood of the fault, I turn off the UDP Server (netcat) and on wireshark I can see ICMP packets sent to my board for every UDP packet I sent. Then if I change the rate of the UDP packets, for example, from each 10ms to 5ms, I will see hard faults faster than if I change from 10ms to 100ms.

So I think that maybe the problem is related to send and receive packets “at the same time”. Maybe “my driver” is not managing properly the tx and rx buffers and their releasing.

Thank you very much again for your support!

@marc Any chance you got this working in the end? If so, would it be possible for you to post the source again? It would appear the files linked above have been moved to trash. I’m working on a SAME70, but a SAM4E-based ASF project would make a great reference.

Hello,

Yes, I was able to fix the hard fault issue. Actually the problem was related to the configMAC_INTERRUPT_PRIORITY wich was defined below configMAX_SYSCALL_INTERRUPT_PRIORITY.

But I was not able to use the new driver above DriverSAM_2020_mar_3.zip wich needs ipconfigZERO_COPY_ enabled. When I use it, after a while a got several “Network buffers:” messages (which means new buffers are being allocated) and the ISR gmac_handler stops to work.

Since I was not able to debug and solve it, I decided to use my “old” driver with ipconfigZERO_COPY_ disabled, but I would apreciate any help what could be the causes for the several “Network buffers:” messages and the ISR gmac_handler stops to work with the new driver DriverSAM_2020_mar_3.zip

Actually the problem was related to the configMAC_INTERRUPT_PRIORITY
wich was defined below configMAX_SYSCALL_INTERRUPT_PRIORITY.

It is very important to have configASSERT() defined properly while developing. It will also warn about wrong interrupt priorities. When you don’t enable configASSERT(), you would get a hard-to-understand exception.

But I was not able to use the new driver above DriverSAM_2020_mar_3.zip
wich needs ipconfigZERO_COPY_

I wrote about the memory configuration in this post. With zero-copy, the driver will reserve GMAC_RX_BUFFERS network buffers in advance. The post describes how to configure these network- and DMA-buffers.

When I use it, after a while a got several “Network buffers:” messages

The number of free network buffers are logged whenever the minimum amount of buffers has changed. These messages look like: Network buffers: 6 lowest 5.

If you see this number going down to 0, please check the configuration.

Also note that UDP sockets store incoming packets in a queue of network buffers. Unless you call FreeRTOS_recvfrom() regularly, the buffers may get exhausted.

Blockquote
It is very important to have configASSERT() defined properly while developing.

Yes, I had it defined since the begining of the project but I didnt receive any warning about wrong interrupt priorities.

Blockquote
If you see this number going down to 0, please check the configuration.

Actually the system works fine for a while, for example, 10 to 20 minutes, sometimes up to 1 hour, but suddenly I get a bunch of “Network buffers:” messages (in less than one second) and the free network buffers goes to 0.

I suppose the cause is because gmac_handler ISR is stopping to work, but I dont know why this is happening, but I think is because of the network traffic (RX packets).

My defines are:
#define GMAC_RX_BUFFERS 24
#define GMAC_TX_BUFFERS 8
#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 60

This problem happens with both UDP and TCP.

As additional information, I also have a PIO interrupt periodically being called each 1ms.

And my ISR priorites are:

  • configMAX_SYSCALL_INTERRUPT_PRIORITY 0x05
  • configMAC_INTERRUPT_PRIORITY 6
  • NVIC_SetPriority(PIOD_IRQn,7);

Please check vPortValidateInterruptPriority() in port.c, which is only defined when configASSERT_DEFINED is defined as 1.
That checks the priority of the active ISR at the moment an API is being called.

#define GMAC_RX_BUFFERS                        24
#define GMAC_TX_BUFFERS                         8
#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 60

That looks perfect, although you could do with less without losing any performance.

But first, I would be curious what happens with:

#define GMAC_RX_BUFFERS                         8
#define GMAC_TX_BUFFERS                         2
#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 60

With these settings there is is a big pool of spare Network Buffers (50). Will it still run out of buffers?