I’ve been struggling for a while now with a HardFault occurring with FreeRTOS+TCP v2.3.2-LTS.Patch.1 in use.
In our project we are using Renesas RA6E1 MCU and most of the source code used to interface with the FreeRTOS+TCP on a low level originates from the Renesas FSP (version 3.6.0). The NetworkInterface.c we’re using can be found from
https://github.com/renesas/fsp/tree/master/ra/fsp/src/rm_freertos_plus_tcp. For buffer management, I’ve chosen to use BufferAllocation_1.c for stability.
So this is the issue in a nutshell: when using DHCP, our device works fine, but switching to static IP, the device occasionally crashes into HardFault handler after some seconds of operation.
I have DHCP enabled with the following preprocessor macros:
#define ipconfigUSE_DHCP 1 #define ipconfigDHCP_REGISTER_HOSTNAME 0 #define ipconfigDHCP_USES_UNICAST 1 #define ipconfigDHCP_SEND_DISCOVER_AFTER_AUTO_IP 1 #define ipconfigUSE_DHCP_HOOK 1
When using static IP, I pass the IP address to
FreeRTOS_Init() function and return
eDHCPPhasePreDiscover phase as suggested in these forums in a couple of posts:
Most of the time the device works just fine with static IP.
It seems that this issue is related to some sort of memory violation, and the reason we’ve only seen it happen with static IP could very well be related to the timing differences in the initialization.
Here are some of our findings so far:
The crash doesn’t seem to occur when:
- Using DHCP
- Ethernet cable is removed
- Disabling FreeRTOS+TCP init altogether
- Disabling only FreeRTOS+TCP RX thread
When the crash occurs:
- The scheduler is in the process of switching tasks:
- If LR points to an invalid memory area, PC points to PendSV_Handler OR
- If PC points to an invalid memory area, LR points to our custom xTaskCallApplicationTaskHookParam, when it’s trying to dereference the pxCurrentTCB
- The task being switched out is the IP RX task (implementation of the task can be found from the NetworkInterface.c:
So basically the task control block of the task we’re trying to return to, has been corrupted by something.
We’ve also suspected that this issue might happen because of some network buffer mismanagement, but I haven’t been able to find any clear issues with the buffers and how they’re initialized. And it’s worthwhile to remember that most of the time everything works just fine, and we haven’t been able to reproduce this issue when DHCP is enabled.
There. Quite a long intro, and I have more data and logs regarding this issue, but perhaps this is enough for the first post. I’ll happily provide more info if needed.
Any suggestions on how to catch this issue?