FreeRTOS+TCP crashing on Cortex M4F and possible bug (MSP432E401Y)

I’m working with an MSP432E401Y processor development kit (MSP-EXP432E401Y) from Texas Instruments. The processor is an Arm Cortex M4F. The core FreeRTOSv202012.00 works well adapted from a TI sample project, and I’ve written an Ethernet driver for the board that can send and receive packets. The Ethernet interrupts work and I’ve verified with a debugger that none of the FreeRTOS Assert traps are causing entry into an infinite debugging loop. The processor speed is 120 MHz and is clocked from the default 25 MHz crystal. The compiler is the CCS TI compiler, not gcc.

However, once the prvIPTask is started, and a few seconds after the xSendEventStructToIPTask() is called, the processor crashes. After commenting out the xSendEventStructToIPTask(), the processor still continues to run. I’ve also set up another task to blink an LED and the LED stops blinking when the processor crashes. The debugger loses contact with the processor. I think that the stack size is adequate for this processor and to ensure that the task can run. Here are the defines being used. heap_4.c is being used along with BufferAllocation_2.c.

#define ipconfigIP_TASK_STACK_SIZE_WORDS ( configMINIMAL_STACK_SIZE * 100 )
#define configCPU_CLOCK_HZ ( ( unsigned long ) 120000000 )
#define configMINIMAL_STACK_SIZE ( ( unsigned short ) 256 )
#define configTOTAL_HEAP_SIZE ((size_t) 0x30000)

#define ipconfigIP_TASK_STACK_SIZE_WORDS ( configMINIMAL_STACK_SIZE * 100 )
#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 15

#define configUSE_PORT_OPTIMISED_TASK_SELECTION 0
#define configTICK_RATE_HZ ( ( TickType_t ) 1000 )
#define configUSE_PREEMPTION 1
#define configUSE_TIME_SLICING 0
#define configUSE_TICKLESS_IDLE 0

Note that to debug the problem, I’ve turned off tickless idle (since tickless idle never worked well with interrupts) and I’ve turned off configUSE_PORT_OPTIMISED_TASK_SELECTION to ensure that the C code is used rather than the optimized task selection code for the port. This does not change the crash behavior.

Switching to BufferAllocation_1.c, the crash occurs after roughly 30 seconds, leading me to believe that this is a memory issue.

In the vReleaseNetworkBufferAndDescriptor() function in the BufferAllocation1.c file, I had to add the following line:

if(pxNetworkBuffer == NULL) return;

After adding this line, the crashes stop. However, the crashes still occur when using BufferAllocation_2.c and I am still uncertain what could cause this. I am thinking that this is an issue with a NULL pointer, and it could be compiler-related as well.

Hi – could you please attach your whole FreeRTOSConfig.h and the Ethernet driver files. As a new poster you can’t normally attach files but I just raised your level so hopefully you will be able to now.

1 Like

Just comparing the TI and and GCC ports I see the TI port doesn’t have the same memory barriers and in a couple of places “volatile” qualifiers - although I don’t suspect this to be the cause - can you confirm you have tried with the compiler optimisation off.

1 Like

Hi Richard, thanks for your response. Yes, I can confirm that the optimization level for the compiler is set to “off”

Sure, I can upload my ethernet driver file and the config file (see attached). The files that I’ve attached work well with the BufferAllocation1 scheme, but only when the pxNetworkBuffer is checked for a null pointer as described above.

FreeRTOSIPConfig.h (16.2 KB)
ethernet.c (20.4 KB)
FreeRTOSConfig.h (8.2 KB)

Hello Mr Nikar, thanks for reporting this.
Through the years, there have been added more and more sanity checks in the source code.
While developing FreeRTOS+TCP, BufferAllocation_1.c was mostly used. If you define ipconfigTCP_IP_SANITY ad 1, most errors will be caught and reported in the logging.

Your report makes me think that buffer descriptors are not returned (released) in a proper way.
Where BufferAllocation_1.c always has a packet buffer of the maximum size, BufferAllocation_2.c mallocs the packet buffer with a size that is just enough.

So what if you use BufferAllocation_1.c and define ipconfigTCP_IP_SANITY? Could you catch any irregularties?

1 Like

Hello Hein, thanks for your response. That is a good suggestion. I defined
ipconfigTCP_IP_SANITY 1 in the config file and now sporadically I am receiving the following printed to the debug UART:

vReleaseNetworkBufferAndDescriptor: Invalid buffer 0

Adding back in the null pointer check at the beginning of the vReleaseNetworkBufferAndDescriptor() function fixes things and the error appears to go away.

if(pxNetworkBuffer == NULL) return;

I updated my Ethernet driver code to perform more null pointer checks and timeout checks, but I think that something goes wrong over time in the network stack. The null pointer check in the vReleaseNetworkBufferAndDescriptor() function might be good to add since this function is also used in driver code as well.

ethernet.c (22.9 KB)

Once in a while I am also receiving some warnings

vDHCPProcess: reply c0a801aaip
Send failed during eSendDHCPRequest.
*** Warning *** only 2 buffers left

So I increased the buffer size to 30. Now the warnings are not happening when an eSendDHCPRequest occurs. The default buffer size of 60 does not fit into available memory.

#define ipconfigNUM_NETWORK_BUFFER_DESCRIPTORS 30

To be honest I doubt that this is caused by a bug in the network stack itself.
Especially regarding such fatal errors like releasing NULL (corrupted ?) buffers.
Mostly problems are caused by (custom) ethernet drivers or issues with FreeRTOS+TCP configuration or simply stack overflows.

1 Like

Can you put a break point on the line that catches the null buffer and see the call stack to how you arrived at that point?

1 Like

Hi Richard, thanks, I put a breakpoint in the following code and here is a screenshot of the callstack. The pxNetworkBuffer is null.


I am also seeing “vReleaseNetworkBufferAndDescriptor: Invalid buffer 0” as expected from the code.

More output with send failed during eSendDHCPRequest (and this will sometimes continue until a hard reset).

Tracing back the error further, it appears that the pointer gets set to null in the prvPacketBuffer_to_NetworkBuffer() function in FreeRTOS_IP.c, where pxResult = NULL since there is a check in this function to ensure that the pointer is “well aligned”

Thanks, Richard, Hein, and Hartmut so much for your help!!

OK, after more debugging, it appears that the error is due to the driver. Starting with a simple copy-only driver, the issue was that the received frame is not being copied into

pxDescriptor->pucEthernetBuffer

The pxGetNetworkBufferWithDescriptor() function in the driver allocates memory and the driver was not copying the data into the pucEthernetBuffer and only a pointer was allocated. After copying the data directly into the pucEthernetBuffer, the crashes and the invalid buffer problem went away. The length of the data passed into the stack does not include the length of the CRC.

I’ve attached the driver code for reference, also indicating detection for link up or down. I can confirm that the code works with BufferAllocation_1.c and BufferAllocation_2.c with no crashes.

Again, thanks so much for your help!! I really appreciate this.

ethernet.c (24.7 KB)

1 Like

Great that you solved the issue and thanks for reporting back !

1 Like

One could argue that a function that releases memory, should test for NULL. The standard free() routine also tests for the NULL pointer, and the C++ delete operator doesn’t complain neither.

That would be an easy change in BufferAllocation_1.c, but I’m afraid it would hide a different problem in the code.

So I would rather propose to add an assert:

	configASSERT( pxNetworkBuffer != NULL );

which at least makes debugging easier.

1 Like

Would appreciate it if you could create a pull request to upstream your driver to https://github.com/FreeRTOS/FreeRTOS-Plus-TCP/tree/main/portable/NetworkInterface/ThirdParty/MSP432 - at the time of writing the ThirdParty directory doesn’t exist.

1 Like

Hein, I second that as a good idea! The assert would definitely help with driver debugging.

Thanks, Richard - that sounds good. I will prepare the driver code and then create a pull request.

Here is a pull request for the driver. My apologies for the delay: I’ve now extensively tested the driver over the past few weeks and it appears to work well. Some changes were also made to the code to ensure robustness. Thanks again for everyone’s help.