xQueueGiveFromISR Assertion

justinh109 wrote on Saturday, September 07, 2019:

Hello all,

I am new to the forum, and fairly new to RTOS as well. I am working on a project that uses RTOS 8.2.3 with heap 4 paired with the LWIP socket API and I have run into an issue that has completely stumped me.

I’m running a simple TCP echo test where I issue the processor a prompt, it generates a response, in this case a 251 byte messages and sends it back to my python server. This test runs well for a while but then it fails!
The GMAC_Handler calls xQueueGiveFromISR, and after a few hours this function is throwing an assert, if I disable the assert a hard fault will result. I also noticed that I can force the error sooner if I spam UDP packets at the device.

/* xQueueGenericSendFromISR() should be used instead of xQueueGiveFromISR()
if the item size is not 0. */
configASSERT( pxQueue->uxItemSize == 0 );

I’ve tried using ‘xQueueGenericSendFromISR instead of xQueueGiveFromISR,’ but this also resulted in a hard fault. The 2 things I think I am doing correctly are heap allocation and IRQ settings. I’ve made sure all the tasks have a highwater mark over 1000 and I double checked my IRQ priorites:

  • #define configPRIOBITS NVICPRIOBITS
  • #define configLIBRARYLOWESTINTERRUPTPRIORITY 0x07*
  • #define configLIBRARYMAXSYSCALLINTERRUPTPRIORITY 4*
  • #define configKERNELINTERRUPTPRIORITY (configLIBRARYLOWESTINTERRUPTPRIORITY << (8 - configPRIOBITS))
    #define configMAXSYSCALLINTERRUPTPRIORITY (configLIBRARYMAXSYSCALLINTERRUPTPRIORITY << (8 - configPRIOBITS))
    #define GMACIRQNPRIO configMAXSYSCALLINTERRUPTPRIORITY
    *

The two functions using the semaphore are below, and for sake of space I attached an snip of the tcpip thread, in addition I am running a thread to manage the socket connection.

//GMAC ISR
static void gmac_task(void *pvParameters)
{
struct gmac_device ps_gmac_dev = pvParameters;
while (1) {
//get high water mark for this task
rtosStats.gmacHighWaterMark = uxTaskGetStackHighWaterMark( NULL );
sys_arch_sem_wait(&ps_gmac_dev->rx_sem, 0); // Wait for the RX notification semaphore.
/
Process the incoming packet. */
ethernetif_input(ps_gmac_dev->netif);
}
}
//Gmac task
static void gmac_task(void *pvParameters)
{
struct gmac_device ps_gmac_dev = pvParameters;
while (1) {
//get high water mark for this task
rtosStats.gmacHighWaterMark = uxTaskGetStackHighWaterMark( NULL );
sys_arch_sem_wait(&ps_gmac_dev->rx_sem, 0); // Wait for the RX notification semaphore.
/
Process the incoming packet. */
ethernetif_input(ps_gmac_dev->netif);
}
}

Unfortunately I’m all alone on this project and I’ve run out of ideas here, I’m totally stumped!
Why does the queue->uxItemSize zero sometimes but not always, how do I prevent this?
Any suggestions or insights would be tremendously appreciated.

incase it is relevant, I’m using an ATSAME70N19 with Atmel studio and ASF.
Thanks in advance!

justinh109 wrote on Sunday, September 08, 2019:

Update:
After reworking some of the drivers I found an error in the ethernet_phy config for atmel gmac drivers. The PHY IC indicated it is operiting in full duplex but the drivers configured the PHY for half duplex. Amazing this worked at all.

Now the test will run for about an hour before the folling assert triggers in the GMAC_Handler when xQueueGiveFromISR is called (see handler in previous post for call):

BaseType_t xQueueGiveFromISR( QueueHandle_t xQueue, BaseType_t * const pxHigherPriorityTaskWoken )
{
configASSERT( pxQueue ); <==== This is triggering!!!

}

Also in my GMAC status registers I am seeing the following errors.

TX errors:
HRESP Not OK: Set when the DMA block sees HRESP not OK.
TFC Transmit Frame Corruption Due to AHB Error

RX errors:
BNA - Buffer Not Available

does anybody how the binary semaphore here could be set to NULL? Is there any correlation to the GMAC errors?

rtel wrote on Sunday, September 08, 2019:

First form your first post the call stack images shows a dummy handler being called - in all probability that is because an interrupt executed for which no handler has been installed. I would recommend working out which interrupt that is as it could be the route cause of all the issues. See the section “Determining Which Exception Handler is Executing” on this page: https://www.freertos.org/Debugging-Hard-Faults-On-Cortex-M-Microcontrollers.html

Regarding the interrupt priority configuration from your first post - I would recommend you update the source files to the latest V10.2.1 code as that contains many more configASSERT() statements to try and catch misconfigurations. I think you should find its a drop in replacement provided you are not using mutexes from interrupts (using binary semaphores from interrupts is fine).

Ref the binary semaphore being NULL on entry to xQueueGiveFromISR() - I expect that will be a symptom of an error that has already occurred. What is the call stack when that happens? The call stack may give you a clue, but if the root cause is a memory corruption then it will be harder to track down.

justinh109 wrote on Thursday, September 12, 2019:

Thanks for tips,

It turns out this problem was a symbtom of bad GMAC error handling. In my case the BNA bit in my GMAC recieve status register was being set and I did not have a handler in place for this! This is a serious low level error because it breaks the data path between the GMAC and my progroam memory, I suspect it also was introducing all sorts of chaos into my GMAC interrupt. I was able to fix this in the GMAC_TASK by flushing the RX buffers when this error occurs and my problem went away.

As far as the dummy_handler goes, ARM devices, or at least Corex-M7, allow you to read the ISPR register to determine the reason that the dummy handler was called. In my case it was hard fault. Here is the handler incase anyone finds it useful in the future:

/* IF YOU GET HERE . . .

	CHECK YOUR VTOR REGISTER in project propertioes -> tools
	
	ISR_NUMBER
	This is the number of the current exception:
	0 = Thread mode
	1 = Reserved
	2 = NMI
	3 = Hard fault
	4 = Memory management fault
	5 = Bus fault
	6 = Usage fault
	7-10 = Reserved
	11 = SVCall
	12 = Reserved for Debug
	13 = Reserved
	14 = PendSV
	15 = SysTick
	16 = IRQ0
	45 = IRQ29

*/
U32 ISPR_ERR = 9999;

void Dummy_Handler(void)
{
//CHECK YOUR VTOR REGISTER in project propertioes -> tools
ISPR_ERR = __get_IPSR();
if (ISPR_ERR == 3)

delay_ms(1);
    while (1) {
		delay_ms(1);
    }

}