HardFault randomly in xEventGroupWaitBits

hi there !
since i use Cellular interface lib, i get randomly some hardfault always at the same place (xEventGroupWaitBits) deeply called from Cellular_Init()
image

I think the random effect is due to timing, but i’m not sure…

I tryed to investiguate, and i found the fault happen when executing
listINSERT_END( pxEventList, &( pxCurrentTCB->xEventListItem ) );
inside the function vTaskPlaceOnUnorderedEventList

checking the variables, i think the problem is due to pxEventList.pxIndex is NULL ?

The assembly is stopped here but no code is detailled due to the Macro

so, i do not understand what could be wrong… most of time, it works like a charm…

PS: Sorry for multiple posting, but i’m not able to put 3 screenshot in 1 post… (bad rule)

Memory corruption due to stack overflow ? Did you enable stack checking and define configASSERT ?

i don’t think so, this happen very early after reset.
i put the stack overflow hook function, but i need to check if i enabled the stack checking.
yes configASSERT is defined.

i do not find how to enable stack checking ?

:+1: thank you

just added to the config
compile, run, and the random crash time is now !
but, the code still halt at same place…
no ASSERT, no call to vApplicationStackOverflowHook…

When running, the task that is crashing have about 30% free stack (initial 4kB)

Strange … if it’s a Cortex-M MCU a wrong interrupt priority might be a source of corruptions, too.
Maybe see Running the RTOS on a ARM Cortex-M Core - FreeRTOS™
Or just a normal programming bug :wink:

yes it is a cortex-M33, i already read the article, and configASSERT is defined, so if i understand well, and if the NVIC is misconfigured, the kernel should get it…

i have no doubt, the pb comes from a mistake somewhere, but because it is not each time, it is difficult to find it…

Yes - recent FreeRTOS versions catch many common problems including interrupt priority issues.
Those bugs can be nasty … good luck :+1:

I think you’re right pointing interrupt mis-configuration, because the crash comes some lines after enabling interrupts (and waiting for it firing)
i reread the article…
hard to well understand…

could you help me to check if my interrupt config is good ?

Or do you enable interrupts before initializing the resource used in an ISR ?
E.g. a queue might not been created but the ISR tries to push something in it ?
As mentioned wrong interrupt priorities are catched by configASSERT by recent FreeRTOS versions…

Can you first remove this interrupt code and see if the crash goes away? This page, which @hs2 already shared, provides good details about interrupt priority. What is the priority of the interrupt that you enable? Also, can you share your FreeRTOSConfig.h?

@hs2, I only use the eventgroup inside the ISR

void _HALUART_TxCpltCallback( UART_HandleTypeDef * hUart )
{
BaseType_t xHigherPriorityTaskWoken = pdFALSE, xResult = pdPASS;
CellularCommInterfaceContext * pIotCommIntfCtx = & _iotCommIntfCtx;
	if( hUart != NULL )
	{
		xResult = xEventGroupSetBitsFromISR( pIotCommIntfCtx->pEventGroup,
											 COMM_EVT_MASK_TX_DONE,
											 & xHigherPriorityTaskWoken );
		if( xResult == pdPASS )
		{
			portYIELD_FROM_ISR( xHigherPriorityTaskWoken );
		}
	}
}

The eventgroup is initialized earlier before the 1st sending. (before the UART is initialized)
The interrupt is enabled after the UART is initialized.

    /* Setup the event group. */
    if( ret == IOT_COMM_INTERFACE_SUCCESS )
    {
        pIotCommIntfCtx->pEventGroup = xEventGroupCreate();
        if( pIotCommIntfCtx->pEventGroup == NULL )
        {
            LogError(( "EventGroup create failed" ));
            helpers_RingBufferFree(pIotCommIntfCtx->s_RxBuffer);
            CellularDevice_PowerOff();
            ret = IOT_COMM_INTERFACE_NO_MEMORY;
        }
    }

    /* Setup the phy device. */
    if( ret == IOT_COMM_INTERFACE_SUCCESS )
    {
        pIotCommIntfCtx->pPhy = & _t_CommIntfPhy;
        ( void ) memset( pIotCommIntfCtx->pPhy, 0, sizeof( UART_HandleTypeDef ) );
        if( prvCellularUartInit( pIotCommIntfCtx->pPhy ) != HAL_OK )
        {
            LogError(( "UART init failed" ));
            vEventGroupDelete( pIotCommIntfCtx->pEventGroup );
            helpers_RingBufferFree(pIotCommIntfCtx->s_RxBuffer);
            CellularDevice_PowerOff();
            ret = IOT_COMM_INTERFACE_DRIVER_ERROR;
        }
    }
static void _HALUART_MspInitCallback(UART_HandleTypeDef *hUart)
{
RCC_PeriphCLKInitTypeDef PeriphClkInit = { 0 };
HAL_StatusTypeDef xHalStatus = HAL_OK;
	if( hUart != NULL )
	{
		PeriphClkInit.PeriphClockSelection = RCC_PERIPHCLK_USART6;
		PeriphClkInit.Usart6ClockSelection = RCC_USART6CLKSOURCE_SYSCLK;
		xHalStatus = HAL_RCCEx_PeriphCLKConfig( &PeriphClkInit );

		if( xHalStatus != HAL_OK )
		{
			LogError(( "Error while configuring peripheral clock for UART6." ));
		}

		/* Peripheral clock enable */
		__HAL_RCC_USART6_CLK_ENABLE();

		__HAL_RCC_GPIOC_CLK_ENABLE();
		__HAL_RCC_GPIOE_CLK_ENABLE();

		/* UART interrupt init */
		HAL_NVIC_SetPriority (USART6_IRQn, 5, 0);
		HAL_NVIC_EnableIRQ(USART6_IRQn);
	}
}

@aggarg, i know if i do not call the cellular interface setup, (all theses functions are not called) then all seems to works fine…
because of this and the crash always happen at the same line, i’m pretty sure it is linked to the interrupt…
but i also use interrupts elsewhere in the code…

here is my config file.
FreeRTOSConfig.h (8.5 KB)

the MCU is the STM32U5A5 and ST provide __NVIC_PRIO_BITS set to 4 bits

configKERNEL_INTERRUPT_PRIORITY = 0xf0 (computed value)
configMAX_SYSCALL_INTERRUPT_PRIORITY = 0x50 (computed value)

honestly i do not remember where i get this config file (maybe in a ST example)
configLIBRARY_LOWEST_INTERRUPT_PRIORITY and configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY are not used anywhere :man_shrugging:

the ST HAL init global init:

  /* PendSV_IRQn interrupt configuration */
  HAL_NVIC_SetPriority(PendSV_IRQn, 15, 0);

the ST HAL Tick timer IRQ setup:

        HAL_NVIC_SetPriority(TIM6_IRQn, 0U, 0U);

the USB IRQ setup:

  HAL_NVIC_SetPriority(OTG_HS_IRQn, 6, 0);

the 2 OSPI IRQ setup:

    HAL_NVIC_SetPriority( OCTOSPI1_IRQn, 5, 0 );
    HAL_NVIC_SetPriority( OCTOSPI2_IRQn, 5, 0 );

an external IRQ setup: (for Ethernet controller)

HAL_NVIC_SetPriority( EXTI2_IRQn, 5, 0 );

and the last one, for the cellular UART: (the one where the crash seems to happen)

HAL_NVIC_SetPriority (USART6_IRQn, 5, 0);

:scream:Yes clearly the IRQ priorities are set hum… using copy-paste… (except for USB) at this time it is a simple prototype, so it is good like this…

i only seen the crash with the code linked to the cellular modem and few lines after using the IRQ

            err = HAL_UART_Transmit_IT( pIotCommIntfCtx->pPhy, ( uint8_t * ) pData, ( uint16_t ) dataLength );
            if( err != HAL_BUSY )
            {
// Crash happen randomly always at the next line few (milli)seconds after reset
// The code initialize all peripherals and starts the cellular in following inside a startup task
                uxBits = xEventGroupWaitBits( pIotCommIntfCtx->pEventGroup,
                                              COMM_EVT_MASK_TX_DONE |
                                              COMM_EVT_MASK_TX_ERROR |
                                              COMM_EVT_MASK_TX_ABORTED,
                                              pdTRUE,
                                              pdFALSE,
                                              pdMS_TO_TICKS( timeoutMilliseconds ) );
                if( ( uxBits & ( COMM_EVT_MASK_TX_ERROR | COMM_EVT_MASK_TX_ABORTED ) ) != 0U )
                {

thank for your help !

Is TIM6 used as FreeRTOS tick (instead of SysTick often used by STM HAL for other purposes) ? NVIC prio 0 could be a problem b/c it’s the highest prio possible. The FreeRTOS ticker is usually set to the lowest prio.
In your case it’s prio 15 then.

Have you set INCLUDE_xEventGroupSetBitFromISR in your FreeRTOSConfig.h? The one you’ve posted does not include this required config option.

What does your EventGroup xTasksWaitingForBits field look like after creation?

@kstribrn i searched for INCLUDE_xEventGroupSetBitFromISR in the project, but i found no reference to it… (even in the freertos source…)
:roll_eyes:

regarding the eventgroup, i think all is ok

@hs2
TIM6 is the HAL timer
FreeRTOS use SysTick

In addition, in CMSIS_OS2, i found

  NVIC_SetPriority (SVCall_IRQn, 0U);

so, to summarize, i have some IRQ set to priorities 0,5 & 6
configKERNEL_INTERRUPT_PRIORITY sets to 15
configMAX_SYSCALL_INTERRUPT_PRIORITY sets to 5 (i don’t know why)

does this mean maybe my IRQ with priority 6 could be an issue ?
or IRQ priority 5 could be a problem ?
or i don’t understand…

do i need to change the TIM6 priority?

i have not found the priority of systick?

NVIC interrupt priorities 5…15 (higher value means lower prio on Cortex-M) are allowed resp. covered by FreeRTOS means you can use FreeRTOS API calls in the corresponding ISRs.
The TIM6 interrupt handler should not use FreeRTOS-API and probably doesn’t.
If you’re using default SysTick for FreeRTOS tick you don’t need to care. FreeRTOS handles this for you (and set the prio to configKERNEL_INTERRUPT_PRIORITY).
See also this posting for a pretty good explanation: