Context Switching Hangs - Issue

Hi everybode,
I’m using FreeRTOS V10.2.1 with a STM32WL microcontroller running at 48MHz (ARM Cortex-M4) with the GCC-ARM_CM3 porting, packed in a ST package.
My application consists of several tasks with different priorities; some of them wait for a direct notification sent by other tasks, one task wait for a queued element, others are notified from ISRs.
After a certain non-deterministic amount of time the context switching stop. By using an internal watchdog, refreshed periodically by a low priority task, this context switching problem will result in a reset. I can also reproduce the problem replacing the watchdog with a software timer to avoid reset so to break and get some informations about running status.

Stack size of each task is enough, set as twice the used one as reported in rtos analyzer during debug sessions. I do not have access to Tracealyzer; i just use the free rtos-views plugin of VScode.

All the enabled interrupt external to FreeRTOS have the same priority and equal to LIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY.

I have the feeling that the problem comes from a task notification set inside a critical section (not the FreeRTOS one, but the one that “Disables IRQ interrupts by setting the I-bit in the CPSR”) that is executed inside an ISR of LIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY priority.

Also, different optimization levels will increase/decrease the frequency of problem during execution time.

Next is freertos config file:

/* USER CODE BEGIN Header */
/*
 * FreeRTOS Kernel V10.2.1
 * Portion Copyright (C) 2017 Amazon.com, Inc. or its affiliates.  All Rights Reserved.
 * Portion Copyright (C) 2019 StMicroelectronics, Inc.  All Rights Reserved.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy of
 * this software and associated documentation files (the "Software"), to deal in
 * the Software without restriction, including without limitation the rights to
 * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 * the Software, and to permit persons to whom the Software is furnished to do so,
 * subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
 * FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 * COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 * IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 *
 * http://www.FreeRTOS.org
 * http://aws.amazon.com/freertos
 *
 * 1 tab == 4 spaces!
 */
/* USER CODE END Header */

#ifndef FREERTOS_CONFIG_H
#define FREERTOS_CONFIG_H

/*-----------------------------------------------------------
 * Application specific definitions.
 *
 * These definitions should be adjusted for your particular hardware and
 * application requirements.
 *
 * These parameters and more are described within the 'configuration' section of the
 * FreeRTOS API documentation available on the FreeRTOS.org web site.
 *
 * See http://www.freertos.org/a00110.html
 *----------------------------------------------------------*/

/* USER CODE BEGIN Includes */
/* Section where include file can be added */
/* USER CODE END Includes */

/* Ensure definitions are only used by the compiler, and not by the assembler. */
#if defined(__ICCARM__) || defined(__CC_ARM) || defined(__GNUC__)
  #include <stdint.h>
  extern uint32_t SystemCoreClock;
#endif
#define configENABLE_FPU                         0
#define configENABLE_MPU                         0

#define configUSE_PREEMPTION                     1
#define configSUPPORT_STATIC_ALLOCATION          1
#define configSUPPORT_DYNAMIC_ALLOCATION         1
#define configUSE_IDLE_HOOK                      1
#define configUSE_TICK_HOOK                      0
#define configCPU_CLOCK_HZ                       ( SystemCoreClock )
#define configTICK_RATE_HZ                       ((TickType_t)1000)
#define configMAX_PRIORITIES                     ( 56 )
#define configMINIMAL_STACK_SIZE                 ((uint16_t)256)
#define configTOTAL_HEAP_SIZE                    ((size_t)32768)
#define configMAX_TASK_NAME_LEN                  ( 16 )
#define configUSE_TRACE_FACILITY                 1
#define configUSE_16_BIT_TICKS                   0
#define configUSE_MUTEXES                        1
#define configQUEUE_REGISTRY_SIZE                8
#define configCHECK_FOR_STACK_OVERFLOW           1
#define configUSE_RECURSIVE_MUTEXES              1
#define configUSE_COUNTING_SEMAPHORES            1
#define configENABLE_BACKWARD_COMPATIBILITY      0
#define configUSE_PORT_OPTIMISED_TASK_SELECTION  0
#define configUSE_TICKLESS_IDLE                  1
/* USER CODE BEGIN MESSAGE_BUFFER_LENGTH_TYPE */
/* Defaults to size_t for backward compatibility, but can be changed
   if lengths will always be less than the number of bytes in a size_t. */
#define configMESSAGE_BUFFER_LENGTH_TYPE         size_t
/* USER CODE END MESSAGE_BUFFER_LENGTH_TYPE */

/* Co-routine definitions. */
#define configUSE_CO_ROUTINES                    0
#define configMAX_CO_ROUTINE_PRIORITIES          ( 2 )

/* Software timer definitions. */
#define configUSE_TIMERS                         1
#define configTIMER_TASK_PRIORITY                ( 2 )
#define configTIMER_QUEUE_LENGTH                 10
#define configTIMER_TASK_STACK_DEPTH             256

/* Set the following definitions to 1 to include the API function, or zero
to exclude the API function. */
#define INCLUDE_vTaskPrioritySet             1
#define INCLUDE_uxTaskPriorityGet            1
#define INCLUDE_vTaskDelete                  1
#define INCLUDE_vTaskCleanUpResources        1
#define INCLUDE_vTaskSuspend                 1
#define INCLUDE_vTaskDelayUntil              1
#define INCLUDE_vTaskDelay                   1
#define INCLUDE_xTaskGetSchedulerState       1
#define INCLUDE_xTaskResumeFromISR           0
#define INCLUDE_xTimerPendFunctionCall       1
#define INCLUDE_xQueueGetMutexHolder         1
#define INCLUDE_uxTaskGetStackHighWaterMark  1
#define INCLUDE_eTaskGetState                1

/*
 * The CMSIS-RTOS V2 FreeRTOS wrapper is dependent on the heap implementation used
 * by the application thus the correct define need to be enabled below
 */
#define USE_FreeRTOS_HEAP_4

/* Cortex-M specific definitions. */
#ifdef __NVIC_PRIO_BITS
 /* __BVIC_PRIO_BITS will be specified when CMSIS is being used. */
 #define configPRIO_BITS         __NVIC_PRIO_BITS
#else
 #define configPRIO_BITS         4
#endif

/* The lowest interrupt priority that can be used in a call to a "set priority"
function. */
#define configLIBRARY_LOWEST_INTERRUPT_PRIORITY   15

/* The highest interrupt priority that can be used by any interrupt service
routine that makes calls to interrupt safe FreeRTOS API functions.  DO NOT CALL
INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER
PRIORITY THAN THIS! (higher priorities are lower numeric values. */
#define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 5

/* Interrupt priorities used by the kernel port layer itself.  These are generic
to all Cortex-M ports, and do not rely on any particular library functions. */
#define configKERNEL_INTERRUPT_PRIORITY 		( configLIBRARY_LOWEST_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) )
/* !!!! configMAX_SYSCALL_INTERRUPT_PRIORITY must not be set to zero !!!!
See http://www.FreeRTOS.org/RTOS-Cortex-M3-M4.html. */
#define configMAX_SYSCALL_INTERRUPT_PRIORITY 	( configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) )

/* Normal assert() semantics without relying on the provision of an assert.h
header file. */
/* USER CODE BEGIN 1 */
#define configASSERT( x ) if ((x) == 0) {extern void _ResetWithError(char *err_string, uint32_t err_code, char *file, uint32_t line); taskDISABLE_INTERRUPTS(); _ResetWithError("configASSERT", 1, __FILE__, __LINE__);}
#define traceTASK_SWITCHED_IN() do{extern char* __last_executed_taskname; __last_executed_taskname = (char*)(&(pxCurrentTCB->pcTaskName[0]));}while(0)
/* USER CODE END 1 */

/* Definitions that map the FreeRTOS port interrupt handlers to their CMSIS
standard names. */
#define vPortSVCHandler    SVC_Handler
#define xPortPendSVHandler PendSV_Handler

/* IMPORTANT: This define is commented when used with STM32Cube firmware, when the timebase source is SysTick,
              to prevent overwriting SysTick_Handler defined within STM32Cube HAL */

#define xPortSysTickHandler SysTick_Handler

/* USER CODE BEGIN Defines */
/* Section where parameter definitions can be added (for instance, to override default ones in FreeRTOS.h) */

#define configAPPLICATION_ALLOCATED_HEAP 0

#if defined(configAPPLICATION_ALLOCATED_HEAP) && (configAPPLICATION_ALLOCATED_HEAP > 1)
  #define _REMOVE_CAST_PRE_PRE_(x)
  #define _REMOVE_CAST_PRE_(x)     _REMOVE_CAST_PRE_PRE_ x
  #define _REMOVE_CAST_(x)         _REMOVE_CAST_PRE_ x
  #if !defined(configTOTAL_HEAP_SIZE) || _REMOVE_CAST_(configTOTAL_HEAP_SIZE) > 16384
    #error "ERROR: STM32WLE5JBx has only 16KB of RAM1 allocable. Change implementation of FreeRTOS heap management, or change values according to the specific microcontroller!" 
  #endif
#endif

/* USER CODE END Defines */

#if defined(__ICCARM__) || defined(__CC_ARM) || defined(__GNUC__)
void PreSleepProcessing(uint32_t *ulExpectedIdleTime);
void PostSleepProcessing(uint32_t *ulExpectedIdleTime);
#endif /* defined(__ICCARM__) || defined(__CC_ARM) || defined(__GNUC__) */

/* The configPRE_SLEEP_PROCESSING() and configPOST_SLEEP_PROCESSING() macros
allow the application writer to add additional code before and after the MCU is
placed into the low power state respectively. */
#if configUSE_TICKLESS_IDLE == 1
#define configPRE_SLEEP_PROCESSING                        PreSleepProcessing
#define configPOST_SLEEP_PROCESSING                       PostSleepProcessing
#endif /* configUSE_TICKLESS_IDLE == 1 */

#endif /* FREERTOS_CONFIG_H */

I want to explain better the application:
A middle priority task is continuously notified by an ISR (relative to an EXTI interrupt) and fill a FreeRTOS queue with some values read from spi peripheral (frequency goes from 2.5 Hz to 40Hz). A low priority task will consume this data, waiting on queue. An higher priority task is notified by the ISR inside critical section as described before, with a periodicity of 10 minutes.
Rest of the time microcontroller goes to low power mode (stop2 specifically).

What can I do to debug this problem?

(Among the tests done by varying the conditions, I noticed that if I try to change the priority of the systick, for example, by changing it to LIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY and other interrupts to a lower priority, the firmware does not work and automatically reboots for some unknown reason.)

Also, i tried to install last version of FreeRTOS kernel but the problem still persists.

How are you setting interrupt priority? The reason I ask is that you need to ensure that the priority shifted correctly before it is programmed into the register.

Try disabling tickless idle and see if the problem still persists.

Did you create this project using STM32Cube? If yes, did you change the HAL Tick to use a timer other than SysTick?

I set NVIC interrupt priorities using STM32CubeMX

I can try and let you know, but tickless mode is very important for me.

Yes, I used STM32CubeMX but the project is based on a LoRaWAN project. HAL tick is handled with RTC. I have not modified this.

You can check that by going to System Core --> Sys --> Timebase Source in the STM32Cube GUI.

When that happens, is tick count incrementing? You can check that by checking the variable xTickCount.

At the moment it seems that with tickless mode disabled everything work just fine. I will confirm this tomorrow

Time source is set to None because HAL tick is taken from RTC subseconds reading as here:

static inline uint32_t GetTimerTicks(void)
{
  uint32_t ssr = LL_RTC_TIME_GetSubSecond(RTC);
  /* read twice to make sure value it valid*/
  while (ssr != LL_RTC_TIME_GetSubSecond(RTC))
  {
    ssr = LL_RTC_TIME_GetSubSecond(RTC);
  }
  return UINT32_MAX - ssr;
}

I will check it

Thank you

It tourned out that xTickCount stop to be incremented as the uxSchedulerSuspended variable is ‘1’. (The tick hook function in which i toggle a gpio continues to be executed as expected)

Good find. So we need to find how the scheduler got suspended and not resumed. Are you calling vTaskSuspendAll somewhere in your application?

I confirm that with tickless mode disabled there is no problem

No, I don’t suspend the scheduler.

Both are good findings. Are you using vPortSuppressTicksAndSleep from port.c or do you provide your own?

Please share the definition of PreSleepProcessing and PostSleepProcessing functions.

I’m using the one from port.c

void PreSleepProcessing(uint32_t *ulExpectedIdleTime)
{
  uint32_t WakeUpTimer_timeOut_ms = app_freertos_tick_to_ms(*ulExpectedIdleTime);
  
  UTIL_TIMER_SetPeriod(&WakeUpTimer, WakeUpTimer_timeOut_ms);
  UTIL_TIMER_Start(&WakeUpTimer);
  Time_BeforeSleep = UTIL_TIMER_GetCurrentTime();

  /*Stop the systick here so that it stops even in sleep mode*/
  portNVIC_SYSTICK_CTRL_REG &= ~portNVIC_SYSTICK_ENABLE_BIT;

  UTIL_LPM_EnterLowPower();

  /*
    (*ulExpectedIdleTime) is set to 0 to indicate that PreSleepProcessing contains
    its own wait for interrupt or wait for event instruction and so the kernel vPortSuppressTicksAndSleep
    function does not need to execute the wfi instruction
  */
  *ulExpectedIdleTime = 0;
}
void PostSleepProcessing(uint32_t *ulExpectedIdleTime)
{
  uint32_t SleepDuration = UTIL_TIMER_GetElapsedTime(Time_BeforeSleep);

  /* Avoid compiler warnings about the unused parameter. */
  UNUSED(ulExpectedIdleTime);

  UTIL_TIMER_Stop(&WakeUpTimer);

  /* Set the new reload value. */
  if (portNVIC_SYSTICK_CURRENT_VALUE_REG > (SleepDuration * CORE_TICK_RATE))
  {
    /*what remains to sleep*/
    portNVIC_SYSTICK_LOAD_REG = portNVIC_SYSTICK_CURRENT_VALUE_REG - (app_freertos_ms_to_tick(SleepDuration) * CORE_TICK_RATE);
  }
  else
  {
    portNVIC_SYSTICK_LOAD_REG = CORE_TICK_RATE;
  }

  /* Clear the SysTick count flag and set the count value back to zero. */
  portNVIC_SYSTICK_CURRENT_VALUE_REG = 0UL;

  /* Restart SysTick. */
  portNVIC_SYSTICK_CTRL_REG |= portNVIC_SYSTICK_ENABLE_BIT;
}

Also, just to clarify, the “UTIL_LPM_EnterLowPower()” call will execute in a critical section the functions PWR_EnterSleepMode()/PWR_ExitSleepMode() or PWR_EnterStopMode()/PWR_ExitStopMode(). The PWR_Exit…() is called at wakeup just after the PWR_Enter…() and then, exiting from the critical section mentioned before, the PostSleepProcessing() will be executed.
Those are the PWR…() functions:

void PWR_EnterSleepMode(void)
{
  /* Suspend sysTick */
  HAL_SuspendTick();

  HAL_PWR_EnterSLEEPMode(PWR_MAINREGULATOR_ON, PWR_SLEEPENTRY_WFI);
}

void PWR_ExitSleepMode(void)
{
  /* USER CODE END ExitSleepMode_1 */
  /* Resume sysTick */
  HAL_ResumeTick();
}
void PWR_EnterStopMode(void)
{
  myEnterLowPower();

  HAL_SuspendTick();

  /* Clear Status Flag before entering STOP/STANDBY Mode */
  LL_PWR_ClearFlag_C1STOP_C1STB();

  HAL_PWREx_EnterSTOP2Mode(PWR_STOPENTRY_WFI);
}

void PWR_ExitStopMode(void)
{
  myExitLowPower();

  /* Resume sysTick : work around for debugger problem in dual core */
  HAL_ResumeTick();

  vcom_Resume();
}

In myEnterLowPower()/myExitLowPower() I deinit/reinit some not-retained peripherals like spi/i2c/usart while in vcom_Resume() I call DMA/USART init functions to restore logging prints.

I think we need to find out which portion is leaving the scheduler is suspended. Can you try disabling parts of your code and try to narrow down the problematic part? You can probably start with minimizing the code in PreSleepProcessing and PostSleepProcessing. Another possibility is memory corruption - does the data next to uxSchedulerSuspended seem corrupted?

Before I minimize the code I would point out that the library that handle logging prints (“stm32_adv_trace”) influence the low power mode in that it disables the STOP2 mode until all characters are transfered. This is done inside critical sections (primask) and inside DMA tx_complete isr (in particular in the isr stop2 is re-enabled).
The print functions that disable low power are called also from a rtc timer callback (from isr), the one that every 10 minutes is called.

EDIT: i removed the test result from this message because it was a false result

Below are some dump values with the firmware not-minimized, taken at the moment in which the software watchdog block execution:

-----------------------------------------------------------------------------------
RAM Address|                 Object                       |         Value         |
-----------------------------------------------------------------------------------
0x200022e4 | volatile UBaseType_t  uxCurrentNumberOfTasks | 0x11                  |
0x20002370 | volatile TickType_t   xTickCount             | 0x278f6b              |
0x200022f8 | volatile UBaseType_t  uxTopReadyPriority     | 0x0                   |
0x20002344 | volatile BaseType_t   xSchedulerRunning      | 0x1                   |
0x200022ec | volatile UBaseType_t  uxPendedTicks          | 0x934f02              |
0x20002374 | volatile BaseType_t   xYieldPending          | 0x1                   |
0x2000232c | volatile BaseType_t   xNumOfOverflows        | 0x0                   |
0x200022f4 |          UBaseType_t  uxTaskNumber           | 0x11                  |
0x20002328 | volatile TickType_t   xNextTaskUnblockTime   | 0x278f6e              |
0x20002324 |          TaskHandle_t xIdleTaskHandle        | 0x200028a4 (Idle_TCB) |
0x200022f0 | volatile UBaseType_t  uxSchedulerSuspended   | 0x1                   |
-----------------------------------------------------------------------------------

I don’t know if something is corrupted. What do you think?

If needed, i have also a full dump of RAM

Hi Igor - your PreSleepProcessing() and PostSleepProcessing() functions seem to be the cause of the issue. You are not supposed to manipulate the SysTick timer in those functions. You might break the careful logic in vPortSuppressTicksAndSleep().

If you use tickless idle without your PreSleepProcessing() and PostSleepProcessing() functions, does the system work properly – aside from using more power?

1 Like

Yes, system seems to work properly.
The Pre/Post Sleep functions were written by STMicroelectronics. I thought they were okay since this is my first time approaching FreeRTOS.

Anyway, the tick hook function is called even when system goes into my problem. If systick was wrongly manipulated, shouldn’t I expect the hook not to be called at all?

Not necessarily. Most likely you are enabling SysTick too early, leaving the system in an inconsistent state. Can you try removing the SysTick manipulation code from your pre and post sleep processing functions?

It seems your post sleep function is writing a very small value to the SysTick load register. That sets SysTick for an interrupt period shorter than the tick ISR. Then you get stuck endlessly repeating the tick ISR in the short window provided by vPortSuppressTicksAndSleep() to handle the interrupt that woke the system. And the value of uxPendedTicks explodes quickly because the tick ISR increments it. I wonder if ST can provide support to you on this function since they provided it?

I tried removing systick manipulation (the lines in Pre/Post functions and the HAL_Suspend/Reume functions inside UTIL_LPM_EnterLowPower()) but system don’t enter deep sleep and it get stuck quickly after reboot

Thanks for explanation. I can try receiving support from ST if needed

Hi all,

I ran into a similar issue. To correct it I replaced the xTaskNotifyWait() in the task by an xQueue. An now it works fine. Note that in the task was calling a semaphore while activated. I suspected that both xTaskNotifyWait and semaphore cannot coexist.

You may try this as it is straight forward.
Good luck.

Just as clarification: task notfications and semaphores are separate features and can coexist.