Scheduler not starting: stm32+HAL+FreeRTOS

Dear all,
I’d like to use HAL and FreeRTOS in a same application. However, I got stuck in a scheduler.
What do I do:

  1. Create new project in stm32cubeIDE for my board (NUCLEO-L152RE)
  2. Set timebase source as TIM11 (least feature-rich timer)
  3. In middleware section choose FreeRTOSv1, all the settings been default (I edit them later in FreeRTOSConfig.h later)
  4. Switch to native FreeRTOS as descibed here and later replace files in FreeRTOS subfolder of the project to recent release from github
  5. Redirect printf to UART2 (connected to inbuilt debugger) as described here (maybe it should be done in some other way, but redirecting UART seems to be most popular)

What I get: if want something simple, i.e. blinky, everything works OK, but when I try something more reallife, I get strange errors:

  1. Malloc error before entering scheduler even if I increase memory from default 3072 to 4096 (however, when I decrease memory to 2048 scheduler is reached, but I do not enter idle task);
  2. Hang in the middle of a string when sending using printf

Could you please tell me how I should start? At the momnet I don’t even understand what is happening.

Here is my FreeRTOS config:

#if defined(__ICCARM__) || defined(__CC_ARM) || defined(__GNUC__)
  #include <stdint.h>
  extern uint32_t SystemCoreClock;
#endif
#define configUSE_PREEMPTION                     1
#define configSUPPORT_STATIC_ALLOCATION          0
#define configSUPPORT_DYNAMIC_ALLOCATION         1
#define configUSE_IDLE_HOOK                      0
#define configUSE_TICK_HOOK                      0
#define configCPU_CLOCK_HZ                       ( SystemCoreClock )
#define configTICK_RATE_HZ                       ((TickType_t)1000)
#define configMAX_PRIORITIES                     ( 7 )
#define configMINIMAL_STACK_SIZE                 ((uint16_t)64)
#define configTOTAL_HEAP_SIZE                    ((size_t)3072)
#define configMAX_TASK_NAME_LEN                  ( 16 )
#define configUSE_16_BIT_TICKS                   0
#define configUSE_MUTEXES                        1
#define configQUEUE_REGISTRY_SIZE                8
#define configUSE_PORT_OPTIMISED_TASK_SELECTION  1

#define configUSE_IDLE_HOOK                     1
#define configUSE_TICK_HOOK                     0
#define configCHECK_FOR_STACK_OVERFLOW          2
#define configUSE_MALLOC_FAILED_HOOK            1
#define configUSE_DAEMON_TASK_STARTUP_HOOK      1

#define configUSE_TIMERS                        1
#define configTIMER_TASK_PRIORITY               3
#define configTIMER_QUEUE_LENGTH                8
#define configTIMER_TASK_STACK_DEPTH            (configMINIMAL_STACK_SIZE*3)

/* Co-routine definitions. */
#define configUSE_CO_ROUTINES                    0
#define configMAX_CO_ROUTINE_PRIORITIES          ( 2 )

/* The following flag must be enabled only when using newlib */
#define configUSE_NEWLIB_REENTRANT          1

/* Set the following definitions to 1 to include the API function, or zero
to exclude the API function. */
#define INCLUDE_vTaskPrioritySet            1
#define INCLUDE_uxTaskPriorityGet           1
#define INCLUDE_vTaskDelete                 1
#define INCLUDE_vTaskCleanUpResources       0
#define INCLUDE_vTaskSuspend                1
#define INCLUDE_vTaskDelayUntil             0
#define INCLUDE_vTaskDelay                  1
#define INCLUDE_xTaskGetSchedulerState      1

/* Cortex-M specific definitions. */
#ifdef __NVIC_PRIO_BITS
 /* __BVIC_PRIO_BITS will be specified when CMSIS is being used. */
 #define configPRIO_BITS         __NVIC_PRIO_BITS
#else
 #define configPRIO_BITS         4
#endif

/* The lowest interrupt priority that can be used in a call to a "set priority"
function. */
#define configLIBRARY_LOWEST_INTERRUPT_PRIORITY   15

/* The highest interrupt priority that can be used by any interrupt service
routine that makes calls to interrupt safe FreeRTOS API functions.  DO NOT CALL
INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER
PRIORITY THAN THIS! (higher priorities are lower numeric values. */
#define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 5

/* Interrupt priorities used by the kernel port layer itself.  These are generic
to all Cortex-M ports, and do not rely on any particular library functions. */
#define configKERNEL_INTERRUPT_PRIORITY 		( configLIBRARY_LOWEST_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) )
/* !!!! configMAX_SYSCALL_INTERRUPT_PRIORITY must not be set to zero !!!!
See http://www.FreeRTOS.org/RTOS-Cortex-M3-M4.html. */
#define configMAX_SYSCALL_INTERRUPT_PRIORITY 	( configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY << (8 - configPRIO_BITS) )

/* Normal assert() semantics without relying on the provision of an assert.h
header file. */
/* USER CODE BEGIN 1 */
#define configASSERT( x ) if ((x) == 0) {taskDISABLE_INTERRUPTS(); for( ;; );}
/* USER CODE END 1 */

/* Definitions that map the FreeRTOS port interrupt handlers to their CMSIS
standard names. */
#define vPortSVCHandler    SVC_Handler
#define xPortPendSVHandler PendSV_Handler

/* IMPORTANT: This define is commented when used with STM32Cube firmware, when the timebase source is SysTick,
              to prevent overwriting SysTick_Handler defined within STM32Cube HAL */

#define xPortSysTickHandler SysTick_Handler

Here is my main.c excluding autogenerated parts and comments:

UART_HandleTypeDef huart2;

extern stateMachine_t simple_led;
SemaphoreHandle_t wake_sem = NULL;
HeapStats_t xHeapStats = {0};
uint8_t ticks = 0;

void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_USART2_UART_Init(void);

void vTask1(void *pvParameters);
void vDispatcher( void *pvParameters );
void vEmitter(void *pvParameters);
void print_heap_stats();

void print_heap_stats(){
	vPortGetHeapStats(&xHeapStats);
	printf("Available heap space (bytes) %lu \n"
			"Largest block (bytes) %lu\n"
			"Smallest block (bytes) %lu\n"
			"Free blocks %lu\n"
			"Minimum ever bytes remaining %lu\n"
			"Succesful allocations %lu\n"
			"Succesful frees %lu\n",
			(uint32_t)(xHeapStats.xAvailableHeapSpaceInBytes),
			(uint32_t)(xHeapStats.xSizeOfLargestFreeBlockInBytes),
			(uint32_t)(xHeapStats.xSizeOfSmallestFreeBlockInBytes),
			(uint32_t)(xHeapStats.xNumberOfFreeBlocks),
			(uint32_t)(xHeapStats.xMinimumEverFreeBytesRemaining),
			(uint32_t)(xHeapStats.xNumberOfSuccessfulAllocations),
			(uint32_t)(xHeapStats.xNumberOfSuccessfulFrees));
	size_t bytesHeap =  xPortGetFreeHeapSize();
	printf("Free bytes in heap u32: %lu\r\n\r\n", (uint32_t)bytesHeap);
}


void vApplicationStackOverflowHook( TaskHandle_t pxTask, char *pcTaskName ){

	printf("Stack overflow triggered by %s\r\n", pcTaskName);
	print_heap_stats();
	( void ) pxTask;

	for( ;; );

}
void vApplicationIdleHook(){
	printf("IDLE\r\n");
	print_heap_stats();
}
/*
void vApplicationTickHook(){
	printf("Ticks %d\r\n", ticks++);
}
*/
void vApplicationDaemonTaskStartupHook( void ){
	printf("Scheduler started\r\n");
	print_heap_stats();
  	
	uint8_t ret;
	ret = InitEvtQueue();
	if (ret != ERR_OK){
		printf("Error creating evt queue\r\n");
	} else {
		printf("Evt queue created\r\n");
		print_heap_stats();
	}

	//InitDelayedEvtQueue
	ret = InitDelayedEvtQueue();
	if (ret != ERR_OK){
		printf("Error creating delayed evt queue\r\n");
	} else {
		printf("Delayed evt queue created\r\n");
		print_heap_stats();
	}
}

void vApplicationMallocFailedHook( void ){

	printf("Malloc error\r\n");
	print_heap_stats();

	for (;;){

	}

}
/* USER CODE END 0 */

/**
  * @brief  The application entry point.
  * @retval int
  */
int main(void)
{
  HAL_Init();
  SystemClock_Config();

  MX_GPIO_Init();
  MX_USART2_UART_Init();

  	RetargetInit(&huart2);

  	printf("Init finished\r\n");
  	print_heap_stats();
	BaseType_t  ret1 = xTaskCreate(vDispatcher, "dispatcher", configMINIMAL_STACK_SIZE, NULL, tskIDLE_PRIORITY+2, NULL );
	BaseType_t  ret2 = xTaskCreate(vEmitter, "event_emitter", configMINIMAL_STACK_SIZE, NULL, tskIDLE_PRIORITY+3, NULL );
	if (ret1 != pdPASS || ret2 != pdPASS){
		printf("Error creating tasks %lu %lu\r\n", ret1, ret2);
	} else {
		printf("Tasks created\r\n");
	}
	print_heap_stats();

	ret1 = xTaskCreate(vTask1,"Task1",configMINIMAL_STACK_SIZE,NULL,tskIDLE_PRIORITY+1,NULL);
	if ( ret1 != pdPASS){
		printf("Blinky error %lu\r\n", ret1);
	}

	vTaskStartScheduler();

Current output with code above

Init finished
19:30:54.910 -> Available heap space (bytes) 0 
19:30:54.910 -> Largest block (bytes) 0
19:30:54.910 -> Smallest block (bytes) 4294967295
19:30:54.910 -> Free blocks 0
19:30:54.910 -> Minimum ever bytes remaining 0
19:30:54.910 -> Succesful allocations 0
19:30:54.910 -> Succesful frees 0
19:30:54.910 -> Free bytes in heap u32: 0
19:30:54.910 -> 
19:30:54.910 -> Tasks created
19:30:54.910 -> Available heap space (bytes) 2144 
19:30:54.910 -> Largest block (bytes) 2144
19:30:54.910 -> Smallest block (bytes) 2144
19:30:54.956 -> Free blocks 1
19:30:54.956 -> Minimum ever bytes remaining 2144
19:30:54.956 -> Succesful allocations 4
19:30:54.956 -> Succesful frees 0
19:30:54.956 -> Free bytes in heap u32: 2144
19:30:54.956 -> 
19:30:54.956 -> Scheduler

Here I hang :frowning:

A couple of things stand out –

  1. Tasks that call printf() probably need more than 64 words of stack.
  2. You are probably multithreading calls to the UART driver (from the ST HAL?) and the driver likely doesn’t support that. You may need to use a mutex or change the design to share the UART output.
  3. Remember that FreeRTOS allocates each task’s TCB and stack from the heap. And, note that configUSE_NEWLIB_REENTRANT 1 increases the TCB size quite a lot. All of that can quickly take up a few KB of heap. A bigger heap (8KB+) will be helpful.

1,3 thank you, I’ll try it tomorrow. IDE asked me several times if I’m sure not to use newlib reentrant option, that I gave up and enabled it :slight_smile:
2. Is printf-stdarg a recommended way to be used both before and after scheduler starts? It seems that I run into trouble somewhen between vTaskSchedulerStart and first or second context switch.

Yes, it’s safe to use that printf() implementation before and after the scheduler starts. You do have to be careful in your implementation of vOutputChar() not to interact with the scheduler.

The issue for #2 is that you might be calling the UART driver from multiple threads of execution which can cause re-entrant calls to driver functions that are not re-entrant. A good diagnostic test might be to make sure only one task calls printf() and see what happens.

After some googling I found these printf implementations:
Stdarg-printf from TCP lab freertos demo
Xprinfc
Mpaland/printf
As far as I understand usage, any of them needs only putchar implementation that is not to be thread safe/reentrant. As for me, it is better to use register-only write to UART tx register not to mess with HAL.

What is more important is to serialize the UART communication not to corrupt the output. You can still use HAL as long as you make sure only one task calls into that at a time. There are couple of ways to do that:

  1. Protect UART HAL calls by a mutex.
  2. Call UART HAL functions only from one task. Here is an example that serializes all prints by sending them to one task: https://github.com/aws/amazon-freertos/blob/main/libraries/logging/iot_logging_task_dynamic_buffers.c

Thanks.