Scheduler hangs after many create/delete tasks

viatorus wrote on Wednesday, April 17, 2019:

Hey everybody.

we are currently running many stability tests and found this issue:

The test below stops working after some seconds. Checking with the debugger, FreeRTOS does not schedule anymore and hangs inside the idle task. We are using an STM32F439xx with the CM3 port. GGC 7.2.

What the code does:

  1. Starts a
    We are using an STM32F439xx with the CM3 port. GGC 7.2.
    main task (prio 8).
  2. Starts a watchdog task (prio 10) to provoke task switches.
  3. In an infinity loop: Start a test take (prio 8) and delete him.

What seems to happen before the crash (not 100% sure):

  1. The test task will be running (happens really rarly, see below).
  2. Calls Suspend.
  3. Main thread calls Suspend/Delete.
  4. Idle task runs for ever without context switches.

vTask1 is called if the counter is:
0x93, 0xAE and y.

y is variable but after y +0xE, the schedular hangs.

Our FreeRTOS config:

define configUSE_PREEMPTION			1
define configUSE_PORT_OPTIMISED_TASK_SELECTION 1
define configUSE_IDLE_HOOK				0
define configUSE_TICK_HOOK				0
configUSE_TIME_SLICING                  0
#define STACK_SIZE configMINIMAL_STACK_SIZE * 10

/* Structure that will hold the TCB of the task being created. */
StaticTask_t xTaskBufferMain;
StaticTask_t xTaskWD;
StaticTask_t xTaskBuffer1;

StackType_t xStackMain[ STACK_SIZE ];
StackType_t xStackWD[ STACK_SIZE ];
StackType_t xStack1[ STACK_SIZE ];

void vTask1(void *) {
  vTaskSuspend(nullptr);
}

void vWD(void *) {
  while (true) {
    vTaskDelay(pdMS_TO_TICKS(1));
  }
}

void vMain(void *) {
  xTaskCreateStatic(vWD, "wd", STACK_SIZE, NULL, 10, xStackWD, &xTaskWD);

	unsigned int i = 0;
  while (true) {
	  TaskHandle_t handle1 = xTaskCreateStatic(vTask1, "task1", STACK_SIZE, NULL, 8, xStack1, &xTaskBuffer1);

	  vTaskSuspend(handle1);
	  vTaskDelete(handle1);

	  ++i;
  }
}

extern "C" int main(void) {
	__enable_irq();

	// 4 bits for pre-emption priority 0 bits for subpriority.
	HAL_NVIC_SetPriorityGrouping(NVIC_PRIORITYGROUP_4);

  xTaskCreateStatic(vMain, "main", STACK_SIZE, NULL, 8, xStackMain,
      &xTaskBufferMain);

  vTaskStartScheduler();

	return 0;
}

I tried to find the issue but I am out of ideas. Can someone please help?

rtel wrote on Wednesday, April 17, 2019:

Probably not related to your issue, but I’m not sure why you are
enabling interrupts. Once you start calling FreeRTOS API functions
interrupts will remain masked up to the max system call interrupt
priority anyway. That is to stop interrupt trying to use the kernel
before the kernel is started.

I will try replicating the issue using your code then report back.

rtel wrote on Wednesday, April 17, 2019:

I’m running the code now, although using Keil rather than GCC. No
issues so far. While I leave that running some other questions:

Is the code you posted all the code there is? Or do you have interrupts
executing too?

What happens if you build the code as C code rather than C++?

Which version of FreeRTOS are you using? Best the have the latest (or
at least 10.1.1) with configASSERT() defined as the newer the version
the more asserts there are to catch configuration errors.

[the value of ‘i’ is over 10 million in my test]

rtel wrote on Wednesday, April 17, 2019:

Another question - what are the compiler options you are using?
Especially, are you using link time optimisation (where there are known
issues)?

viatorus wrote on Thursday, April 18, 2019:

First of all thank you Richard for your fast reply.

  • FreeRTOS 10.2.
  • No other interrupt happen.
  • There exists other code (but only clock init).

GCC:
arm-none-eabi-gcc -mthumb -ffunction-sections -fdata-sections -mcpu=cortex-m4 -g -fexceptions -Wall -Wextra

Linker:
arm-none-eabi-g++ -mthumb -ffunction-sections -fdata-sections -mcpu=cortex-m4 -g -T/home/…/cmake-build-debug/gcc.ld -specs=nosys.specs -Wl,–start-group -lgcc -lc -lc -lm -Wl,–end-group -Wl,-Map=…map -Wl,–gc-sections -Wl,-Ttext=0x08000000 -fno-lto

I built this code in C, and without lto, same result.

viatorus wrote on Thursday, April 18, 2019:

It seems to happen only if I link freeRTOS…, if I build it inside my executable it seems to work…

viatorus wrote on Thursday, April 18, 2019:

Okay, I think I found the issue.

In our config we have this defined:

#define configMAX_SYSCALL_INTERRUPT_PRIORITY	5
#define configKERNEL_INTERRUPT_PRIORITY 		15

Which is totally wrong because it has to be shifted!
https://www.freertos.org/RTOS-Cortex-M3-M4.html

Damn man… this config we have is old and never touched really. This means it is inside many products…

Thank you for your help!

richard_damon wrote on Thursday, April 18, 2019:

Wether you need to shift the number or not depends on what function you are using to write to value into the registers. Many of the CMSIS functions do the shift for you, so you can treat the numbers as the smaller numbers. If you are writing directly into the registers, then you need to be doing the shifting.

viatorus wrote on Thursday, April 18, 2019:

Hmm… how do I know which function is used? I don’t find it.

But it seems to be right. If I shift them (ARM_CM3) it works.
Or could it be a false positive?

rtel wrote on Thursday, April 18, 2019:

Can you explain what you mean by this:

“It seems to happen only if I link freeRTOS…, if I build it inside my
executable it seems to work…”

rtel wrote on Thursday, April 18, 2019:

They do have to be right regardless, and as it seems to make a
difference, then there is some probability that was the issue (not
proven though). As you are using FreeRTOS V10.2.x I hope that
configuration error would be caught if you have configASSERT() defined -
I would be very interested if not.

richard_damon wrote on Thursday, April 18, 2019:

Toni, xPortSchedulerStart directly writes to the registers, and use the shifted values to setup the SYSTICK and PENDSV interrrupts. For any other interrupts, YOU need to make the call to set their priority.

Richard Barry, small point but I am a bit surprized that the Systick priority is setup in xPortSchedulerStart() instead of vPortSetupTimerInterrupt(), as if you have overridden vPortrSetupTimerInterrupt() to use a differt timer, you might want to use the SysTick for other purposes, and might want a differnt Priority for it.

viatorus wrote on Friday, April 19, 2019:

This was a false assumption. I tried it in our project structure and in a demo project. I forgot to compare the FreeRTOSConfig files.

viatorus wrote on Friday, April 19, 2019:

There were no assert (I use 10.2.).

viatorus wrote on Friday, April 19, 2019:

Thank you for the clarification. So they have to be shifted inside the config.

An assert would be good if only the least significant bits were accidentally used.

richarddamon wrote on Friday, April 19, 2019:

Toni, the port.c assumes that the values are in the lower bits of the value, and computes the bits to shift into the control register.

What the assert would have caught is that you used that values someplace that needed them shifted up, it would see the priority as 0 which is too low.

rtel wrote on Friday, April 19, 2019:

Both configMAX_SYSCALL_INTERRUPT_PRIORITY and
configKERNEL_INTERRUPT_PRIORITY should be raw (therefore shifted)
values, assuming you are using an unedited version of FreeRTOS.

The asserts will not check you got the values as you intended them, as
it doesn’t know that, it will check that the values are consistent with
each other and consistent with the hardware.

viatorus wrote on Saturday, April 20, 2019:

Sorry but when I read it right, Richards, you have a different opinion.

R. Damon suggest not shifted.
R. Barry suggest shifted.

Checking out a demo project the values have to be shiftet.