Task stuck in ready state on cortex-m4

Hi, I use freeRTOS kernel version 10.0.1 with IAR 7.2 on Cortex M4.

My test case just reviece data from uart irq and sent it to a queue for process in another task named “UartDataProcess” run at Priority 6,
Everything works fine when I run the system to transmit data at the beginning. After transmitting 10+ hours, somehow the task “UartDataProcess” will get stuck at ready state and never jump out. In this case, other Tasks, including IDLE Tasks, are normally scheduled

and the Task will be resume if i modify the bit of Priority 6 on “uxTopReadyPriority”, after that everything run is OK.

It is looks like topic of [Task stuck in ready state on cortex-M33] very mush.

But no irq with Priority 0

Below is my Macro:
configUSE_PREEMPTION 1
configUSE_TIME_SLICING 1
configUSE_PORT_OPTIMISED_TASK_SELECTION 1
#define __NVIC_PRIO_BITS 3U
#define configLIBRARY_LOWEST_INTERRUPT_PRIORITY 0xf

Does anyone have any idea why the stuck issue happens?

Thanks,
Rob

we need to see code, everything else is crystal ball guessing.

Thanks for feedback,

Yes, I know, But I am not sure what code to show,
it seems some problem in “uxTopReadyPriority” in OS code

not necessarily.

Show us at least the code for the rask that you believe to be stuck.

and this Task is still in pxReadyTasksLists, but the corresponding bit of “uxTopReadyPriority” is not set

why do you claim the critical section? Try without it. When the system is in the problematic state, do you still get sys tick interrupts? I suspect that your task blocks while it still holds the critical section which will mayexplain the behavior you see.

According to the logs,
the Task has left the critical zone,
and it can be reproduced without claim the critical section before

sys tick interrupts is OK, anyother task scheduled is OK except this one,
when i modify the bit of Priority 6 on “uxTopReadyPriority”, this task will resume

What is your configMAX_SYSCALL_INTERRUPT_PRIORITY/configMAX_API_CALL_INTERRUPT_PRIORITY value? What is your configKERNEL_INTERRUPT_PRIORITY value?

Reference link

Thanks for you feedback,
and i check these in the printf as below:
configMAX_SYSCALL_INTERRUPT_PRIORITY = 160
configKERNEL_INTERRUPT_PRIORITY = 480
configPRIO_BITS = 3

And my Uart IRQ is 8

two more things to try:

  • application stack overflow. Try running the task with a larger stack.
  • there may be a race condition where your queue wait timeout expires just before the queue is refilled. check your log whether there is an entry that indicates timeout short before communication ceases.

Other than that, the configuration looks ok, but almost all of your task code is calling non native functions that may be subject to all kinds of problems.

Note that the observation you make about the bit you can manipulate to get the task back to running is just a symptom. Concurrency problems may manifest themselves in literally a bazillion of ways.

Thanks for your feedback again
Stack was not overflow viewed by IAR Debugger

And I found __NVIC_PRIO_BITS defined 3U in other header file, so configLIBRARY_LOWEST_INTERRUPT_PRIORITY 0xf is not appropriate, right?

But I found many other project of M4 core, __NVIC_PRIO_BITS defined 4U,
So, I modify it for revalidate.

but I’m not sure if this is a good change

well, M4 based PODs are explicitly allowed to support different numbers of irqs, so you must tailor that to your POD.

Have you looked at the timeout race issue?

Yes, queue recieve will be timeout when everything is OK, then will never enter when problems arise.

Well, the root cause will be “__NVIC_PRIO_BITS and configLIBRARY_LOWEST_INTERRUPT_PRIORITY not match” ?