Task stuck in READY state

gremond wrote on Monday, September 25, 2017:

I observed a task stuck in READY state and then no more scheduled.

The task is as following:

	for(;;)
	{
		_lastrun_ms = HAL_GetTick();

	ulTaskNotifyTake(pdTRUE, pdMS_TO_TICKS(100));

		while(_buffer_size() > 0){
			if(_buffer_read(&c) == 0){
				_handlePacket(c);
			}
		}
	}

I have a manager checking all task to see if every tasks are running properly it is why I can detect it quickly.

I see that the task goes from BLOCKED to READY but still no more scheduled and stays stuck in ulTaskNotifyTake.

I am on a STM32L4 wich last FreeRTOS 0.9.1 SVN r2518.

Do you have seen this issue or do you have an idea on when it can happen ?

Thanks

rtel wrote on Monday, September 25, 2017:

You say both that the task is inside ulTaskNotifyTake(), and that it is
in the READY state. When it enters ulTaskNotifyTake(), if its
notification count is zero, it will enter the Blocked state. When
something sends it a notification value it will leave the Blocked state
and enter the Ready state - that is what it sounds like has happened in
your system. Once it is in the Ready state it will run again the next
time it is the highest priority task that is ready to run…so in
your system is it the highest priority task that is ready to run or is
there a higher priority task that is using all the CPU time without ever
blocking?

gremond wrote on Monday, September 25, 2017:

This task has a priority of 5
There is one higher with 6
All others tasks are 4 and lower.
When the issue happens, all taks are running properly except this one which is stuck in READY.

I can check that this task is stuck by setting a timestamp each time it runs, and checking this timestamp from an other task.
When this is issue is detected I check all the System status with vTaskList and I see it in READY but not scheduled anymore.

rtel wrote on Monday, September 25, 2017:

You don’t say which architecture you are using. If it is a Cortex-M,
are you sure you have all interrupt priorities set correctly?

http://www.freertos.org/RTOS-Cortex-M3-M4.html

Also make sure you have configASSERT() defined, ideally in the latest
head revision in SVN as that has lots more assert points to help detect
interrupt priority configuration issues.

gremond wrote on Monday, September 25, 2017:

It is a Cortex-M4 and yes I checked several times the priorities from the documentation.
What is strange it is that all tasks are running properly even in lower priority when it happens.
Yes configASSERT is enabled and no assert logs appear

I have also sometimes the issue where the number of tasks is incorrect between uxTaskGetSystemState() != uxTaskGetNumberOfTasks(), in the topic here: https://sourceforge.net/p/freertos/discussion/382005/thread/24f87bc5/
So maybe there could be a relationship between these 2 issues in the task lists

rtel wrote on Monday, September 25, 2017:

Both problems are odd, and would need a bit of a detailed study of the
data structures when the occur. Yes they definitely could be related.

gremond wrote on Monday, September 25, 2017:

Let me know if you need more details of the scheduler and tasks when it happens.

rtel wrote on Monday, September 25, 2017:

Have you stepped into the uxTaskGetSystemState() and uxTaskGetNumberOfTasks() functions to see why they return a different number of tasks?

gremond wrote on Monday, September 25, 2017:

I created these 2 functions to check the ready list (as it is my problem at the beginning):

UBaseType_t uxTaskGetNbReady( TaskStatus_t * const pxTaskStatusArray, const UBaseType_t uxArraySize )
{
UBaseType_t uxTask = 0, uxQueue = configMAX_PRIORITIES;

	vTaskSuspendAll();
	{
		/* Is there a space in the array for each task in the system? */
		if( uxArraySize >= uxCurrentNumberOfTasks )
		{
			/* Fill in an TaskStatus_t structure with information on each
			task in the Ready state. */
			do
			{
				uxQueue--;
				uxTask += prvListTasksWithinSingleList( &( pxTaskStatusArray[ uxTask ] ), &( pxReadyTasksLists[ uxQueue ] ), eReady );

			} while( uxQueue > ( UBaseType_t ) tskIDLE_PRIORITY ); /*lint !e961 MISRA exception as the casts are only redundant for some ports. */
		}
		else
		{
			mtCOVERAGE_TEST_MARKER();
		}
	}
	( void ) xTaskResumeAll();

	return uxTask;
}
UBaseType_t uxTaskGetNbReady2( void )
{
	UBaseType_t uxTask = 0, uxQueue = configMAX_PRIORITIES;

	vTaskSuspendAll();
	{
		/* Fill in an TaskStatus_t structure with information on each
		task in the Ready state. */
		do
		{
			uxQueue--;
			uxTask += listCURRENT_LIST_LENGTH(&( pxReadyTasksLists[ uxQueue ] ));

		} while( uxQueue > ( UBaseType_t ) tskIDLE_PRIORITY ); /*lint !e961 MISRA exception as the casts are only redundant for some ports. */
}
	( void ) xTaskResumeAll();

	return uxTask;
}

And both returns the correct number here
I will continue to step into other list to check the number of tasks and search why a task is stuck in ready

Many years later, I am observing the same phenomenon in FreeRTOS 10.4.

Task 1: priority 3, gets stuck in Ready State following block via ulTaskNotifyTake()
Task 2: priority 1, will not yield to Task 1 via xTaskNotifyGive()

I’m curious if any resolution was reached on this. It seems like a stack problem, but I’m giving the tasks way more RAM than they need.

Which processor core are you using? Do you have configASSERT() defined?

The issue might not be a “stack” issue but the use of a bad pointer that clobbers some FreeRTOS resource needed for the scheduling. Incorrect interrupt priorities can also corrupt that memory (an ISR with too high, that is numerically to low) but recent version of FreeRTOS will check that if configASSERT is defined.