Task waiting on xSemaphoreTake fails to unblock before timeout

remboooo wrote on Friday, May 31, 2019:

I have a pretty simple setup on a Cortex-M0+ (Atmel SAML21) with a task that’s waiting for an interrupt to happen. I use a binary semaphore for this purpose. The task is waiting for the semaphore using xSemaphoreTake with a timeout of 1 second (= 1000 ticks in my setup). The interrupt does not do much more than:

void interrupt() {
	portBASE_TYPE highPrioTaskWoken = pdFALSE;
	debug_printf("give sem");
	xSemaphoreGiveFromISR(tx_complete_semaphore, &highPrioTaskWoken);
	portEND_SWITCHING_ISR(highPrioTaskWoken);
}

The function is actually called by a DMA ISR from Atmel Software Framework, but that should not matter; it’s an ISR.

The weird thing is: the interrupt happens almost immediately after the call to xSemaphoreTake, but the task calling xSemaphoreTake is only resumed when the timeout expires! The call to xSemaphoreTake does return true, so it’s definitely not legitimitely waiting for the timeout to expire.
If I increase the timeout to 20 seconds, the call blocks for 20 seconds (still returning true).

If I delay the call to xSemaphoreTake by placing a debug printf before it (which takes some time because it outputs over a UART), the semaphore is given before the xSemaphoreTake is called, and it returns immediately (like it should). If i replace the delay by a vTaskDelay(1), the call blocks until the timeout again.

I used the FreeRTOS Viewer in Atmel Studio to see what was going on. During the ISR, before the semaphore is given, this is the state of the tasks:

NA	IDLE"		0	0	536878384	536878072	NA	NA	Running
NA	CfgIF"		1	1	536881320	536880540	NA	NA	Blocked
NA	MeasCtr"	2	2	536877664	536876836	NA	NA	Blocked
NA	WDog"		5	5	536875712	536875340	NA	NA	Blocked
NA	Tmr Svc"	1	1	536878664	536878472	NA	NA	Suspended
NA	Vengabu"	2	2	536874136	536873248	NA	NA	Suspended

The “CfgIF” task is the one blocking on the semaphore.

After the xSemaphoreGiveFromISR, this becomes:

NA	IDLE"		0	0	536878384	536878072	NA	NA	Running
NA	CfgIF"		1	1	536881320	536880540	NA	NA	Blocked
NA	MeasCtr"	2	2	536877664	536876836	NA	NA	Blocked
NA	WDog"		5	5	536875712	536875340	NA	NA	Blocked
NA	CfgIF"		1	1	536881320	536880540	NA	NA	Ready
NA	Tmr Svc"	1	1	536878664	536878472	NA	NA	Suspended
NA	Vengabu"	2	2	536874136	536873248	NA	NA	Suspended

So the task waiting for the semaphore is now in the list twice? Once “ready”, once “blocked”.

When I step through the ISR I can see that portEND_SWITCHING_ISR() is called with “true” and sets the PendSV bit. However, in the subsequent PendSV handler no task switching is happening because uxSchedulerSuspended is true. I know too little about the FreeRTOS internals to know if this is supposed to be this way; it’s true that before the ISR happens, all tasks are either blocked or suspended (apart from the idle task, but I do use tickless idle).

Any clues on what might be going on here? I was running FreeRTOS 9.0.0, and because I suspected a bug upgraded to 10.2.1, but this made no difference whatsoever.

rtel wrote on Friday, May 31, 2019:

Could always be a memory corruption caused by a stack overflow or
something like that (do you have stack overflow detection turned on?) -
but my first suggestion would be take the print statement out of the ISR

  • you said yourself it takes time to execute and will either not be
    thread safe or use a thread safety mechanism that could mess up the ISR.
    Does the behaviour change without the print statement?

remboooo wrote on Saturday, June 01, 2019:

Thanks for thinking along. I do have configCHECK_FOR_STACK_OVERFLOW set to 2 and the application keeps chugging along just fine, so that doesn’t seem to be the problem. I only added the debug_printf in the ISR after I noticed this happening, so it’s not the cause of the problem.