Interrupt Handling and a Task


Trying to get a simple task done …
Can someone point out, what I am doing wrong ?


xQueueHandle xq;
long task_woken = pdFALSE;


void tmr_task(void *args)
	char cp;

#if 0
	TickType_t ticks;
	const portTickType xFrequency = 50;
	ticks = xTaskGetTickCount();

	printf("(%d) %s: Started\r\n", __LINE__, __FUNCTION__);
	xq = xQueueCreate(10, sizeof(char));
	if (!xq) {
		printf("(%d) %s: Queue create Failed!\r\n", __LINE__, __FUNCTION__);
	printf("(%d) %s: Queue Created\r\n", __LINE__, __FUNCTION__);
	LL_TIM_EnableCounter(TIM2);				/* enable timer counter */
	printf("(%d) %s: Timer Enabled\r\n", __LINE__, __FUNCTION__);

#if 0
	while (1) {
		xQueueReceiveFromISR(xq, &cp, &task_woken);
		printf("%c", cp);
		vTaskDelayUntil(&ticks, xFrequency);

	while (xQueueReceiveFromISR(xq, &cp, &task_woken))
		printf("%c", cp);

	if (task_woken != pdFALSE)

void TIM2_IRQHandler(void)
	const TickType_t ticks = 500;
	char pc = 'H';

	if (LL_TIM_IsActiveFlag_UPDATE(TIM2)) {
		printf("(%d) %s: Send!\r\n",__LINE__, __FUNCTION__);
		xQueueSend(xq, &pc, ticks);

int main(void)

	printf(" -------------------\r\n");
	printf(" STM32H743 TIM2 test\r\n");
	printf(" -------------------\r\n");

	xTaskCreate(tmr_task, (const char *) "TMR", TASK_STACKSIZ, NULL, TASK_PRIORITY, NULL);

	while (1) { }

This is what I receive…

 STM32H743 TIM2 test
(228) tmr_task: Started
(233) tmr_task: Queue Created
(23(260) TIM2_IRQHandler: Send!

FreeRTOS functions ending in “FromISR” are expected to be called from ISRs. Use xQueueReceive in your task:

for( ;; )
    if( xQueueReceive(xq, &cp, 100) == pdPASS )
        printf("%c", cp);
        printf("Timed out!");

And use xQueueSendFromISR in ISR:

void TIM2_IRQHandler(void)
    char pc = 'H';
    BaseType_t xHigherPriorityTaskWoken;

	if (LL_TIM_IsActiveFlag_UPDATE(TIM2)) {
		xQueueSendFromISR(xq, &pc, &xHigherPriorityTaskWoken);

Also, unless you are sure that your printf is thread safe and can be called from multiple ISRs, do not call it from multiple tasks.


Thanks for the quick reply.

One thing that makes me think, any advantage in using xQueue*, instead of having the data global ?

I see that you are always sending same data to your task from the TIM2 Handler. What are you trying to achieve?

If you just want a task waking periodically, you can use vTaskDelay or vTaskDelayUntil:

If you want to signal your task from the TIM2 handler, you can use direct to task notification:



The TIM2 ISR has been used as a test example to test the functionality/behaviour.
What I am trying to achieve, from the DMARQ events, trying to push out some data to the ethernet interface, continuously.

I would like to have the smallest possible latency, hence the thought of a global data buffer would be the better choice.

The logic being, the xQueue* fn’s would eventually end up using CPU cycles.

My thought was to update the buffer, then just wake the thread/task.
That was my thought about getting it done the fastest possible way.
mutex_lock()/unlock() ?

Different versions of documentation here and there and ST providing a completely different version and with CMSIS OS wrappers v1 and 2 makes things quite a bit messy. The long camel case variable naming topology adds on to the pain, in getting things started up in the first place. Sorry about the rant, have been pulling my hair out a bit.

Another confusing aspect in the ST Cube examples are that they launch a task from main(). From within that task, they launch multiple tasks again. I dont understand the advantage of starting multiple tasks/threads from another task viz, StartThread(), rather than from main() itself.

Thanks for the help!

As was mentioned, your using the FromISR not from within an ISR, and in the ISR your not using the FromISR versions. Inside an ISR, you should only call the …FromISR versions, and those (generally) should only be called from within an ISR.

Also, be VERY careful about using printf, ESPECIALLY in an ISR, this is likely unsafe.

Lastly, you are creating the queue in a task, I try to create ALL my tasks, queues, semaphores, etc in the main line before starting the scheduler so I don’t need to worry about races.

Hi Richard,

Gaurav was gracious and pointed out the problem. I understand what you meant, use xxxISR from within the ISR context.

Any thoughts as to why ST does that threaded kind of tasks ?
Please do take a look in here:

around line#124, static void Netif_Config(void)


I’ve never understood why STM does some things the way they do. Their code is often not a good example of the right way to do things.

Sometimes it does make sense to create network threads dynamically so that you make more handlers when there is more work to do (and if multiple tasks could actually help get things done). But it is less important on the sort of machines FreeRTOS runs on as there is less parallelism available.

There is 1 reason why configuring and starting up an application from an initial or main task with a dynamically assigned task stack can make sense. The static stack size in main can be minimized usually configured in the linker script. For a number of MCUs e.g. Cortex-M this main stack is later on re-used as ISR stack which is usually pretty small (depending on the ISR code, of course).
So you don’t need to statically reserve e.g. a 2k main stack needed by complex setup code with printf-calls where just a fraction of it is later really needed by your ISRs.
I’m using a similar scheme having a main/parent task launching the child tasks of an application except that this main task is not self-deleted but keeps running doing useful things…

Thanks for the info! That sounds like a nifty trick, where you do not need to switch the context then ? Sounds interesting…

Which probably explains why ST does not do:

portEND_SWITCHING_ISR( xHigherPriorityTaskWoken );

In their example code base ? I could be mistaken though.

Thanks for the ptr! :slight_smile:

No, portEND_SWITCHING_ISR is always useful and is not related to the things I mentioned. No idea though, why ST example code doesn’t make use of it.
I’m not using cube/HAL but from what I’ve seen don’t get confused by their ISR callbacks and the real ISRs, which might make use of portEND_SWITCHING_ISR

Doing a seacrch, the only place where portEND_SWITCHING_ISR() appears to be used is in cmsis_os.c osMutex*

osStatus osMutexWait(osMutexId mutex_id, uint32_t millisec)
	TickType_t ticks;
	portBASE_TYPE taskWoken = pdFALSE;  
	if (!mutex_id)
		return osErrorParameter;
	ticks = 0;

	if (millisec == osWaitForever) {
		ticks = portMAX_DELAY;
	} else if (millisec != 0) {
		ticks = millisec / portTICK_PERIOD_MS;
		if (!ticks)
			ticks = 1;
	if (inHandlerMode()) {
		if (xSemaphoreTakeFromISR(mutex_id, &taskWoken) != pdTRUE)
			return osErrorOS;


	} else if (xSemaphoreTake(mutex_id, ticks) != pdTRUE) {
			return osErrorOS;
	return osOK;

But then, the osMutex* fns are hardly used, maybe in 1 or 2 examples, not in the plethora of other samples.

I dont use Cube/HAL, but like to peek in when handwritten code does not work as expected. Cube/HAL appears to be messy as far as readability is concerned.

I agree with your rating concerning Cube/HAL quality and also that it can help figuring out issues with own drivers.
Although the portEND_SWITCHING_ISR helper is used in the CMSIS mutex wrapper it‘s not limited to this use case. It‘s a FreeRTOS native/low level mechanism to help/speed up context switching from an ISR to the right task if possible. So I‘d recommend to make use of it.


I am trying to understand in the FreeRTOS context, when xHigherPriorityTaskWoken should be set to True.

/* If xHigherPriorityTaskWoken was set to true you
we should yield.  The actual macro used here is
port specific. */
portYIELD_FROM_ISR( xHigherPriorityTaskWoken );

So, looked at the implementation of the macro:

/* Constants used with memory barrier intrinsics. */
#define portSY_FULL_READ_WRITE		( 15 )

    #define portYIELD()																\
    {																				\
    	/* Set a PendSV to request a context switch. */								\
    	portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;								\
    	/* Barriers are normally not required but do ensure the code is completely	\
    	within the specified behaviour for the architecture. */						\
    	__dsb( portSY_FULL_READ_WRITE );											\
    	__isb( portSY_FULL_READ_WRITE );											\

    #define portEND_SWITCHING_ISR(xSwitchRequired)	if (xSwitchRequired != pdFALSE) portYIELD()

So, the Macro basically Syncs the cache depending on xSwitchRequired.

In my situation, the STM32H7 appears to have a broken DMA implementation ?

In a simple test application that I use, without any OS, using the workaround (3)
Use Cache maintenance Functions (SCBCleanDCache()) in the following way for USART data to be output

#define __ISB() do {\
                } while (0U)

  \brief   Data Synchronization Barrier
  \details Acts as a special kind of Data Memory Barrier.
           It completes when all explicit memory accesses before this instruction complete.
#define __DSB() do {\
                } while (0U)

__STATIC_INLINE void SCB_CleanDCache(void)
#if defined (__DCACHE_PRESENT) && (__DCACHE_PRESENT == 1U)
	uint32_t ccsidr;
	uint32_t sets;
	uint32_t ways;

	SCB->CSSELR = 0U; /*(0U << 1U) | 0U;*/  /* Level 1 data cache */
	ccsidr = SCB->CCSIDR;
                                            /* clean D-Cache */
	sets = (uint32_t)(CCSIDR_SETS(ccsidr));
	do {
		ways = (uint32_t)(CCSIDR_WAYS(ccsidr));
		do {
			SCB->DCCSW = (((sets << SCB_DCCSW_SET_Pos) & SCB_DCCSW_SET_Msk) |
				      ((ways << SCB_DCCSW_WAY_Pos) & SCB_DCCSW_WAY_Msk));
	#if defined ( __CC_ARM )
		} while (ways-- != 0U);
	} while (sets-- != 0U);


void start_dma(uint8_t *buf, uint8_t txlen)
	tx_done = 0;
	LL_GPIO_ResetOutputPin(GPIOB, LL_GPIO_PIN_0);		/* Done: LED OFF */

	LL_DMA_ClearFlag_TC0(DMA1);				/* Transfer complete */
	LL_DMA_ClearFlag_HT0(DMA1);				/* Half Transfer done */
	LL_DMA_ClearFlag_TE0(DMA1);				/* Transfer Error */
	LL_DMA_ClearFlag_DME0(DMA1);				/* Direct MODE Error */
	LL_DMA_ClearFlag_FE0(DMA1);				/* FIFO Error */

	LL_DMA_SetDataLength(DMA1, LL_DMA_STREAM_0, txlen);	/* DMA Stream length */
			       LL_DMA_STREAM_0,			/* USART3 DMA Stream 0 */
			       (uint32_t)buf,			/* Transmit string */
			       LL_DMA_GetDataTransferDirection(DMA1, LL_DMA_STREAM_0));

	LL_DMA_EnablePeriphDMA(DMA1, LL_DMA_STREAM_0);		/* Enable USART3 Stream DMA */
	LL_DMA_EnableStream(DMA1, LL_DMA_STREAM_0);		/* Enable DMA Channel Tx */

In this scenario, since the Cache maintenance functions are deployed, I guess that’s probably why portEND_SWITCHING_ISR() Macro is not used in ST examples ?

That said, keeping ST examples aside, given the current issue, how should the situation be handled (If there exists a better/proper way of handling it) ?

There is no cache management involved in portYIELD_FROM_ISR. But it takes care about speculative/out-of-order execution done by some hi-performance MCUs like yours.
That‘s the purpose of __DSB/__ISB.
There is nothing wrong with the STM32H7 DMA hardware but it‘s easy to write broken drivers using DMA.
I‘m unsure what workaround 3 means and which problem you want to solve exactly.
But I‘m pretty sure it has nothing to do with portYIELD_FROM_ISR which just works :wink:

I think you mean Solution example 3
Well, it might be useful and could provide the best performance. But you have to do it right.
That basically means to fully understand DMA and cache operation.
In short you have to manually and carefully maintain the CPU data cache for every piece of cachable memory either written to or read from DMA (buffers and descriptors).
Not just once, but for every DMA operation.
It‘s on you to decide but it might be sufficient and much simpler to use one of the other solutions avoiding cache management for your UART driver.

The DMA FAQ was issued by ST themselves. Perhaps, you missed the URL ?

What was visible to me was the out of order Data issue which was fixed by a Cache Sync, for a bare metal test application.

Let me please rephrase.
does __dsb, __isb

Since the Cache Sync was handled by the SCB_* functions themselves doing the same, ST found no joy in having a Cache sync twice ? (I cant find any other explanation and hence)

Again __DSB/__ISB are not cache related. They‘re dealing with processor internal data and instruction pipelines.
Cache management as required by DMA powered drivers using cachable memory is a different thing.

It’s easy to describe the bare metal situation. You disable DCache and the problem vanishes. The Cache is not in sync with the Physical memory.

The FAQ also mentions the problem.

Disable D-Cache globally. It is the most simple solution, but not effective one, since you can loose great part of performance. Can be useful for debugging, to analyze if the problem is related to D-Cache.

I agree with what you are saying. But ST considers it a bug, I guess based on their FAQ.
Long back, writing a Linux driver, Andrew Morton (kernel maintainer) said, if you need to do a dma_sync_*, then likely the DMA implementation on the hardware is broken.

For completion sake, the mentioned USART DMA issue (fixed)


Lets say there are two tasks T1 and T2 with proprieties 1 and 2 respectively. Task T2 is currently blocked and task T1 is running. An interrupt occurres and the interrupt handler performs some tasks (calls FreeRTOS APIs) which result in task T2 becoming ready to run. portEND_SWITCHING_ISR ensures that when the interrupt handler returns, the task T2 starts executing as opposed to task T1.

Regarding your other issue of filling a global buffer in ISR and signaling another task to start processing it, how do you plan to avoid concurrent access i.e. ISR overwriting buffer while it is still being processed by the task? You certainly do not want to block in ISR.