STM32F407 Task stops executing after some minutes

ryan87 wrote on Monday, April 21, 2014:

Dear FreeRTOS community,

I have a problem with FreeRTOS, which drives me mad for the last days/weeks. My simplified situation is now:
I have a custom board which works fine with multiple STM32 (please exclude the custom board from the reasons of my problem). Two are communicating via USART. So the faulting STM32 usually just receives USART messages and writes an event into a queue. A handler task (Prio 4) then processes the event. configMAX_PRIORITIES set to 5. I disabled all the other processing stuff on that uC.
This concept runs fine and smooth, most of the time.

After some random time, between 5-30mins, the handler task is no longer executed.
Debugging shows that:
uxTopReadyPriority = 0
and
pxReadyTasksLists[4]->uxNumberOfItems = 1
pxIndex shows that this is my handler task, ready and waiting to be executed.

As there is no other task running at prio 4 my handler task is never executed.

I tried this with FreeRTOS v7.2.0 and v8.0.0, configASSERT defined (not hit though).

Is this a bug in FreeRTOS? How can uxTopReadyPriority become 0 when there is a ready task?
I can futher investigate this issue if somebody can tell me what I should be looking for. I am currently not very much into the FreeRTOS source code.

My config is as follows:

#define configUSE_PREEMPTION		1
#define configUSE_IDLE_HOOK		0
#define configUSE_TICK_HOOK		0
#define configTICK_RATE_HZ		( ( portTickType ) 1000 )
#define configMAX_PRIORITIES		( ( unsigned portBASE_TYPE ) 5 )
#define configMINIMAL_STACK_SIZE	( ( unsigned short ) 128 )
#define configCPU_CLOCK_HZ		( ( unsigned long ) 168000000 )
#define configTOTAL_HEAP_SIZE		( ( size_t ) ( 54 * 1024 ) )
#define configMAX_TASK_NAME_LEN		( 16 )
#define configUSE_TRACE_FACILITY	1
#define configUSE_16_BIT_TICKS		0
#define configIDLE_SHOULD_YIELD		0
#define configUSE_MUTEXES		1
#define configUSE_RECURSIVE_MUTEXES	1
#define configUSE_COUNTING_SEMAPHORES    1
#define configUSE_STATS_FORMATTING_FUNCTIONS 1

/* Co-routine definitions. */
#define configUSE_CO_ROUTINES 		0
#define configMAX_CO_ROUTINE_PRIORITIES ( 2 )

/* Set the following definitions to 1 to include the API function, or zero
to exclude the API function. */

#define INCLUDE_vTaskPrioritySet		1
#define INCLUDE_uxTaskPriorityGet		1
#define INCLUDE_vTaskDelete			1
#define INCLUDE_vTaskCleanUpResources	        0
#define INCLUDE_vTaskSuspend			1
#define INCLUDE_vTaskDelayUntil			1
#define INCLUDE_vTaskDelay			1
#define INCLUDE_xQueueGetMutexHolder	        1

/* This is the raw value as per the Cortex-M3 NVIC.  Values can be 255
(lowest) to 0 (1?) (highest). */
#define configKERNEL_INTERRUPT_PRIORITY 	255
#define configMAX_SYSCALL_INTERRUPT_PRIORITY 	0x40 /* equivalent to priority 4. */


/* This is the value being used as per the ST library which permits 16
priority values, 0 to 15.  This must correspond to the
configKERNEL_INTERRUPT_PRIORITY setting.  Here 15 corresponds to the lowest
NVIC value of 255. */
#define configLIBRARY_KERNEL_INTERRUPT_PRIORITY	15

// Software Timers
#define configUSE_TIMERS		1
#define configTIMER_TASK_PRIORITY	1
#define configTIMER_QUEUE_LENGTH	10
#define configTIMER_TASK_STACK_DEPTH 512

#ifdef DEBUG
#include "util/log.h"
#define configASSERT(x)     if( ( x ) == 0 ) { Log(0); while(1); }
#define configCHECK_FOR_STACK_OVERFLOW 2
#endif

richard_damon wrote on Monday, April 21, 2014:

This is almost always a problem with interrupts, an interrupt handler calling a non-FromISR routine, or an interrupt of wrong priority using the FromISR routine.

I am not sure how much of that is actually caught by configASSERT()

The other possible problem could be a stack overflow (configCHECK_FOR_STACK_OVERFLOW is good, but not perfect)

ryan87 wrote on Wednesday, April 23, 2014:

I was expecting this answer to come. I triple checked my code, it is rather minimal now with all unneeded stuff commented out, but the problem persists.
Highest IRQ is on level 12, FreeRTOS is safe up to 4 in my configuration. I even checked the NVIC in debug after the task stopped, no priority above 12. From all IRQs I use xQueueSendFromISR. Strange thing is, my program runs for quite some time until one task stops.
I have a xHeap size of 60k and also increased stack size to 2048 (resulting in 8kB), which is quite a lot. The problem persists.

If you say “almost always”, what are the other reasons I might encounter?

BTW: I use GCC version 4.8-2014q1-update from linaro.

rtel wrote on Wednesday, April 23, 2014:

I would agree your symptom is of an interrupt priority configuration error. It would seem that if your ISR is genuinely running at priority 12, then you should be ok with the FreeRTOSConfig.h settings you already posted. However, have you called:

NVIC_PriorityGroupConfig( NVIC_PriorityGroup_4 );

Anywhere in your code? If not, then you will not really be running at priority 12.

If you are calling the above function then please post your UART interrupt handler code.

Regards.

ryan87 wrote on Wednesday, April 23, 2014:

Yes I have the NVIC_PriorityGroupConfig(NVIC_PriorityGroup_4) call directly in my main(), after I setup clocks and before I use any RTOS function.

My IRQ code is:

void USARTIRQ(USART* pUSART)

{
uint16_t sr = pUSART->pRegister->SR;
uint16_t event = 0;
int32_t yield = pdFALSE;
if((sr & USART_SR_TC) && USARTGetTCIE(pUSART->pRegister)) // transfer complete
{
USARTClearTCIE(pUSART->pRegister); // prevent IRQ refiring due to TC
event = EVENT_TX_DONE;
if(xQueueSendFromISR(pUSART->hEventQueue, &event, &yield) != pdTRUE)
Log(0x6000);
}
if((sr & (USART_SR_ORE | USART_SR_FE | USART_SR_NE)) && USARTGetEIE(pUSART->pRegister))
{
// clear interrupt enable
USARTClearEIE(pUSART->pRegister);
if(sr & USART_SR_ORE)
event |= EVENT_RX_ERROR_ORE;
if(sr & USART_SR_FE)
event |= EVENT_RX_FE;
if(sr & USART_SR_NE)
event |= EVENT_RX_ERROR_NE;
if(xQueueSendFromISR(pUSART->hEventQueue, &event, &yield) != pdTRUE)
Log(0x7000);
}
portEND_SWITCHING_ISR(yield);
}

rtel wrote on Wednesday, April 23, 2014:

Could you strip down your project so it contains the least amount of code and functionality required to demonstrate the problem, make sure it builds without any absolute paths, zip it up, then send it to the “business contact” email found on http://www.freertos.org/contact

Regards.

fadillrezha wrote on Friday, May 30, 2014:

Hi is there any updates on this issue? I think I might have the same problem.

Thanks!