uxCriticalNesting has incorrect count in PortExitCritical, in 2 different threads

xygblarbin wrote on Monday, July 25, 2016:

Hi all,

I have an application I’m writing for a Atmel sam4sd32a, with FreeRTOS. I modified an existing FreeRTOS sample project for the Sam4s to get up and running.

As my code got more complex, I started noticing some of the threads freezing, and tracked them down on the call stack.

Any time a thread freezes, it is because the variable uxCriticalNesting is already 0 when vPortExitCritical is called.

My code has binary semaphores, mutexes, and queue messaging implemented in it.

The thread that freezes most often is an LCD thread that uses the USART in SPI mode to write to the screen. The thread actually freezes when attempting to return the mutex that it has checked out. It freezes in vPortExitCritical because uxCriticalNesting is already zero. Here is some psuedo-code:

LcdSendBuffer(void)
{
    while(xSemaphoreTake(m_LcdMutex, 1000) != pdTrue)
    {
        vTaskDelay(1);
    }
    
    WriteData();
    
    xSemaphoreGive(m_LcdMutex);
    
}

I cannot see why uxCriticalNesting would already be zero here. It’s pretty obvious this code SHOULD have the mutex.

The other thread that freezes up is a comm thread that sends messages out the ACTUAL SPI peripheral. However, every time it is frozen is when it is attempting to check for a new Queue message. When checking for the new queue message, the thread freezes up in vPortExitCritical as well, also because uxCriticalNesting is already 0.

I’ve looked at my code extensively, and cannot see a reason why this would be happening. I can’t post the actual code, since it is for a commercial product, but I will answer any questions that will help uncover a solution.

Thanks,

  • Mike

davedoors wrote on Tuesday, July 26, 2016:

Are you using the mutex from an interrupt too? You can’t do that and might be hitting an assert when it seems to freeze.

Its good to read through the following FAQ page http://www.freertos.org/FAQHelp.html any help there?

xygblarbin wrote on Tuesday, July 26, 2016:

The mutex is not being taken or returned from an interrupt

xygblarbin wrote on Tuesday, July 26, 2016:

The application will work fine for an amount of time before failing (sometimes 5 minutes, sometimes over an hour), so this isn’t as simple as an interrupt priority issue I wouldn’t think, or a misconfiguration issue either.

rtel wrote on Tuesday, July 26, 2016:

You have already been given a link to the FAQ that has a list of items
to go through: http://www.freertos.org/FAQHelp.html but haven’t comment
on that yet.

Which FreeRTOS version are you using, and do you have configASSERT()
defined correctly (as per the FAQ above)? From experience of similar
problems I suspect that if you are using a recent version of FreeRTOS
then the assert will most likely find the issue.

xygblarbin wrote on Tuesday, July 26, 2016:

Ok… I’ve already set up the interrupt priority for my chip. It doesn’t seem like a priority problem since that would usually fail pretty immediately. configASSERT is defined so that if the argument is 0, it disables interrupts for the task and puts the task in an infinite loop.

FreeRTOS 8.2.3.

As for commenting on going through the provided FAQ:

“The application I created compiles, but does not run” does not apply.

“Stacks” does not apply. I am not getting an overflow.

“Interrupts are not executing” does not apply.

"I added a simple task to a demo… " does not apply.

“Using API function within interrupts” does not apply, and I have answered that.

“The RTOS scheduler crashes when attempting to start the first task” does not apply.

“The interrupt enable flag gets set incorrectly” does not apply.

“My application crashed before the RTOS scheduler is even started” does not apply.

“Suspending the RTOS scheduler causes me problems” does not apply.

“I have created a new application but it will not compile” does not apply.

"I get stuck on the line " does not apply.

If it had been any of these things i would have mentioned them.

davedoors wrote on Tuesday, July 26, 2016:

What does it mean that a thread ‘freezes’? Your description makes it sound like an assert is triggered because uxCriticalNesting is already 0, so if that is the case the whole app should freeze.

It sounds like taskEXIT_CRITICAL() is called before taskENTER_CRITICAL(), so uxCriticalNesting is 0. Can you trace calls to the enter and exit critical macros? Or count their calls? Could it be the macros were called in an interrupt?

xygblarbin wrote on Tuesday, July 26, 2016:

The stack trace shows a normal flow of operation. No FreeRTOS routines are being called from an interrupt. When I say that a thread “freezes”, I mean that the thread in question gets caught in the infinite loop here:

configASSERT(x) (if x == 0) { taskDISABLE_INTERRUPTS(); for( ;; ); }

Tracing every call to the enter and exit macros will be incredibly difficult, since they occur probably 50 times a second, and the system might not fail for an hour.

However, I do have the stack trace of the failing taskEXIT_CRITICAL:

  • vPortExitCritical
  • xQueueGenericSend
  • xSemaphoreGive
  • LcdSendBuffer

rtel wrote on Tuesday, July 26, 2016:

  • vPortExitCritical
  • xQueueGenericSend
  • xSemaphoreGive
  • LcdSendBuffer

This is confusing as it should not be a possible stack frame, unless
vPortEnterCritical() was a macro or otherwise inlined so it didn’t
appear as a function call on the stack frame.

  • Which compiler are you using?
  • Which optimisation level is the compiler set to?
  • Is the code being built as standard C code?
  • Have you modified the code in any way at all, even if you think the
    modification is irrelevant?

xygblarbin wrote on Tuesday, July 26, 2016:

vPortEnterCritical won’t be on the stack trace because it doesn’t call vPortExitCritical. xQueueGenericSend calls vPortEnterCritical, and then does some stuff, and then calls vPortExitCritical. Maybe a “stack frame” is different from a stack trace, and if so, I’m not sure how to view it in Atmel Studios.

I’m using GCC.
optimisation at 0
standard c
no modification to freeRTOS outside of freeRTOSConfig

xygblarbin wrote on Tuesday, July 26, 2016:

See attached stack trace.

rtel wrote on Tuesday, July 26, 2016:

Sorry. My mistake. Of course the function will not appear in the stack frame as the function has already returns.

That leaves us no closer to tracking down the issue, which would appear to be a data corruption.

From the previous posts in this thread can I take it you are sure there are no stack overflows? Overflow checking only works in tasks though. You could try increasing the stack used by main() as that is also the stack used by interrupts.

xygblarbin wrote on Tuesday, July 26, 2016:

I’ve set breakpoints in both the application stack overflow hook and the badmalloc hook, and neither are causing breaks. (I failed to mention before, but i’m using heap4 to do some rapid allocation and deallocation too, to make things more complicated).

Did you ever resolve this? I am having a similar issue, see uxCriticalNesting has incorrect count in PortExitCritical called from xQueueGenericSend.