Does anyone have experience with encountering intermittent UART Rx overrun on FreeRTOS?
We are trying to find out whether this is a known FreeRTOS issue and if so, if there is a possible fix for it.
We are running FreeRTOS on 600MHz NXP iMXRT1052 processor with handful of FreeRTOS tasks simultaneously running including IP task and MQTT pool tasks.
Any help/advice would be much appreciated.
Regards,
James
How fast are characters coming in on the UART? I presume the Rx is triggering in interrupt and characters read in that and stored somewhere (internal buffer, Queue or StreamBuffer).
UART’s running at 57600 bps.
Yeah upon receiving a character, the peripheral triggers an interrupts and the interrupt makes a callback for characters to be processed. We are using the UART driver from NXP SDK and NXP implements a 2nd level “ring” buffer to copy character from hardware buffer to its “software” ring buffer.
NXP advises me that Rx Overrun is caused by receiving a new character while the UART hardware buffer has NOT been read yet.
We are speculating that perhaps the interrupts are disabled “too long” by higher priority interrupt or critical section.
I am reaching out to FreeRTOS community for I’ve spotted a lot of EnterCriticalSection() invoked with FreeRTOS
And I understand EnterCriticalSection() disables the interrupt and obviously when it’s abused incorrectly, some interrupts will be missed.
Any insight to share?
iMXRT1052 is Cortex-M7 core - Critical sections within FreeRTOS will block interrupts upto configMAX_SYSCALL_INTERRUPT_PRIORITY
. Other higher priority interrupts will not be blocked by critical sections but ISRs for those higher priority interrupts cannot call FreeRTOS APIs. I am not familiar with NXP SDK but if your UART interrupt handler does not call any FreeRTOS API, then you can increase UART interrupt’s priority.
Another thing to keep in mind is that in ARM Cortex-M cores, numerically low priority values are used to specify logically high interrupt priorities. This page describes it detail: https://www.freertos.org/RTOS-Cortex-M3-M4.html
Thanks.
57600 gives a byte period of around 170 micro seconds. FreeRTOS won’t disable interrupt for that long. Either you are disabling interrupts for that sort of long period, or you have a higher priority interrupt blocking the system for a long period of time (they really shouldn’t)
Thanks Richard. Your input confirms my speculation.
As far as I understand, we have only one interrupt with a higher priority than my UARTs; it’s the ethernet interrupt.
I will try to do some timing measurements using a logic analyzer to see if the ethernet interrupt is the culprit.
Our application code does NOT disable interrupts AFAIK; only code that disables interrupts is the critical section calls invoked by FreeRTOS (kernel, TCP/IP stack, MQTT stask, IoT stask).
Do you also have experience with these FreeRTOS libraries and their use of critical section?
I use FreeRTOS a lot, I don’t use many of those libraries, but I wouldn’t expect them to be an issue (The FreeRTOS internal guidelines for critical sections seem good). I often use higher baud rates on significantly slower processors without a problem.
The ethernet interript if it copies full frames could be slow, but I wouldn’t expect it to be that slow.
I’ve made similar experiences as Richard and running a system with ethernet and multiple UARTs at a much slower MCU clock but at much higher UART baud rates without any problems. Well, it’s a zero-copy ethernet driver…
The critical sections used by FreeRTOS itself are very short and usually don’t cause any problems.
I don’t think that those high-level libs you mentioned are using taskENTER/EXIT_CRITICAL
at all.
For task level critical sections vTaskSuspendAll/ResumeAll i.e. suspending the scheduler is the better way if needed.
Thank you, Richard & Hartmut for your comments.
I have two follow-up questions if you don’t mind.
I set up a logic analyzer to monitor critical section timing with FreeRTOS and I am seeing something not expected. In most cases, Enter/Exit is short and sweet
but sometimes I get 500usec to 1000usec long critical sections.
Some worst cases are as bad as 1.5msec. I’ve attached a screenshot I got.
As I mentioned, I am running not only the kernel but also FreeRTOS TCP/IP, MQTT, IoT.
Have you seen anything like that?
I see that vPortEnterCritical() eventually calls vPortRaiseBASERPI for our software built for NXP iMXRT1052 MCU.
portFORCE_INLINE static void vPortRaiseBASEPRI( void )
{
uint32_t ulNewBASEPRI;
__asm volatile
(
" mov %0, %1 \n" \
" msr basepri, %0 \n" \
" isb \n" \
" dsb \n" \
:"=r" (ulNewBASEPRI) : "i" ( configMAX_SYSCALL_INTERRUPT_PRIORITY ) : "memory"
);
}
I understand that CriticalSection does not disable all interrupts but allows some high priority interrupts to run. And it seems like the last line in the code above achieves that?
#define configPRIO_BITS 4 /* 15 priority levels */
#define configMAX_SYSCALL_INTERRUPT_PRIORITY
#define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 2
#define configMAX_SYSCALL_INTERRUPT_PRIORITY (configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY << (8 - configPRIO_BITS))
For us the ethernet interrupt priority is set to 4 and the UART to 5.
As far as I can tell from the logic analyzer capture, the critical section allows both Ethernet and UART interrupt to happen.
Doe this make senses to you?
No, this doesn’t seem right. The critical sections should not last this long, and the UART interrupt should not interrupt a critical section with your configuration.
Can you post the code that generates the signals for input 1 on your logic analyzer?
Can you post the code that configures interrupt priority for the UART?
1>
Here is the code that set the GPIO high and low upon entering and exiting the critical section call:
void vPortEnterCritical( void )
{
SetGpio(2, 29, 1);
portDISABLE_INTERRUPTS();
uxCriticalNesting++;
/* This is not the interrupt safe version of the enter critical function so
assert() if it is being called from an interrupt context. Only API
functions that end in "FromISR" can be used in an interrupt. Only assert if
the critical nesting count is 1 to protect against recursive calls if the
assert function also uses a critical section. */
if( uxCriticalNesting == 1 )
{
configASSERT( ( portNVIC_INT_CTRL_REG & portVECTACTIVE_MASK ) == 0 );
}
}
/*-----------------------------------------------------------*/
void vPortExitCritical( void )
{
configASSERT( uxCriticalNesting );
uxCriticalNesting--;
if( uxCriticalNesting == 0 )
{
portENABLE_INTERRUPTS();
}
SetGpio(2, 29, 0);
}
2>
Here is the way I set the interrupt priority. Simply taken from NXP SDK examples.
NVIC_SetPriority(uart_irq, 5);
NXP provides a header that includes a macro that maps this to an implementation for ARM Cortex M7:
__STATIC_INLINE void __NVIC_SetPriority(IRQn_Type IRQn, uint32_t priority)
{
if ((int32_t)(IRQn) >= 0)
{
NVIC->IPR[_IP_IDX(IRQn)] = ((uint32_t)(NVIC->IPR[_IP_IDX(IRQn)] & ~(0xFFUL << _BIT_SHIFT(IRQn))) |
(((priority << (8U - __NVIC_PRIO_BITS)) & (uint32_t)0xFFUL) << _BIT_SHIFT(IRQn)));
}
else
{
SCB->SHPR[_SHP_IDX(IRQn)] = ((uint32_t)(SCB->SHPR[_SHP_IDX(IRQn)] & ~(0xFFUL << _BIT_SHIFT(IRQn))) |
(((priority << (8U - __NVIC_PRIO_BITS)) & (uint32_t)0xFFUL) << _BIT_SHIFT(IRQn)));
}
}
I think the calls to SetGpio()
should be moved to inside the critical section itself – just after the call to portDISABLE_INTERRUPTS()
and just before the call to portENABLE_INTERRUPTS()
. That should give you a better view of what’s really happening.
That’s a good suggestion.
The above 2 lines mean that FreeRTOS can block interrupts up to priority 2. And as you mention that your Ethernet and UART interrupts have priority 4 and 5 (which are lower priority than 2 as in Cortex-M, numerically lower priorities mean logically higher priorities). So why do you think UART and Ethernet wont be blocked by FreeRTOS critical sections?
Would you please change configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY to 6:
#define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY 6
Thanks.
Finally picking up on this task again…
@aggarg
I changed the syscall interrupt priority to 6 per your suggestion but I still see the issue.
Thanks @richard-damon , @hs2 , @jefftenney , @aggarg
for taking interest in the issue we are having with UART Rx Overrun.
Unfortunately we still have not been able to figure out what the cause of the issue yet
But we now at least know that the issue is reproducible when FreeRTOS IP stack is enabled (via FreeRTOS_IPInit)
Our FW does NOTHING application specific but it lets the IP stack do whatever IP stack is supposed to do.
We hired SoMLabs to write the network driver for us and it’s integrated with the IP stack and we haven’t encountered any network related issues.
It seems to me that it’s reasonable to think that the culprit is either in FreeRTOS IP stack or the network driver.
I wanted to reach out to the FreeRTOS community to see if anyone can share any information regarding this issue.
Below is what our test application is made up of
FreeRTOS idle task
FreeRTOS time service task
application task1 (wakes up every 1 sec, does nothing, and goes back to sleep)
application task2 (reads one byte at a time from UART using LPUART_RTOS_Receive())
FreeRTOS IP task
Ethernet driver task
When FreeRTOS_IPInit is called, the last two tasks are created and run and the Rx overruns occur.
When FreeRTOS_IPInit is NOT called, Rx overruns does NOT happen.
Any suggestion is much appreciated.
What are the priorities of these tasks? Is it possible that application task2 is not getting enough CPU time (because IP task or Ethernet task has higher priority) and therefore not able to read UART fast enough?
Thanks.
Priorities are as follows:
FreeRTOS idle task: 0
FreeRTOS time service task: 9
application task1 (wakes up every 1 sec, does nothing, and goes back to sleep): 5
application task2 (reads one byte at a time from UART using LPUART_RTOS_Receive()): 6
FreeRTOS IP task: 8
Ethernet driver task: 9
Can you try increasing the priority of the application task2 to 9?
Thanks.
This blows my mind.
UART Rx Overrun appears to no longer happen when I changed my UART task priority to 9.
By the way I also had few other things changed from their original settings:
Ethernet driver interrupt priority: 6 (originally 4)
configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY: 6 (originally 2)
UART interrupt priority: 4 (originally 5)
I am going to try to revert all these changes back to original values and change the priority of my UART application task from 5 to 9.
@aggarg
Would you help me understand how this is changing the behavior of UART interrupt?
Based on the logic analyzer captures reviewed initially, I had to conclude that something was disabling interrupts long enough that UART interrupt does not get triggered even after having received 3 bytes of data. I still could not pin-point who could be doing that.
But based on the experiment I just did with your guidance, somehow increasing the priority of a task that reads from UART resolves the issue. How does the application task priority affects the ISR behavior?
As you pointed out, I now understand that the interrupts can be blocked based on SYSCALL_INTERRUPT_PRIORITY but I still can’t seem to be able to connect the dots.
Would you mind enlightening my soul?