hi,
we have problem with weird behavior of freertos on am243x (cortex r5)
ti port uses “cpsid i” to block interrupts in critical section.
finally we found that disabling interrupts from task by “cpsid i” or by setting bit I in cpsr do not block interrupts.
hi
just disabled interrupts (called to enter_critical) in task and put breakpoint in interrupt. this should disable interrupt and create unbalanced nesting, which is deadlock.
but debugger keeps stopping in interrupt ,showing that bit i in cpsr is 1, means that interrupt is masked out. i do not expect to stop in interrupt , while but i in cpsr is 1.
what do i miss ?
thanks
I would not trust debugger breakpoints. Instead, I would do something like
volatile int glbDebugBreak = 0; //glbal scope
if (glbDebugBreak) __asm(“bkpt”)
and set glbDebugBreak to 1 when I would want the bp to be hit, or alternatively, increment a global variable in the isr and monitor it in a real time watch window.
It is very very unlikely that something as fundamental to mcu operation as interrupt masking does not work.
I cannot explain this. How program ended up in ISR while I and F are 1, “masked” and mode is system. Debugger lies? I think it must be something wrong in port to this architecture, maybe mode of operation is not set correctly and writing to CPSR is silently ignored?
void vTaskEnterCritical( void )
{
portDISABLE_INTERRUPTS();
if( xSchedulerRunning != pdFALSE )
{
( pxCurrentTCB->uxCriticalNesting )++;
/* This is not the interrupt safe version of the enter critical
* function so assert() if it is being called from an interrupt
* context. Only API functions that end in "FromISR" can be used in an
* interrupt. Only assert if the critical nesting count is 1 to
* protect against recursive calls if the assert function also uses a
* critical section. */
if( pxCurrentTCB->uxCriticalNesting == 1 )
{
portASSERT_IF_IN_ISR();
}
}
else
{
mtCOVERAGE_TEST_MARKER();
}
}
#define portDISABLE_INTERRUPTS() __asm__ volatile ( "CPSID i" ::: "memory" )
The debugger does not “lie,” but optimizations may make it impossible for the debugger to correctly match source and assembler code. The comment in the screen shot suggests that the location you break at does nothing but return to the caller.
which POD are you using exactly? Judging from the data sheets it looks as if this could be a multi core MCU, could it be possible that you disable interrupts on one core but the interrupt gets asserted on another core?
Ok, can you try to setup your system such that the problem shows without FreeRTOS? For example, in your main(), prior to starting the scheduler, setup your interrupt source, then add a cpsid instruction manually and see if the interrupt still fires? I agree with you that the isr should not be reached with the I flag set, but this should really be independent of FreeRTOS.
Problem may not be directly related to freeRTOS, rather to mode of operation of CPU (system, SVC, other) that FreeRTOS uses.
Port to R5 is pretty straighforward, even simpler than M3. But maybe some specific instructions are required to gain access to CPSR are missing.
That’s is why I search for similar complains that may shed some light.
Interesting thing that we already made POC 2 years ago with the IC from the same line but with A cores in addition to R. We did not see anything like this.
Hope that this is not error in silicon.
Just a shot in the dark: Is it possible that some other control path dispatches to the isr outside of an interrupt context? I do not know the R series too well, but on an M, I would expect something like 0xfffffffd in the lr if we are truly in an interrupt context. And wouldn’t the M field hold an 0x12 if we are in IRQ (judging from your screen shot)?
It’s a good point. b1111 in LSB of CPSR in Irq looks suspicios.
I need to explain some backround.
We have timer interrupt that fires every 125 uSecs.
As the first step we confirmed that interrupt latency and jitter is good. First with cpu timestamp conter and then with digital output. It confirms that there are no spurious/missing interrupts.
Then we intend to confirm that task latency (give/take binary semaphore) is acceptable (few uSecs).
Here we start to scratch our heads, because picture we see is beyound our understanding. Number of events counted by ISR and Task is pretty consistent, but context switch time from ISR to task is pretty randon.
We trace code execution from return from interrupt back to task switch, found that we forget to remove “trace”, which has a lot of overhead.
We removed “trace”, confirmed that return from ISR resumes the right task, pretty expected.
But it did not help with understanding of latencies.
We started to search for anomalies , like interrupts priority (max app priority vs kernel priority), like in M3 ports, but R5 port is much simpler, it uses global interrupt mask, which does not behave as expected, as I described in the first message.
I start to think that maybe CPSR.I is ignored in this implementaion and replaced with interrupt controller. Francly, I expect that TI guys would read ARM/SOC documentation not me
I do not think that an ARM core licensee has a chance to modify that behavior as that should be in the microcode provided by ARM, but I could be wrong. In any case, contacting TI sounds like a good idea.
One more thing that comes to mind as a possibility is some kind of a stack popping back to the wrong context. Once you are at the breakpoint,. can you skip to the end of the isr and do a single step to see where the code returns to?
… Once you are at the breakpoint,. can you skip to the end of the isr and do a single step to see where the code returns to?"
As expected, restore context and return from semaphore take, but it is one single run from many .
Unfortunately, I do not find other R5 ports for reference.
why would you expect this? Interrupts are asynchronous, so unless the code that waits on the semaphore immediately prior to that enforces the interrupt condition, the isr could return to anywhere. In particular, it can not return to “return from semaphore take” because FreeRTOS must execute some code to manage that (eg enforce a context switch, normally in the context of a service interrupt). So it can at best return to the beginning of the semaphore wait call sequence.
Can you share the relevant code snippets of the task that interacts with the isr and the isr itself?
The R5 CPU has a separate SPSR for each operating mode, per ARM’s documentation. The SPSR that you’ve shown is for SVC Mode, have you ensured your User/System Mode CPSR has the IRQ/FIQ disable bits set high when you’re seeing this issue?