Task resumes during ISR?!?

dspaude wrote on Tuesday, October 16, 2007:

This all could possibly be related to my thread where I described task/semaphore problems (). Details: FreeRTOS 4.5.0 running on Atmel AT91SAM7SE512, code compiled by GCC 4.2.1 (YAGARTO).

What I am seeing is my IRQ1 interrupt (XHFC) is being interrupted by FreeRTOS where FreeRTOS resumes some task which ends up signaling the semaphore for the task for which the ISR is handling and will signal (confused?). Take a look at a normal sequence of events:

[00, 00:00:29.690]  xhfc_ISR 1
[00, 00:00:29.690]  xhfc_ISR +
(XHFC task semaphore signaled here using xTaskResumeFromISR())
[00, 00:00:29.690]  xhfc_ISR -
[00, 00:00:29.690]  xhfc_ISR 2
(XHFC task receives semaphore and resumes)
[00, 00:00:29.690]  ** ENTER XHFC **
[00, 00:00:29.690]  ** EXIT XHFC **
(XHFC task suspends with xSemaphoreTake())

However, after some time (and this is under load similar to my other thread) and then I see an interruption and then an Undefined Instruction Exception:

[00, 00:00:29.705]  xhfc_ISR 1
[00, 00:00:29.705]  xhfc_XmtStart(1)=3
[00, 00:00:29.705]  ** ENTER XHFC **
[00, 00:00:29.705]  ** EXIT XHFC **
[00, 00:00:29.705]  xhfc_ISR +
(XHFC task semaphore signaled here using xTaskResumeFromISR())
[00, 00:00:29.705]  xhfc_ISR -
[00, 00:00:29.705]  xhfc_ISR 2
(XHFC task receives semaphore and resumes)
[00, 00:00:29.710]  ** ENTER XHFC **
[00, 00:00:29.710]  ** EXIT XHFC **
(XHFC task suspends with xSemaphoreTake())
[00, 00:00:29.710]  main_HandleFatalError:
[00, 00:00:29.710]  Undef: 20006D6A 60000012

R14=0x20006D6A which translates to an address within xQueueGenericReceive() which is what is called by xSemaphoreTake() and SPSR_und=0x60000012 indicates the failure occurred while in the ARM’s IRQ mode (which is the mode I would expect for the XHFC interrupt as it is connected to IRQ1):
.text          0x20006b3c      0x3ec appbuild/FreeRTOS/Source/queue.o
                0x20006c34                xQueueGenericSendFromISR
                0x20006ec0                xQueueCreate
                0x20006b3c                vQueueDelete
                0x20006dd8                xQueueGenericSend
                0x20006b98                xQueueReceiveFromISR
                0x20006b54                uxQueueMessagesWaiting
                0x20006ce0                xQueueGenericReceive

So, I am guessing that during the IRQ1 interrupt (XHFC set to Level 7 in Atmel’s PRIOR bits) I get a PIT interrupt which would actually be a higher priority interrupt than IRQ1. I am then guessing that the PIT starts executing and in the PIT ISR FreeRTOS decides to switch to a different task (since I am using preemptive mode). The task then resumes executing and happens to be a task that sends data to my XHFC task and signals the XHFC task to resume (xhfc_XmtStart(1)=3). The XHFC task then resumes since it is the highest priority task. When it finishes it waits for a semaphore. The RTOS then switches back to the IRQ1 (XHFC) ISR which then signals the XHFC task semaphore and the ISR exits. The ISR then allows the task switch to the XHFC task which resumes. However, when the XHFC task finishes and then goes to wait on the semaphore I get the Undefined exception.

So, should FreeRTOS be allowed to switch to a task while it is executing??? I don’t think this should be allowed, but I don’t see anything that prevents FreeRTOS from doing so. In my ISR I use portENTER_SWITCHING_ISR() and its associated portEXIT_SWITCHING_ISR( xTaskWokenByPost ). However, in this case it is true that the ISR never had anything woken by itself (it doesn’t loop nor signal the semaphore more than once), but since a task executed that signaled the same semaphore then xTaskWokenByPost should be pdTRUE, but how is the ISR supposed to know it got interrupted by something unrelated?

Anyway, how can I resolve this problem? How can I make FreeRTOS smart and not resume tasks during interrupts?

Thanks for your help,

dspaude wrote on Wednesday, October 17, 2007:

Well, it doesn’t LOOK like the tick (PIT) interrupts the IRQ1 ISR, so how is it possible for a task to resume?

[00, 00:01:56.195]  Tick
[00, 00:01:56.200]  Tick
[00, 00:01:56.205]  Tick
[00, 00:01:56.205]  xhfc_ISR 1
[00, 00:01:56.205]  xhfc_ISR +
[00, 00:01:56.205]  xhfc_ISR -
[00, 00:01:56.205]  xhfc_ISR 2
[00, 00:01:56.205]  ** ENTER XHFC **
[00, 00:01:56.205]  ** EXIT XHFC **
[00, 00:01:56.210]  Tick
[00, 00:01:56.215]  Tick
[00, 00:01:56.220]  Tick
[00, 00:01:56.220]  xhfc_ISR 1
[00, 00:01:56.220]  xhfc_XmtStart(1)=3
[00, 00:01:56.220]  ** ENTER XHFC **
[00, 00:01:56.220]  ** EXIT XHFC **
[00, 00:01:56.220]  xhfc_ISR +
[00, 00:01:56.220]  xhfc_ISR -
[00, 00:01:56.220]  xhfc_ISR 2
[00, 00:01:56.225]  Tick
[00, 00:01:56.225]  ** ENTER XHFC **
[00, 00:01:56.225]  ** EXIT XHFC **
[00, 00:01:56.225]  main_HandleFatalError:
[00, 00:01:56.225]  Undef: 20006D76 60000012

rtel wrote on Wednesday, October 17, 2007:

I cannot say I follow all this, but here are some comments which may help (hopefully).

I don’t understand the line “(XHFC task semaphore signaled here using xTaskResumeFromISR())”.  Are you using a semaphore to wake a task that is blocked on the semaphore, or are you using xTaskResumeFromISR() to unsuspend a task that is in the suspended state?  The former is the preferred method as the latter is susceptible to missed interrupts.  There is some discussion on this within this forum, and I think within the documentation for the function xTaskResumeFromISR() on the FreeRTOS.org WEB site.

Also, am I right from your explanation that you are permitting the PIT interrupt to interrupt the IRQ interrupt, and that both the IRQ and PIT interrupts are making use of the FreeRTOS.org API?  If so then this is definitely a dangerous scenario if you are using the code without modification.  Take a look at the interrupt nesting FAQ on the FreeRTOS.org WEB site.

The portEXIT_SWITCHING_ISR() macro must be placed at the end of a naked interrupt function.  This selects another task to execute, but does not start the task executing until the interrupt epilogue code executes.


dspaude wrote on Wednesday, October 17, 2007:

My error. I meant xSemaphoreGiveFromISR() is used from the XHFC ISR, not xTaskResumeFromISR() as I had written. I was a long time ago using the resume method and found that it didn’t work reliably for interrupts. That line of code was copied from other ISRs I wrote where I ran into the resume failure as described by the FAQ and so I switched. I am really using xSemaphoreGiveFromISR().

I don’t have configKERNEL_INTERRUPT_PRIORITY #defined anywhere and I also don’t see it anywhere in the code that I have. I don’t know how I could modify the IRQ1 versus PIT interrupt priorities, so they are whatever the default state would be. From my post earlier this morning it appears that the IRQ1 and PIT do not interrupt one another.

I had a thought as to the failure. Since a task starts running while the ISR is active and since a semaphore is signaled from the task, the signaled semaphore is NOT done so with xSemaphoreGiveFromISR(), so that may be part of the problem for the failure??? But why did that task start running while in the ISR in the first place? I quadrupled the IRQ stack size and it still fails the same way (thinking that maybe the IRQ stack was getting corrupted).

Here is basically what my ISR looks like:
void xhfc_ISR(void)
#if defined(scfg_kRTOS_FreeRTOS)

#if defined(scfg_kRTOS_FreeRTOS)
___portBASE_TYPE xTaskWokenByPost = pdFALSE;

___/* Clear the edge-triggered interrupt */
___AT91C_BASE_AIC->AIC_ICCR = 0x1 << AT91C_ID_IRQ1;

___log_msg("xhfc_ISR 1");

___(read from and write to memory location of XHFC)

___(task switches in here for some as yet unknown reason)

___if(task should be signaled)
______log_msg("xhfc_ISR +");

______xTaskWokenByPost = xSemaphoreGiveFromISR(xOSSemaphoreHandle[S_XHFC],xTaskWokenByPost);

______log_msg("xhfc_ISR -");

___log_msg("xhfc_ISR 2");

___/* End the interrupt in the AIC. */

#if defined(scfg_kRTOS_FreeRTOS)
___portEXIT_SWITCHING_ISR( xTaskWokenByPost );

davedoors wrote on Wednesday, October 17, 2007:

It might be just a detail missing from the pseudo code you posted, but is your isr function declared using the naked attribute?

I cannot see how a task could execute at the point you indicate unless there is already some screw up somewhere.  Even if interrupts were nesting (which they are not by your account), then the task would not start executing until after all interrupts had completed.

Could it be that your log messages are being displayed or collected in the wrong order, and this is just making look like the task is running?  Is your log message implementation thread and interrupt safe and non blocking?

You would not expect to have the kernel interrupt priority implemented on the ARM7 port because you can just use the FIQ to achieve the same effect.

Don’t have any other ideas just now.

adarkar9 wrote on Wednesday, October 17, 2007:

The fact that outside code is executing during your ISR means you have nested interrupts.  As Richard said, that is a dangerous scenario.  Can you try disabling interrupts during your ISR?

dspaude wrote on Wednesday, October 17, 2007:

The logging mechanism disables interrupts and it doesn’t allow re-entrancy. As such I don’t believe it will allow messages out of order.

Here’s my function prototype (declared in a header file, so that is why I didn’t include it above):
void xhfc_ISR( void ) __attribute__((naked));

adarkar9 (David), when I was trying to interpret my log messages I was drawing conclusions that seemed to be the only explanation for how a task could resume during an ISR. So I assumed that it must have been the preemptive tick which I assumed could interrupt the IRQ1 ISR (figuring it had a higher priority) and that tick might be what caused the task to execute. That was my guess. However, this morning I logged the preemptive tick and it did NOT happen during the IRQ1 ISR. Therefore I am at a loss as to what is causing the switch.

I checked the AIC interrupt priorities and the tick and IRQ1 are configured for 7, although I’ve never been able to tell if that really means anything. I tried setting my IRQ1 AIC priority level to 6 (preemptive tick is 7), but that made no difference.

The failure is highly reproducible, so I’ll eventually track it down.

rtel wrote on Wednesday, October 17, 2007:

>The logging mechanism disables interrupts and it doesn’t allow re-entrancy

Does it then indiscriminately re-enable interrupts at the end of the logging function?  If so I suspect this is the next place to investigate as enabling interrupts could result in unwanted nesting.

If you are going to log from interrupts then you will need to only re-enable interrupts if interrupts were found to be already enabled on entry to the function.


dspaude wrote on Wednesday, October 17, 2007:

From what I can tell it looks like the cause is due to an infrequent event that gets triggered due to memory handling and when that event happens it signals a different semaphore, but that semaphore is signaled using xSemaphoreGive() (because it normally isn’t called from within an ISR). This then causes the task execution which signals the XHFC task which then runs because of its priority. After returns and XHFC task execution it all then blows up.

Is there something that can be used to signal a semaphore that works in non-ISR code but may also work if an ISR happens to swing its way?

rtel wrote on Wednesday, October 17, 2007:

Ah yes - very good.  I didn’t think of that scenario.


dspaude wrote on Wednesday, October 17, 2007:

Thanks for the help, all, even though it eventually was an error in my code. It always helps to trace it all out via communication…

I modified the code to not do what it was doing in the ISR and moved that to the task (it was just more work that I didn’t want to do a few days ago because I was trying to get it working roughly–and it was working roughly).

Richard, the logging mechanism checks the FIQ/IRQ bits in the CPSR and will not enable interrupts if they were not enabled when the routine was entered. I use this mechanism all over in the code and it seems to work pretty well.

Thanks again,