Possible bug in the CCS ARM Cortex R4 port

berni8k wrote on Sunday, September 15, 2013:

I been porting freeRTOS to the ARM9 core of a Texas Instruments OMAP L138 and the closest offical port to what i wanted was the ARM Cortex R4 port for Code Composer Studio running on a RM48 chip.

After a day of hunting down all the problems i finally got my port working, then as i was playing around with it some i noticed some strange effects where it stopped  task switching permanently. The problem kept happening when i tried to run 2 tasks at the same priority level but one contained nothing but an infinite loop with no delays, also later found out the infinite loop task had to be created last in the main sub. I was suspecting my interrupts started misbehaving again so spent a afternoon tracking down what it could have been.

Turns out the problem is in the way ulCriticalNesting variable was being handled. I did not realize the kernel boots it self in permanent critical section protection (would be silly not to do that now that i see it). But what happens in the routine xPortStartScheduler(void) is variable ulCriticalNesting gets set to 0 to make sure the first task run has no critical nesting as it enters, BUT this is not enough! The check for this being 0 is performed in vPortExitCritical( void ) so the first task is entered in critical mode, this means preemptive ticks can’t happen! This bug fixes it self when you call any sort of RTOS function like delay as that contains the vPortExitCritical( void ) call.

I have fixed it by setting nesting ulCriticalNesting to 1 instead and then calling vPortExitCritical( void ). I know very few applications will trigger this bug, but it left me running in circles quite a bit trying to find it because i was looking for it at all the wrong places.I already saw one other forum user have very similar tread switching lockup happen on a PIC.

You guys might want to do something about it if it really is a bug (I have not tested the original port as i only have the hardware to run my own port) The port i found this in came in the official FreeRTOS V7.5.2 release, i have not checked any other ports for this tho.

rtel wrote on Sunday, September 15, 2013:

Hmm, I was just looking at the code and can see why this is.  The structure of the context switch comes from the ARMv4 code (ARM7, ARM9, etc.), but those ports store the critical nesting count as part of the task context.  The R4 port does not store the critical section nesting count as part of the task context because context switches are performed using a pendable interrupt, like the Cortex-M ports (and RX port, etc.).  Therefore the critical nesting does not return to zero until it is actually used.

I have fixed it by setting nesting ulCriticalNesting to 1 instead and then calling vPortExitCritical( void ).

The key thing here is that interrupts must not actually get enabled until *after* the scheduler has started.  If they get enabled before that time then an interrupt service routine might try and perform a context switch before the scheduler is even running, which would result in a crash.

I will have a ponder on how to get this done correctly.  Thanks for pointing it out.


rtel wrote on Sunday, September 15, 2013:

Hang on, scrub my last post, I don’t see what the problem is.

When a task is created the initial context for that task is placed onto the task’s stack.  The initial context has interrupts enabled (see the value of portINITIAL_SPSR in port.c).  When the scheduler is tarted the critical nesting count is set to 0, and when the context of the first task is popped off that task’s stack the CPSR value popped has the I bit clear, and therefore interrupt automatically become enabled.

So, as far as I can see, when a task starts, it has a critical nesting count of 0, and interrupts are enabled, which is the correct system state.

Do you agree?


berni8k wrote on Sunday, September 15, 2013:

Ah now i see that the portRESTORE_CONTEXT assembler macro writes the interrupt enables in CPSR. I was under the impression that was just making sure flags in there don’t get lost. In my port i implemented the macro for enabling and disabling interrupts by writing to a register in the interrupt controller since i could do that with a single line of C code and was likely translated to a single assembler instruction too. Only way of toggling those CPSR bits on a ARM9 i found required a read modify write sequence of 3 instructions, and because i used R0 to do the modify operation that meant i could not simply in line assembler it in to C but incited had to call in to a assembler file as a subroutine.

I have adjusted macro for switching interrupts on and off to use the CPSR method too and now it seams to run fine without the crude fix !(Yes that fix did have a chance of crashing it if there was an interrupt towards the end of xPortStartScheduler(void) that i had enable interrupts just before the context restore)

I would have loved to have some documentation that describes what each function in the port code should be doing as i had mostly been reverse engendering the C and assembler mash up to figure out what it does.

By the way is the speed gain from using configUSE_PORT_OPTIMISED_TASK_SELECTION very significant? It seams like it should help quite a bit as it gets called often, but im not sure how much other overhead is there for everything else.

Thank you for your help and quick response.