FreeRTOS with Altera / Intel NIOS II, exception handling clash.

andrewparlane wrote on Friday, November 30, 2018:

I have a NIOS II softcore design running on a cyclone V FPGA. I would like to run FreeRTOS but have been having some weird issues with spurious interrupts. Looking into it further it appears that both FreeRTOS and the NIOS BSP contain interrupt handling code, which the linker script merges together into some horrendous mismash of code.

Specifically the linker script contains:

.exceptions :
{
PROVIDE (__ram_exceptions_start = ABSOLUTE(.));
. = ALIGN(0x20);
KEEP ((.irq));
KEEP (
(.exceptions.entry.label));
KEEP ((.exceptions.entry.user));
KEEP (
(.exceptions.entry.ecc_fatal));
KEEP ((.exceptions.entry));
KEEP (
(.exceptions.irqtest.user));
KEEP ((.exceptions.irqtest));
KEEP (
(.exceptions.irqhandler.user));
KEEP ((.exceptions.irqhandler));
KEEP (
(.exceptions.irqreturn.user));
KEEP ((.exceptions.irqreturn));
KEEP (
(.exceptions.notirq.label));
KEEP ((.exceptions.notirq.user));
KEEP (
(.exceptions.notirq));
KEEP ((.exceptions.soft.user));
KEEP (
(.exceptions.soft));
KEEP ((.exceptions.unknown.user));
KEEP (
(.exceptions.unknown));
KEEP ((.exceptions.exit.label));
KEEP (
(.exceptions.exit.user));
KEEP ((.exceptions.exit));
KEEP (
(.exceptions));
PROVIDE (__ram_exceptions_end = ABSOLUTE(.));
} > foobar

Then in the bsp alt_irq_entry.S there is some code that gets put in to some of those sections. Specifically:

Line 54: .section .exceptions.entry.label, “xa”
Line 62: .section .exceptions.irqtest, “xa”
Line 85: .section .exceptions.irqhandler, “xa”
Line 93: .section .exceptions.irqreturn, “xa”
Line 97: .section .exceptions.notirq.label, “xa”
Line 106: .section .exceptions.exit.label

Then FreeRTOS port_asm.S has other code which is put into these sections too:

Line 36: .section .exceptions.entry, “xa”
Line 78: .section .exceptions.irqtest, “xa”
Line 90: .section .exceptions.irqhandler, “xa”
Line 94: .section .exceptions.irqreturn, “xa”
Line 135: .section .exceptions.soft, “xa”

The map file created by the linker contains:

.exceptions 0x0000000000000020 0x2fc
[!provide] PROVIDE (__ram_exceptions_start, ABSOLUTE (.))
0x0000000000000020 . = ALIGN (0x20)
*(.irq)
*(.exceptions.entry.label)
.exceptions.entry.label
0x0000000000000020 0x0 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
0x0000000000000020 alt_irq_entry
.exceptions.entry.label
0x0000000000000020 0x0 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
0x0000000000000020 alt_exception
*(.exceptions.entry.user)
*(.exceptions.entry.ecc_fatal)
*(.exceptions.entry)
.exceptions.entry
0x0000000000000020 0x8c obj/default/FreeRTOSv10.1.1/Source/portable/GCC/NiosII/port_asm.o
.exceptions.entry
0x00000000000000ac 0x54 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
*(.exceptions.irqtest.user)
*(.exceptions.irqtest)
.exceptions.irqtest
0x0000000000000100 0x14 obj/default/FreeRTOSv10.1.1/Source/portable/GCC/NiosII/port_asm.o
.exceptions.irqtest
0x0000000000000114 0x10 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
*(.exceptions.irqhandler.user)
*(.exceptions.irqhandler)
.exceptions.irqhandler
0x0000000000000124 0x4 obj/default/FreeRTOSv10.1.1/Source/portable/GCC/NiosII/port_asm.o
.exceptions.irqhandler
0x0000000000000128 0x4 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
*(.exceptions.irqreturn.user)
*(.exceptions.irqreturn)
.exceptions.irqreturn
0x000000000000012c 0x8c obj/default/FreeRTOSv10.1.1/Source/portable/GCC/NiosII/port_asm.o
0x000000000000012c restore_sp_from_pxCurrentTCB
.exceptions.irqreturn
0x00000000000001b8 0x4 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
*(.exceptions.notirq.label)
.exceptions.notirq.label
0x00000000000001bc 0x0 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
*(.exceptions.notirq.user)
*(.exceptions.notirq)
.exceptions.notirq
0x00000000000001bc 0x8 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
*(.exceptions.soft.user)
*(.exceptions.soft)
.exceptions.soft
0x00000000000001c4 0x2c obj/default/FreeRTOSv10.1.1/Source/portable/GCC/NiosII/port_asm.o
*(.exceptions.unknown.user)
*(.exceptions.unknown)
.exceptions.unknown
0x00000000000001f0 0x4 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
*(.exceptions.exit.label)
.exceptions.exit.label
0x00000000000001f4 0x0 …/bsp/\libhal_bsp.a(alt_irq_entry.o)
.exceptions.exit.label
0x00000000000001f4 0x0 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
*(.exceptions.exit.user)
*(.exceptions.exit)
.exceptions.exit
0x00000000000001f4 0x54 …/bsp/\libhal_bsp.a(alt_exception_entry.o)
*(.exceptions)
.exceptions 0x0000000000000248 0xd4 …/bsp/\libhal_bsp.a(alt_irq_handler.o)
0x0000000000000248 alt_irq_handler
[!provide] PROVIDE (__ram_exceptions_end, ABSOLUTE (.))
[!provide] PROVIDE (__flash_exceptions_start, LOADADDR (.exceptions))

Which shows how the code gets mashed together.

Finally looking at the objdump we can the final code:

00000020 <alt_exception>:
20: ef7fff04 addi ea,ea,-4
24: deffe304 addi sp,sp,-116
28: dfc00015 stw ra,0(sp)
2c: d8400215 stw at,8(sp)
30: d8800315 stw r2,12(sp)
34: d8c00415 stw r3,16(sp)
38: d9000515 stw r4,20(sp)
3c: d9400615 stw r5,24(sp)
40: d9800715 stw r6,28(sp)
44: d9c00815 stw r7,32(sp)
48: da000915 stw r8,36(sp)
4c: da400a15 stw r9,40(sp)
50: da800b15 stw r10,44(sp)
54: dac00c15 stw r11,48(sp)
58: db000d15 stw r12,52(sp)
5c: db400e15 stw r13,56(sp)
60: db800f15 stw r14,60(sp)
64: dbc01015 stw r15,64(sp)
68: 000b307a rdctl r5,estatus
6c: d9401115 stw r5,68(sp)
70: df401215 stw ea,72(sp)
74: dc001315 stw r16,76(sp)
78: dc401415 stw r17,80(sp)
7c: dc801515 stw r18,84(sp)
80: dcc01615 stw r19,88(sp)
84: dd001715 stw r20,92(sp)
88: dd401815 stw r21,96(sp)
8c: dd801915 stw r22,100(sp)
90: ddc01a15 stw r23,104(sp)
94: de801b15 stw gp,108(sp)
98: df001c15 stw fp,112(sp)

0000009c <save_sp_to_pxCurrentTCB>:
9c: 06000074 movhi et,1
a0: c61ef004 addi et,et,31680
a4: c6000017 ldw et,0(et)
a8: c6c00015 stw sp,0(et)
ac: deffed04 addi sp,sp,-76
b0: dfc00015 stw ra,0(sp)
b4: d8400215 stw at,8(sp)
b8: d8800315 stw r2,12(sp)
bc: d8c00415 stw r3,16(sp)
c0: d9000515 stw r4,20(sp)
c4: d9400615 stw r5,24(sp)
c8: d9800715 stw r6,28(sp)
cc: d9c00815 stw r7,32(sp)
d0: 000b307a rdctl r5,estatus
d4: da000915 stw r8,36(sp)
d8: da400a15 stw r9,40(sp)
dc: da800b15 stw r10,44(sp)
e0: dac00c15 stw r11,48(sp)
e4: db000d15 stw r12,52(sp)
e8: db400e15 stw r13,56(sp)
ec: db800f15 stw r14,60(sp)
f0: dbc01015 stw r15,64(sp)
f4: d9401115 stw r5,68(sp)
f8: ebffff04 addi r15,ea,-4
fc: dbc01215 stw r15,72(sp)

addresses 20 - a8 are from FreeRTOS code, and ac - fc is the BSP code. Note how the ea register is handled by both chunks of code.

An even more obvoious example is:

00000124 <hw_irq_handler>:
124: 00002480 call 248 <alt_irq_handler>
128: 00002480 call 248 <alt_irq_handler>

alt_irq_handler is in the BSP and is written in such a way that if the ipending register is 0, it locks up. Which is guaranteed to happen the second time it’s called, because interupts are disabled and the first call should handle all of the pending interrupts.

So that sumarizes the issue pretty well (I think), the question is what to do about this. One option is to delete the exception handling code from the BSP, but that’s auto generated and would get replaced every time you generate the BSP. I could write a patch that auto delets it on generation, but that seems a bit hacky. Any ideas?

rtel wrote on Sunday, December 02, 2018:

The NIOS demo in the FreeRTOS download is pretty old - but how does it
deal with this issue? I suspect it has done as you suggest.

andrewparlane wrote on Sunday, December 02, 2018:

I’ve been unable to build the demo app. It includes system.h which is part of the BSP but there isn’t a BSP project. There’s a syslib project, but when I import that into the NIOS build tools (eclipse) the standard NIOS2 option isn’t in the menu to generate the BSP.

I’ve been assuming that you have to have your own hardware config and then create your own BSP project from that, which I could do, but would end up having the same issues as I’m having.

rtel wrote on Sunday, December 02, 2018:

Sorry I can’t remember how the project was created - but can’t see any
exception handling code in the project that would have been
auto-generated so removing that from your project to see if it fixes the
problem would seem to be a good suggestion - if that works then the
question will be how to prevent it being auto generated again…

andrewparlane wrote on Sunday, December 02, 2018:

I have temporarily removed the alt_irq_entry.S and alt_exception_entry.S files, and after that everything works fine.

It doesn’t strike me as the best fix ever, but I can’t think of anything else that would work. Obviously those files will be replaced next time I generate the BSP, but I should be able to set up a patch to delete them each time.

The other option would be to remove the exception handling from freeRTOS, but I expect that’d cause other issues.

I’m a little late to the party, but I just spent about 3 weeks bashing my head on this issue. It’s very important to understand that the linker is going to merge in code from several files based on the “.segment” declarations. Code in port_asm.S does not execute in straight-line fashion. Of note, the link map shows that the FreeRTOS port_asm.S portion declared in .exceptions.entry is mapped-in ahead of the Altera BSP code. At the end of that section is this important chunk:

save_sp_to_pxCurrentTCB:
	movia	et, pxCurrentTCB	# Load the address of the pxCurrentTCB pointer
 	ldw		et, (et)			# Load the value of the pxCurrentTCB pointer
 	stw		sp, (et)			# Store the stack pointer into the top of the TCB
 
 	br		irq_test_user					# skip the section .exceptions.entry
 
     .section .exceptions.irqtest, "xa"
irq_test_user:

That branch in the middle that appears to be a useless branch over nothing is actually skipping over the BSP-supplied code. (The comment is vague.) You can chase through the BSP source tree with “grep -r <section_name> *” and it’ll give you some indications of what code is going to be merged in, and where. The linker’s merge priority is located in the linker.x file.

Toward the end of port_asm.S, the restore context portion is mapped into .exceptions.exit.user, which will place it ahead of the Altera BSP exit code. When you hit the local “eret” instruction, program control changes back into the current-context process, effectively skipping the Altera BSP restore code.

For reasons I can’t explain, the .exceptions.soft section is placed at the end of the port_asm.S file, but the linker maps it into the middle. My local copy has a bunch of block comments explaining the convoluted execution path caused by the not-immediately-obvious linker shenanigans (and by “shenanigans,” I mean “it’s doing exactly what it was asked to do.”)

Does anything need changing in the port code here?

No, no code changes are required (though I will mention that I had grabbed a .zip download of the baseline, and it was one rev old … and that didn’t work properly.) The current baseline works.

I wanted to share my experience, in the hopes that I could save someone else the pain of not completely understanding that the linker was doing things behind the scenes. Those section declarations are really important.

I get the same error but in this thread don’t see a way to fix it. The code already does not have alt_exception_entry.S or alt_irq_entry.s
Does anyone have any ideas how to stop this?
Thanks,
Martin

Would you please elaborate what error are you getting?

Hi, thanks for the reply, when I run the OS, it breaks at the break below-

	.section .exceptions.soft, "xa"
soft_exceptions:
	ldw		et, 0(ea)				# Load the instruction where the interrupt occured.
	movhi	at, %hi(0x003B683A)		# Load the registers with the trap instruction code
	ori		at, at, %lo(0x003B683A)
   	cmpne	et, et, at				# Compare the trap instruction code to the last excuted instruction
  	beq		et, r0, call_scheduler	# its a trap so switchcontext
  	break							# This is an un-implemented instruction or muldiv problem.
  	br		restore_context			# its something else

I am also using FreeRTOSv202212.01.zip for the niosii. None of these projects work well, the batch file for setting up folders in wrong, I used a sample project online that puts the os in the bsp (board support project). That has very different versions of os files. the os files in the demo are nothing like the files at the top level, it seems as if everything I try to do is just not working, now I am not sure if the demo files for the nios project are correct or do I update the files to the top level files, but most of those are incompatible for one reason or another. I have been working on this for months and it is a total struggle.
Thanks,
Martin

It seems like you are executing an unimplemented instruction - can you try to find what instruction is that?

This post seems like the exact same problem. Does the solution there work for you?

Hi, Thank you, I have seen the post you refer to and did set hal.enable_runtime_stack_checking set to false
I don’t know what instruction it would be causing the problem as it does not have any trace or stack info at this point.
I’m concerned that freeRtos is not stable enough for use.
Martin

You can examine the register ea for the address of the instruction that caused this exception. And then you can find the instruction at that address - one way to do that it to use the disassembly view in your IDE.