Shadow registers in PIC32MZ EF

pbeam wrote on Tuesday, April 12, 2016:

In trying to make things better and more efficient, I am totally unable to get shadow registers to work for an interrupt that does not use FreeRTOS services. The effect I get is that FreeRTOS “crashes” when malloc fails – which is really not the problem.

I have an ISR that interrupts at 200 Khz, so overhead is a problem. This does not work:

void __attribute ((interrupt(IPL7SRS), no_fpu, vector(_EXTERNAL_2_VECTOR))) Int2_ISR(void) {
//void Int2_ISR(void) {
unsigned flag = 0;
if (!SPI1STATbits.SPIRBE) {

This does work:

void __attribute ((interrupt(IPL7AUTO), no_fpu, vector(_EXTERNAL_2_VECTOR))) Int2_ISR(void) {
//void Int2_ISR(void) {
unsigned flag = 0;
if (!SPI1STATbits.SPIRBE) {

Obviously, the IPL7SRS has lower overhead, so it is desirable. I have defined the PRISS bits so that interrupt priority 7 uses shadow register 7. (I have also used the no_fpu attribute since FreeRTOS v8.2.3 does not handle the FPU properly, and I don’t use it anyway.)

Should this work? I have looked at the assembly with IPL7AUTO, and it does seem to recognize the shadow registers are available as it skips a number of PUSH instructions. What makes this more critical is that I need to service an A/D at 1Mhz, so the shadow registers would surely help.

rtel wrote on Wednesday, April 13, 2016:

The compiler specific level of detail is outside of the scope of my knowledge, but I have taken advise on this and…

It should be okay, the prolog and epilog for ISR entry is clean.
The XC32 compiler will set up the IRQs so that each one gets its own shadow set - but it will depend on the compiler version.

pbeam wrote on Wednesday, April 13, 2016:

This is what my XC32 v 1.40 does. With IPL7SRS, there is essentially no prolog:

!//void Int2_ISR(void) {
! unsigned flag = 0;
! if (!SPI1STATbits.SPIRBE) {
0x9D00CEF8: LUI V0, -16510
0x9D00CEFC: LW V0, 4112(V0)
0x9D00CF00: ANDI V0, V0, 32
0x9D00CF04: BNE V0, ZERO, 0x9D00CFEC
0x9D00CF08: LUI V0, -16511
! ADC_Value.data32 = SPI1BUF;
0x9D00CF84: LUI V0, -16510
0x9D00CF88: LW V1, 4128(V0)
0x9D00CF8C: SW V1, -32580(GP)
! flag = 1;

This causes FreeRTOS to bomb. With IPL7AUTO, I get all this stuff:
!void attribute ((interrupt(IPL7AUTO), no_fpu, vector(_EXTERNAL_2_VECTOR))) Int2_ISR(void) {

0x9D00A0F4: RDPGPR SP, SP
0x9D00A0F8: MFC0 K1, EPC
0x9D00A0FC: MFC0 K0, SRSCtl
0x9D00A100: ADDIU SP, SP, -136
0x9D00A104: MFC0 K1, Status
0x9D00A108: SW K0, 128(SP)
0x9D00A10C: SW K1, 132(SP)
0x9D00A110: INS K1, ZERO, 1, 15
0x9D00A114: INS K1, ZERO, 29, 1
0x9D00A118: ORI K1, K1, 7168
0x9D00A11C: INS K1, ZERO, 29, 1
0x9D00A120: MTC0 K1, Status
0x9D00A124: SW V1, 24(SP)
0x9D00A128: SW V0, 20(SP)
0x9D00A12C: LW V1, 128(SP)
0x9D00A130: ANDI V1, V1, 15
0x9D00A134: BNE V1, ZERO, 0x9D00A17C
0x9D00A138: NOP
0x9D00A13C: SW RA, 84(SP)
0x9D00A140: SW T9, 80(SP)
0x9D00A144: SW T8, 76(SP)
0x9D00A148: SW T7, 72(SP)
0x9D00A14C: SW T6, 68(SP)
0x9D00A150: SW T5, 64(SP)
0x9D00A154: SW T4, 60(SP)
0x9D00A158: SW T3, 56(SP)
0x9D00A15C: SW T2, 52(SP)
0x9D00A160: SW T1, 48(SP)
0x9D00A164: SW T0, 44(SP)
0x9D00A168: SW A3, 40(SP)
0x9D00A16C: SW A2, 36(SP)
0x9D00A170: SW A1, 32(SP)
0x9D00A174: SW A0, 28(SP)
0x9D00A178: SW AT, 16(SP)
0x9D00A17C: MFLO V0
0x9D00A180: SW V0, 116(SP)
0x9D00A184: MFHI V1
0x9D00A188: SW V1, 112(SP)
0x9D00A18C: MFLO V0
0x9D00A190: SW V0, 108(SP)
0x9D00A194: MFHI V1
0x9D00A198: SW V1, 104(SP)
0x9D00A19C: MFLO V0
0x9D00A1A0: SW V0, 100(SP)
0x9D00A1A4: MFHI V1
0x9D00A1A8: SW V1, 96(SP)
0x9D00A1AC: MFLO V0
0x9D00A1B0: SW V0, 92(SP)
0x9D00A1B4: MFHI V1
0x9D00A1B8: SW V1, 88(SP)
0x9D00A1BC: RDDSP V1, 0x3F
0x9D00A1C0: SW V1, 12(SP)

Things work, but there is, of course, the additional overhead. The processor does take the jump at 0x9D00A134 so it skips all the SW instructions. Apparently, however, the compiler is not saving something important.

I’m probably just going to deal with the overhead.

rtel wrote on Wednesday, April 13, 2016:

I’m told:

“The lack of prolog is what might be expected… you don’t have any regs
that need saving and only gp needs copying over…but it is not clear
why FreeRTOS would have a problem if the kernel is indeed running down
at IPL1… it should not be possible to interrupt IPL7 and when the IPL7
ISR returns all should be clean again”

pbeam wrote on Wednesday, April 13, 2016:

I think I might have figured it out. For posterity, XC32 seems to have a bug with optimization > 0. As you can see from my last message, there is absolutely no prolog, which means the Hi and Lo registers are not saved amongst other things. Changing to optimization of 0 generates this prolog:

void __attribute ((interrupt(IPL7SRS), no_fpu, vector(_EXTERNAL_2_VECTOR))) Int2_ISR(void) {
0x9D00F82C: RDPGPR SP, SP
0x9D00F830: MFC0 K0, EPC
0x9D00F834: MFC0 K1, Status
0x9D00F838: ADDIU SP, SP, -72
0x9D00F83C: SW K0, 68(SP)
0x9D00F840: MFC0 K0, SRSCtl
0x9D00F844: SW K1, 64(SP)
0x9D00F848: SW K0, 60(SP)
0x9D00F84C: INS K1, ZERO, 1, 15
0x9D00F850: INS K1, ZERO, 29, 1
0x9D00F854: ORI K1, K1, 7168
0x9D00F858: MTC0 K1, Status
0x9D00F85C: MFLO V0
0x9D00F860: SW V0, 52(SP)
0x9D00F864: MFHI V1
0x9D00F868: SW V1, 48(SP)
0x9D00F86C: MFLO V0
0x9D00F870: SW V0, 44(SP)
0x9D00F874: MFHI V1
0x9D00F878: SW V1, 40(SP)
0x9D00F87C: MFLO V0
0x9D00F880: SW V0, 36(SP)
0x9D00F884: MFHI V1
0x9D00F888: SW V1, 32(SP)
0x9D00F88C: MFLO V0
0x9D00F890: SW V0, 28(SP)
0x9D00F894: MFHI V1
0x9D00F898: SW V1, 24(SP)
0x9D00F89C: RDDSP V1, 0x3F
0x9D00F8A0: SW V1, 56(SP)
0x9D00F8A4: ADDU S8, SP, ZERO
!//void Int2_ISR(void) {
! unsigned flag = 0;
0x9D00F8A8: SW ZERO, 16(S8)

This is a little better than AUTO.

pbeam wrote on Wednesday, April 13, 2016:

For posterity, from Microchip comes the following statement:

It looks like there is an issue in v1.40 that is addressed for the upcoming release.

The issue is related to a new optimization introduced in GCC 4.8. If the code in the ISR triggers the optimization, some necessary context-saving code can get optimized away.

A workaround in XC32 v1.40 for the issue reported in the web forum thread would be to add this attribute to the function:

extern volatile int foo;

void attribute((interrupt(IPL7SRS),optimize("-fno-shrink-wrap"))) IntHandlerChangeNotification(void)
{
if ((volatile unsigned int)0xfeedface)
(void)foo;
}

Hopefully, this will save someone else the same pain I just went through. Not being a MIPS guru, I’m not sure why all of this is needed, but with optimization, shadow register, and -fno-shrink-wrap, the following prolog is produced, which appears to work:

void __attribute ((interrupt(IPL7SRS), no_fpu, vector(_EXTERNAL_2_VECTOR), optimize("-fno-shrink-wrap"))) Int2_ISR(void) {
0x9D00CEF8: RDPGPR SP, SP
0x9D00CEFC: MFC0 K0, EPC
0x9D00CF00: MFC0 K1, Status
0x9D00CF04: ADDIU SP, SP, -64
0x9D00CF08: SW K0, 60(SP)
0x9D00CF0C: MFC0 K0, SRSCtl
0x9D00CF10: SW K1, 56(SP)
0x9D00CF14: SW K0, 52(SP)
0x9D00CF18: INS K1, ZERO, 1, 15
0x9D00CF1C: INS K1, ZERO, 29, 1
0x9D00CF20: ORI K1, K1, 7168
0x9D00CF24: MTC0 K1, Status
0x9D00CF28: MFLO V0
0x9D00CF2C: SW V0, 44(SP)
0x9D00CF30: MFHI V1
0x9D00CF34: SW V1, 40(SP)
0x9D00CF38: MFLO V0
0x9D00CF3C: SW V0, 36(SP)
0x9D00CF40: MFHI V1
0x9D00CF44: SW V1, 32(SP)
0x9D00CF48: MFLO V0
0x9D00CF4C: SW V0, 28(SP)
0x9D00CF50: MFHI V1
0x9D00CF54: SW V1, 24(SP)
0x9D00CF58: MFLO V0
0x9D00CF5C: SW V0, 20(SP)
0x9D00CF60: MFHI V1
0x9D00CF64: SW V1, 16(SP)
0x9D00CF68: RDDSP V1, 0x3F
0x9D00CF6C: SW V1, 48(SP)

I’ll leave it till another time (or another person) to figure out why so much has to be saved and restored