Bug and fix: Microchip dsPIC30F4011 port can cause address fault when disabling interrupts

e-christenson wrote on Saturday, August 08, 2015:

I’d like to report a bug and a bugfix for certain Microchip dsPIC30F devices, including but not limited to the dsPIC30F4011/4013 and 5011/5013 – these are the processors I have worked with.

Specifically, the freeRTOS can crash when entering critical sections or disabling interrupts due to errata within the chip’s interrupt controller if the interrupt being disabled happens exactly at the same time it is disabled. The port, whether in assembler or C, should correct for this issue by using the disi instruction before and after changing the interrupt privelege level or disabling interrupts.

My tools: FreeRTOSv8.2.1 official and MPLAB-X with XC-16r1.24.
Reference: http://ww1.microchip.com/downloads/en/DeviceDoc/80453G.pdf (dsPIC30F5011/5013 Family Silicon Errata and Data Sheet Clarification) section 10, Interrupt Controller and section 11, disi instruction. see also XC-16 compiler switch -mErrata=all.

These sections state that the CPU can generate an address trap when an interrupt occurs exactly as it is disabled…whether through increasing the CPU IPL (interrupt privelege level) or by turning off its enable flag or its status bit…and that the disi xxx instruction (disable interrupts for xxx CPU cycles) will fail if it is executed exactly on the last disabled cycle.

The current (8.2.1) “official” portDISABLE_INTERRUPTS and portENTER_CRITICAL macros simply manipulate the IPL.

Microchip XC-16 compiler documentation recommends using a compiler-supplied macro,
SET_CPU_IPL( new privelege level)
instead of writing directly to the IPL bits in the CPU status register SR. This macro covers every conceivable case of setting the CPU IPL, including the case where the code changing the IPL is itself in a section with interrupts previously disabled.

In my own code, since I need very short critical sections only, typically for using buffers or testing and/or setting setting mutexes, I simply write:

__builtin_disi(0x3FF); // same as "asm disi 0x3ff", meaning
                    // disable interrupts for some large number of cycles.
__builtin_disi(4);  // required time when interrupt shouldn't happen

or, more often, something like:

    IECxbits.U2TXIE=0;  // disable UART 2 Transmitter interrupt...
                       // IECxbits because x is not important here..
Serial_Xmit_Buffer[head++] =character_to_transmit;
IECxbits.U2TXIE = 1;

And yes, I am a careful ordering of operations here can be used to avoid a critical section.

rtel wrote on Sunday, August 09, 2015:

Thanks for pointing this out. I will take a look at those functions to see how they are implemented before deciding the best approach for this.

e-christenson wrote on Sunday, August 09, 2015:

I gave you a bug ticket, too, and listed affected CPUs. As noted, there are multiple solutions…especially once you start asking the solutions to work in multiple build environments. That makes me think that assembler language is probably the best route, as it will be most portable. The CPU overhead for a function call isn’t much.


tlafleur wrote on Sunday, August 09, 2015:


As you make these changes, it would be nice to also incorporate the changes needed for the dsPIC EP series. As outline in FreeRTOS interactive site.


e-christenson wrote on Sunday, August 09, 2015:

Do go check the errata for your chip(s) and let us know which issues affect freeRTOS.

By the way, I am also up to an issue with the documentation for the Microchip XC16 v1.24 compiler library – specifically, it states that a heap from the linker is required when using stream output functions such as printf. This isn’t the whole story – the heap is required only when setvbuf, which has to be supplied by the project, allocates memory for the stream using malloc. There’s no problem if the call sets no buffering for the stream, using _ IONBF. I am pretty sure that a statically allocated buffer would also work to avoid a heap.

I will try to get that into Microchip on Monday.

rtel wrote on Monday, August 10, 2015:

It looks like the SET_CPU_IPL macro is defined individually for each chip, so is there any reason why the code could not be changed as follows, to ensure the additional code for the errata work around is not included when chips that don’t have the errata are used:


Also, is any errata workaround needed in the vPortYield function where interrupts are disabled?


e-christenson wrote on Tuesday, August 11, 2015:

SET_CPU_IPL is super-portable, at the cost of some extra instructions. I am doubtful that the time will matter at typical clock speeds of thousands of instructions per millisecond (1ms is the shortest supported freeRTOS tick for a dsPIC).

Although it looks like vportYIELD is not a critical section, I’m a bit concerned about popping the SR (status register, with the CPU interrupt priority level) and somehow masking some interrupts.

I hope the following is a pathological example: the yielding task raises its CPU priority level, carefully avoiding the erratum, and then yields. “Task X” is at the same freeRTOS priority level and is resumed at the lower CPU IPL. Being well-behaved and short, “Task X” also yields, before the timer tick, after having found out it had no work to do. So the ielding task is resumed, and, just as the Status Register is popped, the XYZZY interrupt, which is at the base IPL of “Task X” happens. Bingo…address trap per erratum – XYZZY interrupt is being disabled the cycle before it happens.

Note that I have to work at this…the system I am actually working on won’t be doing any yielding or being preempted with its CPU IPL at anything other than zero.

In my own code, I probably would surround that “POP SR” instruction with disi (disable interrupts for the next N cycles) instructions on defensive grounds:

disi 0x3FFF
pop SR
disi 0x04

But I am implicitly assuming (unlike the Microchip macros) that all sections under disi are extremely short and never nest.

It’s worth noting that the disi instruction has its own errata…that go beyond the specifically affected chips. (This is from a sampling of dsPIC 24 and dsPIC33F chips, which don’t have the original erratum). Within the dsPIC30F chips, the disi instruction tends not to work if it is executed exactly as the special function register with the remaining disable count reaches zero. The other families will often only update the count register if it is zero.

In more ordinary code, the taskYIELD can be interrupted by the scheduler…in the current port, all tasks run at IPL 0, and the scheduler interrupt runs at IPL 1, so reducing the IPL back to 0 will not be a problem. Nothing of a lower freeRTOS priority can run, so restoring the IPL should not be an issue.

Apologies for the equivocation…