dsPIC33CK, Address Error with high interrupt rates

Hello,
i tried to condensate the issue in the title. A little background:
I have an application running on a dsPIC33CK64MP502 (dsPIC33CK core, 64k Flash, 8k RAM)
The FreeRTOS port is the v9.0.0 present on the microchip Github: GitHub - MicrochipTech/freeRTOS-PIC24-dsPIC-PIC32MM: This repository contains the freeRTOS demos for Microchip device families like PIC24, dsPIC33E, dsPIC33F , dsPIC33C and PIC32MM.
I should mention that i do not use the Heap, all kernel objects are allocated statically and all interrupts that deal with kernel objects have the same priority as the kernel timer interrupt.

The application uses CANBus for communication and in order to test its robustness i began flooding the bus with frames that have an ID that would be accepted by the application.

The sequence of operations is
CAN Interrupt happens.
If RX Interrupt check which FIFO has data available, read the data into a CAN Frame structure (declared as a local variable inside the interrupt) then put the frame inside a queue (using the ISR version of the call).
End of CAN Interrupt.

There are multiple queues, the number of the FIFO decides which queue will be filled.
There are other tasks that await data from the queues.

I’ve noticed that when the bus traffic is reasonable (occasional bursts, spaced some tens of ms) the Can Frame variable inside the ISR is always allocated at the same address (and it seems to be inside the IDLE Task stack)
when there bus is being flooded by a constant stream of frames every 1ms the can frame variable is placed at an higher address.
Up to some point (doesn’t happen sistematically) until an Address error trap is generated.
Unfortunately i can’t investigate the trap as the stack pointer is corrupted (shows 0x04 instead of a valid address, greater than 0x1000) and i haven’t been able to find the exact moment when the stack is corrupted.

I assume the problem lies somewhere in the CAN interrupt and in the context switching.

Here are the questions:
1)Am i correct in my assumptions that the interrupt uses the IDLE Task Stack?
2)If so, can i change the stack to the “CAN Task”?
3)I am wondering why the address of the Can Frame variable in the ISR changes depending on the Interrupt rate, what could be a reason for it? because i think i might be overflowing the IDLE Task stack even though the kernel doesn’t detect the overflow

My freeRTOSConfig.h file:

#define configSUPPORT_DYNAMIC_ALLOCATION          0
#define configUSE_PREEMPTION                      1
#define configUSE_IDLE_HOOK                       1
#define configUSE_TICK_HOOK                       1
#define configTICK_RATE_HZ                        ((TickType_t)1000)
#define configCPU_CLOCK_HZ                        ((unsigned long)4000000) //Fosc/2
#define configMAX_PRIORITIES                      (4)
#define configMINIMAL_STACK_SIZE                  (300)
//#define configTOTAL_HEAP_SIZE                     ((size_t)5000)
#define configMAX_TASK_NAME_LEN                   (16)
#define configQUEUE_REGISTRY_SIZE                 (16)
#define configUSE_16_BIT_TICKS                    1
#define configIDLE_SHOULD_YIELD                   1
#define configUSE_MUTEXES                         1
#define configUSE_TIMERS                          1

#define configTIMER_TASK_PRIORITY                 13
#define configTIMER_QUEUE_LENGTH                  4
#define configTIMER_TASK_STACK_DEPTH              configMINIMAL_STACK_SIZE

#define configUSE_CO_ROUTINES                     0

/* Set the following definitions to 1 to include the API function, or zero
to exclude the API function. */

#define INCLUDE_vTaskPrioritySet                  0
#define INCLUDE_uxTaskPriorityGet                 0
#define INCLUDE_vTaskDelete                       0
#define INCLUDE_vTaskCleanUpResources             0
#define INCLUDE_vTaskSuspend                      1
#define INCLUDE_vTaskDelayUntil                   1
#define INCLUDE_vTaskDelay                        1
#define INCLUDE_xEventGroupSetBitFromISR          1
#define INCLUDE_xTimerPendFunctionCall            1
#define INCLUDE_xSemaphoreGetMutexHolder          1
#define INCLUDE_uxTaskGetStackHighWaterMark       1

#define configKERNEL_INTERRUPT_PRIORITY           0x01

#define pdMS_TO_TICKS(xTimeInMs)                  ((TickType_t)(((TickType_t)(xTimeInMs))) * (((TickType_t)configTICK_RATE_HZ)/(TickType_t)1000))

#define configCHECK_FOR_STACK_OVERFLOW            2

#define configASSERT( x )                         __conditional_software_breakpoint( x )

//Priorities
#define appPRIORITY_IDLE                          1
#define appPRIORITY_LOW                           2
#define appPRIORITY_MID                           2
#define appPRIORITY_HIGH                          3

Hello Jack,

In general, ISRs run using a “system” or ISR stack which is separate from the stack used by any particular task (including the IDLE task). My knowledge of the 16-bit PIC architecture is quite limited but I did notice a few things:

Your config does not mention configSUPPORT_STATIC_ALLOCATION (which defaults to 0) and also sets configSUPPORT_DYNAMIC_ALLOCATION to 0, which should result in a compilation error. When configSUPPORT_STATIC_ALLOCATION is enabled, the kernel calls vApplicationGetIdleTaskMemory to determine where to store the stack of the IDLE task.

In port.c, the xISRStack is declared to be of size configISR_STACK_SIZE. Perhaps this config item needs to be increased for your use case? This seems to also be missing from your FreeRTOSConfig.h

To address your questions directly:

Usually this is not the case. From a brief look at the port code, I assume that ISRs run using the xISRStack. It depends on the implementation of your CAN bus ISR and if it quickly hands off data to a high priority task for processing. Can you share the code used for CAN bus communications?

Can you double check that you provided the correct FreeRTOSConfig.h?

It’s hard to say without looking at the code in question. It could be due to interrupt nesting. The dsPIC33 reference manual has a helpful description on page 13.

Hello Paul,
thanks for answering.
Weird. It seems i copied the file from the second line.
The first line was the define of configSUPPORT_STATIC_ALLOCATION (which is defined to “1”)

I have to think about the separate ISR stack, as AFAIU when an interrupt happens the return address (the instruction following the ISR call) is pushed onto the stack, then prologue - code - epilogue - return. This of course should mean that the ISR could possibly corrupt any stack that is being used at the moment, hence there must be some way for the kernel to change the stack pointer from the task stack to another stack when an interrupt happens. I shall study port.c more carefully then, because i was also not aware of the existance “ISRStack” and it’s obvious necessity until i began typing this reply. It’s strange that i missed something so fundamental when reading the docs. (my main source of learning is the free book and the online documentation)
But this sparks another question: what if my application also have parts of the code that live outside of the RTOS (an usual situation is protocol emulation, hard realtime)? Don’t answer yet, i want to study the subject over the weekend

Unfortunately i can’t share the CAN code at the moment, but i have already ruled out memory leaks from those functions, there are two things happening: the received frame is being copied from the peripheral to another structure in which the data is formatted to be “platform agnostic” - the ID is reassembled, the flags are derived. There is no memcpy or operations on pointers/arrays that could run away. Then the frame is added to a FreeRTOS queue using the appropriate ISR function

I can also safely rule out interrupt nesting in this case, as all interrupts share the same priority, and interrupts with the same priority can’t preempt each other. Although that would explain everything.

Thanks for the help, i will study some more and report back

From my memory, and its been a while since I have done much with the dsPIC family, the processor doesn’t use a separate ISR stack (as the processor didn’t support it) and the port doesn’t support interrupt nesting.

The ISR will always use the stack of whatever task was running, so ALL tasks need enough slack space on their stack to handle the worse case ISR.

If somehow the ISR re-enables the interrupts, and thus allows the interrupts to nest, that could cause the ISR stack to keep on growing. I seem to remember the way the end of ISR task switching occurred, if that was done incorrectly, (by not clearing the flag on entry to the ISR) it could in extreme cases cause a problem of growing stack.

1 Like