Edit: I’ve narrowed this problem down further, and I’ll post a separate topic about it. Keeping this here for reference, and because I don’t see a way to delete it.
I don’t know the internals of FreeRTOS very well and I need some help understanding what I’m seeing. I’m running FreeRTOS 10 on a Kinetis K22F (Cortex-M4F core), and I’ve been chasing down an intermittent usage fault that tends to pop up after several hours.
I finally got some useful trace information by setting a pair of comparators to watch for pxCurrentTCB getting set to NULL. The problem seems to be triggered by my soft UART code that runs on a timer overflow interrupt.
I can see that it executed vPortPendSVHandler() and then in vTaskSwitchContext() (after xTaskGetTickCountFromISR() and vPortValidateInterruptPriority()) it’s interrupted by the timer IRQ (ftm0_irq) in the middle of a string of ldr and str instructions.
The IRQ ran, processed samples from the DMA buffer, and produced a message sent to a queue, immediately followed by a direct to task notification. These were both done with the IRQ-safe calls, and normally the whole thing works fine. The timer IRQ has a priority of 5, equal to the max syscall priority defined in the configuration.
After the IRQ returns to vTaskSwitchContext() (specifically something in the taskSELECT_HIGHEST_PRIORITY_TASK macro), pxCurrentTCB gets set to 0. The trace buffer skids past the trigger by a few instructions, if I’m interpreting this right. I think this happens in listGET_OWNER_OF_NEXT_ENTRY, which references pxReadyTasksLists.
I’ve looked all over for stack overflows or memory corruption of any kind and I haven’t found anything. What grabs my attention is that the interrupt seems to be happening in the middle of a macro that’s checking a task list, and the task list is being modified by RTOS calls made by the IRQ.
I’m assuming the FreeRTOS kernel is correct and that the problem is something I’m doing, but I’m not sure what that might be. The main thing I’m not clear on is whether my usage (or lack thereof) of pxHigherPriorityTaskWoken is correct. Both the xTaskNotifyFromISR() and xQueueSendFromISR() calls pass an address for pxHigherPriorityTaskWoken, and previously they were followed by a portYIELD_FROM_ISR() call. I removed that because I don’t need or want a context switch in the middle of the ISR, and my understanding is that without portYIELD_FROM_ISR() it should handle the context switch at the next tick, or when the current task blocks. Is that correct? And if I do use it, on this platform does it cause an immediate context switch or does it happen upon exit from the ISR?
If that’s NOT likely to be the problem, does anyone have any suggestions on where to look next?
For the moment my goal is to tweak the code to try to create these conditions much more frequently, so I don’t have to wait 5-10 hours to see if a fix worked.