Hi,
I got troubles with the xTimerQueue becoming full after several runtime hours. Unfortunately, this leads to unmanaged events in the software application, that finally crashes.
How could I prevent from this queue being full ?
Here are more info about the system :
configTIMER_QUEUE_LENGTH = 255
The “Timer Queue full” always occurred in the call xEventGroupSetBitsFromISR() while in an IRQ handler for reception from the CAN controller (the load of the bus does not exceed 26%).
Any idea about the way to investigate this trouble is highly welcome.
Many thanks, in advance, for any support.
Regards
xEventGroupSetBitsFromISRuses the timer task to defer the actual action to task level.
Either bump the timer queue depth to cope with your interrupt / event processing rate or maybe replace the event group with leaner RTOS task notifications - FreeRTOS™.
Out of curiosity: Why do you disable time slicing ?
Thanks for the answer.
Event group is used instead of notification in order to be able to distinguish multiple notification reasons.
Time slicing has been disabled to prevent the tasks being interrupted out of control. We made this choice by design for every task to get its job terminated before yielding in a controlled way. Could it be a mistake ? So far it’s rather handy.
My first thought is that you must have some issue with one of your timer call backs or pended functions taking too long to keep you from clearing the timer queue. It can be noted that the normal guideline is that these functions should be short and fast, and never “block”, as while they run, no other action can happen in the timer task.
With your system, the Timer Task should probably be the highest priority task, and the sole task at that priority so emptying the queue is a priority event. You could make an exception for very short running tasks that quickly yield/block.
No. Due to the building tool (STM32CubeMX). But you’re right: it could be higher.
But I’m afraid that increasing this value would just delay the trouble. The (tool-)default value was 16. I would prefer to find out why the Timer Queue size does not decrease.
Thanks richard-damon.
It was my first thought too. Every IRQ handler has been checked and respects the time constraint you mentioned. It always consists in reading data from the controller interface (DMA buffer or interface registers) and post them in a queue.
in your ISRs, right ? That would be important for optimal responsiveness.
Are there some (CAN) I/O bursts overloading your MCU or is it already almost fully occupied ?
Yes portYIELD_FROM_ISR() is used correctly, i.e. as described in the doc (actually this code from CMSIS v2.00).
There are some CAN I/O bursts, but they do not load the MCU at all (the system timing is observed with the SEGGERS System View tool).
Ok - but it’s strange that your MCU is not hogged by I/O processing but you stumble into a event group backlog with the timer task having the highest prio (as strongly recommened).
That’s odd … do you also use real timers (somehow blocking timer task processing) ?
Yes. They generate IRQ with the only purpose to rate the system components by raising flags in event groups.
Actually the application does not use software timers.
Not IRQ handlers, but timer callbacks. If any function that is used as a timer callback takes significant time or blocks, will delay the operation of the Timer Task, as it is just a single task, so while the callbacks are running, it can’t handle more messages.
If you don’t use software timers or pended functions this isn’t the problem.
The other case was the Timer task needs to be higher in task priority than all the tasks you are starting with the events, as you say your settings make them run to completion, and thus block any other events from being processed.
I’m a bit confused. Since I do not use any software timer, I assume that the only timer callbacks were due to the posting actions in the IRQ handler (from all the XXXFromISR() functions).
Am I wrong ? Where can I find more info about the possible timer callbacks ?
If your only use of the Timer Task is the implicit one from xEventGroupSetBitsFromISR, then that isn’t your issue. But having the Timer Task not the highest could be.
Then the only way to fill the queue would be some task at 32 or above is hogging the system, or you ISRs are firing fast enough that you can’t process the events.
The fact that it only occurs after a long time, says it may be some rare occurrence that causes this state. You might see if you can trap the system by detecting the error of the queue being full, and then examine what happened then.
Many thanks for your support @richard-damon.
While investigating, I discovered another queue was corrupted. From one thing to the next, the corruption was the consequence of a stack overflow !
I could trap the system and could check that everything sounds fine now for tasks and queues.
I have no idea how to investigate further. What should I examine ?
Any help is welcome !