Hi all,
Thanks again for all the comments - they helped me clarify things quite a bit.
I’ve also been forced to revise what it is I was expecting from the code in post /8: essentially, I was hoping to achieve a “cascade” of tasks: ISR trigger working_01_task
as-soon-as-possible; working_01_task
triggers working_02_task
or working_03_task
again ASAP; obviously, this is not what happens in post /8.
So, I’ve done some three experiments, which I’ve added as revisions to the gist in post /8. And just to make sure, the correct gist link for post /8 is now:
I will try to provide a discussion of my experiments below, and while this thread is already getting kinda heavy - I really hope I can get some feedback, especially about the stuff I still might have misunderstood. On the other hand, all of my experiments still end up into some sort of a deadlock after some 100 ms, and I would very much hope for some hints related to that.
Thanks a ton; in my exercises so far, I was careful to copy-paste in that form, I cannot tell why I had failed in that example. In any case, I corrected this.
Many thanks for noticing this! I did not think much about this, possibly because I might have encountered other (unrelated) code, where timeout==0 meant “block indefinitely”, and due to this mixup, failed to pay enough attention.
So, first, I made xQueueReceive
timeout with portMAX_DELAY
… and nothing worked ( I believe I’d just get a single transition for led_task
, and that’s it ).
Then, I made xQueueReceive
timeout with 1 (1 FreeRTOS tick, which in my case should mean 1 ms). Now things sort of started working, as I could see pulsings for the tasks, before the program reaches “deadlock”.
However, what I found strange, was that the pulses for working_01_task
were mostly high, which I really did not expect. So I ended up into a bit of a confusion myself - until I decided to repurpose GP5 pin, to toggle after each and every command in the working_01_task
.
This is now in the gist revision:
And this is how the pulses at start look like:
So basically, this is the trouble: I wanted working_01_task
to react immediately to data from the ISR; but I also wanted to a-priori handle the situation where working_01_task
might have been prevented to handle ISR, and therefore there are more than one items in the queue - which is why I wanted to “flush” the queue in working_01_task
.
Now, this first logical problem is, that in this program, there is nothing else that could pre-empt working_01_task
in reacting: working_01_task
has the highest priority!
But leaving that logical problem aside, the immediate technical problem is that I tried to “flush the queue” as I had historically done in non-FreeRTOS code when “flushing a buffer”: read bytes in a while
loop until the buffer is empty; and basically I thought this while
loop would do it:
while (xQueueReceive(queue_01, (void *)&queuedataitem, 1) == pdTRUE) {
...
This however, does not flush the queue per se in this case; what I think happens now is:
- We end up at the
while (xQueueReceive...)
ofworking_01_task
- it starts blocking due to timeout - Since there has already been one ISR completed before the first entry into
while (xQueueReceive...)
-xQueueReceive
returns having read the byte (note the first “thick” edge of “GP5 / (working_01_task)” - Rest of the code inside the
while (xQueueReceive...)
runs, and the loop ends - We go back to the
while (xQueueReceive...)
blocking;- Now, at first I would have expected this would return pdFalse, as no ISR hit in the meantime, and queue should be empty
- However, since now our
xQueueReceive
has a timeout, it keeps blocking - And since this is already the highest priority task, there is no other higher-or-equal priority task to yield to, while this
xQueueReceive
is blocking!
- Eventually new ISR hits, queue is populated again,
xQueueReceive
returns from blocking with pdTRUE, and the new item on the queue is handled- … and eventually, again we go back to the
while (xQueueReceive...)
blocking due to timeout!
So, looking at this interpretation, my first impression would be, that also this code can in principle indefinitely.
So I am kinda puzzled why at a certain point, the xQueueReceive
returns pdFALSE at all?
The image suggests that working_01_task
holds high for approx 1 ms (I’ve also seen longer durations, that are sort of around 2 ms) - basically a multiple of the tick.
So, even if I would at first assume, that each time we hit xQueueReceive
, the 1 tick/1 ms timeout runs from start - it seems as if it starts only upon first call? And then, as long as we get pdTRUE, we don’t really re-start the timeout?!
In any case, as this does not really look like a proper “flush”, or a proper “cascade” of tasks, I decided to revise.
So, in revision https://gist.github.com/sdbbs/652d4abeda027999be9245db0035c78f/384f208147c6ecf54a232469c8bebd91cbd667a0, in working_01_task
:
- At start, the amount of items in queue is found
- A
while
loop reads exactly those amount of items, usingxQueueReceive
with timeout 0, so it doesn’t block - Items are handled, other tasks signaled
- Then
working_01_task
should “yield” to other tasks- however, since
working_01_task
is the only task with highest priority, calling taskYIELD will be useless - So the next best thing is to have the task “sleep” for a tick with
vTaskDelay((TickType_t) 1);
- however, since
This is how code pulses at start:
So, now indeed I have achieved a “cascade”:
- Either ISR →
working_01_task
→working_02_task
ASAP; - Or ISR →
working_01_task
→working_03_task
ASAP
However, now working_01_task
also explicitly obeys the delay, and indeed it only runs each tick (1ms), allowing ISR to fill the queue with approx 3-4 items.
This is maybe not bad in itself - but I was still wondering, whether I could achieve “cascade” of tasks one after another ASAP, while still having working_01_task
react immediately after every ISR.
Before I get to that, I should mention that this code also ends up in a “deadlock” - but here, when the deadlock happens, basically the ISR stops, while working_01_task
keeps running indefinitely (except since there is no ISR anymore, there is no data in queue either, and so the other tasks never get called).
So, to make working_01_task
answer ASAP, I first thought about “cancelling” the vTaskDelay
; there is xTaskAbortDelay – however, I’d have to call it from ISR, and there is no “*FromISR” version of this function.
So, I thought, the next best thing would be to:
- Have
working_01_task
“yield” by suspending itself viavTaskSuspend
- Have the ISR “wake up” the
working_01_task
viaxTaskResumeFromISR
This is in the gist revision: https://gist.github.com/sdbbs/652d4abeda027999be9245db0035c78f/ee51b8b64e107542dec91b1b68de21062205c565
This is how the pulsings behave at start:
Finally, I do have immediate response to ISR from working_01_task
, and a “cascade” to the other task(s) (they run ASAP after working_01_task
)
Except, - and this I didn’t take into account at first - now that working_01_task
can explicitly handle each ISR, the number of processed queue items in this task (reccount
) is always 1, and therefore only working_03_task
gets called.
And in this sense, my original expectation that I could achieve both ASAP reaction of working_01_task
to the ISR, and “cascade”/ASAP reaction of both working_02_task
and working_03_task
alternately, was not thought through very well.
But at least, I think I’m better aware of the pitfalls in how FreeRTOS tasks are scheduled and when they run, so I can re-think this better.
However, now we come to the actual problem:
All of these three variants start up as shown on respective screenshots - but eventually, after some time, there occurs a brief period of time where apparently all interrupts and tasks stop running; after that, there are a couple of more runs of the ISR, and then the ISR stops. Depending on each variant it is, this also means that either all tasks stop running (or in one of the examples, as mentioned, working_01_task
can keep running indefinitely).
Here is how this looks for the final variant:
So, code runs for about 90 ms, then for some reason, ISR and tasks stop for around 1.4 ms, then we have two more hits of the ISR, and then ISR stops - and in this case, since the ISR “wakes” working_01_task
(using xTaskResumeFromISR
), no other tasks are running either.
I have no idea why this happens; the backtrace I get from gdb
/openocd
here is:
Remote debugging using localhost:3333
warning: multi-threaded target stopped without sending a thread-id, using first non-exited thread
vPortRecursiveLock (uxAcquire=1, pxSpinLock=0xd000013c, ulLockNum=1)
at C:/path/to/FreeRTOS-Kernel-SMP/portable/ThirdParty/GCC/RP2040/include/portmacro.h:192
192 while ( __builtin_expect( !*pxSpinLock, 0 ) );
(gdb) bt
#0 vPortRecursiveLock (uxAcquire=1, pxSpinLock=0xd000013c, ulLockNum=1)
at C:/path/to/FreeRTOS-Kernel-SMP/portable/ThirdParty/GCC/RP2040/include/portmacro.h:192
#1 vTaskSwitchContext (xCoreID=0) at C:/path/to/FreeRTOS-Kernel-SMP/tasks.c:3880
#2 0x10000816 in isr_pendsv () at C:/path/to/FreeRTOS-Kernel-SMP/portable/ThirdParty/GCC/RP2040/port.c:402
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
I’m not sure how accurate this is, but it seems FreeRTOS ends up in a state waiting for a spinlock …
So, to summarize my questions for this:
- Can anyone see any obvious errors in my understanding of the above three examples, and in that case, help me get to the correct understanding?
- Does anyone have an explanation of why does the code end up in a “deadlock”/waiting for a spinlock after some milliseconds of running, and a suggestion on how can I prevent it?