xQueues are _slow_ and in between tasks even slower?

Full repo here: Sparkles-New/Sparkles at master · hdsjulian/Sparkles-New · GitHub
with main.cpp being in Sparkles-New/Sparkles/src/Client-Device at master · hdsjulian/Sparkles-New · GitHub
and the libraries being in Sparkles-New/Sparkles/lib at master · hdsjulian/Sparkles-New · GitHub

I have an ESP32 hooked up to a midi keyboard that sends an esp_now message upon midi command.
The code in the repo i linked does the following, where important points in time are marked with greek letters

  • I spin up two tasks: a Message handler task and a LED handler task
  • In the esp_now receive callback function i place the incoming message in the MessageHandler queue (ALPHA). In the messaging task i read this queue (BETA) and upon receiving place them into the ledHandler’s queue.
  • In the LedHandler i read from that Queue(GAMMA) and set up whatever needs to be set up in order for the LED to blink. Then i let the LED Blink (DELTA).

Now the problem is, that between DELTA and ALPHA lie around 85ms. Which is an insanely long time tbh, and also unacceptable.
Between BETA and ALPHA lie 13ms, between GAMMA and BETA Lie about 42ms.

(i marked the points in the code with ALPHA, BETA, GAMMA, DELTA, TOO, they are
ALPHA = messageHandler.cpp line 33
BETA = messageHandler.cpp line 53
GAMMA = ledHandler.cpp lines 82 or 93 (depending on whether an animation is already running or not)
DELTA = ledHandler.cpp line 152

I’d be happy to hear about any pointers. If the answer is "sorry, you’re outta luck, xQueues are just very very slow because that’s how FreeRTOS works, i’d be happy to hear any solution that is quicker and still runs on separate tasks rather than one huge one.

Queues are not ‘slow’ by design and the latencies you reported are huge.
Did you try measure without (serial ?) logging ? There must be something else going on slowing down your communication.
Also is it intentional that both tasks have the same prio ?
What’s the type and the clock of your MCU ?

Again, your tool of choice is tracealyzer or compatible. It will show you exactly where every clock tick cycle goes.

so i had a lot of reallifing to do today and just had time to do a really quick and dirty "remove all ESP_LOGI instances and pass an unsigned long timestamp instead of a mroe complex object and suddenly the whole time to push data through two queues in two different tasks was down to ~70 µs.
My assumption is (and i’d be interested if anyone can confirm) that the ESP_LOGI macro somehow does some really ungodly stuff with a task that blocks everything and does a lot of task switching stuff.
I could also tell that a single ESP_LOGI within an xTask takes about 6.5ms. So thanks for pointing this out, i guess i’ll just make sure i remove all ESP_LOGI calls before releasing this thing.

yes, wrappers and HALs are notorious for all kinds of strange effects. Most people will sooner or later abandon them in favor of barebone RTOS coding.

Glad you figured it out!

Great that you found the reason and it’s just the logging. Since ESP_LOGI is a macro I guess it‘s #defined to nothing for Release builds. So no need to remove them from the code.