Crash after some hours when interrupts are enabled

No, you need to get this data from the RAM of Cortex-M to your PC and then use the Tracelyzer PC application to visualize this data.

How can I read the data from the Cortex-M RAM to my PC without a debug probe?

Ah, I see. Good point - don’t think I have a definite answer but is it possible to access the Cortex-M RAM from Cortex-A and then transfer to PC? If the Cortex-M core has network connectivity, you can use TCP or UDP stream ports of Tracelyzer.

No, because the Cortex-M RAM is reserved to it and so Cortex-A doesn’t “see” it.
I would need either a mechanism of inter-processor communication or a mechanism to write data from Cortex-M to a hardware peripheral (UART, SPI, …).
This second approach is more or less what I do now with my poor-man logging mechanism to UART.
Sometimes in the past I used rokath/trice to speed up logging (for other projects).

That should likely work!

One thought would be to put the trace buffer into shared memory instead of the internal private memory.

Sometimes there is a “back-door” method for the Cortex-A to access the “private” memory at some other address via a slower method.

An update.
I’ve just found one interesting log after days: a semaphore cannot be taken anymore.
And so I suppose that when I had portMAX_DELAY waiting forver, the application was blocked.
Now after a long timeout it goes on, but that specific semaphore cannot be taken anymore.

Based on your experience, can this happen only if I didin’t release it at the previous call?
Or it it possible some “semaphore overflow” or something like that, after having taken/released it successfully hundreds of milions of times?

Semaphores only store a limited amount of state, so don’t “overflow” from being used too much.

Note that binary semaphores only have two states, so if you give it twice, you can’t now take it twice (that would need a counting semaphore).

It is also possible that the task that is supposed to give the semaphore is waiting on something else, and you hit a deadlock, or something corrupted the scheduler, and it has been dropped from the scheduling.

You may need to look at who was supposed to give the semaphore and see why it didn’t get there.

The phrasing “release it at the previous call” makes me wonder if you actually are using it as a mutex, and if so, if it IS a mutex, it will store what task owns it, which can help you figure out what might have gone wrong. Or, you may need to add some logging to get the recent history before the system gets stuck.

Hi @richard-damon

You’re right.
It’s a mutex, and I create it with xSemaphoreCreateMutex().
I use it to avoid concurrent access to a big buffer from different tasks.
And so I have two different tasks that take and give the mutex (just before and after having used the “shared” buffer).
Is this a supported scenario?

Is there a way to get the task that owns it (since you wrote it stores this info)?

In the Queue structure for the Mutex, is the xSemaphore object as part of a union that has a xMutxHolder, which is the Task Handle of the task that holds the mutex.

Yes, this is a perfectly valid use of mutex. As @richard-damon suggested, check the owner of mutex. Is it possible that you end up attempting to take mutex recursively?

HI @aggarg
sorry for late answer.
I added logging related to the mutext owner, but now I don’t see any log message anymore when the Cortex-M stops working.
I doesn’t see any log message.
Do you have any idea on what I could check after the issue happens (a kind of post-mortem debugging)?
I attach my FreeRTOS config files.
Do the seem ok to you?
FreeRTOSConfig.h (6.4 KB)
FreeRTOSConfigBoard.h (3.5 KB)

Yes, they look okay. You can try increasing the value of configMINIMAL_STACK_SIZE:

#define configMINIMAL_STACK_SIZE                  (256)

You can use the debugger to examine the mutex which becomes unusable (i.e. it cannot be taken anymore).