Core 1 on RP2040 SMP keeps running when Core 0 is stopped

danielecht · September 23, 2024, 9:39am

I’ve been observing strange behavior when trying to flash a new binary file to an RP2040 controller using an JLink Debugger.

I am using FreeRTOS SMP and have nailed the issue down to the following behavior:
I have multiple tasks running on both cores.
All tasks are pinned to a single core using core Affinity.
When halting Core 0 through JLink, Core 1 keeps running.
This causes the debugger to fail reserving the required (RAM) memory for the flashing operation, since Core 1 keeps executing code and using at least part of the required memory.

I believe this behavior is a bug and somewhat dangerous since Core 1 keeps executing code without a scheduler. In my opinion Core 1 continuing execution without a scheduler will define in undefined behavior since there is no control over which tasks are executed.

Is there a way to configure this or is this just a bug?

richard-damon · September 23, 2024, 1:25pm

Personally that sounds like a JLink (or processor) bug, or perhaps a configuration for JLink error. I would expect that this sort of operation either needs to RESET all the processors, or the processor needs some “dedicated” (or at least documented) memory for this sort of operation.

This may be something to bring up to the JLink tech support.

danielecht · September 23, 2024, 1:44pm

I agree that this is an Issue with the JLink. JLink treats the RP2040s cores as two separate targets and halts them independently of each other. I can probably configure it in a way that both cores are haltet before starting to debug.

However in my opinion FreeRTOS should offer an (optional) mechanism to have cores run into an assertion if the other cores are not running. If i.e. one Core runs into a hardfault the other core needs to ensure that it halts itself, since running indipendently in an SMP context is undefined behavior.

richard-damon · September 23, 2024, 2:14pm

The question comes, how do you quickly determine that the other core isn’t running?

If the JLink treats the processor as just two separate processors, but they are two linked processors, them the JLink is just wrong.

Note, the JLink has the ability to have a significant configuration file, and it could be that something needs to be done to set that up correctly so that it halts the other processor too.

danielecht · September 26, 2024, 8:32am

I think you are right that JLink treating the cores seperately is questionable. OpenOCD combines both cores into one virtual target. However RP2040 also supports running both cores independently so JLink having the option to do so is not completely wrong.

The steps I’ve taken to verify how cores are running were as follows:

I have debug sessions for both cores, one for core 0 one for core 1. If I understand JLinks docs correctly this is the way multicore debugging always works for JLink. I can then halt the cores seperately by setting a breakpoint. However the other core keeps running. I can verify that by pausing the other core and seeing it contiued execution, as well as observing serial outputs this core produces.
However I have not yet understood if the single running core just keeps executing the last running task or if somehow it somehow keeps scheduling.

richard-damon · September 26, 2024, 1:08pm

The scheduler runs in each core, so core 1 can switch from task to task. What won’t happen is the tick advancing time, since that is normally assigned to core 0.

Also, if core 1 sees that a task can be started in core 0, and signals it, that signal will be, of course, ignored.