Deadlock of task lock

Hi, I observed a deadlock of TASK_LOCK using riscv CPU, is this an known issue?

  1. Task A call xQueueSemaphoreTake()
  2. xQueueSemaphoreTake() call vTaskSuspend()
  3. vTaskSuspend() take TASK_LOCK and enable interrupt.
  4. softirq was triggered and finally vTaskSwitchContext() was called. (Still in stack of current Task A)
  5. vTaskSwitchContext() take TASK_LOCK and deadlock observed.

Possbile fix:
In vTaskSwitchContext(), before take TASK_LOCK, check uxSchedulerSuspended, if uxSchedulerSuspended is true, return immediately.

Any ideas?Is it possible a RISCV porting bug?

Are you using SMP? We do not have a SMP port for RISC-V. Where did you get the code from?

Yeah, I am using a SMP code base. Downloaded from github “FreeRTOS-Kernel.git”, branch “remotes/origin/smp”. We are debugging for why the softirq (which triggered the vTaskSwitchContext) was raised. i am not sure if there exist a bug on our riscv porting codes.

Did you write the port code yourself? In any case, can you try this latest code (which will be merged to main) - GitHub - chinglee-iot/FreeRTOS-Kernel at smp-dev-complete-merge-candidate-history

No, the port code is provided by other commercial company. OK I will try (although it’s hard to rebase my code to latest). Thanks very much!

It’s hard to suggest debug steps for code we haven’t even seen. Can that commercial company provide support too?

Yeah, that commercial company could support, but they are very slow, however I will keep going here.
I figure out this issue via analyze the task stack. However after this deadlock seqnence is clear, I found this issue could be easily seen just via code review of task.c.
So I start this topic because I am suspecting this is a common issue. vTaskSuspend() take TASK_LOCK and enable interrupt (which will make vTaskSwitchContext possible), so this deadlock is theoretically possible for current freertos kernel. So the fact is Freertos kernel may rely on the CPU ARCH porting to avoid this possible deadlock (e.g disable vPortYield), which i think is wrong behavior. At leaset Freertos kernel need provide a guideline or hint for this.

It seems that you are observing deadlock when calling portGET_TASK_LOCK from the same core. This indicates towards an incorrect implementation of portGET_TASK_LOCK - it is expected that it will not block when called from the same core recursively. You can take a look at the following links for a reference -

Great! Looks like this is the root cause. I will focus my debug on this point. Again, thanks a lot.

@pfu Just an update - we merged the SMP branch to main today. Merge SMP feature to main (#716) · FreeRTOS/FreeRTOS-Kernel@ae3a498 · GitHub

1 Like