I opened an discussion about an error that keeps happening with my spinlocks even though their implementation is verified independently. You can find it here
Running main_full with all four A53 cores on will probabilistically result in the above assert
Running main_full with only two cores on for 24h does not result in the above problem.
But when I debugged it with the debugger, I realized that itâs not the interrupts that are missing, but the interrupts are responding slower, and the interrupts arenât being responded to when assert appears.
When you ran main_full, the value of configRUN_MULTIPLE_PRIORITIES was 0 or 1?
configRUN_MULTIPLE_PRIORITIES was 0
Thanks. Have you check why the interrupt latency on ca? By the way, there is another possible cause of the bug. Maybe the pending bit was set after hit breakpoint.
Hi @Yafeng I encountered this assert problem, too. Do you figure out why this happends? I think add âwhileâ check is not the solution.
I think this is a bug in FreeRTOS.
This assertion for checking pxThisTCB->xTaskRunState reads the pxThisTCB->xTaskRunState variable, but if there is another core at that moment, requesting the current core yield, then there will be a problem here.
After many tests I have done, most of the time this bug occurs in the following cases:
1.The interrupt enable code is running on one core,
2.The code for the assertion checking is scheduled to run on another core,
If the task is bound to a specific core, might be able to avoid that.
PS: add âwhileâ check is really not the right solution
yafeng
@Yafeng
Thank you for your analysis. The assertion does not account for the possibility that the task run state could be modified by another task after itâs interrupt is enabled.
Can you remove the configASSERT statement and run the main_full demo again to verify this change? If the results are satisfactory, we will create a pull request to address this issue.
@sherry, @Yafeng Would you please try @Freshâs suggestion of removing this assert - FreeRTOS-Kernel/tasks.c at main ¡ FreeRTOS/FreeRTOS-Kernel ¡ GitHub?
After remove the configASSERT statementďźran main_full for 5 hours and everything is ok!
Thank you for sharing! And thank you for spotting the bug. Would you like to raise a PR to fix this or would you like us to raise?
I suggest that you raise PR to fix thisďź
@sherry @Yafeng
The PR addresses this problem is merged. Thank you for reporting and helping to verify this change.
Those tests were written for unicore system, so I wonât be surprised if some of them do not work on SMP. Can you try to pin all the tasks in that tests to one core?
Was anyone able to test if the A53 *4 port code works with GIC( generic interrupt controller) enabled?
Hi @aggarg @Yafeng,
I was looking at the portASM code of the âAdding support for ZynqMPSoc(A53*4) #13â merge request and was wondering where in the case of interrupt nesting in the IRQ_Handler, do we decrement this counter? I only see that we increment it and then push it to the stack. But after we call the C isr handler, we restore this value from the stack (without decrementing) and compare this to 0. Shouldnât we be decrementing it?
Any interrupt that interrupted us, will have increased the count, and then restored it, so it MUST be the same as we incremented it. There is no option to shuffle to order of the interrupt nesting that has happened, so each level only needs to restore the value it sees, which WILL be the same as a decrement.
I have tested the full test on smp. Some test cases need additional operations which is shown in the tabled blow.
Test case | Note |
---|---|
PEEK_QUEUE_TESTS | Add affinity |
GENERIC_QUEUE_TESTS | Add affinity |
EVENT_GRoUP_TESTS | Add affinity |
RECURSIVE_MUTEX_TESTS | Add affinity |
BLOCK_TIME_TESTS | Add affinity |
INTERRUPT_SEMAPHORE_TESTS | Add affinity |
TIMER_TESTS | Use TIMER_SERVICE_TASK_CORE_AFFINITY to bond soft time with test task |