Microblaze : pxCurrentTCB gets corrupted when running with Interrupts enabled.

richardd wrote on Monday, February 05, 2018:

Hi,

We are running a multi-thread application on a Xilinx Microblaze softcore and we are seeing that after approximately 15hours of running that the pxCurrentTCB which is returned from taskSELECT_HIGHEST_PRIORITY_TASK is corrupt.
The system is running with 3 interrupts enabled:

  1. System Timer Interrupt (10ms)
  2. I2C Interrupt (random, but usually every second or so)
  3. Video Interrupt (approximately every 16mS).
    All the threads run at the same priority of “tskIDLE_PRIORITY + 1”. And two of the threads are woken up by Interrupts from the I2C and Video Interrupt.

We have enabled the integrity checking on the task list, using configUSE_LIST_DATA_INTEGRITY_CHECK_BYTES, and when we do this we see pxCurrentTCB is being set to pdINTEGRITY_CHECK_VALUE (0x5a5a5a5a). So we are pretty sure that we are not overwriting memory and corrupting the lists.

We are also running with configCHECK_FOR_STACK_OVERFLOW=2 and portSTACK_GROWTH=-1, and the stack checking has not fired so again we don’t think we are corrupting memory.

The Xilinx toolchain is XSDK 2017.3.

We have searched the forum and have seen that other people have reported similar issues on the Cortex M3, but don’t think this is relevant as the microblaze only has a single interrupt priority?

Any help in resolving this issue would be much appreciated!

Thanks in advance,

Richard

rtel wrote on Monday, February 05, 2018:

Are your interrupt priorities at or below configMAX_SYSCALL_INTERRUPT_PRIORITY? (Might be a different name in the Cortex-A port - Google FreeRTOS on Corex-A to find it.

Also do you have configASSERT() defined?

richardd wrote on Tuesday, February 06, 2018:

Hi Richard,

We certainly have been concerned about the documentation surrounding the configMAX_SYSCALL_INTERRUPT_PRIORITY. Searching through the Xilinx BSP for freeRTOS 9.01 v1.1, we see no reference to this define. We also checked for the following and none are defined or used in the BSP, configKERNEL_INTERRUPT_PRIORITY configMAX_SYSCALL_INTERRUPT_PRIORITY and configMAX_API_CALL_INTERRUPT_PRIORITY.

So it seems that there are no INTERRUPT Priorities setup with the Microblaze port of freeRTOS?

The microblaze has a single interrupt pin which is driven from an external interrupt controller which has 5 interrupts attached:
Video interrupt = 0 (We see the configAssert on a thread which is woken buy this interrupt)
I2C interrupt = 1 (Used)
UART Interrupt = 2 (Not used)
System Timer Interrupt = 3 (Used for thread switching)
SPI Interrupt = 4 (Not Used)

Looking at the Xilinx ISR handler if multiple interrupts occur at the same time then they would be service starting at ID = 0.

Yes, configAssert is assigned, I have attached a stack trace for a capture of the error condition.

It seems that while attempting to wait on a semaphore the pxCurrentTCB get corrupted. This is after approximately 15 hours, with the semaphore wait occuring about every 16ms, it is signalled by the Video Interrupt ISR.

Thanks for your support!

Richard

rtel wrote on Tuesday, February 06, 2018:

My apologies getting muddled - the interrupt priority configuration
constants are for the Zynq, not Microblaze.

richardd wrote on Tuesday, February 06, 2018:

Hi Richard,
Understand getting muddled, do the same thing all the time :slight_smile:

Now that you understand that this is happening on a microblaze does this change your original thoughts on what might be going wrong?

Thanks

Richard

rtel wrote on Tuesday, February 06, 2018:

Difficult to say without being able to step through the code. Stating
the obvious, there is a corruption occurring somewhere, its just
tracking down where.

Maybe a long shot, but are you using vPortEnableInterrupt() repeatedly?
There was a small fix there in the last weeks where by the call to
XIntc_Enable() was placed into a critical section. Most people only use
the function during initialisation though. See line 310 (at the time of
writing) on this link:
https://sourceforge.net/p/freertos/code/HEAD/tree/trunk/FreeRTOS/Source/portable/GCC/MicroBlazeV9/port.c

richardd wrote on Tuesday, February 06, 2018:

Hi,

Yes we did see the fix for vPortEnableInteerupt but we only call this at start-up.

We can certainly get the code to stop on the error, we’d be happy to host a teamviewer session if you had the time :slight_smile:

This is getting critical for the product and at this point we are unable to proceed with shipment :frowning:

Thanks

Richard