Should the Microblaze FreeRTOS port define portMEMORY_BARRIER()?

I’m using FreeRTOS for my project. It gets stuck in an infinite loop up in the xTaskResumeAll method from time to time. I’m not calling this method directly, its being called internally from FreeRTOS when I am using semaphores or task notifications.

It appears that its happening when xTaskResumeAll is called when xPendingReadyList has more than two items in the list. The loop it gets stuck in is

    while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
    {
       pxTCB = listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) );
       listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
       portMEMORY_BARRIER();
       listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
       prvAddTaskToReadyList( pxTCB );
       ...
    }

Looking at the generated binary its possible to see that pxTCB is only read the first time through the loop. I guess the compiler is unable to work out that listREMOVE_ITEM( &( pxTCB->xEventListItem ) ) actually changes the result of listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) ) and as such it caches pxTCB in a register the first time through the loop. This means that the second (and subsequent) times through the loop we continually try (and fail) to operate on the pxTCB cached in the register. This means that the second item in the list is never removed and we have an infinite loop.

It appears that this was a known problem and a portMEMORY_BARRIER() was added into FreeRTOS. It would force the pxTCB to be reloaded by the compiler for each loop. The problem is that that the portMEMORY_BARRIER is undefined for the Microblaze port. When I define portMEMORY_BARRIER and re-compile the problem disappears:

    #define portMEMORY_BARRIER()      asm volatile ("" ::: "memory")

This happens in my project when the interrupts line up such that vTaskSuspendAll has been called (e.g. when using semaphores) and two interrupts trigger pretty close together that awake tasks. It can take hours or days but it does eventually happen.

This can be reproduced reliably with a sample application. I have manually called vTaskSuspendAll and xTaskResumeAll but it does show the problem. I can’t upload because I am a new user.

If you look at the code, it seems to be an optimization error, as the compiler SHOULD be able to see that the list is changed, so it needs to read the new value, but if it doesn’t it has made an error.

The sort of memory barrier you are using shouldn’t be expensive, as it just limits optimizations over the line, and for that loop there shouldn’t be much of that sort available. But that is still just a band-aid for an erroneous compiler.

It might be useful to carefully document that code segment and the resulting output and report it to the compler’s support group.

The solution seems reasonable. Just to confirm, can you share the generated assembly before and after making this change?

I can’t upload files so I will just include it here inline. I trimmed it down to just the loop. Before:

               while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
    4768:	b0000000 	imm	0
    476c:	31807b54 	addik	r12, r0, 31572	// 7b54 <xPendingReadyList>
    4770:	e86c0000 	lwi	r3, r12, 0
    4774:	be0301e0 	beqid	r3, 480		// 4954
    4778:	33000001 	addik	r24, r0, 1
                    pxTCB = listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) ); /*lint !e9079 void * is used as this macro is used with timers and co-routines too.  Alignment is known to be fine as the type of the pointer stored and retrieved is the same. */
    477c:	e86c000c 	lwi	r3, r12, 12
    4780:	e863000c 	lwi	r3, r3, 12
                    prvAddTaskToReadyList( pxTCB );
    4784:	ea63002c 	lwi	r19, r3, 44
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    4788:	e8c30028 	lwi	r6, r3, 40
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    478c:	e8a30014 	lwi	r5, r3, 20
                    prvAddTaskToReadyList( pxTCB );
    4790:	61730014 	muli	r11, r19, 20
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    4794:	32e30018 	addik	r23, r3, 24
                    prvAddTaskToReadyList( pxTCB );
    4798:	46d89c00 	bsll	r22, r24, r19
    479c:	b0000000 	imm	0
    47a0:	30eb7b98 	addik	r7, r11, 31640
    47a4:	b0000000 	imm	0
    47a8:	316b7b9c 	addik	r11, r11, 31644
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47ac:	b81000b0 	brid	176		// 485c
    47b0:	31230004 	addik	r9, r3, 4
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    47b4:	e8860000 	lwi	r4, r6, 0
    47b8:	f9430028 	swi	r10, r3, 40
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47bc:	e9030008 	lwi	r8, r3, 8
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    47c0:	3084ffff 	addik	r4, r4, -1
    47c4:	f8860000 	swi	r4, r6, 0
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47c8:	e883000c 	lwi	r4, r3, 12
    47cc:	10ca0000 	addk	r6, r10, r0
    47d0:	f8880008 	swi	r4, r8, 8
    47d4:	eb23000c 	lwi	r25, r3, 12
    47d8:	e8850004 	lwi	r4, r5, 4
    47dc:	88844800 	xor	r4, r4, r9
    47e0:	be0400f8 	beqid	r4, 248		// 48d8
    47e4:	f9190004 	swi	r8, r25, 4
    47e8:	e9050000 	lwi	r8, r5, 0
                    prvAddTaskToReadyList( pxTCB );
    47ec:	e88b0000 	lwi	r4, r11, 0
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47f0:	3108ffff 	addik	r8, r8, -1
    47f4:	f9050000 	swi	r8, r5, 0
                    prvAddTaskToReadyList( pxTCB );
    47f8:	e9040008 	lwi	r8, r4, 8
    47fc:	b0000000 	imm	0
    4800:	e8a07b1c 	lwi	r5, r0, 31516	// 7b1c <uxTopReadyPriority>
    4804:	f8830008 	swi	r4, r3, 8
    4808:	f903000c 	swi	r8, r3, 12
    480c:	e9040008 	lwi	r8, r4, 8
    4810:	80b62800 	or	r5, r22, r5
    4814:	b0000000 	imm	0
    4818:	f8a07b1c 	swi	r5, r0, 31516	// 7b1c <uxTopReadyPriority>
    481c:	e8a70000 	lwi	r5, r7, 0
    4820:	f9280004 	swi	r9, r8, 4
    4824:	f9240008 	swi	r9, r4, 8
    4828:	f8e30014 	swi	r7, r3, 20
    482c:	30850001 	addik	r4, r5, 1
    4830:	f8870000 	swi	r4, r7, 0
                    if( pxTCB->uxPriority >= pxCurrentTCB->uxPriority )
    4834:	b0000000 	imm	0
    4838:	e8807c38 	lwi	r4, r0, 31800	// 7c38 <pxCurrentTCB>
    483c:	e884002c 	lwi	r4, r4, 44
    4840:	16449803 	cmpu	r18, r4, r19
    4844:	be520010 	bltid	r18, 16		// 4854
    4848:	10a70000 	addk	r5, r7, r0
                        xYieldPending = pdTRUE;
    484c:	b0000000 	imm	0
    4850:	fb007b10 	swi	r24, r0, 31504	// 7b10 <xYieldPending>
                while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
    4854:	e88c0000 	lwi	r4, r12, 0
    4858:	bc0400d0 	beqi	r4, 208		// 4928
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    485c:	eb23001c 	lwi	r25, r3, 28
    4860:	e9030020 	lwi	r8, r3, 32
    4864:	e8860004 	lwi	r4, r6, 4
    4868:	f9190008 	swi	r8, r25, 8
    486c:	8884b800 	xor	r4, r4, r23
    4870:	be24ff44 	bneid	r4, -188		// 47b4
    4874:	fb280004 	swi	r25, r8, 4
    4878:	b810ff3c 	brid	-196		// 47b4

After:

             while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
    475c:	b0000000 	imm	0
    4760:	31607b1c 	addik	r11, r0, 31516	// 7b1c <xPendingReadyList>
    4764:	e86b0000 	lwi	r3, r11, 0
    4768:	be0301b8 	beqid	r3, 440		// 4920
    476c:	31800001 	addik	r12, r0, 1
                    pxTCB = listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) ); /*lint !e9079 void * is used as this macro is used with timers and co-routines too.  Alignment is known to be fine as the type of the pointer stored and retrieved is the same. */
    4770:	b81000c4 	brid	196		// 4834
    4774:	e86b000c 	lwi	r3, r11, 12
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    4778:	e8a40000 	lwi	r5, r4, 0
    477c:	f8030028 	swi	r0, r3, 40
    4780:	30a5ffff 	addik	r5, r5, -1
    4784:	f8a40000 	swi	r5, r4, 0
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    4788:	e8c30014 	lwi	r6, r3, 20
    478c:	e8e30008 	lwi	r7, r3, 8
    4790:	e8a3000c 	lwi	r5, r3, 12
    4794:	e8860004 	lwi	r4, r6, 4
    4798:	31030004 	addik	r8, r3, 4
    479c:	f8a70008 	swi	r5, r7, 8
    47a0:	88844000 	xor	r4, r4, r8
    47a4:	be04010c 	beqid	r4, 268		// 48b0
    47a8:	f8e50004 	swi	r7, r5, 4
                    prvAddTaskToReadyList( pxTCB );
    47ac:	e8e3002c 	lwi	r7, r3, 44
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47b0:	e9460000 	lwi	r10, r6, 0
                    prvAddTaskToReadyList( pxTCB );
    47b4:	60870014 	muli	r4, r7, 20
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47b8:	314affff 	addik	r10, r10, -1
                    prvAddTaskToReadyList( pxTCB );
    47bc:	452c3c00 	bsll	r9, r12, r7
    47c0:	b0000000 	imm	0
    47c4:	30847b60 	addik	r4, r4, 31584
    47c8:	e8a40004 	lwi	r5, r4, 4
                    listREMOVE_ITEM( &( pxTCB->xStateListItem ) );
    47cc:	f9460000 	swi	r10, r6, 0
                    prvAddTaskToReadyList( pxTCB );
    47d0:	b0000000 	imm	0
    47d4:	e8c07ae4 	lwi	r6, r0, 31460	// 7ae4 <uxTopReadyPriority>
    47d8:	e9450008 	lwi	r10, r5, 8
    47dc:	f8a30008 	swi	r5, r3, 8
    47e0:	80c93000 	or	r6, r9, r6
    47e4:	f943000c 	swi	r10, r3, 12
    47e8:	e9250008 	lwi	r9, r5, 8
    47ec:	b0000000 	imm	0
    47f0:	f8c07ae4 	swi	r6, r0, 31460	// 7ae4 <uxTopReadyPriority>
    47f4:	e8c40000 	lwi	r6, r4, 0
    47f8:	f9090004 	swi	r8, r9, 4
    47fc:	f9050008 	swi	r8, r5, 8
    4800:	f8830014 	swi	r4, r3, 20
    4804:	30660001 	addik	r3, r6, 1
    4808:	f8640000 	swi	r3, r4, 0
                    if( pxTCB->uxPriority >= pxCurrentTCB->uxPriority )
    480c:	b0000000 	imm	0
    4810:	e8607c00 	lwi	r3, r0, 31744	// 7c00 <pxCurrentTCB>
    4814:	e863002c 	lwi	r3, r3, 44
    4818:	16433803 	cmpu	r18, r3, r7
    481c:	bc52000c 	blti	r18, 12		// 4828
                        xYieldPending = pdTRUE;
    4820:	b0000000 	imm	0
    4824:	f9807ad8 	swi	r12, r0, 31448	// 7ad8 <xYieldPending>
                while( listLIST_IS_EMPTY( &xPendingReadyList ) == pdFALSE )
    4828:	e86b0000 	lwi	r3, r11, 0
    482c:	bc0300c8 	beqi	r3, 200		// 48f4
                    pxTCB = listGET_OWNER_OF_HEAD_ENTRY( ( &xPendingReadyList ) ); /*lint !e9079 void * is used as this macro is used with timers and co-routines too.  Alignment is known to be fine as the type of the pointer stored and retrieved is the same. */
    4830:	e86b000c 	lwi	r3, r11, 12
    4834:	e863000c 	lwi	r3, r3, 12
                    listREMOVE_ITEM( &( pxTCB->xEventListItem ) );
    4838:	e8830028 	lwi	r4, r3, 40
    483c:	e8e3001c 	lwi	r7, r3, 28
    4840:	e8c30020 	lwi	r6, r3, 32
    4844:	e8a40004 	lwi	r5, r4, 4
    4848:	31030018 	addik	r8, r3, 24
    484c:	f8c70008 	swi	r6, r7, 8
    4850:	88a54000 	xor	r5, r5, r8
    4854:	be25ff24 	bneid	r5, -220		// 4778
    4858:	f8e60004 	swi	r7, r6, 4
    485c:	b810ff1c 	brid	-228		// 4778

Just to be clear the memory barrier was already in the code so I am not adding that. I am just defining the memory barrier. I assume this means its also a problem for other platforms/compilers.

Fair enough, that’s what I thought. I do wonder if it is a bug though. The compiler would have to somehow know that xPendingReadyList and xPendingReadyList->xListEnd->pxNext->pvOwner->xEventListItem->pxContainer could be related (in this case they are the same object). I wonder what assumptions the compiler can make about memory. If it couldn’t make any then it would never be able cache values in registers. I wonder if the pvOwner (which is a void *) lets the compiler relax some of its assumptions. Anyways, like you mention I’ll try post to the compler’s support group.

Thank you for sharing these assemblies. Your assessment and solution looks good to me. Would you be willing to raise a PR?

Sure gave it a go https://github.com/FreeRTOS/FreeRTOS-Kernel/pull/621.

I actually worked out this is not a problem with the compiler. It is actually allowed to make this optimization according to the ISO C specification. The code FreeRTOS is technically “incorrect C”. When the item from the list is removed it operates on List_t * and ListItem_t *. This means that the compiler can assume that any MiniListItem_t will remain unchanged by the loop. As pxTCB is obtained via a MiniListItem_t it is able to assume that pxTCB wont change in the loop (it also knows that pvOwner is not changed by the loop).

In this case this is incorrect because the ListItem_t * may actually alias a MiniListItem_t (i.e. may actually change the MiniListItem_t in the loop) which is technically not allowed by ISO C (see https://gcc.gnu.org/bugs/#casting_and_optimization, https://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html and https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8).

This happens even with my desktop compiler. I have a main.c that takes the list.c and triggers a failure (i.e. SEGFAULT as the loop tries to read from 0x0000) on my Linux x86 desktop in both gcc and clang. This failure goes away if you do any of the following:

  1. Define a memory barrier.
  2. Change List_t.xListEnd to be a ListItem_t. As the loop body changes a ListItem_t.pxNext value the compiler forced to re-read pxTCB as it must now assume that the loop might have changed List_t.xListEnd.pxNext.
  3. Compile with -fno-strict-aliasing (i.e. Turn of the optimization causing problems).

I guess the MiniListItem_t is to save memory in the TCB. However, it does mean that the compiler can do strange undefined things if it is aliased via ListItem_t* and changed. I wonder if there are other problems in the code. Maybe this has all been thought of and fixed already?

1 Like

You can stop the use of MiniListItem_t by definging configUSE_MINI_LIST_ITEM to 0 in your FreeRTOSConfig.h. Would you please check if your problem is solved by that?

Defining configUSE_MINI_LIST_ITEM to 0 also fixes the problem.

Thank you for reporting back.