STM32H Dual Core AMP Demo

dookei · November 9, 2022, 2:19pm

Hi there,
I am currently porting this example "Simple Multicore Core to Core Communication Using FreeRTOS Message Buffers" and adapting in order to create a message dispatcher on each core.
The main idea is to allow the cores to communicate with each other and forward the messages to the accordding tasks. please see below the idea.

In my case scenario I was considering using a QUEUE to store the received messages. This is after read by the “Dispatcher task”.
Now on the provided example( I know its now finished) the sending task might get blocked if the receiving task does not read the buffer. Therefor, in the last paragraph is written that
and I state:

So far we have only considered the cases where the sending task must unblock the receiving task. If it is possible for a message buffer used for core to core communication to get full, causing the sending task to block, then it is also necessary to consider how the receiving task unblocks the sending task. That can be done by overriding the default implementation of the sbRECEIVE_COMPLETED() in exactly the same way as already described for sbSEND_COMPLETED().

What should I add in order to #define sbRECEIVE_COMPLETED( pxStreamBuffer ) ?
It took me some time to undertand this as I am little rusty at the moment. but getting there.

rtel · November 9, 2022, 5:56pm

First want to check you have seen the stream and message buffer docs, of which this is one page: FreeRTOS stream buffers - circular buffers - almost certain you will have as the blog you refer to will probably link to them.

The callback mechanism has been updated since the blog was written so each stream buffer can have it’s own callback if that is preferred - which is normally the case if stream buffers are used both internally and for core to core communication within the same application. Quoting from that doc page:

Stream and message buffers created using the xStreamBufferCreate() and xMessageBufferCreate() API functions (and their statically allocated equivalents) share the same callback functions, which are defined using the sbSEND_COMPLETED() and sbRECEIVE_COMPLETED() macros. The following sections provide more information on these macros.
Stream and message buffers created using the xStreamBufferCreateWithCallback() and xMessageBufferCreateWithCallback() API functions (and their statically allocated equivalents) can each have their own unique callback function.

Your actual ask is not clear to me - do you want to know what to add in order to override the default implementation of sbRECEIVE_COMPLETED()? If so, #define it in FreeRTOSConfig.h. Alternatively, if you are asking what you need to add into the definition of that macro, then you need it to generate an interrupt in the other core (details of which are somewhat dependent on your hardware).

You can look at this heavily commented file for examples: FreeRTOS/MessageBufferAMP.c at main · FreeRTOS/FreeRTOS · GitHub The Windows or Linux simulator examples are a convenient way to see that file in use.

dookei · November 11, 2022, 9:06am

Hi @rtel thanks for your prompt and clear response.
I didnt noticed that there was already an updated implementation from that demo. I was using the one provided by ST.
Yes, what I was also looking for was where/how to implement the sbRECEIVE_COMPLETED().
In the updated demo (link below) this is already implemented.
So a quick overview on how it would look like(on my IPC message dispatcher stack).

I believe this way the IPC communication stack would perform well.

For future reference(here is the update demo for the Dual core STM32H7 family devices): CORTEX_M7_M4_AMP_STM32H745I_Discovery_IAR

dookei · November 22, 2022, 10:02am

Hi @rtel, just a quick question related to overriding the default implementation from sbSEND_COMPLETED() and sbRECEIVE_COMPLETED().
I am using them both for the IPC/AMP but I would also like to use the message buffer within the freertos “the standard way”. I there a clean way to link to the default implementation when not an IPC related buffer.

So this is the implemetation in the AMP demo that I modified in order to cover both IPC and standard message buffer usage:

void vGenerate_ISR_TxSendComp ( void * xUpdatedMessageBuffer )
{
	MessageBufferHandle_t xUpdatedBuffer = ( MessageBufferHandle_t ) xUpdatedMessageBuffer;

	/* IPC message buffer */
	if( (xUpdatedBuffer == xControlMessageBuffer) || (xUpdatedBuffer == xDataMessageBuffers) ) /* not an IPC buffer type */
	{
		if( (xUpdatedBuffer != xControlMessageBuffer)) /* this is a data message buffer */  
		{
			/* Writes the handle of the data message buffer to which data was written to the control message buffer*/
			while( xMessageBufferSend( xControlMessageBuffer, &xUpdatedBuffer, sizeof( xUpdatedBuffer ), mbaDONT_BLOCK ) != sizeof( xUpdatedBuffer ) )
			{
				/* Nothing to do here. */
			}
			GENERATE_EXTI(EXTI_LINE_TX_COMP);		/* Generate interrupt in the other core */
		}
		else
		{
			// do nothing 
		}
	}
	/* Not an IPC message buffer */
	else {

		sb_STANDARD_SEND_COMPLETED(xUpdatedMessageBuffer);
	}
}

The sb_STANDARD_SEND_COMPLETED() is nothing more than a copy from the standard implementation FreeRTOS-Kernel/stream_buffer.c
Would this be the best way to support both ?
thanks once again for your fantastic support.

aggarg · November 22, 2022, 11:27am

The sbSEND_COMPLETED is called if the per-instance callback is not supplied - FreeRTOS-Kernel/stream_buffer.c at 5f7ca3a55fa9ccd6137c9c557ecba629c0b96e2a · FreeRTOS/FreeRTOS-Kernel · GitHub

If you are creating the stream buffer using xStreamBufferCreate, the per instance callback will be NULL and sbSEND_COMPLETED will get called. There should be no need to make a copy sb_STANDARD_SEND_COMPLETED and call it explicitly. Did I miss anything?

dookei · November 22, 2022, 11:46am

yes you are right if its not supplied but here is supplied.

this is what I have on my FreeRtosConfig.h

/* Override the default implementation of sbSEND_COMPLETED so the macro creates an interrupt in the M7 core*/
#define sbSEND_COMPLETED( pxStreamBuffer ) vGenerate_ISR_TxSendComp( pxStreamBuffer )

aggarg · November 22, 2022, 11:53am

Lets say you have 2 stream buffers -

Used for AMP - Create this buffer using xStreamBufferCreateWithCallback and supply vGenerate_ISR_TxSendComp as pxSendCompletedCallback parameter.
Used for IPC - Create this buffer using xStreamBufferCreate.

If you do that, you do not need to override sbSEND_COMPLETED in your FreeRTOSConfig.h.

dookei · November 22, 2022, 12:57pm

@aggarg Thank you very much for your prompt reply.
If I undertood correct AMP Demo CORTEX_M7_M4_AMP_STM32H745I_Discovery_IAR provided wont work together with the message buffers for intercommunication within the same rtos?

Now I understand why it is writen on RTOS-stream-buffer-example

Create the stream buffer using the xStreamBufferCreateStaticWithCallback() or xStreamBufferCreateStaticWithCallback() API functions if you need each stream buffer to have its own “send completed” behaviour.

I also noticed that this was implemented on 10.5.0 and not yet supported for MPU enabled ports. Is there a plan on supporting this featureon MPU enabled ports?

github.com

FreeRTOS/FreeRTOS-Kernel/blob/5f7ca3a55fa9ccd6137c9c557ecba629c0b96e2a/History.txt#L58-L72


      
          	+ Add the ability to override send and receive completed callbacks for each

          	  instance of a stream buffer or message buffer. Earlier there could be

          	  one send and one receive callback for all instances of stream and message

          	  buffers. Having separate callbacks per instance allows different message

          	  and stream buffers to be used differently - for example, some for inter core

          	  communication and others for same core communication.

          	  The feature can be controlled by setting  the configuration option

          	  configUSE_SB_COMPLETED_CALLBACK in FreeRTOSConfig.h. When the option is set to 1,

          	  APIs xStreamBufferCreateWithCallback() or xStreamBufferCreateStaticWithCallback()

          	  (and likewise APIs for message buffer) can be used to create a stream buffer 

          	  or message buffer instance with application provided callback overrides. When

          	  the option is set to 0, then the default callbacks as defined by

          	  sbSEND_COMPLETED() and sbRECEIVE_COMPLETED() macros are invoked. To maintain 

          	  backwards compatibility, configUSE_SB_COMPLETED_CALLBACK defaults to 0. The 

          	  functionality is currently not supported for MPU enabled ports.

aggarg · November 22, 2022, 1:05pm

For that you will need something like you suggested above.

Not sure if it is currently planned. Do you have a use case where you need it?

dookei · November 22, 2022, 1:27pm

Not yet. But might have a use case next year for a custom request.

I will now adapt the code to use both message buffer simultaneously with the xMessageBufferCreateStaticWithCallback for the IPC communicaton and xMessageBufferCreateStatic() for the intercommunication within the same RTOS( well the default usage)

aggarg · November 22, 2022, 2:52pm

Sure - let us know the details when you have them.

dookei · December 1, 2022, 11:02am

@aggarg “quick” question( I am getting a bit confused):
from this page FreeRTOS xMessageBufferCreateStatic() API documentation

For the AMP implementation when using xMessageBufferCreateStaticWithCallback
I can only define and implement the callbacks for the core that is transmitting the message. Right?

So on the core that is transmitting the message I have

xMessageBufferWithCallback = xMessageBufferCreateStaticWithCallback( 
                                     sizeof( ucMessageBufferWithCallbackStorage ),
                                     ucMessageBufferWithCallbackStorage,
                                     &xMessageBufferWithCallbackStruct,
                                     vSendCompletedCallback,
                                     vReceiveCompletedCallback );

and then the call for the vSendCompletedCallback that is going to generate the interrupt on the partner core.

void vSendCompletedCallback( MessageBufferHandle_t xUpdatedBuffer, BaseType_t xIsInsideISR, BaseType_t * const pxHigherPriorityTaskWoken )
{
	/* Writes the handle of the data message buffer to which data was written to the control message buffer*/
	while( xMessageBufferSend( xControlMsgBuff_Tx, &xUpdatedBuffer, sizeof( xUpdatedBuffer ), mbaDONT_BLOCK ) != sizeof( xUpdatedBuffer ) )
	{
		/* Nothing to do here. */
	}
	GENERATE_EXTI(EXTI_LINE_TX_DISP_SEND_COMP);		/* Generate interrupt on the other core */
}


void vReceiveCompletedCallback( MessageBufferHandle_t xMessageBuffer, BaseType_t xIsInsideISR, BaseType_t * const pxHigherPriorityTaskWoken )
{
 // do nothing this will be implemented by the ISR 
}

/* Interrupt generated by the partner core when the message was read from the buffer */
void IPC_COM_ISR_Tx( void )
{
	BaseType_t xHigherPriorityTaskWoken = pdFALSE;
	//TODO configASSERT( xM7AMPTask );
	CLEARFLAG_EXTI(EXTI_LINE_TX_DISP_REC_COMP);
	xMessageBufferReceiveCompletedFromISR( xDataMsgBuff_Tx, &xHigherPriorityTaskWoken ); 
	/* Normal FreeRTOS "yield from interrupt" semantics, where
	xHigherPriorityTaskWoken is initialized to pdFALSE and will then get set to
	pdTRUE if the interrupt unblocks a task that has a priority above that of
	the currently executing task. */
	portYIELD_FROM_ISR( xHigherPriorityTaskWoken );
}

That means that the core that is receiving the message (and not creating the message buffer) wont have a call back since we didnt created the message buffer. Or am I missing something here ( I must )

On the receiving core I have ISR that is going to notify that a message was sent to the buffer and is available to be received.

void IPC_COM_ISR_Rx(void) // interrupt generated from the partner Core to indicate a message wa sent to the buffer.

void IPC_COM_ISR_Rx(void) // interrupt generated from the partner Core to indicate a message wa sent to the buffer 
{
	MessageBufferHandle_t xUpdatedMessageBuffer;
	BaseType_t xHigherPriorityTaskWoken = pdFALSE;

		/* xControlMessageBuffer contains the handle of the message buffer that contains data. */
		if( xMessageBufferReceiveFromISR( xControlMsgBuff_Rx, &xUpdatedMessageBuffer, sizeof( xUpdatedMessageBuffer ), &xHigherPriorityTaskWoken )
				== sizeof( xUpdatedMessageBuffer ) ){
				/* Call the API function that sends a notification to any task that is
	    	blocked on the xUpdatedMessageBuffer message buffer waiting for data to
	    	arrive. */
			xMessageBufferSendCompletedFromISR( xUpdatedMessageBuffer, &xHigherPriorityTaskWoken );
		}
	/* Did sending to the queue unblock a higher priority task? */
	if( xHigherPriorityTaskWoken ){
		portYIELD_FROM_ISR( xHigherPriorityTaskWoken );
	}
}

After that the blocked task will read the message buffer and (I assume) execute the default implementation from sbRECEIVE_COMPLETED( pxStreamBuffer ). Or am I missunderstanding something ?

		xReceivedBytes = xMessageBufferReceive(
									xDataMsgBuff_Rx,				/* Handle of message buffer. */
									&xReceivedMsg,					/* Buffer into which received data is placed. */
								    sizeof( Message_IPC ),	/* Size of the receive buffer. */
									IPC_COM_WAIT_RX );			/* Time to wait for data to arrive. */

aggarg · December 6, 2022, 10:08am

Did you specify a receive callback during creation of the message buffer? If yes, that should be called. It should be easy enough to verify by putting a breakpoint in the callback.

dookei · December 6, 2022, 12:01pm

Hi @aggarg thanks for your feedback.
how can I define it if the buffer was created by the partner core ?

My questions is on the AMP example without xMessageBufferCreateStaticWithCallback the default implementations from sbRECEIVE_COMPLETED() and sbSEND_COMPLETED() could be overriden in each core .
But now by using the xMessageBufferCreateStaticWithCallback I can only have this callbacks overriden in the core that creates the message buffer. Correct?

On Core CM7

create the buffer with xMessageBufferCreateStaticWithCallback and the vSendCompletedCallback() and vReceivedCompletedCallback accordingly.
vSendCompletedCallback() will basicaly generate an interrupt on the core CM4.
The vReceiveCompletedCallback() I have nothing inside since the Receive Completed signal will be sent from the core CM4 in form of EXTI interrupt.
On my CM4toCM7_EXTI_ISR, I then call xMessageBufferReceiveCompletedFromISR().

On Core CM4

Receives an EXTI interrupt and executes the according ISR that calls the xMessageBufferSendCompletedFromISR().
Now the task that was blocked will recceive from the buffer (xMessageBufferReceive().
AND HERE is where my question arrises: After calling the xMessageBufferReceive the CM4 core will execute the default sbRECEIVE_COMPLETED() since the message buffer was created by CM7.
I took a deeper dive in the xMessageBufferCreateStaticWithCallback and this leads me to thís

github.com

FreeRTOS/FreeRTOS-Kernel/blob/91927abc0b630e9a499c4dbf34e9b1009eadaff5/stream_buffer.c#L1385-L1389


      
          #if ( configUSE_SB_COMPLETED_CALLBACK == 1 )
          {
              pxStreamBuffer->pxSendCompletedCallback = pxSendCompletedCallback;
              pxStreamBuffer->pxReceiveCompletedCallback = pxReceiveCompletedCallback;
          }

Now this callbacks(can only be executed by the Core CM7 and not by the CM4, since they are in CM7 memory area)

Question
Is there a way to define the callback vReceivedCompletedCallback() on the CM4 without overriding the default sbRECEIVE_COMPLETED() or losing the default implementation from streambuffers ?
Many thanks once again.

aggarg · December 7, 2022, 5:12am

Thank you for detailed explanation. I understand you issue.

Assuming you are building both the instances of FreeRTOS with configUSE_SB_COMPLETED_CALLBACK set to 1, this line will try to invoke the callback in CM7. Is it possible for you to verify that?

I do not think there is a straight forward way as of now. Can you try to add the following functions and then set the receive completed callback after creation from CM4?

#if ( configUSE_SB_COMPLETED_CALLBACK == 1 )

void vStreamBufferSetReceiveCompletedCallback( StreamBufferHandle_t xStreamBuffer,
                                               StreamBufferCallbackFunction_t pxReceiveCompletedCallback )
{
    xStreamBuffer->pxReceiveCompletedCallback = pxReceiveCompletedCallback;
}

#endif /* configUSE_SB_COMPLETED_CALLBACK */
/*-----------------------------------------------------------*/

#if ( configUSE_SB_COMPLETED_CALLBACK == 1 )

void vStreamBufferSetSendCompletedCallback( StreamBufferHandle_t xStreamBuffer,
                                            StreamBufferCallbackFunction_t pxSendCompletedCallback )
{
    xStreamBuffer->pxSendCompletedCallback = pxSendCompletedCallback;
}

#endif /* configUSE_SB_COMPLETED_CALLBACK */
/*-----------------------------------------------------------*/

maxdd · July 30, 2024, 7:23pm

@aggarg
Is the change still needed on the other core?
Also is there an up-to-date example to use as a reference or is CORTEX_M7_M4_AMP_STM32H745I_Discovery_IAR + Callback API the best shot currently?

aggarg · July 31, 2024, 5:52am

Are you trying to solve the same problem? If yes, you’ll need to add the above mentioned 2 functions. Let us know if this solves your use case and we will be happy to make them (or you can raise a PR too ;)).

Yes, that is the best shot. Let us know if you face any issue.

maxdd · July 31, 2024, 7:02am

In general i’m trying to find a consistent way without the need to use external library to share data between Cores.
Are the two functions eventually supposed to be added in stream_buffer.c?
My understanding is that i would need to be sure that the CM4 calls these the function “vStreamBufferSetReceiveCompletedCallback” after the CM7 has initialized the xDataMessageBuffers hence after prvWaitForOtherCoreToStart.
“vStreamBufferSetSendCompletedCallback” is infact not needed because we can “create” the buffer “withCallback()” right?

aggarg · July 31, 2024, 8:24am

Yes, that is correct. To be precise, CM4 needs to call vStreamBufferSetReceiveCompletedCallback for the stream buffer after CM7 has created the stream buffer.

Right - you do not need to call this one. I proposed to add both for consistency.

If this solves your problem, we are happy to add these. Please let us know if you are able to make it work.