xSemaphoreGive results i BusFault on STM32F777

Hello,

I’m developing a part of code that records some measurements. After successful measurement, existing thread (lets call it MeasThread) notifies my new thread (lets call it RecorderActuatorThread) via binary semaphore. RecorderActuatorThread then writes the measurement to the buffer, accessing it using mutex. After 2 successful writes, third one results in hard fault in xSemaphoreGive(_recorder.buffer.mutex). RecorderActuatorThread runs in a loop, where it checks for a message from RecorderThread, that can change it’s state, and writes to buffer if the state is recording

The application already consists of multiple threads (GUI, Meas, etc). Before adding recorder thread it works like a charm so (I guess) we can assume that the problem is somewhere in Recorder thread.

The _recorder.buffer.mutex is for now used only once in recorder thread and nowhere else.

Here is the code of recorder thread and its initialization:

static recorder_init_res_t _recorder_init(void)
{
	_recorder.queue = xQueueCreate(RECORDER_QUEUE_SIZE, sizeof(recorder_msg_t));
	if(_recorder.queue == NULL) return recorder_init_queue_error;

	_recorder.buffer.mutex = xSemaphoreCreateMutex();
	if(_recorder.buffer.mutex == NULL) return recorder_init_mutex_error;

	_recorder.new_measurement = xSemaphoreCreateBinary();
	if(_recorder.new_measurement == NULL) return recorder_init_semaphore_error;

	// Create recorder actuator task
	osThreadDef(recorder_acturator_thread, RecorderActuatorThread, osPriorityNormal, 0, 1024);
	osThreadCreate(osThread(recorder_acturator_thread), NULL);

	return recorder_init_ok;
}

void RecorderActuatorThread(void const * argument)
{
	osEvent event;

	while(1)
	{
		event = osMessageGet(_recorder.queue, 0);

		if(event.status == osEventMessage)
		{
                     .... (state machine - has nothing to do with freeRTOS)
		}

		if(_recorder.recording)
		{
			if(xSemaphoreTake(_recorder.new_measurement, 10) == pdTRUE)
			{
				_write_frame_to_buffer(recorder_frame_type_measurement);
			}
		}
		else
		{
			 osThreadYield();
		}
	}
}

static void _write_frame_to_buffer(recorder_frame_type_t type)
{
	recorder_frame_settings_welding_t frame_settings_welding;
	recorder_frame_program_settings_t frame_program_settings;
	recorder_frame_finnish_t frame_finnish;
	recorder_frame_header_t header = {.type = type, .version = RECORDER_FRAME_VERSION};
	recorder_frame_t frame = {.header = header};
	uint8_t *data_ptr = NULL;
	uint16_t frame_size;

	_recorder.buffer.frames_nr++;

	switch(type)
	{
		case recorder_frame_type_welding_settings:
			frame.header.data_size = sizeof(recorder_frame_settings_welding_t);
			// fill frame here
			frame_settings_welding.temp = 1;
			data_ptr = (uint8_t *)&frame_settings_welding;
			break;

		case recorder_frame_type_program_settings:
			frame.header.data_size = sizeof(recorder_frame_program_settings_t);
			// fill frame here
			frame_settings_welding.temp = 2;
			data_ptr = (uint8_t *)&frame_program_settings;
			break;

		case recorder_frame_type_scan_results:
			frame.header.data_size = sizeof(ScanResults_t);
			data_ptr = (uint8_t *)&GenInfo.ScanResults;
			break;

		case recorder_frame_type_measurement:
			frame.header.data_size = sizeof(MeasList2CH_t);
			data_ptr = (uint8_t *)&GenInfo.Meas;
			break;

		case recorder_frame_type_finish:
			frame.header.data_size = sizeof(recorder_frame_finnish_t);
			frame_finnish.frames_nr = _recorder.buffer.frames_nr;
			frame_finnish.error = GenInfo.error;
			data_ptr = (uint8_t *)&frame_finnish;
			break;
	}

	if(data_ptr != NULL && frame.header.data_size > 0)
	{
		memcpy(frame.data, data_ptr, frame.header.data_size);
	}

	// NOTE: make sure that frame size is aligned to 4 bytes because if not - HardFault can occur
	frame_size = sizeof(recorder_frame_header_t) + frame.header.data_size;
	_write_to_buffer((uint8_t *)&frame, frame_size);
}

static void _write_to_buffer(uint8_t *ptr, uint32_t size)
{
	static uint64_t bytes_written = 0;
	if(_recorder.buffer.write_ptr + size > _recorder.buffer.end_ptr)
	{
		_error_handler(recorder_error_write_buffer_overflow);
		return;
	}

	xSemaphoreTake(_recorder.buffer.mutex, portMAX_DELAY);
	memcpy(_recorder.buffer.write_ptr, ptr, size);
	_recorder.buffer.write_ptr += size;
	bytes_written += size;
	xSemaphoreGive(_recorder.buffer.mutex);
}

And here is the function used for notifying about new measurement:

void recorder_new_measurement_notify(void)
{
	if(_recorder.recording)
	{
		xSemaphoreGive(_recorder.new_measurement);
	}
}

Now, I understand the concepts of mechanisms I use in the code, but I am aware that I don’t know all the specifics about FreeRTOS. I am struggling to find out what is the problem, as when I’m debugging the applicantion, the workflow seems fine, all the pointers look ok, and after two successful writes, xSemaphoreGive(_recorder.buffer.mutex) at the end of _write_to_buffer results in hard fault and I have no idea why. The debugger is struggling to unwind call stack properly so when the hard fault occurs, so I am not posting it. I hope that I provided enough information to let you see some thing I’m missing and you guys will help me out with it.

Have you checked for stack overflow? What are the stack sizes of the involved tasks?

The RecorderActuatorThread stack size is 1024 bytes. When I doubled it, nothing changed, so I assumed the problem is not there

What is the definition of your _recorder structure?

When you set a break point on your xSemaphoreTake() function call, what are the contents of r0 the first and second times trough vs. the third invocation?

static const recorder_buffer_t _buffer_init = {
		.ptr = (uint8_t *)RECORDER_BUFFER_ADDRESS,
		.end_ptr = (uint8_t *)(RECORDER_BUFFER_ADDRESS + RECORDER_BUFFER_SIZE),
		.write_ptr = (uint8_t *)RECORDER_BUFFER_ADDRESS,
		.read_ptr = (uint8_t *)RECORDER_BUFFER_ADDRESS,
		.frames_nr = 0,
};

static recorder_t _recorder = {.recording = 0, .buffer = _buffer_init};

Where RECORDER_BUFFER_ADDRESS is 0xC2E00000 which is a valid address of external SDRAM on my board. RECORDER_BUFFER_SIZE is 1MB. I double checked that the RecorderActuatorThread is the only resource accessing this region of memory.

As for the R0:

  1. R0 = 8
  2. R0 = 16
  3. R0 = 76

sorry, what is the structure definition, not the initialization?

Also, I was looking for the value of R0 at the time of the

bl [xSemaphoreGive]

instruction (you might want to relocate the breakpoint to the assembly window, not the code window). This is just to ensure that a valid pointer is being passed to the system call.

Just to make sure about the usage of the stack, are the following variables big in size?

recorder_frame_settings_welding_t frame_settings_welding;
recorder_frame_program_settings_t frame_program_settings;
recorder_frame_finnish_t frame_finnish;
recorder_frame_header_t header = {.type = type, .version = RECORDER_FRAME_VERSION};
recorder_frame_t frame = {.header = header};

Or are there any other big variables declared on the stack?

The RecorderActuatorThread stack size is 1024 bytes

Normally the size of the stack is expressed in units of words, also when calling xTaskCreate(). So do you mean 1024 bytes or 1024 words?

Actually there is no bl [xSemaphoreGive].

Here is the disassembly part:

336       	xSemaphoreTake(_recorder.buffer.mutex, portMAX_DELAY);
08015742:   mov.w   r1, #4294967295 ; 0xffffffff
08015746:   ldr     r0, [r4, #32]
08015748:   bl      0x8009efc <xQueueSemaphoreTake>
337       	memcpy(_recorder.buffer.write_ptr, ptr, size);
0801574c:   mov     r2, r5
0801574e:   add     r1, sp, #12
08015750:   ldr     r0, [r4, #20]
08015752:   bl      0x80006a8 <memcpy>
338       	_recorder.buffer.write_ptr += size;
08015756:   ldr     r3, [r4, #20]
08015758:   add     r3, r5
0801575a:   str     r3, [r4, #20]
339       	bytes_written += size;
0801575c:   ldr     r3, [pc, #48]   ; (0x8015790 <_write_frame_to_buffer+232>)
0801575e:   ldrd    r0, r1, [r3]
08015762:   adds    r0, r0, r5
08015764:   adc.w   r1, r1, #0
08015768:   strd    r0, r1, [r3]
340       	xSemaphoreGive(_recorder.buffer.mutex);
0801576c:   movs    r3, #0
0801576e:   ldr     r0, [r4, #32]
08015770:   mov     r2, r3
08015772:   mov     r1, r3
08015774:   add     sp, #36 ; 0x24
08015776:   ldmia.w sp!, {r4, r5, lr}
0801577a:   b.w     0x8009a7c <xQueueGenericSend>
0801577e:   nop     

Where exactly should I locate the breakpoint?

To this location, please…

I didn’t know that. It seems it is 1024 words then.

As for the structures, the biggest one is around 20 bytes

Everytime value of R0 is 0x2003D208

ok, that looks right. So you are saying that after you hit that break point for the third time and then try to single step OVER the b.w instruction, you end up in a fault?

Something odd happening… When I’m stepping in disassembly, then it goes over and I can’t tell where exactly the hardfault occurs. It now looks like some asynchronous event, but I’m not sure

well ok, so you may have been barking up the wrong tree. Your IDE may have trouble correctly decoding the fault stack frame, so you need to do that manually. Look at R13 at fault time and dump the topmost 20 bytes. Prior to check whether R13 matches R13_p or R13_m which will tell you whether you came from an ISR or user mode. There are docs on this site that tell you how to decode the stack frame.

You MAY have too small an interupt stack.

What R13 do you mean? From general purpose registers?

If you have access to R0, you should also see R13 in the same window in the IDE. On Cortex M Series, R13 maps to either R13_m or R13_p, so the “general purpose” R13 will always have the same value as one of the two. What IDE are you using?

I am using Eclipse. I only see R0 - R12 in Registers view, thats why I’m asking

Is there something called SP or similar in the list?

Anyways, you should first study the Cortex fault stack frame documentation before proceeding.

The other thing to try before further fault analysis is to extend the interrupt stack. There are several pieces of documentation that explain how it’s done (normally through the linker command file).

There is sp and it has the same value as msp

ok, so your next step is to investigate the fault by looking at the fault stack. Again, there is a lot of documentation on the internet and in this forum about how to do that. I checked superficially and found this one:

Seems to be reasonable enough.

Good luck!

1 Like