Unbalanced uxSchedulerSuspended in x86 simulator , freeRTOS version 10.0.0

rasty1 wrote on Thursday, July 25, 2019:

Hi
I get assertion in vTaskSleep due to uxSchedulerSuspended equals 1 ,

Happens rarely in native execution, but pretty fast with application running under drmemory.exe.

Anyone face something similar?

Thanks
Rasty

rasty1 wrote on Thursday, July 25, 2019:

It appears that ++uxSchedulerSuspended, --uxSchedulerSuspended is not quite atomic with MinGW compiler!
0047a76c: call 0x47faeb
2121 --uxSchedulerSuspended;
0047a771: mov 0x539524,%eax
0047a776: dec %eax
0047a777: mov %eax,0x539524

I’ll try __sync_fetch_and_add and __sync_fetch_and_sub

rasty1 wrote on Thursday, July 25, 2019:

I read article http://www.freertos.org/FreeRTOS_Support_Forum_Archive/September_2013/freertos_Concerns_about_the_atomicity_of_vTaskSuspendAll_d165e9c3j.html

Understand the Idea, but the fact is that xSchedulerSuspended left “1”.
Wrapping of ++xSchedulerSuspended with taskENTER_CRITICAL()/taskEXIT_CRITICAL() helps.

Maybe because unlike MCU, simulated CRITICAL does not really disable interrupts?

rtel wrote on Thursday, July 25, 2019:

Are you using any advanced compiler settings, such as link time
optimisation?

rtel wrote on Thursday, July 25, 2019:

Should have asked - are you using Visual Studio or GCC (the Windows port builds with both). My question related to GCC.

Also, which FreeRTOS version are you using? The latest port files in SVN (not yet relesed) have a few modifications to the WinSim port to try and correct some reports we get of misbeahviour, although we are not able to replicate the behaviour ourselves.

Finally - make sure you are not making any Windows system calls from FreeRTOS tasks, as that can accidentally put a thread into a blocked state when the FreeRTOS scheduler thinks it is still running - which in turn can cause logic errors.

rasty1 wrote on Thursday, July 25, 2019:

I use minGW GCC, debug settings. freeRTOS version is 10.0.0
Happens pretty fast if i feed application to https://drmemory.org/

Nothing special.

  1. high prio task gives several semaphores to several low prio tasks and calls TaskSleep(1).
  2. Low prio tasks wake up and do some work and pend of semaphore again,
  3. one low prio task uses getch()/putc() for FreeRTOS CLI (maybe not quite suitable for RTOS port because getch() is blocking in Win32, but I did not find better way for console.

rasty1 wrote on Thursday, July 25, 2019:

I opened another thread while ago. Need some way to tell RTOS that Windows call is going to block (enter/leave blocking call). Otherwise I cannot implement sockets, serial or console I/O.

rtel wrote on Thursday, July 25, 2019:

I opened another thread while ago. Need some way to tell RTOS that
Windows call is going to block (enter/leave blocking call). Otherwise I
cannot implement sockets, serial or console I/O.

Ah yes, there is another open thread about this, partially implemented
as per the head revision in SVN at the moment. I have not been able to
get the other changes to work yet - I was using Visual Studio rather
than GCC.

Ref the Dr Memory thing - does it happen when you don’t use Dr Memory,
and just run natively?

Ref memory use, note there is a known issue with a resource not being
freed with the way tasks are killed in the Windows simulator at the
moment. That is documented, and will only impact if you create and
delete tasks quickly over an extended period.

rasty1 wrote on Thursday, July 25, 2019:

I had random problems in native run as well. Mainly semaphore timeout or just hanging.
Initially I blamed Win32 API calls that change heuristics or memory corruption. Few days ago I started with drmemory to clean the code. Once major defects were fixed I started to get Assertion in TaskSleep. Found that uxSchedulerSuspended left “1”

It happens pretty fast.

Memory leak does not bother me.

rasty1 wrote on Friday, July 26, 2019:

I think that there is a weakness in simulator, because taskENTER_CRITICAL does not prevent from task switch .

Please consider following scenario

BaseType_t xTaskResumeAll( void )
{
TCB_t *pxTCB = NULL;
BaseType_t xAlreadyYielded = pdFALSE;

/* If uxSchedulerSuspended is zero then this function does not match a
previous call to vTaskSuspendAll(). */
configASSERT( uxSchedulerSuspended );

/* It is possible that an ISR caused a task to be removed from an event
list while the scheduler was suspended.  If this was the case then the
removed task will have been added to the xPendingReadyList.  Once the
scheduler has been resumed it is safe to move all the pending ready
tasks from this list into their appropriate ready list. */
taskENTER_CRITICAL();
{
		Task A					TaskB
	--uxSchedulerSuspended;
	loadA uxSchedulerSuspended --> reg = 1	
		taskSwitch -> 			
            LoadB uxSchedulerSuspended -->reg = 1
            IncrementB --> reg =2 

	decrementA uxSchedulerSuspended reg <-- 0
	storeA uxSchedulerSuspended <-- 0
						<-- taskSwitch  	
						StoreB 	uxSchedulerSuspended  <-- Reg = 2
	uxSchedulerSuspended = 2

rtel wrote on Friday, July 26, 2019:

It is possible you are seeing an issue where a task that is swapped out
continues to run for a little while - that is the issue I was referring
to that other people see but we are not able to duplicate and can occur
because swapping a task out in the simulator is just a matter of
suspending its associated Windows thread but suspending a thread in
Windows does not necessarily occur immediately. That is why the head
revision in SVN adds additional synchronisation objects into the context
switch - after a thread is suspended the next thing the thread does is
wait on a synchronisation object in case it continued to run - the
synchronisation object being intended to stop it in its tracks. Done
here (only in head revision):
https://sourceforge.net/p/freertos/code/HEAD/tree/trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c#l661

In this case though I’m not sure that can be the case so I would like to
understand better why you think a critical section does not prevent a
task switch - critical sections use synchronisation objects too, and
only one thread can hold the object at a time:

https://sourceforge.net/p/freertos/code/HEAD/tree/trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c#l605

…is where the object is taken, which effectively acts like disabling
interrupts where this running on an MCU:

https://sourceforge.net/p/freertos/code/HEAD/tree/trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c#l382

rasty1 wrote on Saturday, July 27, 2019:

critical section prevents from races.
but in this case. increment is not protected with critical section. only
decrement is guarded.
cpu assembly is not atomic as well.
I do not fully understand mapping of rtos to Windows.
in case mcu critical section disables interrupts and ensures that nothing
else preempts critical section it is clear for me.
under Windows critical section is a semaphore.
since Windows scheduler is not strictly preemptive it is possible that
task is swapped in and out in the middle of critical section and result of
decrement is overwritten.
again I’m not sure that I fully understand mapping of freertos to win32

On Fri, Jul 26, 2019, 15:58 Richard Barry rtel@users.sourceforge.net
wrote:

It is possible you are seeing an issue where a task that is swapped out
continues to run for a little while - that is the issue I was referring
to that other people see but we are not able to duplicate and can occur
because swapping a task out in the simulator is just a matter of
suspending its associated Windows thread but suspending a thread in
Windows does not necessarily occur immediately. That is why the head
revision in SVN adds additional synchronisation objects into the context
switch - after a thread is suspended the next thing the thread does is
wait on a synchronisation object in case it continued to run - the
synchronisation object being intended to stop it in its tracks. Done
here (only in head revision):

FreeRTOS Real Time Kernel (RTOS) / Code / [r2837] /trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c

In this case though I’m not sure that can be the case so I would like to
understand better why you think a critical section does not prevent a
task switch - critical sections use synchronisation objects too, and
only one thread can hold the object at a time:

FreeRTOS Real Time Kernel (RTOS) / Code / [r2837] /trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c

…is where the object is taken, which effectively acts like disabling
interrupts where this running on an MCU:

FreeRTOS Real Time Kernel (RTOS) / Code / [r2837] /trunk/FreeRTOS/Source/portable/MSVC-MingW/port.c

Unbalanced uxSchedulerSuspended in x86 simulator , freeRTOS version 10.0.0
https://sourceforge.net/p/freertos/discussion/382005/thread/70afb03902/?limit=25#b318/29d8

Sent from sourceforge.net because you indicated interest in
SourceForge.net: Log In to SourceForge.net

To unsubscribe from further messages, please visit
SourceForge.net: Log In to SourceForge.net

rtel wrote on Saturday, July 27, 2019:

I will look at the asm code you posted again, but not at my computer at the moment.

rasty1 wrote on Sunday, July 28, 2019:

I did few more tests.
I found that

  1. simulator hangs due to loop with putchar().
    Scanario: task (priority 1, one above idle) pends in getchar(), then prints long string after Enter.
    TaskDelay(0) after each putchar() prevents from hanging,
    Can be simulated when CLI i/o maps to Win32 putchar()/getchar()
  2. Modification vTaskSuspendAll() (below) solved unballanced uxSchedulerSuspended under Dr. Memory.
  3. Just __sync_fetch_and_add(&uxSchedulerSuspended,1) without taskENTER_CRITICAL() is not sufficient.

I believe that it is somehow related to calls to blocking Win32 APIs.
I admit that it is not recomended, but unfortunately I did not find anything better for console i/o and sockets.

void vTaskSuspendAll( void )
{
/* A critical section is not required as the variable is of type
BaseType_t. Please read Richard Barry’s reply in the following link to a
post in the FreeRTOS support forum before reporting this as a bug! -
http://goo.gl/wu4acr */
taskENTER_CRITICAL();
__sync_fetch_and_add(&uxSchedulerSuspended,1);
taskEXIT_CRITICAL();
}

rtel wrote on Sunday, July 28, 2019:

If I understand correctly you are making windows system calls from FreeRTOS tasks, which is something that is known not to be supported for this reason. If you want to print characters to the console then look at how vLoggingPrintf() is implemented in the FreeRTOS+TCP demos that run in the windows simulator.

rasty1 wrote on Sunday, July 28, 2019:

Yes I know that it is not supported. I assume because that RTOS does not know that API call is blocking. I though it will work out if i make all windows calls from low priority tasks at the same priority (allow time-slicing ).
Everything worked just fine at the beginning, but when application got bigger (more tasks and more CPU load) I found some strange effect that I describe in this thread.