Bug in the Win32 Port

davidefer wrote on Tuesday, April 09, 2019:

Hi Richard
I open a new topic to emphasize the problem.

The vPortExitCritical function in the Win32 port has a major bug:

void vPortExitCritical( void )
{
int32_t lMutexNeedsReleasing;

/* The interrupt event mutex should already be held by this thread as it was
obtained on entry to the critical section. */

lMutexNeedsReleasing = pdTRUE;

if( ulCriticalNesting > portNO_CRITICAL_NESTING )
{
	if( ulCriticalNesting == ( portNO_CRITICAL_NESTING + 1 ) )
	{
		ulCriticalNesting--;

		/* Were any interrupts set to pending while interrupts were
		(simulated) disabled? */
		if( ulPendingInterrupts != 0UL )
		{
			configASSERT( xPortRunning );
			SetEvent( pvInterruptEvent );

			/* Mutex will be released now, so does not require releasing
			on function exit. */
			lMutexNeedsReleasing = pdFALSE;
			ReleaseMutex( pvInterruptEventMutex );
		}
	}
	else
	{
		/* Tick interrupts will still not be processed as the critical
		nesting depth will not be zero. */
		ulCriticalNesting--;
       >>>>> lMutexNeedsReleasing = pdFALSE;  // this operation is missing!!!
	}
}

if( pvInterruptEventMutex != NULL )
{
	if( lMutexNeedsReleasing == pdTRUE )
	{
		configASSERT( xPortRunning );
		ReleaseMutex( pvInterruptEventMutex );
	}
}

}

lMutexNeedsReleasing = pdFALSE; is missing, when the ulCriticalNesting is not one.
The mutex is released when nested vPortEnterCritical calls are done, allowing other tasks to interrupt even if they are not allowed to.

davidefer wrote on Tuesday, April 09, 2019:

And here is another problem:

vPortGenerateSimulatedInterrupt is used to generate a context switch(yield) from within vPortEnter/ExitCritical.
The original version of vPortGenerateSimulatedInterrupt released unconditionally the mutex, allowing other tasks to pre-empt while the original task was still in a critical section.
So the releas mutex call goes in the if.

The change forces also to check for the critical nesting before acquiring the mutex.
Here is the fixed version (I’ve replaced zero with portNO_CRITICAL_NESTING):

void vPortGenerateSimulatedInterrupt( uint32_t ulInterruptNumber )
{
configASSERT( xPortRunning );

if( ( ulInterruptNumber < portMAX_INTERRUPTS ) && ( pvInterruptEventMutex != NULL ) )
{
    // we do need a test on critical nesting!
	if (ulCriticalNesting == portNO_CRITICAL_NESTING)
	{
		WaitForSingleObject(pvInterruptEventMutex, INFINITE);
	}
	ulPendingInterrupts |= ( 1 << ulInterruptNumber );

	/* The simulated interrupt is now held pending, but don't actually
	process it yet if this call is within a critical section.  It is
	possible for this to be in a critical section as calls to wait for
	mutexes are accumulative. */
	if( ulCriticalNesting == portNO_CRITICAL_NESTING )
	{
		SetEvent( pvInterruptEvent );
   		>>> ReleaseMutex( pvInterruptEventMutex ); // this must go here
	}
}

}

davidefer wrote on Wednesday, April 10, 2019:

We have discovered another problem in the port that causes tasks deadlocks, particularly in the function ulTaskNotifyTake :

				/* All ports are written to allow a yield in a critical
				section (some will yield immediately, others wait until the
				critical section exits) - but it is not something that
				application code should ever do. */
				portYIELD_WITHIN_API();
			}
			else
			{
				mtCOVERAGE_TEST_MARKER();
			}
		}
		else
		{
			mtCOVERAGE_TEST_MARKER();
		}
	}
	taskEXIT_CRITICAL();

	taskENTER_CRITICAL();
	{

The original taskEXIT_CRITICAL was assuming that windows performed a task switch allowing the FreeRTOS interrupts management to take place and swap out the current task.
Well, at least in our simulation, sometimes it didn’t happen!
It seems that sometimes Windows has some delays in the context switch.
The taskENTER_CRITICAL was immediately performed, thus blocking the context switch and causing a deadlock of the current task.

The solution is to wait that Windows performs the context switch. I’ve tried with Sleep(0) but it didn’t work. Sleep(1) worked but it would mess up the FreeRTOS scheduler.
The only solution I’ve found was just to wait the interupts management to occur.
(see also this thread: https://stackoverflow.com/questions/5848448/forcing-context-switch-in-windows).

I attach our current port implementation.
I’ve fixed all the problems mentioned above and marked the changes with #PORTCHANGE

I hope that these changes will be revised and if eligible integrated in the FreeRTOS official release.

port.c (23.9 KB)

dukb wrote on Wednesday, April 10, 2019:

Which version of file You showed here ?
Orginal source of the 10.2.0 port.c file for MSVC-MingW I copied below. It different to the code You posted.
Are You going to include all changes in next FreeRTOS version?
OR
Can You gather all changes You currently working on in patch (or branch from repo)?
Sorry I’ve just found out that whole port.c is in next post :slight_smile:

void vPortGenerateSimulatedInterrupt( uint32_t ulInterruptNumber )
{
configASSERT( xPortRunning );

if( ( ulInterruptNumber < portMAX_INTERRUPTS ) && ( pvInterruptEventMutex != NULL ) )
{
	/* Yield interrupts are processed even when critical nesting is
	non-zero. */
	WaitForSingleObject( pvInterruptEventMutex, INFINITE );
	ulPendingInterrupts |= ( 1 << ulInterruptNumber );

	/* The simulated interrupt is now held pending, but don't actually
	process it yet if this call is within a critical section.  It is
	possible for this to be in a critical section as calls to wait for
	mutexes are accumulative. */
	if( ulCriticalNesting == 0 )
	{
		SetEvent( pvInterruptEvent );
	}

	ReleaseMutex( pvInterruptEventMutex );
}

}

davidefer wrote on Wednesday, April 10, 2019:

I have taken the port of version V10.2.0 and made my modifications. Actually we are still working with V10.0.1 but the port is compatible.

dukb wrote on Tuesday, April 16, 2019:

I found one bug in port.c davidefer posted:
In void vPortExitCritical(void) there right now pvInterruptEventMutex is not release if ulCriticalNesting is above 1 (nestig is allowed so “Enable Interrupts” only when it’s equal 0 ).
It’s in line number 659:
lMutexNeedsReleasing = pdFALSE; // #PORTCHANGE mutex cannot be released in critical section
BUT
in* vPortEnterCritical* this mutex is locked more than once (in case of nesting critical sections). Line number 605:

  • if( xPortRunning == pdTRUE )
    {
    / The interrupt event mutex is held for the entire critical section,
    effectively disabling (simulated) interrupts. /
    WaitForSingleObject( pvInterruptEventMutex, INFINITE );
    ulCriticalNesting++;
    }

So whenever nesting critical occurs there is deadlock.
Solution is:
*
if (ulCriticalNesting == portNOCRITICALNESTING)
{
WaitForSingleObject( pvInterruptEventMutex, INFINITE );
}
ulCriticalNesting++;
*

Other thing is question if change in ExitCritical section is needed. Based on ResumeThread documentation https://docs.microsoft.com/en-us/windows/desktop/api/synchapi/nf-synchapi-releasemutex
A thread can specify a mutex that it already owns in a call to one of the wait functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. However, to release its ownership, the thread must call ReleaseMutex one time for each time that it obtained ownership (either through CreateMutex or a wait function).
It looks like nesting mechanism is implemented inside the mutex already.
So in previous version it could work correctly since every call of vPortEnterCritical make lock attempt on mutex and every vPortExitCritical made unlock attempt.

Does anyone know if the changes will be included in next official FreeRTOS release?

davidefer wrote on Tuesday, April 16, 2019:

So according to what you say, the whole ulCriticalNesting management could be deleted. (?) Or at least in vPortGenerateSimulatedInterrupt.
Anyway it is quite strange because I’ve let my app run for hours without deadlocks.

dukb wrote on Tuesday, April 16, 2019:

@davidefer I think there is still ulCriticalNesting needed to inform vPortGenerateSimulatedInterrupt that interrupts are disabled and not to wait for mutex in the case. I think increasing it every vPortEnterCritical is not necessary and it could be bool.
But it needs to be tested since it’s not described clearly in MS documentation.
Look for example on https://docs.microsoft.com/en-us/windows/desktop/api/processthreadsapi/nf-processthreadsapi-resumethread
It is clearly said that ResumeThread decrease suspend count and return value before change. Thread is running again if suspend count is equal 0.

You didn’t observe it… hmmm
Are You sure You called at least 2 times vPortEnterCritical in raw (without calling vPortExitCritical)?
Could You place breakpoint in else section to
if( ulCriticalNesting == ( portNOCRITICALNESTING + 1 ) ) statement ?

davidefer wrote on Tuesday, April 16, 2019:

it seems you are right, i have no nesting. I’ll let a test run overnight with your mod. Thank you!

davidefer wrote on Wednesday, April 17, 2019:

It seems that most of the changes I’ve done concerning the mutex management are useless, thanks to @dukb.

Here attached is my current port version. I’ve removed the useless changess but left my bug fixes.

port.c (23.2 KB)

rtel wrote on Wednesday, April 17, 2019:

Thanks - there are quite a few changes to look through here.

wtodd wrote on Thursday, April 18, 2019:

I also noticed the problem of portYIELD not yielding immediately, causing the scheduler to lock up, but my fix uses a different approach than davidefer.

First, I made each task store its thread state in a thread-local storage slot. This allows the task to access its own thread state and lets you check if a thread is a FreeRTOS task or just a normal Windows thread.

I then added a Windows Event object to xThreadState. When a task exits a critical section with interrupts pending, it waits on this event object until prvProcessSimulatedInterrupts signals that it is done processing the interrupts. If an interrupt triggers a context switch, the task will continue to wait on this event until it is scheduled to run again.

Then, I modified vPortGenerateSimulatedInterrupt so it just enters a critical section, sets the interrupt flag and event, then exits the critical section. Now when a task calls vPortGenerateSimulatedInterrupt, it will wait for interrupts to finish before returning from the finall call to portEXIT_CRITICAL.

I also added a tickless idle implementation that reduces the cpu usage of the windows port.

port.c (24.4 KB)

rtel wrote on Saturday, July 06, 2019:

Sorry been a while. Looking at the code now I’m not sure I see issues. There is however an issue in the port.c you posted as the loop waiting for the interrupt counter to change in the exit critical function is causing a hang when the trace recorder is used from the tick interrupt - as that uses critical sections inside the tick.

There is a call to GetThreadContext() in prvProcessSimulatedInterrupts() designed to wait until the thread that was suspended is actually in the suspended state. I have tried setting that to a loop to ensure it does indeed wait until the subject thread is suspended but the only time I’ve seen GetThreadContext() fail is when the subject thread has deleted itself - so the handle passed to GetThreadContext() is NULL.

I will continue looking at this.

[EDIT] See https://sourceforge.net/p/freertos/discussion/382005/thread/d8a09bc895/#4231

rtel wrote on Sunday, July 07, 2019:

Ok - I think I can replicate now. Curiously it was having the trace recorder code included in the build that was masking the issue - presumably because that has its own critical sections.

I’m prety confident that the latest simply update (which will be checked in soon, but at the time of wrting SVN is not allowing a secure connection) fixes the issue. The update just allocates an event for each task, then each time the task leaves the blocked state by means other than the tick interrupt the first thing it does next is block on its own event. If the task continues running (because the SuspendThread() is asynchronous) for a bit then it won’t get past blocking on the event. When the task is resume the event is signalled, freeing the the task again.

davidefer wrote on Monday, July 08, 2019:

ok thank you, we will check your solution a soon as checked in

rtel wrote on Monday, July 08, 2019:

So it turns out that when I thought I had managed to replicate the issue reported here it was actually a bug in the application code I was testing - once that was fixed I’m back not being able to replicate the issue. Can you please tell me which version of Windows you are using, and also confirm that you are not making ANY windows system calls from inside a FreeRTOS task (including writes to disk and writes to the console - although you can get away with these if they are infrequent and only done from one task as per the demo app).

However the changes I made to the port.c file are probably valuable to keep just in case. I’m not going to check them in yet because the soak test failed last night - I’m almost certain the failure was a false positive but nonetheless I will continue to test before checking in. I have attached the updated port.c file to this post - you can compare it with the head revision in SourceForge to see the changes. Please let me know if it solves your problem (assuming the problem is not caused by using Windows system calls).port.c (23.7 KB)

davidefer wrote on Tuesday, July 09, 2019:

Hi Richard, concerning the Win32 calls, as stated here we do actually perform them but protected by the taskENTER_CRITICAL(); Win32call; taskEXIT_CRITICAL(); sequence.

Your port seems good until now, it has run for some hours without problems. I’ll check it with a different project and let you know tomorrow the results.

wtodd wrote on Tuesday, July 09, 2019:

Richard, I have some concerns about the solution you posted.

  1. It only fixes the problem for the yield interrupt. If any other simulated interrupt causes a context switch, the suspended thread could continue to run for a while after the call to vPortGenerateSimulatedInterrupt.
  2. You’re using ulPendingInterrupts to check if you should wait on the yield event object AFTER releasing the mutex. If the simulated interrupt handler preempts the thread and clears the flags before you reach the check, the task will never actually wait on the event. You should either make a local copy before releasing the mutex, or create a boolean indicating that the thread should wait on the event.
  3. Why do you wait on the event in a loop with a 0 timeout? You can just use ResetEvent, right?
  4. The fix assumes that the thread calling vPortGenerateSimulatedInterrupt or vPortExitCritical is the currently scheduled FreeRTOS task. If it’s a normal Windows thread instead, it will access the yield event of the currently scheduled FreeRTOS task, which could cause it to unblock that FreeRTOS task early. This is why I used a thread-local storage slot in my solution. It allows you to access the thread state of the currently executing thread (not just the currently scheduled thread), which ensures you’re using the correct yield event object. And, in the case of a normal Windows thread, it will be NULL, so you know not to wait on an event. It also provides an easy way to check if a thread is running when it shouldn’t be (the thread state is different than pxCurrentTCB).
  5. There is no need to duplicate the yield event waiting code in vPortGenerateSimulatedInterrupt and vPortExitCritical. Just wrap the code inside vPortGenerateSimulatedInterrupt with vPortEnterCritical/vPortExitCritical instead of manually taking the mutex.
  6. You don’t call CloseHandle on the yield event when deleting the thread.

In regards to calling Windows system calls from a FreeRTOS task. I know your official position is to just don’t do that, but at the same time you break this rule in your own demos. From what I can tell, it’s not safe to call any external function (even standard c library functions) that could make a system call, without first entering a critical section. Otherwise, the FreeRTOS scheduler could try to suspend the thread during the syscall, which can cause SuspendThread to fail (or not suspend immediately). If it actually does suspend the thread successfully, it can cause a deadlock if another thread calls an system function before the first has a chance to finish (if the first had acquired a mutex before being suspended).

I suspect that SuspendThread can fail even when the program does not call system functions from FreeRTOS tasks, because the Windows port itself does so. Namely, when entering/exiting a critical section, it acquires and releases the interrupt mutex. If the scheduler tries to suspend the thread at the wrong point inside WaitForSingleObject or ReleaseMutex, it can cause the call to SuspendThread to either fail to suspend the thread immediatly, or in some cases, fail to suspend the thread at all. That’s why it is important to block the FreeRTOS task on an event inside vPortExitCritical if any interrupt is pending, in case the interrupt causes a context switch and SuspendThread fails.

From my testing, with my fix in place and with all system calls wrapped in a critical section, we can safely interact with standard Windows threads, without relying on lockless data structures or using the Windows thread priority as synchronization mechanism. We can safely call taskENTER_CRITICAL/taskEXIT_CRITICAL from Windows threads to synchronize access to memory that is shared with a FreeRTOS task. We can interract with Windows threads from FreeRTOS tasks by calling Windows API functions from a critical section. We can interract with FreeRTOS tasks from Windows threads using the ISR FreeRTOS api functions (provided we enter a critical section first and manually call portYIELD if it is requested).

rtel wrote on Tuesday, July 09, 2019:

Richard, I have some concerns about the solution you posted.

Thanks for your feedback.

  1. It only fixes the problem for the yield interrupt. If any other
    simulated interrupt causes a context switch, the suspended thread
    could continue to run for a while after the call to
    vPortGenerateSimulatedInterrupt.

Skipping this one for now as will need more thought.

[edit1] This is basically the same case as a tick interrupt resulting in a context switch. When making the changes posted above I decided not to consider the tick as a critical case as a tick interrupt can occur at any time - and therefore if the context switch (the task being suspended) occured a little late it would not cause a fatal error because the task being interrupted is unaware of it anyway. This is a very different scenario to a task yielding because it is entering the blocked state - if a task fails to enter the blocked state because it is yielding then there will be logic errors in the program that are critical. For example, if a task is blocking on a queue but continues past the block point then the queue logic won’t work and anything might happen. The Win32 port is not ‘real time’ in any case, but an approximation that should behave logically the same, but not necessarily temporaraly the same.

  1. You’re using ulPendingInterrupts to check if you should wait on the
    yield event object AFTER releasing the mutex. If the simulated
    interrupt handler preempts the thread and clears the flags before
    you reach the check, the task will never actually wait on the event.
    You should either make a local copy before releasing the mutex, or
    create a boolean indicating that the thread should wait on the event.

Can you please give me a line number (from the file I posted).

[edit 1] Fixed.

  1. Why do you wait on the event in a loop with a 0 timeout? You can
    just use ResetEvent, right?

…because I didn’t know about ResetEvent(), will change.

  1. The fix assumes that the thread calling
    vPortGenerateSimulatedInterrupt or vPortExitCritical is the
    currently scheduled FreeRTOS task. If it’s a normal Windows thread
    instead, it will access the yield event of the currently scheduled
    FreeRTOS task,

If I understand you correctly here - that you are describing something
it is not valid to do. Normally scheduled windows threads cannot access
FreeRTOS scheduler operations, and Windows threads that are running
FreeRTOS tasks cannot make Windows system calls - the two are just not
compatible logically.

which could cause it to unblock that FreeRTOS task
early. This is why I used a thread-local storage slot in my
solution. It allows you to access the thread state of the currently
executing thread (not just the currently scheduled thread), which
ensures you’re using the correct yield event object. And, in the
case of a normal Windows thread, it will be NULL, so you know not to
wait on an event. It also provides an easy way to check if a thread
is running when it shouldn’t be (the thread state is different than
pxCurrentTCB).

Maybe I’m just not following this properly, but doing this kind of thing
could be the route cause of your issues, and be why I can’t replicate them.

  1. There is no need to duplicate the yield event waiting code in
    vPortGenerateSimulatedInterrupt and vPortExitCritical. Just wrap the
    code inside vPortGenerateSimulatedInterrupt with
    vPortEnterCritical/vPortExitCritical instead of manually taking the
    mutex.
  2. You don’t call CloseHandle on the yield event when deleting the thread.

Will fix.

In regards to calling Windows system calls from a FreeRTOS task. I know
your official position is to just don’t do that, but at the same time
you break this rule in your own demos.

In the standard demo we get away with this because we write to stdout
very infrequently and from only one task. You will note that the call
to kbhit() is normally commented out, because that causes issues too.

The TCP/IP examples on the other hand write to the console rapidly, so
we don’t use printf() at all as it will soon crash if we do. Instead we
send the strings to print to a Windows thread that is not under the
control of the FreeRTOS kernel and print them out from there. In that
case the strings are sent in a circular RAM buffer to avoid sharing any
FreeRTOS primitives with the Windows thread. I think the thread running
kernel code does signal the Windows thread somehow (forget how) but in
non blocking way.

From what I can tell, it’s not
safe to call any external function (even standard c library functions)
that could make a system call, without first entering a critical
section.

Entering a critical section may work, but I don’t know enough about the
inner workings of Windows to know.

Otherwise, the FreeRTOS scheduler could try to suspend the
thread during the syscall, which can cause SuspendThread to fail (or not
suspend immediately). If it actually does suspend the thread
successfully, it can cause a deadlock if another thread calls an system
function before the first has a chance to finish (if the first had
acquired a mutex before being suspended).

I suspect that SuspendThread can fail even when the program does not
call system functions from FreeRTOS threads, because the Windows port
itself does so. Namely, when entering/exiting a critical section, it
acquires and releases the interrupt mutex. If the scheduler tries to
suspend the thread at the wrong point inside WaitForSingleObject or
ReleaseMutex, it can cause the call to SuspendThread to either fail to
suspend the thread immediatly, or in some cases, fail to suspend the
thread at all. That’s why it is important to block the FreeRTOS task on
an event inside vPortExitCritical if any interrupt is pending, in case
the interrupt causes a context switch and SuspendThread fails.

Again need to consider this point more.

From my testing, with my fix in place and with all system calls wrapped
in a critical section, we can safely interact with standard Windows
threads, without relying on lockless data structures or using the
Windows thread priority as synchronization mechanism. We can safely call
taskENTER_CRITICAL/taskEXIT_CRITICAL from Windows threads to synchronize
access to memory that is shared with a FreeRTOS task. We can interract
with Windows threads from FreeRTOS tasks by calling Windows API
functions from a critical section. We can interract with FreeRTOS tasks
from Windows threads using the ISR FreeRTOS api functions.

I only tried the code posted by davidefer - which for me locked up right
away, I think because the trace recorder code was calling critical
sections from inside the interrupt. I need to study your code in more
detail as it sounds like it has some goodness - I did diff your file
with the released code over the weekend but there were too many changes
to look through.

Good discussions -

wtodd wrote on Tuesday, July 09, 2019:

I have also found a related bug in the Windows port. The scheduler can cause a deadlock if it suspends a thread before it has fully initialized (i.e. before it calls into the specified thread function). I suspect that the Windows thread initialization code acquires some mutex, which other system functions also try to acquire. If the thread is suspended after acquiring the mutex but before releasing it, and then another thread tries to acquire the same mutex, the program will deadlock.

I’ll attach my solution, which is just to wait for the thread to fully initialize before returning from pxPortInitialiseStack. It also includes my previous changes and some changes to fix the tick accuracy, which you may find useful.

port.c (25.7 KB)