ulTaskNotifyTake timeout accuracy

Since your code never suspends the scheduler, your code should never see a nonzero value in uxPendedTicks. (Well, your ISRs might see it if they happen to interrupt FreeRTOS code while it has the scheduler suspended.) So if you noticed uxPendedTicks is 1 when ulTaskNotifyTake() returns, something is wrong. But perhaps you mean that uxPendedTicks had been 1 at some point during the timeout. That would be normal.

Anyway I still feel confident that you can’t get a timeout < 4 ticks if you request 5 ticks.

If you end up using a scope to watch the ISR that is supposed to execute every 4ms, please let us know what you find.

Yes, uxPendedTicks had been 1 at some point during the timeout. since you agreed that was normal, then if that tick was deferred at the beginning of the ulTaskNotifyTake() long enough, the first and second tick could be counted shortly after the call, and made the timeout close to 2 ticks less than what was specified.

See the captured picture using the IAR I-Jet probe and its ITM_EVENT feature, on the 4th line, 0x0000 - when the external interrupt asserted, 0x0001 - before calling ulTaskNotifyTake(), 0x0002 - ulTaskNotifyTake() returned in 3.1ms. I also verified that using the TIM2 timer. I think those measurements are equivalent to the scope measurement.

Here’s a “normal” case for uxPendedTicks == 1:

  • Task A calls ulTaskNotifyTake() with xTickCount = 0 and xPendedTicks = 0 (and no tick ISR pending either)
  • ulTaskNotifyTake() decides to wake Task A at xTickCount == 5
  • Task B calls FreeRTOS API function that suspends scheduler
  • Tick occurs (xTickCount = 0, xPendedTicks = 1)
  • FreeRTOS function for Task B resumes scheduler (xTickCount = 1, xPendedTicks = 0)
  • Tick occurs (xTickCount = 2, xPendedTicks = 0)
  • ulTaskNotifyTake() finally returns (xTickCount >= 5, xPendedTicks = 0)

It just can’t make your timeout end early. You can play with the order of things to make sure.

Is there any chance you’re calling ulTaskNotifyTake() with interrupts masked? eg, portDISABLE_INTERRUPTS(), taskDISABLE_INTERRUPTS(), *PortRaiseBASEPRI(), etc. That would be one way that the timeout would be too short but the xTickCount delta would still be right.

I added more traces to the code, and now have better explanation of the issue.

ITM1 traces the xTickCount in the xTaskIncrementTick() before calling vApplicationTickHook(); if scheduler is suspended, the highest byte is the uxPendedTicks, and lower 16-bit of xTickCount; when scheduler is resumed, the 2nd highest byte is the uxPendedTicks.
ITM2 traces the adjusted xTickCount when Idle function exits the low power sleep mode inside vTaskStepTick(); the reload register value of systick before idle enters sleep; and the current counter after idle task exited sleep.
ITM3 traces all interrupts of my own code
ITM4 traces external interrupt (0x00000001); ulTaskNotifyTake(0x00000002); and ulTaskNotifyTake exits prematurely

Here is a capture of two interrupt cycles, the first one is normal, and the second is the one has the issue:


After my work task finished its job, it called ulTaskNotifyTake(timeout=5) to wait for the next interrupt. The xTickCount was 0x1B2F and the timeout should be 0x1B34. The SysTick reload register was 48000*4 + remainder of previous systick register value. The next interrupt woke up the idle task before the SysTick timed out, the idle task adjusted the xTickCount to 0x1B32, also set the systick timeout to the fraction of a single tick based on the systick current counter value. Everything was good so far. After finished the job, the work task called ulTaskNotifyTake() to wait again. In this case, right before idle task entered sleep, the Systick interrupt asserted.

int_trigger
The fraction of a single tick out of 0x143B7 was 34871, about 720us.


Because idle task suspended the scheduler, the systick interrupt ISR (0x01001B32) was not able to increment the xTickCount, instead it incremented the xPendedTicks.
The idle task entered sleep mode, and set the systick timeout to 0x23253, about 3 ticks after.
Why it was 3? Could be another task which needs wakeup earlier than the work task.

Here is the snippet from the portTASK_FUNCTION, scheduler was suspended before entering sleep mode:

 static portTASK_FUNCTION( prvIdleTask, pvParameters )
…
	xExpectedIdleTime = prvGetExpectedIdleTime();
	if( xExpectedIdleTime >= configEXPECTED_IDLE_TIME_BEFORE_SLEEP )
	{
		vTaskSuspendAll();
		{
			/* Now the scheduler is suspended, the expected idle
			time can be sampled again, and this time its value can
			be used. */
			configASSERT( xNextTaskUnblockTime >= xTickCount );
			xExpectedIdleTime = prvGetExpectedIdleTime();

			if( xExpectedIdleTime >= configEXPECTED_IDLE_TIME_BEFORE_SLEEP )
			{
				traceLOW_POWER_IDLE_BEGIN();
				portSUPPRESS_TICKS_AND_SLEEP( xExpectedIdleTime );
				traceLOW_POWER_IDLE_END();
			}
			else
			{
				mtCOVERAGE_TEST_MARKER();
			}
		}
		( void ) xTaskResumeAll();
	}


After 3 ticks, the systick interrupt asserted, since the idle task suspended the scheduler, the systick ISR had to increment the xPendedTicks again and xPendedTicks became 2.

HERE IS the issue:
Then the idle task exited the sleep, and saw systick elapsed 3 ticks and called vTaskStepTick() to adjust the xTickCount by 3, but the xPendedTicks was still 2 after the adjustment. After xTaskResumeAll(), the 2 xPendedTicks was added to xTickCount AGAIN.

This explains why the xTickCount increased by 5 while the true elapsed time was about 3ms.

A great bit of debugging here! The fact you are using tickless mode was a BIG piece of information missing in the thread - at least for me.

If you are using the default sleep mode, then the call to eTaskConfirmSleepModeStatus() here https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/master/portable/GCC/ARM_CM4F/port.c#L544 should abort the sleep if there is already a tick pending. Do you see why that is not happening?

Yes, tickless idle certainly changes things. Note @rtel the abort sleep for pending ticks is not yet released. See here: https://github.com/FreeRTOS/FreeRTOS-Kernel/commit/e1b98f0b4be153cf6590c5b5b2f76b001af71c51.

The case of xPendedTicks = 2 has always been possible but (again) only causes a late wakeup. Let me take a closer look at this.

@rtel, thanks! @jefftenney, I agree with your comment in the link, the unreleased fix in the link did not address the case when the systick interrupt asserted while the idle task was in sleep mode.

Just for clarity, the yet-to-be-released fix linked above addresses the late-wake-up problem correctly. My comment on the commit was only referring to the commit comment – not the code. The code is perfect.

@jefftenney, thanks for the clarification. I agree the yet-to-be-released fix did address the late-wake-up problem. I thought your comment was referring to the case when the systick interrupt asserted while the idle task was in sleep mode, my misunderstanding.

There seems to be a corner case not handled properly in vPortSuppressTicksAndSleep(). If you are using version 8.2.3, on line 526 of port.c you should see this:

portNVIC_SYSTICK_CTRL_REG &= ~portNVIC_SYSTICK_ENABLE_BIT;

Will you please put an assert statement after that line like this:

configASSERT(portNVIC_SYSTICK_CURRENT_VALUE_REG != 0);

You could use any equivalent check – doesn’t have to be an assert statement. You could add a trace event if portNVIC_SYSTICK_CURRENT_VALUE_REG is zero there.

The line number was messed with my tracing code, I assume you meant the line below:

__weak void vPortSuppressTicksAndSleep( TickType_t xExpectedIdleTime )
{
uint32_t ulReloadValue, ulCompleteTickPeriods, ulCompletedSysTickDecrements, ulSysTickCTRL;
TickType_t xModifiableIdleTime;

	/* Make sure the SysTick reload value does not overflow the counter. */
	if( xExpectedIdleTime > xMaximumPossibleSuppressedTicks )
	{
		xExpectedIdleTime = xMaximumPossibleSuppressedTicks;
	}

	/* Stop the SysTick momentarily.  The time the SysTick is stopped for
	is accounted for as best it can be, but using the tickless mode will
	inevitably result in some tiny drift of the time maintained by the
	kernel with respect to calendar time. */
	portNVIC_SYSTICK_CTRL_REG &= ~portNVIC_SYSTICK_ENABLE_BIT;

	/* Calculate the reload value required to wait xExpectedIdleTime
	tick periods.  -1 is used because this code will execute part way
	through one of the tick periods. */
	ulReloadValue = portNVIC_SYSTICK_CURRENT_VALUE_REG + ( ulTimerCountsForOneTick * ( xExpectedIdleTime - 1UL ) );
	if( ulReloadValue > ulStoppedTimerCompensation )
	{
		ulReloadValue -= ulStoppedTimerCompensation;
	}
            ITM_EVENT32(2, ulReloadValue);

See my trace line ITM_EVENT32(2, ulReloadValue), my previous trace actually captured a case which implied the portNVIC_SYSTICK_CURRENT_VALUE_REG was zero:

ulReloadValue was 0x23253 = 143955, which was exactly (3 * 48000 - portMISSED_COUNTS_FACTOR), here ulStoppedTimerCompensation = portMISSED_COUNTS_FACTOR = 45.

I agree, considering the reload value.

Here’s a patch:

		/* Calculate the reload value required to wait xExpectedIdleTime
		tick periods.  -1 is used because this code will execute part way
		through one of the tick periods. */
		ulReloadValue = portNVIC_SYSTICK_CURRENT_VALUE_REG + ( ulTimerCountsForOneTick * ( xExpectedIdleTime - 1UL ) );
		if( portNVIC_SYSTICK_CURRENT_VALUE_REG == 0 )
		{
			ulReloadValue += ulTimerCountsForOneTick;
		}
		if( ulReloadValue > ulStoppedTimerCompensation )
		{
			ulReloadValue -= ulStoppedTimerCompensation;
		}

I’ll write up the explanation and post it here. Maybe also the right opportunity for another optimization I had considered too. We’ll see what Richard thinks.

That is a very old version. If there is an issue in it, does the issue exist in the latest version?

Yes, the bug affects the current version of FreeRTOS too. But your recent fix (pending release) actually fixes this bug too.

If vPortSuppressTicksAndSleep() happens to stop systick at zero, then that function causes xTickCount to “jump ahead” by 1 tick. That’s what Bin is experiencing. But your fix induces an abort if there are any pended ticks, which solves the problem.

@Thatseasy, can you integrate the not-yet-released fix linked above (and dump my patch) and verify it solves your issue?

Yes, I tried the more recent build 10.3.1, it had the same problem.

@jefftenney, I have incorporated the not-yet-released fix and running the test now. I’m afraid it only addresses the issue when systick asserted right before the idle task entering the sleep, and does not fix the problem when systick interrupted when idle task is already in sleep mode, and the xPendedTicks will be counted twice. Will see how the test goes.

As I suspected, the issue is still there:


after calling ulTaskNotifyTake, the idle task set the systick timeout to 0x002CD5B, a little less than 4ms. Almost 1ms later, at the cursor point, there was an external interrupt woke up the idle task.

See the expended the cursor area:


the systick current counter was 0x232AF = 144047, which was only (47/48000)us away from a single tick, and the following code had set to the systick load register:

			/* The reload value is set to whatever fraction of a single tick
			period remains. */
			portNVIC_SYSTICK_LOAD_REG = ( ( ulCompleteTickPeriods + 1UL ) * ulTimerCountsForOneTick ) - ulCompletedSysTickDecrements;

and this explained the immediate systick interrupts. Since the timeout was so short, scheduler was still suspended at that time, the xPendedTicks was incremented. However, I cannot explain why the xTaskIncrementTick() was called twice unless the second one was due to the PendSV:
if( xTaskIncrementTick() != pdFALSE )
{
/* A context switch is required. Context switching is performed in
the PendSV interrupt. Pend the PendSV interrupt. */
portNVIC_INT_CTRL_REG = portNVIC_PENDSVSET_BIT;
}

xTaskIncrementTick() was called 4 times in total, half was called in xTaskResumeAll() to process the xPendedTicks.

Here’s what seems to be happening. There is another edge case not handled properly in vPortSuppressTicksAndSleep(). If something other than systick wakes the CPU, and if the time remaining until the next tick is very small, then a very tiny value is loaded into the reload register here. Then the systick expires while vTaskStepTick() executes here, which pends the systick ISR and starts another very small systick period. While this second period is underway, we finally load the standard value into the reload register here. Then we unmask interrupts, at which point the pending systick ISR executes, and then very soon thereafter another systick interrupt occurs from the second very small systick period ending.

What’s interesting about this edge case is that there is a small window of reload values that can cause the error. Only values larger than e.g. 2 and smaller than e.g. 10 would ever cause the error. But in this case the ITM usage, or traceINCREASE_TICK_COUNT() usage, in vTaskStepTick() would expand the error window a bit because those things delay our loading the correct value into the reload register.

For a quick experiment and fix, can you swap the last two statements:

Current Code:

portNVIC_SYSTICK_CTRL_REG |= portNVIC_SYSTICK_ENABLE_BIT;
vTaskStepTick( ulCompleteTickPeriods );
portNVIC_SYSTICK_LOAD_REG = ulTimerCountsForOneTick - 1UL;

Proposed (testing only):

portNVIC_SYSTICK_CTRL_REG |= portNVIC_SYSTICK_ENABLE_BIT;
portNVIC_SYSTICK_LOAD_REG = ulTimerCountsForOneTick - 1UL;
vTaskStepTick( ulCompleteTickPeriods );

@jefftenney, swapping those two lines did the trick. no issue was seeing after about 4 hours running. I also tried omitting one external interrupt and letting the systick to expire, the code appears handling that well.

I can let the test run a bit longer and see if there is anything coming up.

Summarizing this “twofer”, there are two edge-case bugs in the default implementations of tickless idle for Cortex M. Both bugs are in vPortSuppressTicksAndSleep().

Bug 1: Stopping systick at zero
This bug causes kernel time (xTickCount) to “jump ahead” one full OS tick. Because this problem is a special case of the more general pending-tick problem fixed here, this problem will go away for free in the next release of FreeRTOS. However, it might be even better to update the comments in the code to reflect that in both cases of stopping systick in vPortSuppressTicksAndSleep(), stopping on zero has been considered and addressed.

Bug 2: Rescheduling an immediate tick after early wakeup
This bug causes kernel time (xTickCount) to “jump ahead” by nearly one full tick. See the post above for the description. The fix proposed above needs additional review – specifically regarding the systick option to use a divided clock – would need to check all ports and build options etc.

Possible Optimization
Given that vPortSuppressTicksAndSleep() is likely to be modified in all the various Cortex M ports soon, it may be the right time to add the optimization not to stop systick at all for eAbortSleep.

2 Likes

I reviewed FreeRTOS support for systick using the divided clock. It appears the fix proposed for “Bug 2” will need a small tweak for the case of systick using the divided clock.

I’d be happy to prepare a PR or create an issue on github or whatever would be most helpful.