FreeRTOS and power consumption

znatok wrote on Thursday, June 19, 2014:

Hi,
I have a project running on STM32F2 ARM Cortex M3 CPU. We are trying to tune power consumption of the hardware. I personally trying to reduce CPU consumption. I came to the following situation:

  • CPU is running at 120MHz. ART Accelerator is enabled.
  • I have created NO TASK
  • Before running vTaskStartScheduler() CPU consumes about 20mA
  • After starting a scheduler CPU current rises up to 55mA

I know that FreeRTOS will create idle task and TimerTask (I have Timers enabled). But I can’t understand what specific CPU hardware is activated after starting scheduler that should consume 35mA. I do not believe SysTick interrupt can do it.
According to STM datasheet CPU running at this clock at 3.3V with ART enabled and ALL PERIPHERAL ENABLED should consume up to 49mA. But in my case I have no peripheral enabled (except clocks and NVIC) and before scheduler is running power consumption is at 20mA.
Can someone lite a light on the question what can cause to consume extra 35mA.
Thanks.

rtel wrote on Thursday, June 19, 2014:

The scheduler is just executable code running on the CPU. In a like for like situation it does not have the ability to increase the power consumed by the CPU. By in a like for like situation I mean:

  • CPU running at the same frequency.
  • The same peripherals, clocks and memories are enabled.
  • The same amount of time is spent sleeping as opposed to running at full tilt.

Starting the scheduler will enable the SysTick timer - but nothing else. So that would have the potential to use extra power, but not 35mA worth.

When you run the system without FreeRTOS what is it doing? Is it running code all the time, or spending some time in sleep mode?

Using an RTOS can allow you to do more work for the same amount of power consumed simply by freeing up CPU time. That is done by ensuring everything is event driven, and absolutely no CPU cycles are wasted by polling anything.

In addition all RTOSs (that I am aware of) allow crude but very effective power saving by simply using the idle task to put the CPU into a low power state. Assuming your application is not going to use 100% of the CPU time this will give you massive power savings.

FreeRTOS goes further still, further than most conventional systems, by also allowing you to turn the tick interrupt off, enabling you to go into almost the deepest of the sleep modes available (not quite the deepest as the RAM and CPU register values must be retained). There is a generic Cortex-M tickless mode built into FreeRTOS, but the generic implementation is limited by the resolution and speed of the SysTick timer and the fact that the SysTick timer must remain on while in sleep mode. However the generic implementation can be overridden with a chip specific implementation that allows you to generate ticks from other time sources - low power 32-bit timers being the best. Here is an example running on an STM32L:

http://www.freertos.org/STM32L-discovery-low-power-tickless-RTOS-demo.html

It is possible to get power consumption down to the uA level.

Regards.

heinbali01 wrote on Thursday, June 19, 2014:

Hi,

I would also think that the extra 35mA is used because the CPU doesn’t sleep anymore.
The prvIdleTask() function in task.c will be active 100% of the time, waiting for some event which never occurs.

The simplest way of reducing power would be to include a vApplicationIdleHook() handler which issues a sleep instruction.

The power consumption can be further reduced by using the tickless mode (configUSE_TICKLESS_IDLE). In this mode the kernel will calculate each time how long it may enter sleep mode. After waking up, it will calculate how long it has actually slept and update the tick counter accordingly.

You can go further, as Richard suggests, by using a low power 32-bit timer (a timer/counter). I used this method in a M4 and had the device sleep up to 35 seconds.

Note that by using tickless mode, there is no loss of performance, as long as the sleep will be ended by any interrupt source. The result of xTaskGetTickCount() may become a bit less reliable, because the actual sleep times will be estimated.


/* An example when using SYSTICK for tickless mode
 * on a CPU with 120 Mhz and a clock-tick of 1000 Hz */

/* 120,000,000 / 1,000 = 120,000 */
ulTimerCountsForOneTick = ( configSYSTICK_CLOCK_HZ / configTICK_RATE_HZ );
/* The maximum sleeping time is 139 ticks, 0.139 sec in this example */
/* 16,777,215 / 120,000 = 139 */
xMaximumPossibleSuppressedTicks = portMAX_24_BIT_NUMBER /
    ulTimerCountsForOneTick;

/* An example when using a 32-bit timer to wake-up from sleep */
/* 120,000,000 / 1,000 = 120,000 */
ulTimerCountsForOneTick = ( configSYSTICK_CLOCK_HZ / configTICK_RATE_HZ );
/* 4,294,967,295 / 120,000 = 35,791, or 35.8 seconds */
xMaximumPossibleSuppressedTicks = portMAX_32_BIT_NUMBER /
    ulTimerCountsForOneTick;
        

Regards.

znatok wrote on Thursday, June 19, 2014:

Hi.
Thanks for your help. You shouldn’t prove me advantages of os usage. I understand it and not going to stay without freertos.

Before starting scheduler my code initialize some peripherals. A few timers uart and some gpio. If I put while(1) before running scheduler I get 35ma less consumption.
I’m trying to figure out what code may become running after scheduler started that can cause power consumption.

What about interrupts? Am I right that scheduler should configure and enable nvic?

Another concern is program flash usage. If I stop program with while(1) optimization strips off a lot of code. Can longer jumps and prefetch cash miss source 35mA?

rtel wrote on Thursday, June 19, 2014:

Another concern is program flash usage. If I stop program
with while(1) optimization strips off a lot of code. Can
longer jumps and prefetch cash miss source 35mA?

I suppose it would be possible, but I would doubt it.

Yes - when the scheduler is running the tick interrupt will be running, but I can’t image that would consume much more than just sitting in a while(1) loop using 100% CPU - other than your thoughts about longer jumps making use of more flash loads, etc.

Also, yes it configures the NVIC, but it does not enable it, it will be enabled already, all it does is set the priority of some of the system interrupts.

I would be interested to find the answer to this too. You could experiment by starting the scheduler without first creating any tasks, so only the Idle task runs, then measure the current. After that add the following code to the idle task hook function:

taskENTER_CRITICAL();
for( ;; );

That will leave it spinning in the idle task without the tick interrupt firing - does it make a difference to the current consumption?

Finally try adding a for( ;; ); loop before the call to vTaskStartScheduler() to see what the difference is.

How are you measuring the current? Externally, or getting the chip to measure it itself?

Regards,
Richard Barry.

znatok wrote on Thursday, June 19, 2014:

Thanks for your interest in this problem.
I could continue experiments after weekend.

How are you measuring the current? Externally, or getting the chip to
measure it itself?
Have a DVM connected to the CPU Vcc circuit. We rised up one resistor from
the power supply and put a DVD in series in amper-meter mode.

Interesting thing that my own bootloader running on the same board with
FreeRTOS as well eats expected 20mA. Bootloader fits 32K and seats in first
two 16K Flash segments.
Well I will do what you have suggested and will let you know.

On Thu, Jun 19, 2014 at 9:19 PM, Real Time Engineers ltd. <rtel@users.sf.net

wrote:

Another concern is program flash usage. If I stop program
with while(1) optimization strips off a lot of code. Can
longer jumps and prefetch cash miss source 35mA?

I suppose it would be possible, but I would doubt it.

Yes - when the scheduler is running the tick interrupt will be running,
but I can’t image that would consume much more than just sitting in a
while(1) loop using 100% CPU - other than your thoughts about longer jumps
making use of more flash loads, etc.

Also, yes it configures the NVIC, but it does not enable it, it will be
enabled already, all it does is set the priority of some of the system
interrupts.

I would be interested to find the answer to this too. You could experiment
by starting the scheduler without first creating any tasks, so only the
Idle task runs, then measure the current. After that add the following code
to the idle task hook function:

taskENTER_CRITICAL();for( ;; );

That will leave it spinning in the idle task without the tick interrupt
firing - does it make a difference to the current consumption?

Finally try adding a for( ;; ); loop before the call to
vTaskStartScheduler() to see what the difference is.

How are you measuring the current? Externally, or getting the chip to
measure it itself?

Regards,
Richard Barry.

FreeRTOS and power consumption
https://sourceforge.net/p/freertos/discussion/382005/thread/8849ff7c/?limit=25#d1dd

Sent from sourceforge.net because you indicated interest in
SourceForge.net: Log In to SourceForge.net

To unsubscribe from further messages, please visit
SourceForge.net: Log In to SourceForge.net

znatok wrote on Tuesday, June 24, 2014:

Hi,
Situation is absolutely awkful. I’ve spent a whole day today and here
is what I have:
My main looks like
int main( void )
{
vTaskStartScheduler();
while(1);
}

This code eats 30mA with FreeRTOS 5.3.0
But it eats 60mA with FreeRTOS 5.7.2
If I add while(1) to TICK_HOOK i get desired 30mA
But even with 5.3.0 I can’t get it at 30mA after I add a few more
lines of code before starting schedulers. Code is not HW related.
At the same time my own bootloader that is running on the same boards
stays stable at 30mA with both 5.3.0 and 5.7.2

I’m lost…

rtel wrote on Tuesday, June 24, 2014:

This really is a mystery.

V5.3.0 is really (really) old, so a lot has changed since then, but there is no V5.7.2 so I don’t think that is what you meant. Can you give me the real version number so I can see what has changed between the two. (http://www.freertos.org/History.txt)

Are you using the same FreeRTOSConfig.h in each case?

Regards.

znatok wrote on Tuesday, June 24, 2014:

Sorry the versions are 7.3.0 and 7.5.2.
I do not think problem is related to FreeRTOS although. Better give me
any Idea what can cause to STM32F2 CPU consume extra 30mA

On Tue, Jun 24, 2014 at 5:42 PM, Real Time Engineers ltd.
rtel@users.sf.net wrote:

This really is a mystery.

V5.3.0 is really (really) old, so a lot has changed since then, but there is
no V5.7.2 so I don’t think that is what you meant. Can you give me the real
version number so I can see what has changed between the two.
(http://www.freertos.org/History.txt)

Are you using the same FreeRTOSConfig.h in each case?

Regards.


FreeRTOS and power consumption


Sent from sourceforge.net because you indicated interest in
SourceForge.net: Log In to SourceForge.net

To unsubscribe from further messages, please visit
SourceForge.net: Log In to SourceForge.net

rtel wrote on Tuesday, June 24, 2014:

So between those two versions some ‘improvements’ were made to the tickless idle functionality. Can you confirm that you have configUSE_TICKLESS_IDLE either undefined, or defined to 0 in FreeRTOSConfig.h? If so, then that will make no difference between the two versions.

There were also a few additional barrier instructions added to the Cortex-M port, but that is not going to make such a huge difference, if any at all.

Regards.

znatok wrote on Tuesday, June 24, 2014:

Confirm I do not touch configUSE_TICKLESS_IDLE (not defined)

On Tue, Jun 24, 2014 at 6:06 PM, Real Time Engineers ltd.
rtel@users.sf.net wrote:

So between those two versions some ‘improvements’ were made to the tickless
idle functionality. Can you confirm that you have configUSE_TICKLESS_IDLE
either undefined, or defined to 0 in FreeRTOSConfig.h? If so, then that will
make no difference between the two versions.

There were also a few additional barrier instructions added to the Cortex-M
port, but that is not going to make such a huge difference, if any at all.

Regards.


FreeRTOS and power consumption


Sent from sourceforge.net because you indicated interest in
SourceForge.net: Log In to SourceForge.net

To unsubscribe from further messages, please visit
SourceForge.net: Log In to SourceForge.net

znatok wrote on Thursday, June 26, 2014:

Hi,
I’ve found a way to reduce power to desired 30mA. Once I disable
vTaskDelete with

#define INCLUDE_vTaskDelete 0

Power consumption gots down.

In my code I have a calls to vTaskDelete but only upon critical errors
that does not actually happen. So I’m sure vTaskDelete function is not
invocated.
I’ve looked at all places where INCLUDE_vTaskDelete is #ifdef’ed and
do not see any logical explanation on how this can influence on power
consumption. I recall I’m using v7.5.2. Maybe you have any idea?

rtel wrote on Thursday, June 26, 2014:

I can’t see why that would make a difference.

If INCLUDE_vTaskDelete is 0 then the function prvCheckTasksWaitingTermination() which is called repeatedly from the idle task would not do anything, but most of the time all it would do is one comparison anyway, so I can’t see how it would effect power consumption. Maybe as an experiment you could set INCLUDE_vTaskDelete back to 1, and simply comment out the call to prvCheckTasksWaitingTermination() - which you will find easily enough by searching in tasks.c.

Note if you do that, and you are actually using vTaskDelete(), then you will eventually run out of heap as that is the function that frees memory allocated by the kernel to tasks that have since been deleted.

Regards.

znatok wrote on Friday, June 27, 2014:

Well…, I think we are pretty close.
Commenting out prvCheckTasksWaitingTermination() makes the difference
(-20mA). I went on and looked at the implementation of the
prvCheckTasksWaitingTermination(): It has a loop

while( uxTasksDeleted > ( unsigned portBASE_TYPE ) 0U )
{

}

It’s clear that uxTasksDeleted is always 0, so code inside the loop never
gets executed. But if I comment out code inside the loop power consumption
goes low. If I leave something inside it rises up. Actually a call to
prvCheckTasksWaitingTermination() is the only thing that FreeRTOS is doing
in idle and most of the time my application is in idle loop as it is fully
even driven.

if code inside loop is commented we get following assemble code:
70 prvIdleTask:
75 0000 014B ldr r3, .L3 @ tmp137,
76 .L2:
77 0002 1A68 ldr r2, [r3, #0] @ uxTasksDeleted.74,
uxTasksDeleted
78 0004 FDE7 b .L2 @

Once I uncomment first line in the loop (which is vTaskSuspendAll() ) we
get +20mA power and following code:
714 .L109:
715 0006 FFF7FEFF bl vTaskSuspendAll @
716 .L111:
717 000a 2368 ldr r3, [r4, #0] @ uxTasksDeleted.74,
uxTasksDeleted
718 000c 002B cmp r3, #0 @ uxTasksDeleted.74
719 000e FAD1 bne .L109 @
720 0010 FBE7 b .L111 @

This is a same busy loop as vTaskSuspendAll() never gets executed. I played
a bit with code inside the loop and the only conclusion I can get to is
that STM32 pipeline and or prefetch (or whatever else they have inside)
gets cleared upon some conditional branch that cause to access to program
flash and thus increase power consumption.

Really without understanding VHDL of the CPU core it’s impossible to
predict behavior of the pipe line and even if I could find a sequence that
would not disturb pipeline I do not want to stay dependent on it. Seams
like a TICKLESS is the only reasonable solution? Where can I get more
information about TICKLESS? I’m a bit afraid of putting a system into sleep
mode. How would it react to events (like UART interrupt for example)?
Should I configure each event to wake up from a sleep?

Thanks.

rtel wrote on Friday, June 27, 2014:

Hmm. Very interesting. We will leave that one for the hardware guys to figure out.

If you want to use tickless idle in its most basic form, using just the lightest core sleep rather than any special low power modes, then all you need to do is set configUSE_TICKLESS_IDLE to 1 in FreeRTOSConfig.h. In that light sleep mode any interrupt will bring the CPU back out of sleep - and the tick count will get automatically adjusted to account for any ticks that would have occurred between going to sleep and the interrupt bringing it out of sleep again. Tasks that are blocked with a time out will also bring the CPU out of sleep when their timeout expires.

Please report back your findings, including the effect on power consumption, when you have tried turning this on.

Regards,
Richard Barry

markwrichardson wrote on Friday, June 27, 2014:

I don’t know how long your pipeline is on that processor. Try adding a series of ‘NOP’ instructions inside the loop at the start and see if that makes a difference. Start with a lot and then trim them back.

znatok wrote on Friday, June 27, 2014:

Hi,
TICKLESS definitely helped:
TICKLESS=1 INCLUDE_vTaskDelete=1 45mA
TICKLESS=0 INCLUDE_vTaskDelete=0 48mA
TICKLESS=0 INCLUDE_vTaskDelete=1 67mA

I tried to play with NOPs before and managed to get lower consumption with
3 NOPs before vTaskSuspendAll
inside loop
while( uxTasksDeleted > ( unsigned portBASE_TYPE ) 0U )
{
__asm volatile( “nop” );
__asm volatile( “nop” );
__asm volatile( “nop” );
vTaskSuspendAll
#if 0

#endif
}

But once I uncommented rest of the code in loop current got’s back up.
If you want me to run tests I can definitely do it but I do not think this
is a good solution because: first we are not sure what exactly influence on
STM32 core operation and second you can not fully control assembler output
from the “C” code as optimizer can have it’s own tricks also linker can
also move code (and this can be vital as well).

STM32F2 has so called “Adaptive real-time memory accelerator (ART
Accelerator™)” which is turned all. But all you can find about it in
documentation is this (they do not even explain how to turn it on/off, I
had to ask it in forum):

2.3.5 Adaptive real-time memory accelerator (ART Accelerator™)
The ART Accelerator™ is a memory accelerator which is optimized for STM32
industrystandard ARM® Cortex™-M3 processors. It balances the inherent
performance advantage of the ARM Cortex-M3 over Flash memory technologies,
which normally requires the processor to wait for the Flash memory at
higher operating frequencies. Thanks to the ART Accelerator™, the CPU can
operate up to 120 MHz without wait states, thereby increasing the overall
system speed and efficiency. To release the processor full 150 DMIPS
performance at this frequency the accelerator implements an instruction
prefetch queue and branch cache, which enables program execution from Flash
memory at up to 120 MHz without wait states.

According to datasheet ART reduces power consumption in 10mA (at 120MHz,
3.3V)
Getting more info from ST is a waste of time. Based on my experience of
working with other HW modules of this CPU I can tell that many times ST
engineers do not know themselves how modules are working.

On Fri, Jun 27, 2014 at 5:47 PM, Mark markwrichardson@users.sf.net wrote:

I don’t know how long your pipeline is on that processor. Try adding a
series of ‘NOP’ instructions inside the loop at the start and see if that
makes a difference. Start with a lot and then trim them back.

FreeRTOS and power consumption
https://sourceforge.net/p/freertos/discussion/382005/thread/8849ff7c/?limit=25#c8cf/9dca/297e

Sent from sourceforge.net because you indicated interest in
SourceForge.net: Log In to SourceForge.net

To unsubscribe from further messages, please visit
SourceForge.net: Log In to SourceForge.net