Freertos size

Hi All,
May I know the code size the freertos (with all of its features enabled) occupies in memory?
I need this information for assessing the memory requirements for my project.
Looking for a quick response.

Just try it yourself ?
It depends on various important things starting with your MCU type, toolchain, optimization/code generation tuning etc. only YOU know about, right ?
Setup your or a matching demo project and build. Then you can get a good estimation.

You can compile down to about 4.5K, if you remove everything useful. So I normally quote about 9K to 18K for usable configurations. The plus/minus 100% in those numbers demonstrates Harmut’s point, there are so many variables that impact the code size. Most compilers default to removing dead code (not GCC though, you need to tell GCC to do that, and even then it’s a bit clumsy), so including/excluding FreeRTOS features in the configuration file sometimes doesn’t make that much difference.

If code size is critical then inspect the generated map file. I’ve seen seemingly innocuous things like calling sprintf() more than double code size if it brings in the entire floating point library.

When using a GCC compiler, it is also recommended to compile with :

    -ffunction-sections
    -fdata-sections

and link with :

    -Wl,--gc-sections

This will avoid that your linker will include unused functions.

@htibosch Suprisingly I found time ago that -fdata-sections may or may not help minimizing image size and might even increases image size (a bit). I don’t know why …
At least this applies to my applications using GCC toolchain with -Os.
Hence I’m omitting -fdata-sections despite its documented purpose.

Doesn’t do -fdata-sections for RAM sections what -ffunction-sections does for code sections?
I think it creates a separate section for each object in the .data space.

Yes, but FreeRTOS won’t have many variables that can be dropped from the final image. Also, because multiple variables no longer have a known relationship in their addresses, for processors that do better putting a “base” value in a register and using a small offset, you can need more memory to store more address bases.

-fdata-sections helps when you have some large tables that only sometimes need to be loaded depending on which functions you are using from that module.

1 Like

Ah yes - I remember that I’ve read the doc regarding the dedicated (smaller) data sections Hein mentioned. Thanks for the explanation Richard !
That explains why it doesn’t help with my apps because for certain reasons I don’t have to deal with many/large static tables.
So I wouldn’t recommend to always use -fdata-sections and hope it helps - the mileage may vary :wink: and the opposite effect might occur.
OTOH the difference of the image size isn’t that large.

Thanks all for your replies. As I am new to Freertos, I want to know footprint of Freertos with all features enabled for Cortex M4F processor core and arm tool chain.

Again I’d really recommend to just download FreeRTOS including demos and select an appropriate demo and build it. Also depending on your application constraints optimization for size (e.g. GCC -Os) vs. optimization for speed (e.g. -O2) produces pretty different image sizes. In addition and also depending on the toolchain (I’m not familiar with the ARM compiler, I’m using the GNU toolchain) link-time-optimization can be used which also might significantly reduce image size.
Even though you’re using a Cortex-M4F you might use it’s FPU resp. floating point code or not. This has an impact, too.
Finally just enabling all features (I’m sure almost no one does that) isn’t enough. The application needs to contain code using those features. Otherwise it’s dead code and optimized out and you won’t get any useful number to estimate the size of your application.
You’ve been warned :wink:
However, if you don’t bother just take the reasonable 8-19k @rtel mentioned.

Thank you. I will try and update my findings here.

This is a good point which is frequently underrated. The Cortex M architecture allows for spectecular speed improvements (we are talking factors around 1000 or more) when code is only executed in the kernel and does not need to access external busses. Also, code that employs the pipeline and lookahead cache in full gains tremendously in speed. In order to accomplish that, compilers go through a myriad of optimization tricks. For example, a loop like

for (i=0;i<10;i++) {<do something>}

may be unfolded as long as there are processor registers available, yielding in the compiled code typically 4 to 5, sometimes more verbatim repititions of the code labeled do something. So the code may execute very much faster when optimized for speed but may in return take up a factor of several times much ROM/flash footprint.