Use of switch/case in FreeRTOS code

system · April 14, 2016, 9:20am

fubo wrote on Thursday, April 14, 2016:

Hi all,

some C constructs are converted to C runtim lib calls by most compilers. For example, switch/case is reworked as a C runtime lib call to handle it. I would suggest FreeRTOS code to be free as much as possible from specific C library code.
For switch/case, I would suggest to convert all of them to if/then/else.

Any feedback?

system · April 14, 2016, 9:29am

davedoors wrote on Thursday, April 14, 2016:

A compiler can do more to optimize a switch statement, so switch is better IMHO.

system · April 14, 2016, 10:49am

fubo wrote on Thursday, April 14, 2016:

For ARM cores, compiler builds a table and jumps to a C runtime API that will jump to proper location. Although performance could be better, this could be less controllable as dependent by toolchain. As I use to develop for ARM RVCT and GCC for ARM embedded, implementation is different. I would prefer it to be compiled in pure ARM opcodes rather than C runtime libs jumps.

rtel · April 14, 2016, 11:49am

rtel wrote on Thursday, April 14, 2016:

My preference is for the code to be such that the compiler can select
the most efficient run time - which means using a switch statement
[where appropriate] rather than chains of if-then-elses.

system · April 14, 2016, 11:55am

davidbrown wrote on Thursday, April 14, 2016:

You have two issues seriously wrong or misunderstood here.

First, FreeRTOS is an operating system. Switches and other C constructs are handled by the compiler, not the OS software. There is nothing that FreeRTOS can do to influence how the compiler generates code for a switch statement. If you don’t understand what compilers do and how an RTOS fits in the development process, then you really need to go on some courses or read an introductuary book - this is very basic stuff.

Secondly, the way a compiler translates a switch statement will vary greatly depending on the details of the source code, the compiler in question, and the choice of optimisation flags. I think it would be very rare for it to involve a runtime call, at least on an ARM processor. Typical implementations can include jump tables, calculated jumps, if-then-else jumps, and if-then-else conditional execution instructions, but compilers are free to be more “imaginative”. You should write your code in the clearest manner possible, ensure that you have appropriate optimisation flags enabled, and let the compiler figure out the best way to generate the code. It is rare that it makes sense to try to influence the details of code generation - usually you will do more harm than good with ideas like “if-then-else is faster than switch”, as human programmers are notoriously poor at spotting the real bottlenecks.

system · April 14, 2016, 12:34pm

fubo wrote on Thursday, April 14, 2016:

Let’s start saying that there are no misunderstood. I am not confusing OS from toolchain. So before moving to offending, please do not argue about any course I should attend.

I am not interested in performance. I am not saying if/then/else is faster/better than switch/case. I am arguing that it could be more portable between toolchains.

What I am trying to explain is this. If you look at disassembly in attachment, the switch in xTaskGenericNotify is translated using a __ARM_common_switch8 (with RVCT) or __gnu_thumb1_case_uqi (with GCC) call in C runtime library. I suppose the compiler found it effective.
My main problem is that if I have to relocate partially the OS in an execution region (small in size) because load region is in a single bank flash where I have to erase some other subsectors, I have to relocate also C runtime libraries used by it. Instead, using if/then/else, should avoid compiler to use C runtime API and I could save some space.

I know that my problem could be really platform specific. I am just sharing a scenario.

richard-damon · April 14, 2016, 12:55pm

richard_damon wrote on Thursday, April 14, 2016:

Since FreeRTOS is designed to be platform independent, its goal is to write code to be generically best.

In general, compilers are much more able to optimize switch statements than a chain of if/else if statements as it is more clearly stated to the complier what is being done. It is in fact, quite common for the compiler to internally convert a switch statement with sparce cases into the equivilent of if/else if.

In your case, if the compiler is generating not optimal for your case code for the switch, it is likely an issue with your options to the compiler. Perhaps you told the compiler to optimize for space, so some common constructs are implemented in library calls rather than inline.

system · April 14, 2016, 1:16pm

davidbrown wrote on Thursday, April 14, 2016:

I think I misunderstood your earlier post - you are referring to the use of “switch” within the FreeRTOS code rather than how the compiler implements switch statements. With that cleared up, we can move on.

Compilers will generate code using if-then-else for switch if it is more efficient, but they can use other methods. Sometimes this will mean a function call (it surprises me in this case, but there is always something new to learn). You may also see function calls in other cases - perhaps function prologues or epilogues will use a call. This is the way compilers work - they will use functions from the C language support library as needed. You can perhaps influence it in some ways - in particular, optimising with -Os is more likely to use such calls than optimising with -O2. But you will still have to cope with other such helper functions, and make sure they are linked correctly. And while avoiding the function call might make the code more compact for a single switch, perhaps with many switches the overall size will be smaller when using the function call.

system · April 14, 2016, 2:20pm

fubo wrote on Thursday, April 14, 2016:

I must report one more finding.

As David reported, the way switch/case are compiled depends on compiler. With RVCT those functions are not really C runtime APIs, rather C code scratches that are added from compiler and referred by all the other functions that could need them.The problem is that in principle you cannot control were they are relocated, because it depends on linking sequence: the first module referring to it will define and integrate it, the other ones will refere to. With GCC, instead, they are C runtime APIs (__gnu_thumb1_case_uqi .../lib/gcc/arm-none-eabi/4.7.4/thumb\libgcc.a(_thumb1_case_uqi.o))
So it is even more difficult to control were they will be relocated.

So my suggestion was to implement hand made tables to handle switch/case constructs to avoid this uncertainty.

htibosch · April 14, 2016, 6:22pm

heinbali01 wrote on Thursday, April 14, 2016:

I’m breaking my head when trying to understand what you are pointing at.
Why would it be a problem if the compiler decides in some cases to use a runtime library call?

I just found some clear text about __gnu_thumb1_case_uqi that you referred to.

I have not yet encountered a situation in which these optimisations would have disadvantages for my project.
In the non-kernel software (like FreeRTOS+TCP) you will sometimes see “hand-made switch tables”, think of the CLI modules (Command Line Interface: each command can be seen as a case). Then it’s an excellent way to structure code.
But in the kernel code, I would be glad to leave optimisations up to the compiler. It has smart ways of calculating either the fastest (-O3) or the most economic way (-Os).

system · April 14, 2016, 8:38pm

fubo wrote on Thursday, April 14, 2016:

My case is an ARM946 with ITCM and DTCM (intenral RAMs) connected to an external flash. While erasing a sector in flash, code cannot be fetched from there. So during erase I stop the scheduler to avoid tasks to run. During erase phase, only ISR can be served. So I set up the map file to relocate ISR code in xTCM. All my peripherals drivers are already written to avoid reference to any RVCT/GCC C runtime library. As xTCM is limited, I want to avoid to move pieces of C runtime library there. And as I have part of FreeRTOS moved to TCM (mainly to improve performance), I’d like it to avoid bringing C runtime library pieces.

richard-damon · April 15, 2016, 2:43am

richard_damon wrote on Friday, April 15, 2016:

When I have had to do updates like that, I generally have a small seperate program that was loaded into ram as a ‘bootloader’ that would be incharge of updating the flash memory. This simple program is compiled to be run from the desired memory, and sometimes doesn’t use FreeRTOS as it is simple and focused enough (depends on the I/O requirements for geting the data).

system · April 15, 2016, 6:55am

davidbrown wrote on Friday, April 15, 2016:

Usually when doing something like that, I use an absolute minimal function in ram. For chips that have only one flash plane, you can’t really do much during write or erase. So your code in your normal program looks something like:

void writeFlash(void) {
    for (int page = first; page < last; page++) {
        disable_interrupts_and_pause_secheduler();
        ram_writeFlash(page, pageSize, data_pointer);
        enable_interrupts_and_pause_secheduler();
        data_pointer += pageSize;
    }
}

void ram_writeFlash(int page, int size, int * p) __attribute__((section("ram_code"))) {
   // Actual write process, using busy waiting
}

You don’t need much inside the functions that run from ram - it should not call any other functions. You have complete control here - if your compiler generates function calls for switch statements, don’t use switch statements. You certainly don’t use any RTOS features. (If you need other tasks to keep running during the flash programming, you have the wrong choice of microcontroller or hardware design.)

system · April 15, 2016, 7:57am

fubo wrote on Friday, April 15, 2016:

@Richard: my case is that I have to write at erase at runtime in a sort of flash file system.

@David: the problem is not the write, that is quite atomic, but the erase. I am using erase suspend feature of flash, but I cannot do us intervals of erase, otherwise a complete operation would take a long time. Instead, I am doing it in some ms chunks, but during this interval I must keep at least interrupts enabled. During erase, OS should be able to run, also if scheduler is stopped. This is why I need to move partially OS in TCM.