Question about FreeRTOS stack usage

jcwren wrote on Tuesday, July 24, 2007:

I have a question about memory allocation and stack usage that I need to confirm:

System is ARM7, GCC, and newlib.  Prior to the scheduler starting, any stack usage is on the stack set up by newlib.  All calls to xTaskCreate, etc all use a small amount of space.  Once the scheduler is started, NO stack space outside ucHeap is used, since all context changes take place on the task stack that exists inside ucHeap, correct?  Ignore the case for interrupts, since the ARM7 has a separate stack for IRQ/SWI’s.

The issue I’m trying to resolve is that the newlib _sbrk() always returns -1 (error), because of a check that it does to see if the heap space overlaps the stack.  Since the stack for the current task is inside ucHeap, it appears to be less than the top of the heap, and _sbrk() thinks there’s a collision.  In reality, it’s a spotty check at best, since the ARM7 doesn’t have an MMU or a stack fence, but it’s at least something.

My solution will be (assuming my assumption is correct.  And you know what they say about an assumption: It makes an ass out of you and umption (Samuel L. Jackson, “Long Kiss Goodnight”)) to use the existing stack overlap check if the scheduler is not running, otherwise, once the scheduler is running (and since it will never exit in an ARM7), permit the heap to grow to the beginning of the IRQ stack (which is above the supervisor stack (which is no longer used), which is above the system stack (which is never used, since if we’re in user mode, the user stack is in the ucHeap area)).

If this is not the case, and the context switching is using the supervisor stack, I should be able to at least reduce the size of the supervisor stack to 128 bytes or so to permit context switches to take place.

Any insight, advice, etc would be greatly appreciated.

–jc

dspaude wrote on Tuesday, July 24, 2007:

Hi JC,

Which toolchain are you using (GCC, but which version, what other libraries, and I assume your newlib is NOT reentrant since you mentioned _sbrk rather than _sbrk_r, etc.)? I’ve been using YAGARTO, but I don’t see a ucHeap in my memory map file. I’d also like to know how all this stack handling works (between ARM’s 7 modes and stacks and FreeRTOS’s heap partitioned into stacks). I’ve had a Data Abort within _sbrk_r() while using FreeRTOS and I have no idea why that happened other than to assume FreeRTOS didn’t have enough heap allocated. I don’t fully understand how the toolchain sets up heaps and stacks for use on the ARM (if anyone can point me to a doc for understanding that would be great).

Regards,
Darrik

davedoors wrote on Tuesday, July 24, 2007:

Have you read http://www.freertos.org/a00111.html ?

How the heap is allocated depends on the memory management scheme you include in your makefile. If you are using heap_1 or heap_2 then the "FreeRTOS heap" is just an array from which data is grabed when FreeRTOS wants to create a task or queue. It is not the heap as your compiler knows it. heap_3 is just a wrapper for malloc and free.

The stacks for the tasks come from the "FreeRTOS heap". Task context is saved into this space.

The scheduler uses Supervisor mode stack, but not much. Interrupts use IRQ or FIQ stacks. Nothing uses User mode stack unless you call main in User mode.

Dave.

jcwren wrote on Wednesday, July 25, 2007:

Yes, I’ve read a00111.html page.  I was already aware of most of that. 

Mostly what I was trying to confirm was that once the scheduler is started, no FreeRTOS memory accesses will be beyond the end of .bss, with the exception of the minor bit of supervisor stack space used by the scheduler.

This allows me to permit tasks to do initial malloc()'s in their startup code, and whatever malloc()/free()'s the various newlib routines do (I don’t fundamentally approve of the functions doing that, but the reality is that they do, and for newlib to work properly, the memory allocator needs to work).

I think the heap_2.c memory allocator is ideal.  I don’t know how often, if at all, FreeRTOS allocates and releases memory from inside itself, but it’s a known fact that malloc()/free() can result in fragmented memory, and is generally considered to be a poor choice for embedded systems, especially those with a tighter budget.

BTW, in the next release, would you mind adding a function that exposes the state of xSchedulerRunning?  It’s helpful for FreeRTOS aware libraries to know if the scheduler is currently running or not, and that would seem to be the most effective way.  If you don’t think so, I’m open to other suggestions.

–jc

jcwren wrote on Wednesday, July 25, 2007:

Darrick,

  I’m using GCC 4.1.2, newlib 1.15 (modified syscalls.c), FreeRTOS 4.3.1, FatFS 0.04b, and lpcusb 20060903 (modified).

  The stacks and such are pretty straight forward, although at first glance they do seem hard to wrap your head around.  Basically, there are 6 stacks, one for each mode.  The processor uses the stack that corresponds to the mode.  When an IRQ occurs, the IRQ stack is used.  When a FIQ occurs, the FIQ stack is used, etc.  System/user mode share a stack.

  FreeRTOS runs the scheduler in supervisor mode (as noted in the a00111.html docs), while tasks run in system mode.  In a normal newlib environment (not running a scheduler), the stacks are (in order, from highest RAM address going lower) UNDEF (undefined instruction execution), ABORT (basically non-word aligned access to memory), FIQ (fast interrupt), IRQ (vectored and non-vectored interrupt), SVC (supervisor), then system/user.  FreeRTOS requires that that scheduler be started while in supervisor mode, and runs tasks in system mode.  However, when a task is created with xTaskCreate, one of the parameters is the size of the stack for that task.

  On of the configuration variables in FreeRTOSConfig.h is configTOTAL_HEAP_SIZE, which dictates how much total space will be reserved for tasks, queues, semaphores, etc.  When the task is created, it allocates a task control block (TCB) from this total space, and also the amount of stack required.  When the task is running, it’s stack pointer is actually pointing to the stack space allocated by the xTaskCreate call.  When the scheduler itself is running, it uses a little bit of the supervisor stack (safe to say less than 64 bytes, based on prefilling memory and watching what gets used), and a little bit of the IRQ stack (when Timer 0 fires into the interrupt handler).  Once the scheduler is running, your application can use any memory from the end of .bss up to the initial supervisor SP, minus 64 bytes.

  This is where _sbrk() comes in (probably telling you something you already know here).  The heap, as GCC/newlib knows it, starts at the end of .bss, and grows up towards the top of memory.  In a non-FreeRTOS system, sbrk() checks to see if the request to increase the heap size (as a result of a malloc/calloc/etc) will be larger than the current stack pointer (the system/user SP grows towards heap space).  If the current end of heap plus the requested space is greater than the current SP value, ENOMEM is returned.

  Of course, this fails miserably when trying to use this scheme under FreeRTOS, because the process that’s requesting memory has it’s SP below the start of the heap, so it looks like a collision occurred, and ENOMEM is returned (not all functions check the return of malloc, since we all know that real computers have gigs of memory, and we only know that something went wrong when it locks up hard).  Your ABORT condition stands a good chance of a pointer getting overwritten by stack, or your stack overwritten by data, and something goes to fetch some memory, and it’s pointing to nowheresville (like reference non-existent space, say address 0x40008000, where there’s no RAM, no FLASH, no peripheral blocks, etc).

  Back to _sbrk_r.  The newlib, as compiled by the crossdev utility on my machine, has reentrancy enabled.  _sbrk_r is not in newlib/syscalls.c, but is a library function which has a _reent structure pointer.  It calls _sbrk, which I believe it expects to be a system call, therefore non-interruptable.  On return, it copies the system errno to the  local copy of errno in the _reent structure for that thread.  I’m not clear how _sbrk_r guarantees that errno isn’t overwritten between the exit of the system call and another interrupt.

  Currently, my solution is only permit one of the tasks to really use the newlib calls.  Most of the tasks I have do things like monitor devices or parse GPS input, then update a global structure protected by semaphores.  The main worker task uses things like sprintf() for formatting, handles the file I/O, etc.  So from newlibs point of view, the system is single threaded.

  On the ucHeap not being present in your memory map, it’s because ucHeap is inside the statically allocated structure xHeap inside of heap_2.c  The map file will show where a large chunk of memory the size of configTOTAL_HEAP_SIZE exists in the heap_2.o file.  That’s ucHeap, and where FreeRTOS does all it’s memory management magic.

  I tend to ramble in my explanations somewhat, but maybe this helped you out.  Take a look at the LPC2148_Demo code at http://jcwren.com/arm.  I’ll have an updated version up in the next couple days (ver 120) that has a command you’ll likely find useful in understanding all this (‘mem map’).  You can also allocate and free memory from the command line to see where it’s actually returning pointers to, how it affects the heap end, etc.

  (Any errors contained within this explanation are purely mine, and based on my understanding of FreeRTOS, newlib, GCC, etc, so far.  I’ve a lot of experience with micros (AVR, MSP430, Z80, etc), and x86 systems, but the ARM7 is my first major foray into 32 bit micros.  It’s likely I’m doing some really stupid things here and there, and people should feel free to (politely) point them out).

  --jc

jcwren wrote on Wednesday, July 25, 2007:

  Oh, I forgot to mention there’s a really good data sheet that shows the stacks and registers as they relate to the   various modes.  I downloaded it, then forgot where I found it.  I’ve put in the http://jcwren.com/arm directory as ARM7TDMI-S.pdf.  If someone knows of link to the original source, please post here and I’ll remove this copy. 

  --jc

rtel wrote on Wednesday, July 25, 2007:

FreeRTOS.org only ever allocates memory when a task or queue is created.  This always comes from the pvPortMalloc() function - which is equivalent to malloc when using heap_3, otherwise the heap as defined by the linker is not used.  Take care about using library functions that call malloc as malloc is not generally reentrant.  Also many GCC library functions use masses of stack.

Regards.

rtel wrote on Wednesday, July 25, 2007:

Thanks for your great explanation.  Also for providing what looks like a great example application.  Is it ok if I link to this from the FreeRTOS.org site?

Regards.

jcwren wrote on Wednesday, July 25, 2007:

  Yes, by all means, please feel free to link to it.

  I learn best by having a process that basically looks like a C loop: do { edit (); compile (); crash (); Google (); } while (1);

  I got started on this because I have a particular application in mind that a LPC2148 is perfect for and it uses a good number of the elements found in the demo code. 

  Many of the answers to the problems I had were targeted at the non-GCC compilers.  I imagine those are great for many people, but I prefer to do my development under Linux.  So besides the fact that GCC runs under Linux, GCC is also just a flat-out excellent compiler, with lots of documentation (albeit some of it is rather cryptic…), and it’s free.  Also, many of the answers I found were “Oh yea, I did that” and no code, or “oh, I found the problem” and no indication of what the solution was.  This started me thinking that maybe a practical working example that other people could use might be useful.

  At some point, I decided to suspend work on my project, and build out the demo code.  As I become familiar with a new package or peripheral, I try to find a way to integrate it, particularly when I found a recurring question along the lines of "how do I do this?"

  All my code is mostly nothing more than an aggregate of great work like FreeRTOS, lpcusb, FatFS, newlib, etc, with a little glue.  I may whine about how a few things are done here and there, but the reality is that without the work done by everyone else, I’d still be flipping switches on the front of my IMSAI 8080 (which I still have!).  So for the FreeRTOS code, Richard, thanks for writing it, and for making it publicly available (of course, there’s still time to change the license, close the source, write a book, and make everyone buy a new copy every time you revise it…)

  --jc

  (Future enhancements will include an FIQ demo, VIC software interrupts, and hopefully a decent undef/abort handler that actually indicates what caused the abort… while (1); makes a lousy debugging aid.)

dspaude wrote on Wednesday, July 25, 2007:

Thanks, JC. I like your main loop (C loop) example. I think it is pretty common for embedded developers!

Regarding the document you have, it’s one of the ARM core documents which can be found at:
http://www.arm.com/documentation/ARMProcessor_Cores/index.html

For example, the Programmer’s Model chapter of the following Technical Reference Manual (TRM) is about the same as the one you posted:
http://www.arm.com/pdfs/DDI0210C_7tdmi_r4p1_trm.pdf

The doc you reference (DDI 0084D) is also on that ARM core docs page, but at Rev. F:
http://www.arm.com/pdfs/DDI0084F_7TDMIS_R3.pdf

Best Regards,
Darrik