Thread-local storage?

rwilder wrote on Thursday, April 14, 2011:

Does FreeRTOS support thread-local storage on the AVR32 platform using GCC 4.3.2?

I’m testing a small sample project that seems to compile&link using “__thread” attribute. However, I’m not certain it is actually working when I run it in debugger.

rwilder wrote on Thursday, April 14, 2011:

Here are a few more details.

I’m using Atmel’s AVR32 Studio version 2.6.0 (eclipse-based IDE) which comes with
gcc compiler version 4.3.2 (atmel-1.2.0-(mingw32_special)), Atmel provides
a software framework including a port of FreeRTOS V6.0.0.

The test code below simply creates two instances of a thread. If thread-local
storage works as anticipated, then the loop would not hit the breakpoint.

I find that it always breaks, and that the pointer psTlsThreadParms refers to
the same address regardless of which instance is running.

Any ideas ???

//--------------------------------------------------
// start of file
//--------------------------------------------------
// TlsTest.c
// Test of thread-local storage using FreeRTOS on Atmel AVR32
// Richard Wilder - GrayBox Technologies LLC
//

#include “FreeRTOS.h”
#include “task.h”

#define TLSTEST_STACK_SIZE         ( configMINIMAL_STACK_SIZE + 192 )
#define TLSTEST_TASK_PRIORITY   ( tskIDLE_PRIORITY + 2 )
void vTlsTestTask_init(void);
static portTASK_FUNCTION_PROTO( vTlsTestTask, pvParameters );

// Here we define a structure to hold thread data
typedef struct tag_STlsTestParameters
{
int iInstance;
int iCounter;
struct tag_STlsTestParameters* pSelf;
} STlsTestParameters;

// Here we make two instances
STlsTestParameters g_sThread1Parms;
STlsTestParameters g_sThread2Parms;

// Here we declare another instance using thread-local storage
__thread STlsTestParameters gTLS_sThreadParms;

//--------------------------------------------------
// vTlsTestTask_init()
//--------------------------------------------------
// This is the public API that you call from main()

void vTlsTestTask_init(void)
{
// initialize the first instance of parameter structure
g_sThread1Parms.iInstance = 1;
g_sThread1Parms.iCounter = 0;
g_sThread1Parms.pSelf = &g_sThread1Parms;

// initialize the second instance of parameter structure
g_sThread2Parms.iInstance = 2;
g_sThread2Parms.iCounter = 0;
g_sThread2Parms.pSelf = &g_sThread2Parms;

// launch the first instance of the thread
xTaskCreate( vTlsTestTask,
( const signed portCHAR * )“TlsTest1”,
TLSTEST_STACK_SIZE,
&g_sThread1Parms,
TLSTEST_TASK_PRIORITY,
( xTaskHandle * )NULL );

// launch the second instance of the thread
xTaskCreate( vTlsTestTask,
( const signed portCHAR * )“TlsTest2”,
TLSTEST_STACK_SIZE,
&g_sThread2Parms,
TLSTEST_TASK_PRIORITY,
( xTaskHandle * )NULL );
}

//--------------------------------------------------
// vTlsTestTask()
//--------------------------------------------------
// This is the thread/task function

static portTASK_FUNCTION( vTlsTestTask, pvParameters )
{
int iInstance;
int iCounter;
int iTlsCounter;

// cast our void-pointer argument back to our structure-pointer
STlsTestParameters* psThreadParms = (STlsTestParameters*) pvParameters;

// make a local copy (for viewing in debugger)
iInstance = psThreadParms->iInstance;

// take a snapshot of the TLS instance address (for viewing in debugger)
STlsTestParameters* psTlsThreadParms = &gTLS_sThreadParms;

// initialize the thread-local instance of the structure
gTLS_sThreadParms.iInstance = psThreadParms->iInstance;
gTLS_sThreadParms.iCounter = psThreadParms->iCounter;
gTLS_sThreadParms.pSelf = psTlsThreadParms;

// main loop
for(;:wink:
{
// make some changes
iTlsCounter = ++gTLS_sThreadParms.iCounter;
iCounter = ++psThreadParms->iCounter;

// if there is truly a thread-local instance of gTLS_sThreadParms
// then the counter values should always be equal
if (iTlsCounter != iCounter)
{
// >>> SET DEBUGGER BREAKPOINT HERE <<<
iCounter = iTlsCounter = 0;
}

// sleep an odd number of ticks to keep the two threads out-of-sync
vTaskDelay(47 + iInstance);
}
}

//--------------------------------------------------
// end of file
//--------------------------------------------------

mikehitekp wrote on Friday, April 15, 2011:

As far as I can see, although GCC doesn’t barf about ‘__thread’, it may well just ignore it, as it is not supported everywhere.

At a conceptual level, I cannot see how just declaring a variable as __thread helps the compiler know which thread the variable is attached to, nor which variable is intended to be read.

The only way I have found is to pass a pointer to the task, and by sufficient dereferencing, make it point to some data local to the task. You need to ensure it is dereferenced before the task finishes, and that no-one is using it when this happens.

rwilder wrote on Friday, April 15, 2011:

Thanks for your reply.

It isn’t ignored. GCC adds a bunch of code variables declared with __thread (I looked at the disassembler listing). But I think you’re right, FreeRTOS would have to cooperate somehow - to tell the GCC libraries that it has performed a context switch. It would also have to cooperate on task creation, to create a new instance of thread-local storage.

BTW: I know about casting the void-pointer argument to the task. It’s just ugly if you want the task to call any helper functions. You have to pass the pointer to each helper.The helpers become dependent on the task and that takes away abstraction and re-use. I wish it were hidden, like the “this” pointer in C++

richard_damon wrote on Saturday, April 16, 2011:

The way the compiler would handle Thread Local Storage is to create a structure  with an entry for each separate variable. A pointer to the “current” thread’s structure is kept some place handy (either a dedicated register, or a global variable). The task system (like FreeRTOS) then needs to create a copy of this for each task created, and then change the pointer with each task swap,

To implement this in FreeRTOS would probably require adding a pointer in the TCB (or maybe just extra space at the base of the stack frame), code to create the memory (which would need to be in the portable layer as determining how much is needed is implementation specific) and code to change the pointer (which would also need to be in the portable layer, but the state save/restore code that would use it is already there.

davedoors wrote on Saturday, April 16, 2011:

There is a structure used by newlib that does this. I can’t remember the name of the structure, but if you look it up and search this forum for it you will find code that somebody has done what richard_damon is talking about already for the newlib structure. I think it was quite a long time ago and this forum is not very search friendly, but it is there somewhere.