Is accessing different elements of an array thread-safe?

char my_array[10];

Let’s say taskA changes the value of my_array[0] and taskB changes the value of my_array[1] , is this thread safe?

My instinct tells me that it is not thread-safe, because since arrays cannot have padding bytes between elements, and usually CPU access (read/write) multiple bytes at a time, that means changing the value of my_array[0] would result in writing adjacent array elements which may have changed in the meantime.

More specifically, is this guaranteed to be thread-safe with single core MCUs without cache?

And if this is thread-safe, is this guaranteed by the CPU or by FreeRTOS?

1 Like

It depends on if the compiler will access each byte atomically, or if it needs to read then write a full “word” to write a new value to just one byte.

Most processor have individual byte write instructions, so at the single core level, you should be save. Even a cache won’t get in the way for a single core, as all threads are on the same core, so use the same cache. If the processor needs to read a bigger chunk to update, it does it “under the hood” and the program doesn’t see the problem.

If the processor doesn’t have byte writes (or the compiler doesn’t use them) but does a word read as an instruction, changes a piece, then write the word, you will have problems.

Where you get problems is with multi-core, as core will tend to share memory at the cache-line level, so that would not be safe if the two tasks might be on different cores.

2 Likes

What about members of a packed struct?

__attribute__((packed))
typedef struct {
    char c; // 1 byte
    int i;  // 4 bytes
} my_struct_t:

my_struct_t arr[2];

int *a = &(arr[0].i);
int *b = &(arr[1].i);

In this case the i members are not aligned in memory, the following could be an in-memory representation of arr

  --- arr[0].c        --- arr[1].c 
 |                   | 
\ /  arr[0].i       \ /  arr[1].i
|-| |-------------| |-| |-------------|
 _   _   _   _   _   _   _   _   _   _
 0   1   2   3   4   5   6   7   8   9

a points to address 0x01 and b to address 0x06, in this case the processor to access *a should access the memory blocks [0x00 - 0x03] and [0x04 - 0x07], ending up accessing arr[0].c, arr[1].c and arr[1].i.

So if in this case we had taskA writing to *a and taskB writing to *b, would that be thread-safe?

For a “packed” struct, it depends on how the complier generates the code (particularly the writes), If it uses byte writes for the 4 bytes of the packed int, then no problem, which is the most likely case. If it makes two word size writes, and loads the old data into the writes, it will have problems.

The key is that for misaligned accesses, since byte access IS available on most processors, accessing in 4 separate reads and writes the data at +1, +2, +3, and +4 (and thus not touching the others) is the simplest. This get arround the problem that to use word only accesses, it would need different processing for b then a, as they are at different offsets within the word.

For word access based machines, to “fake” byte accessing, this was done, but most modern machines have byte accessing instructions, where the processor handles the issues, and may even recognize the multiple writes when done “right” and merge them together.

1 Like

If you are sure that different tasks will always access different indexes of the array, why can you not define them separately. Something like the following:

my_struct_t core0Def;
my_struct_t core1Def;
1 Like

When I saw your first question, I thought: stay away from there. Maybe you get it working, but it is fragile: things may go wrong when optimisation changes, or when the compiler (-flags) gets upgraded.

FreeRTOS+TCP uses packed structs to represent IP-packets. I liked it because structs are easy while debugging code: you can see all member values together in a nice table.

It must be said that in most cases the packed attribute (“pack_struct_start.h”) was not necessary, because even without packing the fields would be nicely aligned.

There is only notable exception is:

    #include "pack_struct_start.h"
    struct xARP_HEADER
    {
        // 32-byte aligned
        MACAddress_t xSenderHardwareAddress;  // 6 bytes, aligned
        uint8_t ucSenderProtocolAddress[ 4 ]; // 4 bytes, unaligned
        MACAddress_t xTargetHardwareAddress;  // 6 bytes, aligned
        uint32_t ulTargetProtocolAddress;     // 4 bytes, aligned
    }
    #include "pack_struct_end.h"
    typedef struct xARP_HEADER ARPHeader_t;

A MAC address is 6 bytes long, and therefore the next field is 2-byte aligned. We turned uint32_t uxSenderProtocolAddress into an array of 4 bytes to avoid alignment errors.

FreeRTOS+TCP uses safe methods to access variables bitwise, for instance in BitConfig.h. BitConfig is used by the DHCPv6 client. It parses arrays of bytes, reading them as either char, int16 or int32. Here is one example:

    vBitConfig_write_16( &( xMessage ), pxDHCPMessage->xClientID.usHardwareType );
    vBitConfig_write_32( &( xMessage ), pxDHCPMessage->ulTimeStamp );
    vBitConfig_write_uc( &( xMessage ), pxEndPoint->xMACAddress.ucBytes, ipMAC_ADDRESS_LENGTH_BYTES );

When using this method, you don’t have to worry about alignment, because all data objects are handled as character arrays.

No worries? Hm, be alert when changing the optimisation level. I have seen that GCC would replace the following (big endian) code:

    pucData[ 0 ] = ( uint8_t ) ( ( ulValue >> 24 ) & 0xFFU );
    pucData[ 1 ] = ( uint8_t ) ( ( ulValue >> 16 ) & 0xFFU );
    pucData[ 2 ] = ( uint8_t ) ( ( ulValue >> 8 ) & 0xFFU );
    pucData[ 3 ] = ( uint8_t ) ( ulValue & 0xFFU );

with either:

    ( ( uint16_t *)pucData )[ 0 ] = ( uint16_t ) ( ( ulValue >> 16 ) & 0xFFFFU );
    ( ( uint16_t *)pucData )[ 1 ] = ( uint16_t ) ( ulValue & 0xFFFFU );

or with:

    *( ( uint32_t *)pucData ) = ulValue;

Both caused a fatal exception.

EDIT

The following function from a gzip library was broken by -Os, optimisation by size:

uint32_t LG(uint8_t *ptr)
{
	uint32_t result;
	memcpy (&result, ptr, 4);
/*
    // Broken version:
    uint32_t result;
    uint8_t *target = (uint8_t *)&result;
    *(target + 0) = *(ptr + 0);
    *(target + 1) = *(ptr + 1);
    *(target + 2) = *(ptr + 2);
    *(target + 3) = *(ptr + 3);
*/
	return result;
}

hoping that memcpy() is aware of alignment.

1 Like

My question was mostly out of curiosity, but I guess there may be cases where a developer is forced to use an array this way.

@htibosch
Thanks for the further clarification, I was aware of the issue you mentioned, but my question was about thread safety, and from what I understand different threads can access different members of a (packed) struct without any problems.