FreeRTOS+FAT FIFO

zetter wrote on Friday, September 28, 2018:

Hi,

we’re using FreeRTOS+FAT for storing sensor data. One FreeRTOS task is collecting data and writing it to a file, while another task is reading from it and uploading data to a server. The data is chunked, and we’re currently writing one line in the file for each chunk. We’re keeping a separate read pointer to the file to be able to switch between read and write mode for the two processes (The often run parralell for “live streaming”). We’re basically using the file as a large FIFO with a binary semaphore.

The above solution works well, but eventuelly the file will become to large as no data is ever removed. Also, if the MCU should restart for some reason (power outage, WDT-reset etc.), it will upload all data ever collected. We could ofcourse empty the file each time the read pointer reaches EOF, but that would not solve the problem where the MCU restarts when only half the data is uploaded. Everything would then be re-uploaded.

So, we need some way of removing data that has allready been uploaded. The ideal solution would be to “truncate” with an offset. That is, if the file is N bytes large, and we upload the first X bytes, it would be perfect to truncate it to (N-X) bytes with X bytes offset. I don’t see any way this is possible with the current API.

Another solution would be to somehow prepend data to the file, and thus use the regular ff_truncate function for removing uploaded data from the end.

We’ve initially tried using one file per datachunk, but the FS became extremely slow after a few thousand files.

Does anyone have a good solution to this problem? Maybe there’s a clever/obvious solution I haven’t though of.

Best regards,
Fredrik

heinbali01 wrote on Friday, September 28, 2018:

Hi Fredrik,

what medium are you using for FreeRTOS+FAT? Is it an SD-card?

As for sensor data that is collected at a constant speed, I would not really think of using FAT. I’d rather use some SPI/NAND memory chip.

But if you insist on using FAT, I would propose to create a big fixed-size binary file, which has a 512-byte header which stores a read and a write pointer ( among other things ).
When a data record is written, the write point advances. When a pointer reaches the maximum size of the file, it wraps and moves to sector 1 at offset 512 ( sector 0 will be for the storing meta data ).

When the file is not present at start-up, create it and write zero’s until it is all filled. After that, you will find that access is very fast, just because all sectors have been pre-allocated, and no FAT changes are needed.

we’re currently writing one line in the file for each chunk

It would be handy if you can give the logging records a fixed size. If not, you’d have to look for tokens like LF/CR, which makes it slower.

We’re basically using the file as a large FIFO with a binary semaphore

In the solution that I’m proposing, the file will not become a FIFO, but a circular buffer :slight_smile:

Note that FreeRTOS+FAT already has a semaphore to protect against concurrent access. But… as long as you have a file opened with WRITE access, READ access will not be allowed.
But you can both read and write, when using the same file handle ( while protecting the handle with a semaphore :slight_smile: ).

The data written to a file will be flushed immediately when calling the +FAT flush routine FF_FlushCache().
This is in contrast when creating a file with a growing length: the directory entry is not updated when calling FF_FlushCache(). Flushing data in +FAT means: emptying and freeing the sector cache buffers, thus it doesn’t mean that the directory entry is updated and flushed to disk. That only happens when the output file is being closed.

richard_damon wrote on Friday, September 28, 2018:

“Files” do not support things like inserting/deleting at the begining (or middle) because that would require effectively reading in the whole file, modifying it, and writing it out, due to the block nature of the file system. Hein has pointed out one way of changing to use it as a circular buffer. Another option would be to automatically roll from one file to another when you reach a certain size, and then delete the old files as they are done being used (now read and write might be different files at times).

heinbali01 wrote on Saturday, September 29, 2018:

Richard’s idea is also good, and it allows to open one file in read mode, and another file in write mode independently.
Only one remark: when there is a power fail or an exception, the length of the file being written won’t get updated in the directory entry. That is only done when the write handle is closed in FF_Close().
When it is a new file, then the length will appear as zero until it is closed.
When closing a file, all sector buffers will be flushed to disk.

A file can be opened in in update mode ( allowing read & write ) with for instance FF_Open(pcName, "+").
With "a+", the file will be created if it does not exist yet.
Please see FF_GetModeBits() in ff_file.c.

zetter wrote on Monday, October 01, 2018:

Hein and Richard,

thank you for your suggestions.

Hein, yes, we’re currently using a SD-card, but will switch to a eMMC as soon as our new board arrives in about two weeks. We’ve used SD-cards for a few years, but wanted something soldered to the board for this generation of boards. eMMC seemed like a nice option. We’ll see how it works out.

Hein, the sulution you’re proposing is the exact one we’ve been using for the last few years. We’ve used raw acces to a SD-card without a FS storing read/write pointers that wraps to create a circular buffer. It has worked very well, but we figured there might be a simpler setup now that we have a FS and all :slight_smile: Also, we try to use open source/well tested code as much as possible instead of inventing the weel over and over as we have done in the past :slight_smile:

We don’t really need to use FAT to store the data, it just seem convenient for debugging purposes. In the past, when something has gone wrong with the storage, we had to upload the whole storage and go through the binary format. We figured a file would be more convenient as we’ve have added a small web server to the code so that we can view the FS and the files on it.

But, maybe going back to that approach with a circular buffer, but in a file, is a good solution.

So, regarding the fixed size file. Is the only added benefit write speeds? Or would a crasch with a file a that grows mean that data is lost even though we flush?

Best regards,
Fredrik

heinbali01 wrote on Monday, October 01, 2018:

Hi Fredrik,

We’ve used SD-cards for a few years, but wanted something soldered to the board

A wise decision. The contacts of an SD-card are vulnerable for corrosion, spiders, and oily substances.

We’ve used raw access to a SD-card without a FS storing read/write pointers
that wraps to create a circular buffer

Well: it is useful to have a file system. For instance it is nice to store firmware updates ( HEX files ) and create ( human readable ) configuration files or web pages. Or statistics, and debugging output :slight_smile:

The access speed should be about the same if you compare a raw access with a fixed-size binary file.

So, regarding the fixed size file. Is the only added benefit write speeds?

When the file has a fixed size, then the directory entry will never have to change any more.
And also, it will be considerably faster than a growing file, because ff_fwrite() doesn’t have to allocate new sectors. Especially if your information blocks are sector-aligned ( a multiple of 512 bytes ), it will be the fastest solution.

Or would a crash with a file a that grows mean that data is lost even though
we flush?

When you call FF_FlushCache(), all sector buffers will be written to disk. So your data have been stored. But that will not update directory entries.

When you want to use a growing file: just call FF_Close(), and re-open it in append mode the next time you want to write. FF_Close() will also call FF_FlushCache() when it is ready.
Here is another thread about this subject.

zetter wrote on Monday, October 01, 2018:

Thanks again Hein. We’ll go with a growing file and close/open after each write of a complete chunk (will happen at most 1/second during intense periods). Once the file has grown over a few GB well try to delete it and start over by assert all data is uploaded etc.

The next challenge will be how to store the information on what data is upplaoded and acked (with CRCs to server). That read-pointer is currently only hel in memory which makes the system voulnareable to resets (WDT, power outage etc.). The obvious solution would be to write it to the file (at the beginning as you suggest) or to a separate file. The problem I see is that it will be written quite often (sometimes a few thousand times/hour) which will wear the sectors of the flash where it’s stored out.

Best regards,
Fredrik

richarddamon wrote on Monday, October 01, 2018:

One advantage of rolling the file over is that when the data in one file is uploaded, that file can be deleted so you don’t try to upload it again. That means that you will only duplicate upload one file worth of data. That might argue for making the file size smaller for when you roll to a new file.

zetter wrote on Tuesday, October 02, 2018:

Richard, true, thanks.

This old discussion is now of great interest to me. I’m seeing some interesting phenomena. Here is the scenario: I’m logging data to SD card. For each day, there are two files containing very different records. Type 1 records are small, 38 bytes, written once per second. Type 2 are large, 20,006 bytes, written once per minute. For each update, I open a file for append, write one record, and then close.

What I’m finding is that the updates get slower and slower over the 24 hour period. The problem seems to be much worse with the smallest card I have; 8 GB. When I’m running a double-speed test, I eventually get a failure:

FF_GetBuffer[0x124]: failed mode 0x1

Here is some speculation, backed by some analysis:

  • This pattern of updates is causing the files to become interleaved, resulting in fragmentation.
  • Normally, for a solid-state drive, fragmentation is not considered a problem. However, in this case, when I open the file for append, +FAT has to traverse a long chain to find the end of the file.
  • [Pure speculation:] Traversing all the fragments uses a lot of buffer space, and eventually it gets exhausted, leading to the FF_GetBuffer failure.
  • The problem is worse with the 8 GB card because it uses 4k clusters, whereas the larger cards use 32k. Maybe the smaller clusters allow more fragmentation.

So, I’m thinking about what I can do about this. Here are some ideas I’ve come up with:

  • Would keeping the files open help? I mean, at least +FAT shouldn’t have to follow a chain to find the end of the file when I make an update; the “cursor” is already there. I wish there were a way to cause the directory entry to be updated and flushed to disk without closing the file.
  • It seems unlikely that putting the files in separate directories would help, but I don’t know that.
  • I’m thinking putting the files in separate partitions would probably result in nice, contiguous files even if I’m still opening and closing. Of course, it is more complexity.

Thoughts?

Edit: According to this tool: https://www.partitionwizard.com/help/align-partition.html the partition on my 8GB disk is misaligned (and it won’t fix it unless I’ve paid). Maybe that is at least part of the problem? Could this have anything to do with the parameters passed to prvPartitionAndFormatDisk?

Indeed it helps to make the cluster size as big as possible: less access to the FAT.

In projects where an SD-card is being used, I always let the FreeRTOS application partition and format the disk. That is part of the installation. It makes sure that big clusters will be used.

What also helps performance is to read and write in blocks that are a multiple of a 512-bytes.
So if a data-block is 20,006 bytes, if possible, I would allocate 20480 bytes in a file for a single data block. Writing 20480 aligned bytes happens within a single command. Writing 20,006 bytes might typically need 3 writes.

38 bytes is very small and inefficient for disk access. You can put 26 small data blocks in a KB (2 sectors). Writing 2 sectors is very quick.
26 * 38 = 988 bytes. You would loose 1024 - 988 = 36 byte in every KB.

I had a similar project: constant logging to a file (dB measurements). I asked my customer: in case of an unexpected power-cut, how much data may be lost? The measurements of the last 10 minutes? The last hour?
Can you answer that question for your project? I hope that the answer is not: the last second!

Now if e.g. at most 15 minutes of logging may be lost, you could write to disk 4 times an hour. Provided that you have enough RAM space to store the data. I would only write multiples of sectors, and keep the remaining data for later.

Keep files open: the directory might not be updated ( with the latest file size ), until you close the file.

If opening a file in append mode is slow, I would look at the +FAT caching ( see xParameters.ulMemorySize ). Does it help to increase the cache that you give the I/O handler?

About alignment: bigger SD-cards seem to have a block of 4MB ( at an offset of 4MB ) that is optimised to contain a FAT. A FAT has frequent access at random locations, whereas the data area has infrequent linear access.
When FreeRTOS+FAT partitions a card, it will try to put the FAT in that optimised area ( see MX_LBA_TO_MOVE_FAT ).

There is an interesting article about formatting SD cards and speed.

I assume that the start address of the data clusters must be a multiple of the cluster size. And I think that FreeRTOS+FAT follows that rule.

Make sure that FAT directories do not become too long. The FAT file system is very inefficient when handling long directories. It is much better to make directories per year, month, week and/or per day, so that each directory remains small ( but that is not the problem that you are reporting ).

For each update, I open a file for append, write one record, and then close.

I wonder how it would work if you create the entire file in advance at 0:00 am., giving it the full length that it needs. Then for every data record, you just write to the file. Would that be a lot faster?
And in that case, the directory doesn’t need updating, so you can keep the file open all day, provided that FF_SDDiskFlush() is called after writing.

I hope that any of this will help in finding a solution. Please tell how it goes.
Thanks.

Keeping the files open is incredibly, hugely more efficient. Barely any “disk” utilization. What are the consequences of not updating the directory with the latest file size?

Suppose there is a crash of some kind (e.g., power failure, whatever) while a file is open. Will a subsequent ff_fopen(name, "a") (open file for append) start where the previous write left off, or will it start at the file size from the directory?

Thanks,
Carl

Suppose there is a crash of some kind (e.g.,
power failure, whatever) while a file is open.

The latest clusters will be allocated in the FAT, the data has been written, but the directory information, i.e. the file size, may run behind.
When reopening the file later on, it will trust on the file-size information in the directory entry.

My idea of keeping the file constantly open, would require pre-allocating a file at 0:00 am. After allocating the file, close and open the handle. Now you only need to remember the actual length of the data.

One way to find the head of data is reading the file and look for non-zero data. Or you could write an index in the first sector of the file. If you are the only person to read the file, that could be a possibility.

Have you also played with increasing the +FAT cache?

Regards, H

After some digging in the code, I found ff_seteof(), which looks like it does about what I want: updates the directory information but keeps the file open. So, now I am calling that once every 15 minutes, which, I hope, means that I will lose at most that much history.* I need to do some testing. Certainly, the efficiency is better than it has to be.

  • I like to make the analogy that our project is a “Flight Data Recorder” (i.e., “black box”) of sorts, but it’s not really that important, outside of my imagination.

Hmmm, maybe I misunderstand what ff_seteof() is supposed to do. Apparently, it only updates the directory entry if the file is being truncated to 0 length?

I decided to make my own function to update the file size in the directory:

// Make Filesize equal to the FilePointer
FF_Error_t FF_UpdateDirEnt( FF_FILE *pxFile )
{
FF_DirEnt_t xOriginalEntry;
FF_Error_t xError;

	/* Get the directory entry and update it to show the new file size */
	xError = FF_GetEntry( pxFile->pxIOManager, pxFile->usDirEntry, pxFile->ulDirCluster, &xOriginalEntry );

	/* Now update the directory entry */
	if( ( FF_isERR( xError ) == pdFALSE ) &&
		( ( pxFile->ulFileSize != xOriginalEntry.ulFileSize ) || ( pxFile->ulFileSize == 0UL ) ) )
	{
		if( pxFile->ulFileSize == 0UL )
		{
			xOriginalEntry.ulObjectCluster = 0;
		}

		xOriginalEntry.ulFileSize = pxFile->ulFileSize;
		xError = FF_PutEntry( pxFile->pxIOManager, pxFile->usDirEntry, pxFile->ulDirCluster, &xOriginalEntry, NULL );
	}
	return xError;    
}

int prvFFErrorToErrno( FF_Error_t xError ); // In ff_stdio.c

FF_Error_t ff_set_fsize( FF_FILE *pxStream )
{
FF_Error_t iResult;    
int iReturn, ff_errno;

	iResult = FF_UpdateDirEnt( pxStream );

	ff_errno = prvFFErrorToErrno( iResult );

	if( ff_errno == 0 )
	{
		iReturn = 0;
	}
	else
	{
		iReturn = -1;
	}

	/* Store the errno to thread local storage. */
	stdioSET_ERRNO( ff_errno );

	return iReturn;
}

It seems to work OK. I ran this test:

void vSampleFunction( char *pcFileName)
{
#define N_ELEMENTS 311    
    
FF_FILE *pxFile;
uint16_t pcBuffer[N_ELEMENTS];   
unsigned count = 0;
    
    /* Open the file specified by the pcFileName parameter. */
    pxFile = ff_fopen( pcFileName, "w" );
    configASSERT(pxFile);
    
    for (size_t i = 0; i < N_ELEMENTS; ++i) {
        pcBuffer[i] = count++;
    }
    int rc = ff_fwrite( pcBuffer, sizeof(pcBuffer), 1, pxFile );
    configASSERT(1 == rc);

    if( FF_isERR( ff_set_fsize( pxFile )) != pdFALSE )
    {
        /* The truncate failed. */
        int error = stdioGET_ERRNO();
        printf("ff_set_fsize error: %s (%d)\n", strerror(error), error);        
    }  
    for (size_t i = 0; i < N_ELEMENTS; ++i) {
        pcBuffer[i] = count++;
    }
    rc = ff_fwrite( pcBuffer, sizeof(pcBuffer), 1, pxFile );
    configASSERT(1 == rc);   
}

and got this in the directory:

dir
test.dat [writable file] [size=622]

which makes sense; the first write is accounted for, but not the second. Windows CHKDSK is happy with it.