FreeRTOS Plus FAT extrem slow (compared to FatFS)

dc42 · October 7, 2025, 8:09am

@htibosch I was referring to FatFs, not FreeRTOS+FAT.

carlk3 · October 7, 2025, 1:43pm

Also, you can go further and put a buffered Standard Input/Output (stdio) wrapper around the FreeRTOS-Plus-FAT Standard API. Then, the application would use fprintf instead of ff_fprintf, or fwrite instead of ff_fwrite, for example. This can give a 4X speedup for some applications. See:

P51D · October 15, 2025, 1:27pm

I’ve done some measurements focused on the FreeRTOS+FAT

All tests are made with 12.5MHz SPI clock and a line-length of 100bytes…

Reading 100 lines in 512 bytes chunk size

ff_fopen=1.2ms
do-while ff_fread with own ‘\n’ searching an line separation = 49.3ms
ff_fclose = 0.2ms

Reading 200 lines in 512 bytes chunk size

ff_fopen=0.94ms
do-while ff_fread with own ‘\n’ searching an line separation = 98.2ms
ff_fclose = 0.2ms

Reading 100 lines in 2048 bytes chunk size

ff_fopen=1.2ms
do-while ff_fread with own ‘\n’ searching an line separation = 44.3ms
ff_fclose = 0.2ms

Reading 200 lines in 2048 bytes chunk size

ff_fopen=0.94ms
do-while ff_fread with own ‘\n’ searching an line separation = 86.2ms
ff_fclose = 0.2ms

compared to reading the entire file, without line separation:

ff_fread = 12ms for 100 lines
ff_fread = 20ms for 200 lines

or using ff_fgets

100 lines = 152.4ms
200 lines = 304.7ms

So reading as mutch as possible and process the data in RAM would be the fastest approach.

Summary

ff_fgets is >3x slower than own (quick&dirty) implementation. I’m surprized about this, and I think there is some space for improvements in the +FAT library! I migrated from the FatFS fgets to the ff_fgets and struggled with the new performance → so I think they do something different.
reading chunks > sector size for getting some % improvements
ff_freads shows that I get ~1MB/s, what is 50% of what you @carlk3 measured, but you had an 31.25MHz SPI-Clock. So I think the theori fits

htibosch · October 16, 2025, 3:51am

Thank you for these details.

@P51D wrote:

I think there is some space for improvements in the +FAT library!

Always! And we are grateful for the feedback.

Maybe we should put a #warning in the declaration of these byte-oriented read/write functions.

We could add a new read/write object that has its own i/o buffers.

ff_fgets is >3x slower than own (quick&dirty) implementation

As you may have seen, FF_GetLine() is much slower, because it is 100% flexible. You can seek to any position and call FF_GetLine() or FF_GetC() from there.

This is how it is implemented:

FF_GetLine() calls FF_GetC() for every single byte.
FF_GetC() calls FF_getMinorBlockEntry()
FF_GetC() calls FF_SetCluster()
Depending on FilePointer, calculate CurrentCluster
and traverse the FAT to find the right ulAddrCurrentCluster
FF_SetCluster() may call FF_TraverseFAT()
If it succeeds:
FF_ReadPartial() is called to finally read a single byte.

The functions FF_GetC() and ff_fputc() were added for “academic completeness”, but they’re too slow for a real-life application.

It is preferred to use ff_fread() and ff_fwrite() only, preferably with a multiple of 512 bytes.

In my projects, I often have a c++ object that puts a buffer between the application and the +FAT functions ff_fread() and ff_fwrite().

carlk3 · December 4, 2025, 5:29pm

As I mentioned above, if you’re using a Standard C Library implementation like glibc or newlib you can easily put a buffered Standard Input/Output (stdio) wrapper around the FreeRTOS-Plus-FAT Standard API and get a vast speedup for operations like these.

Will_Robertson · February 3, 2026, 10:37am

Hi Carl @carlk3 , David @dc42 and Hein

Thank you very much - particularly for Carl’s performance tuning tips (sorry I can’t include a link to the performance tuning tips - I got an error message here when I included links so had to strip all the links out).

We’re working on writing the stream of data from the h.264 video encoder on the STM32N6 to a buffer then to SD storage for a new open source wildlife camera so this has been a big help to us.

At the moment we’re using FileX in ST’s venc_sdcard_ThreadX example but FreeRTOS+FAT would also be an option. We’re encountering long and very volatile write times from FileX (c. 100x slower on average than the STM32N6 and a V30 SD card hardware would be expected to achieve in theory) - these look characteristic of what we’d expect from incomplete clusters (or even incomplete 512 byte sectors) being written to the SD but we’re finding it very difficult to debug how the file system is buffering the data and interacting with the HAL to write data to the SD - we suspect that at some layer something may be buffering inappropriately and/or trying to write data one sector at a time instead of one cluster at a time and/or not correctly handling the DAT0 low signal from the SD card (indicating when it can’t accept data) but we’re having difficulty diagnosing the exact causes of the problem.

We suspect that in the example code some the performance could be improved by pre-allocating space for the file, enabling HardwareFlowControl and trying to use Transceiver communication with the SD card but we’re not sure.

This is our conservation work that the open source wildlife camera is to support (Quick Summary: New Homes for Old Friends Switzerland – New Homes For Old Friends - a “An error occurred: Sorry, new users can’t put links in posts.“ error message forced me to remove the URL but a Google search should find it.)

These are the potential ideas we’re looking at at the moment to improve performance - I tried to include a link to them but was forced to strip that: SD_Card.md (Googling doesn’t find that but hopefully it should be possible to go through the directory structure on my github account)

We’re working on a fork of the ST venc_sdcard_ThreadX example.

I put together is a spreadsheet with some of the very volatile write times we’re seeing but the link to that had to be stripped as well to post this - we got write speeds ranging from 1 Megabyte per second to 20 Kilobytes per second.

Any help or suggestions that anyone could give would be wonderful!

My background is as a climbing arborist and in server side software so I don’t have any experience of debugging the interaction between filesystems and the underlying HAL.

Will

aggarg · February 5, 2026, 6:35am

@Will_Robertson You should be able to post links now.

htibosch · February 13, 2026, 3:24pm

Interesting, the article about trees and animals. In an earlier project, FreeRTOS+FAT was used on the ocean, for filming fish under the ship.

When using FreeRTOS+FAT for mass storage, the following will be nothing new for you:

Make sure that every write contains a multiple of 512 bytes, and preferably contains like 10 KB or more. You can do a simple experiment: fill a file in blocks of 1KB, 10KB or e.g. 100KB.
Realise that if you always write data in blocks of “N x 512” bytes, the user pointer will be passed to the DMA. In other words, the driver is “zero-copy” whenever possible.

Yes please, share your spread sheet with measurements. I think that you can attach ZIP files now.

file system is buffering the data and interacting with the HAL to write data to the SD

When following the above rules, there won’t be much buffering in the SD-card driver.

There is an old article about formatting SD-cards. We found that in some cases, the SD-card is considerably faster if it is formatted by FreeRTOS+FAT, while using large clusters.

carlk3 · February 13, 2026, 4:36pm

32 Bit (4 Byte) Buffer Alignment is probably crucial if you’re using SD mode and DMA. Depending on your hardware and driver, if a read or write buffer is not 4-byte aligned, it will have to be copied to one that is. That means writing or reading one block at a time, which is much slower than streaming multiple blocks. It’s easy to ensure your buffers are 4-byte aligned if you use buffered Standard Input/Output (stdio). (See stdio_buffering example).

Topic		Replies	Views
FreeRTOS+FAT FIFO Kernel	15	1457	May 11, 2020
hiccup when writting SD card, freeRTOS, fatfs, ZYNQ platform Kernel	8	828	June 2, 2017
FreeRTOS + FAT: exFAT support? Libraries	14	3549	March 21, 2020
FreeRTOS+FAT example required for SD card using SPI interface Libraries	15	4003	January 23, 2024
FreeRTOS+FAT slow write on 64GB eMMC and concurrency Libraries	11	2385	January 19, 2021

FreeRTOS Plus FAT extrem slow (compared to FatFS)

Related topics