FreeRTOS+FAT example required for SD card using SPI interface

Acutetech · July 21, 2022, 6:28am

I would like to implement FAT with FreeRTOS for an SD card using the SPI interface. Can anyone point me to a working example?

The examples at Lab-Project-FreeRTOS-FAT seem to be for chips which have SDIO interfaces, so are not applicable.

The porting documentation at FreeRTOS_Plus_FAT/Creating_a_file_system_media_driver says " if the media is an SD card then it might be necessary to access the card through an SPI peripheral" which is exactly what I am looking for.

My hardware has a FreeRTOS port, and has a working FatFs (elm-chan) port working through an SPI interface to an SD card. But the FatFs Low level disk I/O module diskio.c does not look like it will map easily to FreeRTOS FAT.

Somone must have cracked this. Help appreciated.

htibosch · July 26, 2022, 12:42am

Hello Charles, I’m sorry that your message stayed unanswered for five days.

If you have FatFs working you can use its driver for FreeRTOS+FAT as well. It is well described on this page.

It might be useful to look at an existing driver for comparison, for instance the STM32F4x driver.

For what platform are you developing?

Acutetech · July 26, 2022, 1:10am

Thanks Hein. I had seen the links you included. The STM32F4xx (and indeed the other reference implementations) support SD cards using a dedicated SDIO interface, rather than the SPI interface. So what is needed is a reference implementation that uses an SPI interface. My guess is that there must be plenty, but so far I have only been able to find one, which I am attempting to adapt now.

This design is for the Cypress CY8C6347BZI-BLD53 by Carl Kugler, available here:

That’s the good news. The bad news is that it is an all-singing, all-dancing implementation supporting multiple SD cards on multiple SPI ports, and re-entrancy as well.

I am using the Maxim MAX78000 which has SPI ports but not SDIO ports. There is a FAT SD card implementation available for this chip using an SPI interface (which I have working) but once again, it is not straight-forward to move from that to FreeRTOS+FAT.

Hence the request: are there any reference implementations using SPI bus?

htibosch · July 26, 2022, 4:01am

The STM32F4xx (and indeed the other reference implementations) support SD cards using a dedicated SDIO interface

Yes that is true, the current drivers use a dedicated interface.

I have once developed an SD-card driver using SPI, and I still have bad dreams about that adventure.

Carl Kugler is really good, he is very experienced with SD/MMC cards and he knows all about FreeRTOS+FAT. He also helped testing and reported about that.

are there any reference implementations using SPI bus?

I don’t know of any.

Can you share the library so I can look at how it is structured?

All your driver needs is:

Initialisation function
Read function
Write function

Each of the functions should be “blocking”: it should only return when all has been done.

The functions do not need to be “re-entrant” because they are protected with a mutex ( see ff_locking.c ).

Can you try to attach the SPI-driver as it is now?

Acutetech · July 26, 2022, 4:40am

Hi Hein. I note your observation about the 3 functions that my driver needs, and I see how these are implemented in your reference implementations (with SDIO interfaces). In those cases the read and write functions quickly call HAL_SD_ReadBlocks() and HAL_SD_WriteBlocks(), and I guess all the messy work is done there.

In the case of Carl’s work, and the Maxim implementation of FatFS here: http://elm-chan.org/fsw/ff/00index_e.html the read and write functions become sequences of low-level commands sent through the SPI bus. The FatFS code uses blocking SPI transfers. Carl is using DMA, with the attendant need to manage interrupts and callbacks. Carl’s portable folder contains 6 .c files (partly due to the added complexity of his implementation):
FreeRTOS-FAT-CLI-for-PSoC-63/portable/MCU_PSOC6_M4 at master · carlk3/FreeRTOS-FAT-CLI-for-PSoC-63 · GitHub

I can see what Carl has done for the Cypress chip but the Maxim API calls for SPI and DMA are different so I am am having to substitute functionality carefully. I have no “driver” at this stage as I am hacking it now.

Maxim’s implementations (not FreeRTOS) are here:
(1) A test of the FatFs:

(2) This uses the FatFs driver from here (hardware-specific code is in diskio.c but uses synchronous, blocking SPI transfers):

(3) An example of using non-blocking async DMA and callbacks to transfer blocks of data through the SPI interface:

htibosch · July 26, 2022, 6:23am

Carl’s implementation looks very good to me. These are the 3 functions that you want to call:

int sd_init(sd_card_t *this);
int sd_read_blocks(sd_card_t *this, uint8_t *buffer, uint64_t ulSectorNumber, uint32_t ulSectorCount);
int sd_write_blocks(sd_card_t *this, const uint8_t *buffer, uint64_t ulSectorNumber, uint32_t blockCnt);

The parameter ‘ulSectorCount’ and ‘blockCnt’ have the same meaning: the number of sectors to be read or written.

You will need to provide SPI functions mentioned in sd_spi.h.

Carl also provides a module ff_sddisk.c that uses the 3 functions.

You do not need the following functions:

	sd_lock( sd_card_t *this );
	sd_unlock( sd_card_t *this );

because FreeRTOS+FAT already makes sure that no reentrant calls will be done.

SPI and DMA: you can start with a simple polling version, and later on sort out how to use DMA.

Acutetech · July 27, 2022, 8:57am

Any advice on control of the SPI chip select pin? My guess is that in Carl’s implementation CS pulses 7 times while sending 6 command bytes and reading one response byte. While Maxim’s FatFS driver seems to take CS low across all 7 bytes. Are both approaches OK or am I misunderstanding something?

htibosch · July 27, 2022, 2:15pm

Any advice on control of the SPI chip select pin?

There are two ways of controlling the CS pin:

Manual: make it low before a transaction, and let it get pulled up (or make it high) after the transaction.
Automatic: some SPI peripherals have a dedicated CS pin. As long as the peripheral is exchanging data, CS will be held low.

My guess is that in Carl’s implementation CS pulses 7 times while sending 6 command bytes and reading one response byte.

I can not find the code that makes you think this.

My own driver shows:

	pucBuf[0] = '\xff';               // Filler bits
	pucBuf[1] = 0x40 | (cmd & 0x7F);  // Send the command
	pucBuf[2] = (arg>>24) & 0xFF;     // send command arguments
	pucBuf[3] = (arg>>16) & 0xFF;
	pucBuf[4] = (arg>>8) & 0xFF;
	pucBuf[5] = arg & 0xFF;           // Send LSB
	uint8_t crc7 = getCrc7 ( pucBuf + 1, 5 );
	pucBuf[6] = crc7;	// correct CRC for first command in SPI (CMD0)

When 7 bytes are exchanged, the CS should go low before the first bit, and it should go high after the last response bit was clocked-in.
The number of response bytes is variable (16, 20, 24, 64). The drivers sends out dummy bytes to the SD-card just to keep the clock running.

As for SPI, there is no difference between command bytes and responses. When you clock out 7 bytes, the peer will return 7 bytes. It depends on the protocol which return bytes can be considered “response bytes”. In most situations, the first response byte is useless, because the SPI slave has not yet received a single command.

htibosch · July 27, 2022, 2:54pm

Not sure if it was all clear what I wrote in my previous reply?

A CMD exchange between SD-driver and SD-card exists of sending a command with parameters and a checksum. Those are 8 bytes, and the reply that is received during these 8 bytes is ignored.
After the 8th byte, CS is still kept low, and the response will be clocked in.

The following statement needs some modification:

Carl’s implementation CS pulses 7 times while sending 6 command bytes and reading one response byte

The Chip Select line ( CS or SS, Slave Select ) does not send pulses. The clock line (CK or SCLK) sends pulses. The CS line goes low for as long as the exchange lasts.

Carl’s implementation will send 7 command bytes ( including the last CRC byte ). After the 7th byte, the SPI master keeps on clocking in order to read the multi-byte result.

Acutetech · July 27, 2022, 9:33pm

I understand this and since I can’t see the “manual” implementation in Carl’s code I assume it must be the “automatic” implementation. But then when I look at sd_cmd_spi() in his sd_card.c I can see 6 calls to sd_spi_write() sending one byte, then a loop that calls sd_spi_write() until the single response byte is returned.

sd_spi_write() calls Cy_SCB_SPI_Write(), waits for an interrupt, then calls Cy_SCB_SPI_Read() to read a byte (no wait for interrupt, so it must be delivering the byte that arrived in the previous Cy_SCB_SPI_Write()). In the “automatic” approach CS will be asserted and negated 7 times during the sd_cmd_spi() sequence, with 7 interrupts, which I find surprising.

However I cannot see any “manual” code that asserts and negates CS - bracketing all 7 bytes of the sd_cmd_spi() sequence. It is not in sd_select() and sd_deselect(). So I am forced to conclude that CS is asserted and negated for each byte (automatically).

In contrast, the same command sequence in Maxim’s FatFS is implemented by send_cmd() which brackets the whole sequence with a single manual CS assertion and negation.

Because these approaches seem quite different I was/am seeking assurance that I am not overlooking something.

htibosch · July 28, 2022, 6:21am

I understand this and since I can’t see the “manual” implementation in Carl’s code I assume it must be the “automatic” implementation
So I am forced to conclude that CS is asserted and negated for each byte (automatically).

That is also my conclusion.

Because these approaches seem quite different I was/am seeking assurance that I am not overlooking something.

I think that if you would look at the signals on the wires, you will see the same behaviour.

One difficulty in the protocol is that the response time can be variable: the protocol says that in some cases the SPI master must send filler bytes ( 0xff ) until a valid response is found.
An example of this can be found in sd_wait_ready(): it sends 0xff until a timeout is reached, or until the response is not equal to 0x00.

If I were you I would just start playing with it: create a spi_exchange function, and also a select/deselect couple, and send a simple command. Little by little you can make it more advanced.

Acutetech · August 2, 2022, 9:35pm

Progress report: I have my port to the MAX78000 working. Commands etc use interrupts and sector data transfer uses DMA. the chip select is asserted and negated for each byte of the command transfers (as with Carl’s). This seems inelegant and inefficient, but it is worth recording here that it does work.

I will now tidy up my code - including addressing the over-active chip select - and perhaps it could then be of use to others.

carlk3 · August 13, 2022, 2:07am

That doesn’t seem right, but it has been a long time since I’ve looked at the Cypress implementation. Over the years I’ve experimented with different strategies for CS. Usually I have kept CS asserted for entire transactions. At one point, I even extended this to keeping CS asserted between CMD24_WRITE_BLOCK or CMD25_WRITE_MULTIPLE_BLOCK and the following CMD13_SEND_STATUS. However, an Integral SD card seemed to object to that and would give no response to the CMD13_SEND_STATUS. The Specifications says only that “The host starts every bus transaction by asserting the CS signal low.” To me, that doesn’t say that it must be deasserted high first. But others might have a different interpretation. I have tried hardware CS control at times, but now I prefer to manage it in software. One reason for that is the “Cosideration on Multi-slave Configuration” that ChaN describes at How to Use MMC/SDC.

I’ve ported that Cypress implementation to the Raspberry Pi Pico (carlk3/FreeRTOS-FAT-CLI-for-RPi-Pico). That port is much more recent (and, I like to think, better) than the Cypress one. Since then, I’ve also ported it to STM32, but that project is currently only in a private repository.

Acutetech · August 13, 2022, 4:52am

Thanks Carl.
I am pretty sure that the Cypress port asserts and negates CS for every byte transferred.

Certainly, in my first port for the MAX78000 I did this, and I was able to access SD cards successfully. In Maxim’s port of the FatFs - Generic FAT Filesystem Module it looks like CS is negated at the start of each command sequence then immediately asserted again, so the CS stays low until the start of the subsequent command (as an approximation).

My current code takes an intermediate approach: where I can see a group of transfers (e.g. the 6-byte command, 2-byte CRC or 16-byte read of CSD data) I assert CS at the start of the block and negate it at the end. Of course this applies to 512b byte sector transfers.

I have also had to control the CS manually: when I ask the chip to control CS it fails. I assume some setup or hold time violation, but I am not currently able to check with an oscilloscope. There is little time penalty in this, or with my somewhat sub-optimal approach to CS (above).

vinayak · January 23, 2024, 12:09pm

hi, @htibosch
I need to implement the FAT32 file system IP inside the FPGA without the CPU involved
thing is SD host controller IP is already implemented inside FPGA but that is without the FAT32 file system so i need to implement the FAT32 IP by that i can do the read write files to that so could you please provide the reference for that…

richard-damon · January 23, 2024, 3:51pm

Implementing the actual FAT file system operation in the FPGA (and not via a program running in a CPU, perhaps implemented in the FPGA) would be out of scope for this forum, which talks about the use of the FreeRTOS system, which runs on CPUs.

The FAT32 File system has a set of low level interface functions that are called to read or write a sector from the file system. If you have a SD Host Controller implemented, you just need to define these functions to perform those operations on the device on the SD Host Controller. That will be very dependent on the API of the IP, and there is likely some base software provided with that IP to help you write that code. It may need some changes to make it operate better under FreeRTOS, as it may have busy-loops waiting for slow operations that should be converted to waiting for an operation complete interrupt (if available in the IP).