FreeRTOS + FatFs : Works only with taskENTER_CRITICAL

tanffn wrote on Monday, December 01, 2014:

Hello,
I am new to FreeRTOS and FatFs so the following might involve some silly mistakes…
In order to get FatFs to work with 2 active tasks I had to wrap its calls with taskENTER_CRITICAL.
I was unable to find anyone facing this problem which leads me to think there is a different problem.

Overview:
I have 2 Tasks, Process() and SDCardStream() + SPI DMA that streams fills a DBuffer every 100ms.

Process() is in High priority and always takes ~50ms to complete.
SDCardStream() is in Normal priority and usually takes ~6ms (but sometimes 80ms, uSD BusyState) to complete.

I MUST complete one Process() task per 100ms tick.
The buffer allows me to “miss” an SD write and complete it later, this is why I thought FreeRTOS is the solution.

Problem:
If I don’t wrap the functions with a critical section I get fatal errors (FR_DISK_ERR, …).

    taskENTER_CRITICAL();
    res = f_open(&MyFile, filename, FA_CREATE_ALWAYS | FA_WRITE);
    taskEXIT_CRITICAL();

If I disable Process() it also works.

If I have a critical section, and the function will take 80ms, Process() won’t be called on time (i.e I won’t have enough time before the next tick).

** Is this a known issue?
** Is there a way to break FatFs in the “idle” wait periods (e.g. while it waits uSD to get out of busy state)
** Can you suggest a lead / idea?

edwards3 wrote on Monday, December 01, 2014:

Which FAT code are you using? Chan?

heinbali01 wrote on Tuesday, December 02, 2014:

ChaN and most FS drivers will have some kind of protection.

    /* The _FS_REENTRANT option switches the reentrancy of the FatFs module.
    /
    /   0: Disable reentrancy. _SYNC_t and _FS_TIMEOUT have no effect.
    /   1: Enable reentrancy. Also user provided synchronization handlers,
    /      ff_req_grant, ff_rel_grant, ff_del_syncobj and ff_cre_syncobj
    /      function must be added to the project. */

If you are sure that SDCardStream() is the only user, there is no need for protection at all.

If you have multiple users, you might want to enable _FS_REENTRANT and provide sync routines. taskENTER_CRITICAL() is too heavy for this type of protection.

But: is your SD-card accessed through SPI? And does that interfere with Process() ?

What else is Process() doing that could disturb access to the SD-card?

richard_damon wrote on Tuesday, December 02, 2014:

You should not be using taskENTER_CRITICAL to protect something that can take 6ms, let alone 80ms. A mutex is a much better option here, as that will only block the other tasks that try to use the file system. (The code may even support a locking method inside itself which you can connect to such a mutex).

tanffn wrote on Tuesday, December 02, 2014:

Yes, which came with stm32cubef4. I am using the STM3240G_Eval board.

tanffn wrote on Tuesday, December 02, 2014:

Thank Hein,
I looked at that flag and it was configured correctly (1).

There is only one user for the SDcard which uses SDIO. SPI is currently not even enabled and I just wanted to test FreeRTOS + FatFs.

Process() is simply a loop…
for (int i = 0; i < 13186; i++)
res += sqrt((i*i) / 37);

tanffn wrote on Tuesday, December 02, 2014:

I know I shouldn’t use critical section, this is why I posted it as a problem :wink:

No on access the file system other than that one task SDCardStream().

tanffn wrote on Tuesday, December 02, 2014:

Some more information, the last few hundred bytes written (value varies) is corrupt! The same code without FreeRTOS works correctly.

I have no idea what can cause this :frowning:
I know its borderline rude but I am desperately looking for ideas… I’ve attached the code of both projects (with/out FreeRTOS) and some test results.

heinbali01 wrote on Wednesday, December 03, 2014:

Hi Ariel,

You known that if you post code like this, you have to promise something in return? At least a chardonnay, or a home-made crumble apple pie :slight_smile:

You’re giving configMINIMAL_STACK_SIZE to uSDThread(). Disabling interrupts will avoid that the stack will be overwritten, and thus hide the problem.

Especially when opening/closing files, a lot of stack space may be needed.

These ChaN defines influence the stack usage:

#define	_USE_LFN	2		/* 0 to 3 */
#define	_MAX_LFN	128		/* Maximum LFN length to handle (12 to 255) */
/* The _USE_LFN option switches the LFN feature.
/
/   0: Disable LFN feature. _MAX_LFN has no effect.
/   1: Enable LFN with static working buffer on the BSS.
/   2: Enable LFN with dynamic working buffer on the STACK.
/   3: Enable LFN with dynamic working buffer on the HEAP. */

Tip: use a big stack (>= 1024 words, 4KB), and look at the actual stack usage.

(see http://www.freertos.org/Stacks-and-stack-overflow-checking.html)

PS. defining _FS_REENTRANT is not enough, you’ll also have to define ff_req_grant()/ff_rel_grant().
You don’t need it as long as uSDThread() is the only FS user.

Regards,
Hein

tanffn wrote on Wednesday, December 03, 2014:

You rock Hein! (you’ve earned home-made crumble apple pie by proxy *)
Increasing the stack size did fix the inconsistent writing errors!

…YET, still, without the critical section I get FR_DISK_ERR. :confused: (only when Process() is active)
It can happen after one file or after creating 13 files, but its gonna fail.
The files that had been created had been written correctly.

ST did take care of Semaphore everything is defined correctly (pointing to osSemaphore* at cmsis_os.c)

(*, proxy) I am willing to pay! :slight_smile:
I really don’t like to cross the vague “wow, that’s way too much” line but if I do I am fully willing to pay for the assistance.

tanffn wrote on Saturday, December 06, 2014:

No one experienced this error?
I have no idea how to approach the debugging of such an error.

heinbali01 wrote on Saturday, December 06, 2014:

Hi Ariel,

ChaN’s error: FR_DISK_ERR

Many developers will have seen this error, but there is not enough information available to say anything about the cause.

ChaN’s FS is a great driver, but when it encounters an error, you’ll only see a code like: ‘FR_DISK_ERR’. This may have 54 different reasons :slight_smile:

What I would do is create a new version ff_debug.c with some more logging:

#define ABORT(fs, res) { fp->err = (BYTE)(res); LEAVE_FF(fs, res); }

In ABORT() you could log at least __LINE__.

But FR_DISK_ERR may be thrown in many more cases, like here:

if (nxt == 0xFFFFFFFF) { res = FR_DISK_ERR; break; } /* Disk error? */

Are you sure your SD-card is still 100% sane? It must have got corrupted earlier when you had the stack problems. Did you try this again after formatting the card?

You could also test your low-level SD-card driver: write and read-back many sectors at different locations (after which it’ll need formatting again).

Think, think: you’re writing that:

without the critical section I get FR_DISK_ERR
only when Process() is active

Would the other task ProcessingLoop need more stack, maybe? You also gave it a minimum stack while it does use the math library?

Good luck!
Hein

tanffn wrote on Saturday, December 06, 2014:

Hi Hein, thank you again for helping me out!

To simplify it ThreadProcessing is now reduced to:

static void ThreadProcessing(void const *argument) { for( ;; ) vTaskDelay(1); }

I format the uSD before every run (f_mkfs) and it is able to write several files before it fails (from single digit to hundreds of files, the same code with the same configuration)

After doing many more tests I saw that even running the SDCardStream() task without any other task, it will fail, it just takes more time.

Been tweaking all of the configurations you mentioned and running the test in the same config several times (as it can even complete successfully (1024 files) in one test and then fail after 10 files in another run)

***The same code, with the same driver, works without the RTOS. I ran several tests of 32k files each.
I am sure its something trivial that I am missing (like Stack Overflow that you suggested)…

heinbali01 wrote on Sunday, December 07, 2014:

Simplification is always good to solve a problem.

When using a single task, everything runs well. When accessing the disk from a FreeRTOS task, you get rare and random errors.

Did you check the Cortex M priorities of all peripheral drivers?

I suppose that your SD-card driver uses interrupts.

Here is all information: http://www.freertos.org/RTOS-Cortex-M3-M4.html

Here’s maybe another useful comment:

/* The highest interrupt priority that can be used by any interrupt service
routine that makes calls to interrupt safe FreeRTOS API functions.  DO NOT CALL
INTERRUPT SAFE FREERTOS API FUNCTIONS FROM ANY INTERRUPT THAT HAS A HIGHER
PRIORITY THAN THIS! (higher priorities are lower numeric values. */
#define configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY  10

Regards,
Hein

tanffn wrote on Sunday, December 07, 2014:

Sadly it’s defined correctly under HAL_Init()
HAL_NVIC_SetPriorityGrouping(NVIC_PRIORITYGROUP_4);

I’ll simplify SDCardStream(), roll up my sleeves and start debugging STM’s SDIO.

heinbali01 wrote on Sunday, December 21, 2014:

That’s not exactly I was asking for:

Does your project define any interrupt and are the priorities not too high?

Literally: are the priorities used numerically not lower (reverse logic) than the value of: configLIBRARY_MAX_SYSCALL_INTERRUPT_PRIORITY ?

Especially, somewhere in 'portable/xx/ARM_CM3/port.c ,
make sure that these two configASSERT’s will do their job:

configASSERT( ucCurrentPriority >= ucMaxSysCallPriority );
configASSERT( ( portAIRCR_REG & portPRIORITY_GROUP_MASK ) <= ulMaxPRIGROUPValue );

A second question: Did you define:

void vApplicationTickHook( void );

As you probably know, the freedom to write code inside this call-back function is very limited, because it is called from the Idle task tick interrupt.

Regards.

tanffn wrote on Monday, December 22, 2014:

I’ve been able to track down the problem to “Transmit FIFO underrun” (in BSP_SD_WriteBlocks).

Apparently it’s a known issue that “Non-DMA mode” results in random underrun errors when writing.
In addition trying to resolve it by enabling “HW flow control” results in CRC errors (which are also a know issue and are documented in the errata).

I ended up simply configuring the SDIO driver to use DMA, which works well.
(that might lead to another undocumented DMA conflict issue(DMA Maximum Transactions), but thats a problem for another day :))