How to avoid corruption of a external memory

didi71 wrote on Monday, November 20, 2017:

Hello everybody.

Onto a LPC1768 system I periodically write one external SPI FRAM to store counters.
Rarely, during the writing procedure (few ten’s of microseconds), the FreeRTOS scheduler execute a preemptive task switch that seems to corrupt the data inside the FRAM.

Which is the best practice to preserve the correct writing procedure?
I suppose to wrap the writing code inside a:

  2. vTaskSuspendAll & xTaskResumeAll
  3. Other strategies?

Which is the best?

Thanks in advance.

richard_damon wrote on Monday, November 20, 2017:

If you have something like a SPI transaction that MUST run on tight timing, personally I find the best option is to make that run on interrupts, or maybe even better use DMA.

If for some reason you can’t do that, then I wouldn’t like making a critical section (taskENTER/EXIT_CRITICAL) that long. (if an ISR can be long enough to cause problems, then you really want to do the above, or you need the critical section).

One option would be to make the operation have the highest priority (so there isn’t another task that could prempt it), or use the vTaskSuspend/ResumeAll tyype of block.

didi71 wrote on Tuesday, November 21, 2017:

Hi Richard.

Thank you so much for your reply.

Unfortunately my firmware is inherited and I cannot change the architecture.
The only option is to wrap the FRAM write code inside a taskENTER/EXIT_CRITICAL block.
I’ve also to check with a digital oscilloscope the required write timing.


richard_damon wrote on Tuesday, November 21, 2017:

One quick comment, it is very rare to NOT be able to change the software, manyy because you don’t have it, in which case you can’t make the simple change either. It can be impractical, but it is rarely impossible. The other case is that yyou have been geven the task with instructions not to make any big changes, at which point the response sometimes is to go to those giving the instructions and present the options and show why it is needed.

You have a program that as written has a corruption bug, and their are several ways to fix it, which also have the possibility to introduce other, maybe harder to find, problems. Using taskENTER/EXIT_CRITICAL has the issue that it will delay all interrupts for your tens of microseconds. One design issue is to see it this might cause issues with any other device. If there is a device that this might cause an issue with, it very likely will be something very intermintant and hard to figure out.

It is also possible that the suspend/resume option (which would be less invasive) may not be good enough, if there is an interrupt that takes long enough to process that it delays this operation enough to cause the issue. It takes looking at the whole system to be sure.

This sort of operation, the need to write a packet of data, where the data must go out as an uninterrupted whole, is one of the critical timing situations that requires careful design to avoid problems. The ideal solution is a DMA transfer, but depending on the processor and other design decisions, that might not be available (if it is, then if you can add the critical section, you probalby have enough access to implement this, it is more work, but isn’t correct operation worth it). The next best solution is making it interrupt driven with a high priority interrupt.

Ultimately, someone NEEDS to look at the system as a whole to see what is really needed. Doing a quick solution has just too much chance of introducing another hard to find intermitant bug. If due to management issue, that is really needed, do it but make sure that the possible issue is doecumented so if(when) the intermintant shows up, it can be quicker to solve.