Hello,
I am running FreeRtos on an Atmel SAME54. I have 11 tasks running (including IDLE and TmrSvc).
One task is giving me trouble.
void MyTask()
{
const TickType_t xFrequency = 250 / portTICK_RATE_MS;
xLastWakeTime = xTaskGetTickCount();
while (1)
{
vTaskDelayUntil(&xLastWakeTime, xFrequency);
set spi chip select low
write to spi bus (this includes waiting portMAXDELAY on a semaphore)
set spi chip select high
}
}
This task takes 8 or 9 msec to complete the loop.
I checked each task’s stack pointer, and none are overflowed. Assert is enabled, as is stack overflow checking.
After a while (hours), MyTask stops running. After poking around with the debugger, I find:
xTickCount = 10652969
MyTask is on the SuspendedTaskList,
ItemValue = 9075774
pxTopOfStack has changed from 0x20003224 (when running normally) to 0x200031fc
The semaphore that the SPI driver uses points to a Queue which includes a member, xTasksWaitingToReceive = 0x20001bd8
fields include:
uxNumberOfItems = 0xFFFFFFFF
pIndex = 0x20001bd8
xListEnd = 0x20001bd8
The chip select for SPI is high (as reported by the debugger).
It seems to me that MyTask is not waiting on a semaphore, and it’s delay time has expired. The spi chip select is high, so the spi transfer must have completed.
How did it get here?
Why did the pxTopOfStack change?
How can I get out of this?
Any thoughts on what else to look at?
I guess xLastWakeTime in real code also a local variable, right ?
When stopping the target what’s the call stack / where exactly is the task blocked ?
Is the SPI bus also used by other tasks ?
Yes, xLastWakeTime is a local variable.
Unfortunately, my IDE (Atmel Studio 7) doesn’t show the call stack for each task, only for the one currently running (usually IDLE).
The SPI bus is only used by this task.
Is it possible to verify that the SPI Tx Complete ISR that is supposed to give the semaphore did fire? You can put a breakpoint in the ISR or set a GPIO and monitor that?
Should this code not be under control of a critical section? What value is the Tx semaphore initalized with (where is the call to sem_init on the xfer sem)? It better be 1. Generally, using a counting semaphore imho for this kind of driver implementation is not a good practice, although I can see that the library developers probably only wanted on centralized point for syncing.
aggarg,
I am making an assumption that it fired, based on the fact that the chip select is high. The call to spi_m_os_transfer() must have returned for that to happen. There is no timeout on the xSemaphoreTake() call.
A breakpoint is not useful to find this error, because it runs correctly for hours. I have set a breakpoint there, just to see what happens in the os, and is does fire. After myTask hangs, is there a way to look at the sempahore to tell what state it is in?
RAc,
I don’t think it’s necessary in this case. The calls to spi_m_os_transfer() come from only one task. The buffers (rxbuf, txbuf) are static char buffers declared in the calling module and are initialized just before calling spi_m_os_transfer().
RAc,
Thank you for your suggestion. I put counters all over the place and eventually isolated the problem to a message queue that had an infinite timeout. I changed the timeout to 0, and now the task never hangs.
As a general rule, I try to avoid using “infinite” timeout, and ALWAYS check the return value of the operation, and handle a timeout error appropriately (which might be to just ignore it).