tomwbarclay wrote on Thursday, March 19, 2015:
Hi, I need ideas on how to further investigate a suspected FreeRTOS failure. Problem is that a semaphore is fired from an ISR but is not caught by the background thread. Code snippet follows
// init the control block
rblk = reg_block;
success = TRUE;
state = CMX7164_SEQ_INT0;
#ifdef DEBUG_SDR_THRU_CBUS
debug_blip_debug_line(DEBUG_LINE5);
#endif
// set the SPI and CMX7164 interrupt call back routines
spi_io_set_interrupt_callback_function(sdr_cbus_wr_thru_txfr_callback);
cmx7164_set_interrupt_callback_function(sdr_cbus_interrupt_callback);
timer_io_set_hw_alarm_time(SDR_CBUS_TIMER_CHANNEL, tout_us, sdr_io_cbus_wr_timeout);
// create a software interrupt as if the SPI txfr has happened
// this will start the entire sequence
NVIC->STIR = SPI_IRQn;
// wait till the program sequence completed event arrives
event_wait(txfr_cbus_wr_event);
#ifdef DEBUG_SDR_THRU_CBUS
debug_blip_debug_line(DEBUG_LINE6);
#endif
// reset the Timer, SPI and CMX7164 interrupt call back routines
timer_io_clear_hw_alarm(SDR_CBUS_TIMER_CHANNEL);
spi_io_set_interrupt_callback_function(NULL);
cmx7164_set_interrupt_callback_function(NULL);
Code enters at the top … sets up some interrupt callbacks, kicks off the interrupt processes
with the NVIC-> command . (Interrupt processes take from 150 to 500us) then waits on semaphore from final interrupt process (event_wait … my FreeRTOS wrapper for xSemaphoreTake.)
Mostly this works… the interrupt driven state machine does its thing and once its finished it fires the semaphore which is caught by the event wait so the thread can continue and exit the routine.
I can break the execution just before the ISR throws the semaphore and then I can view the semaphore state in FreeRTOS viewer. This shows its value is 0 (not yet thrown) and it has a task waiting on it. Just as you would expect.
When it goes wrong the hardware timeout timer actions first. This fires the semaphore but when I look at FreeRTOS Viewer it shows that there is NO waiting task. MMMM!!!
To verify this I added the debug_blip… lines. These toggle io pins so I can capture real events on my analyzer.
When good the entry toggle (DEBUG5) is first, The ISR completed toggle (not shown here) is next and the exit toggle (DEBUG6) is next. No timeout toggle shows … all as anticipated.
When it goes wrong the entry toggle is first, the ISR completion toggle is next, followed by a long delay to the timer timeout toggle. No exit toggle is seen.
My conclusion (so far) is that either
a) The thread never executes the event_wait() … but I can see all the ISR traffic on my analyser so I know that it must have executed the line before (NVIC->).
b)event_wait() (wrapped xSemaphoreTake) somehow lost my request.
c)Another process was executed in between the NVIC-> line and the event_wait() line
The processor is a SAM4N @ 100MHz. I am using Atmel Studio6.2. I am running about 20 to 50 transactions a second and the fault occurs between 1 and 30 seconds of execution time.
Also another independent thread stops at the same point in time … I can see no connection between them. Plus all of the other threads, as far as I can see, run unperturbed, so the whole RTOS has not crashed or gone crazy.
FYI … setting configUSE_PORT_OPTIMISED_TASK_SELECTION in ConfigFreeRTOS.h fails to compile cleanly.
Any suggestions welcomed … I am running out of ideas