xQueueReceive blocks indefinitely if xQueueSendToBackFromISR() is called in a loop exceeding 127 times

Hello,

I’m developing a packet processing application. I have developed a FreeRTOS based Packet DMA driver which uses FreeRTOS Queues to share packet buffers between application thread and driver. I have encountered a situation where the packet reader thread blocks indefinitely, even though the queue is filled by the ISR. This happens randomly after 2-4 hours of operation. My application is halted all of a sudden, then the DMA engine halts eventually as nobody is processing anything.

I’ve been analyzing what was going on from last few months and found a bug in FreeRTOS Queue implementation.

App Thread Pseudocode

while(1) {
	QueueReceive(out_queue, buffer, portMAX_DELAY);
	process_packet();
	xQueueSendToBack(in_queue, buffer, portMAX_DELAY);
}

PacketDMA ISR Pseudocode

{ 
	packet_count = read_available_packet_count_register();
	for(int i = 0; i < packet_count; i++) {
		/* Read from InQueue */
		xQueueReceiveFromISR(in_queue, &buffer, NULL);
		/* Write to OutQueue */
		xQueueSendToBackFromISR(out_queue, &buffer, NULL);
	}
}

Diagnosis

The issue occurs when an interrupt occurs while the out_queue just got empty and xQueueReceive() function is about to add the task to waiting list using

  /* xQueueReceive() */
  vTaskPlaceOnEventList( &( pxQueue->xTasksWaitingToReceive ), xTicksToWait );

More specifically inside the vListInsert( pxEventList, &( pxCurrentTCB->xEventListItem ) ); call from vTaskPlaceOnEventList();

I am using 256 packet DMA buffers. When this call happens the Queue is locked using prvLockQueue( pxQueue );

When the Queue gets locked the pxQueue->cTxLock variable is moved from -1 to 0.

So when the ISR writes to the Queue, if the Queue is locked, it will simply increment the lock using:

	/* Increment the lock count so the task that unlocks the queue
	knows that data was posted while it was locked. */
	pxQueue->cTxLock = ( int8_t ) ( cTxLock + 1 );

Since I’m using 256 buffers, there are conditions in which I receive a burst of packets which can some be called in a loop for more than 127 times.

Since the pxQueue->cTxLock variable is 8-bit integer, the value will overflow and become negative values. (-127… -1).

Once the ISR is complete, vTaskPlaceOnEventList(); resumes and completes putting the task on waiting list.

Then the prvQueueUnlock() function goes and checks whether the cTxLock is modified(incremented) or not.

	/* See if data was added to the queue while it was locked. */
	while( cTxLock > 0 /*queueLOCKED_UNMODIFIED*/ )

Now we have negative values for cTxLock, which is treated as ‘unlocked’ by prvQueueUnlock() and therefore will not got and check whether any Queue is non-empty and therefore skips waking up the tasks.

Since the Queue is full, my application is blocked and is unable to process any packets and halts the DMA driver eventually as there is no free buffer available.

Solution

By declaring pxQueue->cTxLock and pxQueue->cRxLock of type int16_t I’m able to fix this issue. I am unable to see the issue thus far.

So, could you please analyze the issue from your side and fix it if I’m making sense. Correct me if I’m wrong in my analysis.

Regards,
Renjith

Please see this thread, which is unfortunately on GitHub rather than the support forum https://github.com/FreeRTOS/FreeRTOS-Kernel/issues/419

Thanks Richard. It is exactly the same problem reported in the Github thread. Unlike the UART byte streaming use case, we are running a packet processor capable for processing 700K packets per seconds.

I forgot to mention our FreeRTOS version which is 10.2.1 which doesn’t have the Assert when the size crosses 128.

Likewise even if we add a condition check inside the interrupt handler to manually check whether we are queueing more than 128, our interrupt handler will fire immediately before letting the packet processor thread gets executed. We are doing interrupt coalescing to get maximum performance.

So, it will be better to fix it in such a way that it allows for more than 128 entries in the queue. If memory usage is a concern, you can make it a config macro to enable this feature at build time.

Regards,
Renjith

I don’t want to add any additional bytes to the structure members so most likely we will sacrifice a bit of run-time to cap the count, either to the number of tasks in the system or to 128 - assuming there are no more than 128 tasks that should be ok. Even if there are greater than 128 tasks it will be ok in all but theoretical scenarios.