FreeRTOS + lwIP TCP cannot receive large packages at high rates

I might be wrong but I believe the socket is blocking, however, I am using select with a 50us timeout so I the task does not block. So, with select I am checking on multiple sockets, not only one. Once I detect some activity, I check in each of them which one was the one that got triggered. That is the reason for the first if in the code:

if(FD_ISSET(subscriber->publisher_socket_to_receive_fd, fd_set))

Regarding the vTaskDelay(0) you are right, it does not yield. I put a breakpoint there are going step by step it always comes back to the that while loop. I changed it to vTaskDelay(1 / portTICK_PERIOD_MS) but still the same. The issue here is that I don’t want to wait in the order of ms but us or as soon as possible come back to the task. My reasoning is that it can check if there is activity in the socket. Copy the data from the tcp input buffer if there is activity in the socket or yield and come back to check as soon as possible.
However, I kept going step by step. For what I can figure out, as I receive 30 fps but only read one every time I land on the break point, the input buffer got full. At that point, the selectfunction didn’t even timed out and that might be where it blocks. Could that be?

For the tasks it looks like this:
Startup task: I stated form Xilinx’s example (like this thread) which acquires an IP and handles the initialization of the connection. Again, following the example, this is to task has priority DEFAULT_THREAD_PRIO=2. Here it is where the “new tasks” (mine) are created. After this, task 1 is deleted and the “custom” tasks remain.

I create four tasks. One to receive from sockets, one to send to sockets. One to send received data from sockets to my hardware and one to receive data from my hardware to sent out to the sockets. They all have DEFAULT_THREAD_PRIO priority. The reasoning is same as explained before. Each of them checks if there is activity (sockets or hw), if there is it process it. If there isn’t, it yields (with vTaskDelay(1ms)) and loops back when there is a context switching so that is why I want to come back as soon as possible to each task, just to check if there is something to process. Therefore, a round-robin without fixed priority so the tasks execute in order (T1, T2, T3, T4, T1, T2, T3, T4…) should be enough. Could that be the problem?

Just to point out that to create tasks I use (following the example) sys_thread_new which is defined as:

/*---------------------------------------------------------------------------*
 * Routine:  sys_thread_new
 *---------------------------------------------------------------------------*
 * Description:
 *      Starts a new thread with priority "prio" that will begin its
 *      execution in the function "thread()". The "arg" argument will be
 *      passed as an argument to the thread() function. The id of the new
 *      thread is returned. Both the id and the priority are system
 *      dependent.
 * Inputs:
 *      char *name              -- Name of thread
 *      void (* thread)(void *arg) -- Pointer to function to run.
 *      void *arg               -- Argument passed into function
 *      int stacksize           -- Required stack amount in bytes
 *      int prio                -- Thread priority
 * Outputs:
 *      sys_thread_t            -- Pointer to per-thread timeouts.
 *---------------------------------------------------------------------------*/
sys_thread_t sys_thread_new( const char *pcName, void( *pxThread )( void *pvParameters ), void *pvArg, int iStackSize, int iPriority )
{
xTaskHandle xCreatedTask;
portBASE_TYPE xResult;
sys_thread_t xReturn;

	xResult = xTaskCreate( pxThread, ( const char * const) pcName, iStackSize, pvArg, iPriority, &xCreatedTask );

I hope it was detailed enough for a better understanding.
Thanks again for the help.
if( xResult == pdPASS )
{
xReturn = xCreatedTask;
}
else
{
xReturn = NULL;
}

	return xReturn;
}