ulTaskNotifyTake/xTaskNotifyGive blocks in an unexpected way (with lwip_recvfrom)

Hi,

I am new to FreeRTOS and trying to get a quite simple setup working. I have two custom tasks, one that listens to UDP packets and one that transmits UDP packets. When I get a UDP packet, I will do some trivial processing and then notify the transmission task to send something back. For now it just sends three constant bytes of data.

The RX task basically does this:

for (;;) {
  count = lwip_recvfrom(sock, recv_buf, sizeof(recv_buf), 0,
                        (struct sockaddr *)&from, &socklen);
  /* Some processing here... */
  xTaskNotifyGive(g_notify_task);
}

And the TX task:

for (;;) {
  ulTaskNotifyTake(pdTRUE, portMAX_DELAY);
  count = lwip_sendto(sock, send_buf, sizeof(send_buf), 0,
					     (struct sockaddr *)&addr, len);
}

Both tasks are created with the same priority. This works just fine for the first UDP-packet received, the TX-task is awoken and sends a single UDP packet out. But as the next UDP-packet arrives, the RX-task seems to be blocking on lwip_recvfrom(), and the TX-task is blocking on ulTaskNotifyTake(), neither waking up again. The packets are coming in at a rate of about 1 per second. However if I comment out lwip_sendto() from the TX-task (or simply use some invalid input argument as I did by mistake so that it fails to send), the TX-task wakes up for every UDP-packet received. I have a feeling that I am missing something obvious here.

For me this sounds like a symptom of a data corruption probably caused by a stack overflow in the TX task by running through lwip_sendto.
Did you define configASSERT and also enable stack overflow checking for development/debugging to catch possible fatal errors or data corruptions ?
What are the stack sizes of your tasks ?

configASSERT is defined in FreeRTOSConfig.h. I was suspecting that stack overflow might have something do to with it, so I created a third task that periodically runs uxTaskGetSystemState() and prints some statistics for each task. Here is the info for my two tasks:

Name           State           Prio    Stack (usStackHighWaterMark)
udp_rx_th      Blocked         2       860
udp_tx_th      Suspend         2       864

If I understood the docs correctly, the stack should approach zero when it is running out, so it seems I am in the clear. The stack size is 1024 bytes each. Stack overflow checking is enabled (#define configCHECK_FOR_STACK_OVERFLOW 2).

Hi @iceaway ,

As someone who has started out ‘attempting to use lwip with FreeRTOS’ on two different platforms for two different projects I found it very difficult to get lwip to play nice with the multi-threaded environment. It was a number of years ago and I can’t remember much detail but I do remember spending a considerable amount of time trying to get a satisfactory result. I don’t know which hardware platform or port you are using, but ultimately I ended up switched to using FreeRTOS+ TCP for both projects. If there isn’t any reason why you can’t use FreeRTOS +TCP I would definitely do so. It takes care of all the issues associated with the multi-threaded environment for you and the performance was as much as a factor of 10 better for the scenarios I was using it in.

If you must continue to use lwip I am guessing there could be some kind of mutual exclusion required when accessing the IP stack which may be breaking your processes? Just a thought…

I hope that is helpful :slight_smile:
Kind Regards,

Pete

Thanks Pete, appreciate your insight! Unfortunately I think I am stuck with FreeRTOS + LwIP at the moment, as that is what my SoC provider (Xilinx) ships with their Vitis development environment.

Hi @iceaway ,

One of my projects is using Vitis 2022.1 on the Zynq7000, what is your hardware?

I might be able to share some code with you. I basically don’t use the Xilinx generated RTOS code at all anymore I have my own builds for everything.

Kind Regards,

Pete

We are also using a Zynq 7000 SoC. That would be cool (not a huge fan of Vitis so far…)! I was trying to figure out a way to message you privately, as to avoid spamming this thread with irrelevant info, but I can’t seem to find any way to do that. Surely it must be possible?

Are you on LinkedIn?
If so send a connection request on there my details are in the info on my profile in this forum.

Cheers,

Pete.

1 Like

I just made an interesting discovery, but not really sure what it means. I currently use the same destination port on both sides, i.e. I receive packets on port 20106, and then when I send packets back to the host PC I set the destination port to 20106 as well. But I tried changing the destination port on the PC to 20107 instead, and now it works fine. Isn’t that strange? I suppose I should be able to use the same port, as they are on different hosts?

I used a separate socket for each of the UDP connections (one for TX and one for RX), but if I created a single socket shared by both transmission and reception, I am now able to use the same destination port.

Thanks for reporting back ! Nice that the issue was that simple to solve. That’s rarely the case :wink:

Hi Pelle, (@iceaway )

Glad you’re making progress here :slight_smile:

I have messaged you on Linkedin if you want to continue our conversation offline.

Kind Regards,

Pete

Well, after even more digging I discovered that the problem was a lot less mysterious than I initially thought. The PC application (not written by me, so I did not know about this detail) that sends data automatically updates the destination port for its packets to the SOURCE port of the last UDP packet received. So when I was sending UDP replies from the MCU, the source port was some random number generated by LWIP, hence the next packet from the PC would be sent to this port, explaining why my received thread never woke up.

1 Like

Thank you for taking time to report your solution.