FreeRTOS_send() failure to return

laughlinmbari wrote on Monday, September 09, 2019:

I’ve stumbled across the following behavior where I was able to hang FreeRTOS_send (it would not return, and the thread responsible for doing IP comms would hang as a result). The undesireable behavior appers to be easily mitigated by setting an appropriate FREERTOS_SO_RCVTIMEO (which hadn’t been set in my system I was working on). Digging into the source, it seems to me that there is logic within FreeRTOS_send is intended to catch a situation like this, which is why I’m posting this.

Setup: My FreeRTOS MCU is hosting a few listening sockets, and upon receipt of a connection, creats a child socket which it hands off to a seperate comms application. The seperate comms application parses incoming UI commands, and if the appropriate command is received sends back the appropriate data.

The listening sockets do not have a FREERTOS_SO_SNDTIMEO set (so their children sockets don’t inherit one).

It appears to be sensitive to timing, but I can reproduce it ~1/10 with the following script on my desktop:

import socket
import time

def sleep_ms(sec):
    time.sleep(sec/1000.0)


HOST = "10.0.0.2"
PORT = 10003

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

print("opening socket...\n")

try:
    s.connect((HOST, PORT))
    data = s.recv(1024)
    time.sleep(1)

except OSError as msg:
    print(msg)

cmd = b'b_req\r\n'
print("sending command: {} and exiting".format(cmd.decode('ascii')))

s.sendall(cmd)
sleep_ms(1)
exit()

The basic flow of the comms app is as follows:

while(1){
    sres = FreeRTOS_select(task->active_fds, xSelectTimeout);
		if (sres  < 0) {
			panic ("select");
		} else if (sres == 0) {
			// nothing happening on any socket
			continue;
		}
		
		/* cycle through configured apps and check for incoming requests */ 
		for (i = 0; i < task->appnum; i++) {
			app = task->apps[i];

			/* check for requests on established connections */
			for (j = 0; j < MAX_APP_CLIENTS; j++) {
				if (!app->clientsocks[j] || !FreeRTOS_FD_ISSET(app->clientsocks[j],
						task->active_fds))
					continue;

				/* data arriving on an already-connected socket. */
				res = FreeRTOS_recv(app->clientsocks[j], app->tcp_buffer,
				  TCP_BUFSIZE, 0);
				if (res <= 0 || app->request_handler(app, j, res) < 0) {
					/* Read error (-1) or EOF (0) */
					utils_socket_close(app->clientsocks[j]);
					FreeRTOS_FD_CLR(app->clientsocks[j], task->active_fds, eSELECT_READ | eSELECT_EXCEPT);
					app->clientsocks[j] = 0;
					continue;
				}

			if (app->app_type != SERIAL_APP) taskYIELD();
			}

			/* check for incoming new connections on master sock */
			if (!app->master_sock || !FreeRTOS_FD_ISSET(app->master_sock, task->active_fds))
				continue;

			len = sizeof (client);
			newsock = FreeRTOS_accept(app->master_sock, &client, &len);
			if (newsock < 0) {
				perror ("accept");
				panic ("EXIT_FAILURE");
			}

			for (j = 0; j < MAX_APP_CLIENTS; j++) {
				if (!app->clientsocks[j]) {
					FreeRTOS_FD_SET(newsock, task->active_fds, eSELECT_READ | eSELECT_EXCEPT);
					app->clientsocks[j] = newsock;
					break;
				}
			}
			if (j >= MAX_APP_CLIENTS) {
				tcp_srv_respond_term(newsock,
				  "\r\nToo many active connections, closing!\r\n\n");
				utils_socket_close(newsock);
				continue;

			}
			if (xPortGetFreeHeapSize() < TCP_MIN_FREE_HEAP) {
				tcp_srv_respond_term(newsock,
				  "\r\nRunning low on memory, closing!\r\nConsider closing unused connections.\r\n\n");
				utils_socket_close(newsock);
				continue;
			}
			if (app->app_type != SERIAL_APP) {
				sprintf(response, "\xff\xfd\x22\xff\xfa\x22\x01\x01\xff\xf0\r\n\nConnected to %s \r\n%s> ",
				  app->name, app->name);
				tcp_srv_respond_term(newsock, response);
			}
			if (app->app_type != SERIAL_APP) taskYIELD();
		}

and the basic structure of the UI respond task is

parse_cmd(APP_CFG *app, int sockidx, int len)
{
	Socket_t sock = app->clientsocks[sockidx];
	char *srv_inbuf = app->tcp_buffer;
	int battery;
	float fval1, fval2, fval3;
	uint16_t *voltages;
	uint8_t types[2];

	srv_inbuf[len] = 0;
    
    //some error checking and sanitizing stuff here
    else if (!strncmp(srv_inbuf,"b_req", 5)) {
			char *strPtr = str_data;
			//... query some ADCs and GPIO, copy some structs, and then send back
            // data with calls to tcp_srv_respond() see below.
            
      }
}

the FreeRTOS_send wrapper for reference:

int tcp_srv_respond(Socket_t sock, char *bufp, int bytestosend)
{   
   int bytessent;

   while (bytestosend) {
	  bytessent = FreeRTOS_send(sock, bufp, bytestosend, 0);

      if (bytessent < 0) {
         return (-1);
      } else if (bytessent == 0) {
            return (0);
      } else if (bytessent < bytestosend) {
         /* Only part of the data was written */
         bufp += bytessent;
         bytestosend -= bytessent;
         taskYIELD();       /* Let other tasks run before we try again */
         continue;
      } else {
         break;     /* All bytes sent  */
	  }
   }
   return (0);
}

To reiterate, the problem can be resolved by introducing a SEND timeout, so my particular issue has been resolved, but I’ve posted here because my reading of the FreeRTOS_send leads me to beleive this may be an unintended behavior.

Cheers.

rtel wrote on Monday, September 09, 2019:

Thanks for the detailed report. Can you please confirm which version
you are using so we can investigate - the version number is at the top
of each source file.

laughlinmbari wrote on Tuesday, September 10, 2019:

Behavior observed on both FreeRTOS V10.0.1 (FreeRTOS+TCP V2.0.1), and the most recent version, V10.2.1 which I pulled last Friday (Sep. 6, 2019).

heinbali01 wrote on Tuesday, September 10, 2019:

Hi Laughlin, thanks for letting this know.

As you must have seen, the default send time-out is:

#define ipconfigSOCK_DEFAULT_SEND_BLOCK_TIME    portMAX_DELAY

unless you overwrite this macro in your FreeRTOSIPConfig.h. The socket option FREERTOS_SO_RCVTIMEO can change the send time-out for an individual socket.

By the way, your respond function could as well look like this:

int tcp_srv_respond(Socket_t sock, char *bufp, int bytestosend)
{
	return FreeRTOS_send(sock, bufp, bytestosend, 0);
}

FreeRTOS_send() will try to send all data, and it may do this in smaller chunks, depending on the buffer space.
When all data has been sent, it will return. When a time-out occurs, it stops trying. It returns the actual number of bytes sent, or a negative error code (see FreeRTOS_errno_TCP.h).

Your code probably doesn’t have to yield(), because the API’s already yield when necessary.

I have not observed that FreeRTOS_send() hangs when the time-out is unlimited (portMAX_DELAY), but I will try that out.

A few questions:

How much data do you normally send in your function tcp_srv_respond()?
Does it always happen, or only sometimes?
Have you observed what happened on the LAN when FreeRTOS_send() starts hanging? How much data has been sent? Was there a problem with the connection? Do you have a PCAP of it?
Have you left ipconfigTCP_HANG_PROTECTION and ipconfigTCP_KEEP_ALIVE enabled?