Mqtt slow response

I have a device subscribed to a certain topic.
The messages posted to that topic is based on publish subscribe model.

Device -------------> Publishes --------> TOPIC A-------->SUBSCRIBES TO TOPIC B
Lambda Function gets invoked based on topic A and publishes value to TOPIC B
Devices-----------> Receives Data on the topic B----->Controls Peripherals based on it.

When transactions are immediate and consecutive, the receive of messages on topic B is almost real time.

During a period of inactivity, say a few minutes of publish messages to topic A.
The first receive after inactivity is quiet a delayed on.

  1. I’ve configured the lambas so that they don’t have a cold start.
  2. What happens when there is no payload between the device and the broker for a long time?

There is no re-connection as the device is mostly online via ethernet.

Should persistent connection could be of any help? Or any changes to #define CONNACK_RECV_TIMEOUT_MS ( 2000U )

Any advice is appreciated to avoid occasional long delays after minutes of inactivity.

Hi @Surender

Could you specifiy which libraries (i.e coreMQTT, coreMQTT-Agent, etc) you are using and their versions?

Could you also elaborate, or show code if possible, of the mqtt processing code/loops? Immediately, it’s hard to single out the root of the issue.

  • coreMQTT v1.1.0
  • coreMQTT Agent v1.0.0
  • FreeRTOS Kernel V10.4.3
  • FreeRTOS V202107.00
  • FreeRTOS Transport Secure Sockets V1.0.1

Just a little update when we tried to investigate using the wireshark.

The slow mqtt responses were due to occasional TCP Spurious Retransmissions even with a QOS 0.

Any thoughts to configurations to avoid this or delay the retranmission of tcp packets?

Note : The setup is running on Physical Ethernet.

Regards,
Surender


FYI

Looking at the Wireshark logs you shared, it seems that the TCP stack on your device:

  1. is not seeing the ACK sent by the broker and therefore, attempting to the send the data again.
  2. is not responding with ACK in a timely manner causing the broker to send the data again.

Which TCP stack are you using? Can you enable debug logs in that stack and may be that can provide a clue?

Thanks.

Hi @aggarg,
I work with @Surender on this issue. Let me share more logs.
I added some comments to it.
Our target is to have a consistent response time for this MQTT request-response query.
Currently, we experience delays because of the TCP retransmission.

=====================================================================   Normal TCP example ================================================
[14:25:28]  [Info] Publishing message to request topic
[14:25:28]  lwip_send(54, data=0x3ffb6124, size=78, flags=0x0)
[14:25:28]  tcpip_thread: API message 0x3ffba290
[14:25:28]  tcp_write(pcb=0x3ffb4c2c, data=0x3ffb6124, len=78, apiflags=1)
[14:25:28]  pbuf_alloc(length=1440)
[14:25:28]  pbuf_alloc(length=1440) == 0x3ffdf42c
[14:25:28]  pbuf_add_header: old 0x3ffdf490 new 0x3ffdf47c (20)
[14:25:28]  tcp_write: queueing 1010700106:1010700184
[14:25:28]  tcp_output_segment: rtseq 1010700106
[14:25:28]  tcp_output_segment: 1010700106:1010700184
[14:25:28]  pbuf_add_header: old 0x3ffdf47c new 0x3ffdf468 (20)
[14:25:28]  ip4_output_if: en1
[14:25:28]  IP header:
[14:25:28]  +-------------------------------+
[14:25:28]  | 4 | 5 |  0x00 |       118     | (v, hl, tos, len)
[14:25:28]  +-------------------------------+
[14:25:28]  |       73      |000|       0   | (id, flags, offset)
[14:25:28]  +-------------------------------+
[14:25:28]  |  255  |    6  |    0xadcd     | (ttl, proto, chksum)
[14:25:28]  +-------------------------------+
[14:25:28]  |  iot device  | (src)
[14:25:28]  +-------------------------------+
[14:25:28]  |   aws iot core  | (dest)
[14:25:28]  +-------------------------------+
[14:25:28]  ip4_output_if: call netif->output()
[14:25:28]  pbuf_add_header: old 0x3ffdf468 new 0x3ffdf45a (14)
[14:25:28]  ethernet_output: sending packet 0x3ffdf42c
[14:25:28]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:28]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:28]  pbuf_alloc(length=60)
[14:25:28]  pbuf_alloc(length=60) == 0x3ffaf514
[14:25:28]  tcpip_thread: PACKET 0x3ffaecd4
[14:25:28]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:25:28]  pbuf_remove_header: old 0x3fff5328 new 0x3fff5336 (14)
[14:25:28]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:25:28]  ip4_input: packet accepted on interface en
[14:25:28]  ip4_input: 
[14:25:28]  IP header:
[14:25:28]  +-------------------------------+
[14:25:28]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:25:28]  +-------------------------------+
[14:25:28]  |    29313      |010|       0   | (id, flags, offset)
[14:25:28]  +-------------------------------+
[14:25:28]  |  229  |    6  |    0x15e3     | (ttl, proto, chksum)
[14:25:28]  +-------------------------------+
[14:25:28]  |   aws iot core  | (src)
[14:25:28]  +-------------------------------+
[14:25:28]  |  iot device  | (dest)
[14:25:28]  +-------------------------------+
[14:25:28]  ip4_input: p->len 40 p->tot_len 40
[14:25:28]  pbuf_remove_header: old 0x3fff5336 new 0x3fff534a (20)
[14:25:28]  pbuf_remove_header: old 0x3fff534a new 0x3fff535e (20)
[14:25:28]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:25:28]  tcp_receive: ACK for 1010700184, unacked->seqno 1010700106:1010700184
[14:25:28]  tcp_receive: removing 1010700106:1010700184 from pcb->unacked
[14:25:28]  pbuf_free(0x3ffdf42c)
[14:25:28]  pbuf_free: deallocating 0x3ffdf42c
[14:25:28]  tcp_receive: pcb->rttest 367 rtseq 1010700106 ackno 1010700184
[14:25:28]  tcp_receive: experienced rtt 0 ticks (0 msec).
[14:25:28]  tcp_receive: RTO 3 (1500 milliseconds)
[14:25:28]  tcp_output: nothing to send (0x0)
[14:25:28]  pbuf_free(0x3ffaf514)
[14:25:28]  pbuf_free: deallocating 0x3ffaf514
[14:25:28]  tcpip_thread: API message 0x3ffb7e40
[14:25:28]  lwip_send(54) err=0 written=78
[14:25:28]  lwip_send(54, data=0x3ffb6124, size=150, flags=0x0)
[14:25:28]  tcpip_thread: API message 0x3ffba290
[14:25:28]  tcp_write(pcb=0x3ffb4c2c, data=0x3ffb6124, len=150, apiflags=1)
[14:25:28]  pbuf_alloc(length=1440)
[14:25:28]  pbuf_alloc(length=1440) == 0x3ffdf42c
[14:25:28]  pbuf_add_header: old 0x3ffdf490 new 0x3ffdf47c (20)
[14:25:28]  tcp_write: queueing 1010700184:1010700334
[14:25:28]  tcp_output_segment: rtseq 1010700184
[14:25:28]  tcp_output_segment: 1010700184:1010700334
[14:25:28]  pbuf_add_header: old 0x3ffdf47c new 0x3ffdf468 (20)
[14:25:28]  ip4_output_if: en1
[14:25:28]  IP header:
[14:25:28]  +-------------------------------+
[14:25:28]  | 4 | 5 |  0x00 |       190     | (v, hl, tos, len)
[14:25:28]  +-------------------------------+
[14:25:28]  |       74      |000|       0   | (id, flags, offset)
[14:25:28]  +-------------------------------+
[14:25:28]  |  255  |    6  |    0xad84     | (ttl, proto, chksum)
[14:25:28]  +-------------------------------+
[14:25:28]  |  iot device  | (src)
[14:25:28]  +-------------------------------+
[14:25:28]  |   aws iot core  | (dest)
[14:25:28]  +-------------------------------+
[14:25:28]  ip4_output_if: call netif->output()
[14:25:28]  pbuf_add_header: old 0x3ffdf468 new 0x3ffdf45a (14)
[14:25:28]  ethernet_output: sending packet 0x3ffdf42c
[14:25:28]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:28]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:28]  pbuf_alloc(length=60)
[14:25:28]  pbuf_alloc(length=60) == 0x3ffaf514
[14:25:28]  tcpip_thread: PACKET 0x3ffaecd4
[14:25:28]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:25:28]  pbuf_remove_header: old 0x3fff5328 new 0x3fff5336 (14)
[14:25:28]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:25:28]  ip4_input: packet accepted on interface en
[14:25:28]  ip4_input: 
[14:25:28]  IP header:
[14:25:28]  +-------------------------------+
[14:25:28]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:25:28]  +-------------------------------+
[14:25:28]  |    29314      |010|       0   | (id, flags, offset)
[14:25:28]  +-------------------------------+
[14:25:28]  |  229  |    6  |    0x15e2     | (ttl, proto, chksum)
[14:25:28]  +-------------------------------+
[14:25:28]  |   aws iot core  | (src)
[14:25:28]  +-------------------------------+
[14:25:28]  |  iot device  | (dest)
[14:25:28]  +-------------------------------+
[14:25:28]  ip4_input: p->len 40 p->tot_len 40
[14:25:28]  pbuf_remove_header: old 0x3fff5336 new 0x3fff534a (20)
[14:25:28]  pbuf_remove_header: old 0x3fff534a new 0x3fff535e (20)
[14:25:28]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:25:28]  tcp_receive: ACK for 1010700334, unacked->seqno 1010700184:1010700334
[14:25:28]  tcp_receive: removing 1010700184:1010700334 from pcb->unacked
[14:25:28]  pbuf_free(0x3ffdf42c)
[14:25:28]  pbuf_free: deallocating 0x3ffdf42c
[14:25:28]  tcp_receive: pcb->rttest 368 rtseq 1010700184 ackno 1010700334
[14:25:28]  tcp_receive: experienced rtt 0 ticks (0 msec).
[14:25:28]  tcp_receive: RTO 3 (1500 milliseconds)
[14:25:28]  tcp_output: nothing to send (0x0)
[14:25:28]  pbuf_free(0x3ffaf514)
[14:25:28]  pbuf_free: deallocating 0x3ffaf514
[14:25:28]  etharp_timer
[14:25:28]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:28]  lwip_send(54) err=0 written=150
[14:25:28]  [Info] Time elapsed: 490 ms between publish and publish ACK
[14:25:28]  pbuf_alloc(length=183)
[14:25:29]  pbuf_alloc(length=183) == 0x3ffaf514
[14:25:29]  tcpip_thread: PACKET 0x3ffb976c
[14:25:29]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:25:29]  pbuf_remove_header: old 0x3ffdf42c new 0x3ffdf43a (14)
[14:25:29]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:25:29]  ip4_input: packet accepted on interface en
[14:25:29]  ip4_input: 
[14:25:29]  IP header:
[14:25:29]  +-------------------------------+
[14:25:29]  | 4 | 5 |  0x00 |       169     | (v, hl, tos, len)
[14:25:29]  +-------------------------------+
[14:25:29]  |    29315      |010|       0   | (id, flags, offset)
[14:25:29]  +-------------------------------+
[14:25:29]  |  229  |    6  |    0x1560     | (ttl, proto, chksum)
[14:25:29]  +-------------------------------+
[14:25:29]  |   aws iot core  | (src)
[14:25:29]  +-------------------------------+
[14:25:29]  |  iot device  | (dest)
[14:25:29]  +-------------------------------+
[14:25:29]  ip4_input: p->len 169 p->tot_len 169
[14:25:29]  pbuf_remove_header: old 0x3ffdf43a new 0x3ffdf44e (20)
[14:25:29]  pbuf_remove_header: old 0x3ffdf44e new 0x3ffdf462 (20)
[14:25:29]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:25:29]  tcp_receive: pcb->rttest 0 rtseq 1010700184 ackno 1010700334
[14:25:29]  tcp_output: nothing to send (0x0)
[14:25:29]  netconn_recv_data: received 0x3ffaf514, len=129
[14:25:29]  lwip_recv_tcp: netconn_recv err=0, pbuf=0x3ffaf514
[14:25:29]  lwip_recv_tcp: buflen=129 recv_left=5 off=0
[14:25:29]  pbuf_remove_header: old 0x3ffdf462 new 0x3ffdf467 (5)
[14:25:29]  tcp_output: nothing to send (0x0)
[14:25:29]  pbuf_alloc(length=20)
[14:25:29]  pbuf_alloc(length=20) == 0x3ffaeba8
[14:25:29]  tcp_output: sending ACK for -1485235822
[14:25:29]  pbuf_add_header: old 0x3ffaebf8 new 0x3ffaebe4 (20)
[14:25:29]  ip4_output_if: en1
[14:25:29]  IP header:
[14:25:29]  +-------------------------------+
[14:25:29]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:25:29]  +-------------------------------+
[14:25:29]  |       75      |000|       0   | (id, flags, offset)
[14:25:29]  +-------------------------------+
[14:25:29]  |  255  |    6  |    0xae19     | (ttl, proto, chksum)
[14:25:29]  +-------------------------------+
[14:25:29]  |  iot device  | (src)
[14:25:29]  +-------------------------------+
[14:25:29]  |   aws iot core  | (dest)
[14:25:29]  +-------------------------------+
[14:25:29]  ip4_output_if: call netif->output()
[14:25:29]  pbuf_add_header: old 0x3ffaebe4 new 0x3ffaebd6 (14)
[14:25:29]  ethernet_output: sending packet 0x3ffaeba8
[14:25:29]  pbuf_free(0x3ffaeba8)
[14:25:29]  pbuf_free: deallocating 0x3ffaeba8
[14:25:29]  lwip_recv_tcp: lastdata now pbuf=0x3ffaf514
[14:25:29]  tcpip_thread: API message 0x3ffb7e00
[14:25:29]  tcpip_thread: API message 0x3ffb7e40
[14:25:29]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=5
[14:25:29]  lwip_recvfrom(54, 0x3ffdb34d, 124, 0x0, ..)
[14:25:29]  lwip_recv_tcp: top while sock->lastdata=0x3ffaf514
[14:25:29]  lwip_recv_tcp: buflen=124 recv_left=124 off=0
[14:25:29]  lwip_recv_tcp: deleting pbuf=0x3ffaf514
[14:25:29]  pbuf_free(0x3ffaf514)
[14:25:29]  pbuf_free: deallocating 0x3ffaf514
[14:25:29]  tcpip_thread: API message 0x3ffb7e00
[14:25:29]  tcpip_thread: API message 0x3ffb7e40
[14:25:29]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=124
//
//@commnet: Note that req-response happens in 380 ms without retransmission
//
[14:25:29]  [Info] Time elapsed: 380 ms between publish ACK and receive response                   
[14:25:29]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:29]  lwip_recv_tcp: top while sock->lastdata=0x0

=====================================================================   IDLE sample ================================================

[14:25:51]  etharp_timer
[14:25:51]  tcp_output: nothing to send (0x0)
[14:25:52]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:52]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:52]  tcpip_thread: API message 0x3ffb7e40
[14:25:52]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:52]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:52]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:52]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:52]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:52]  tcpip_thread: API message 0x3ffb7e40
[14:25:52]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:52]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:52]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:52]  etharp_timer
[14:25:52]  tcp_output: nothing to send (0x0)
[14:25:53]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:53]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:53]  tcpip_thread: API message 0x3ffb7e40
[14:25:53]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:53]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:53]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:53]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:53]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:53]  tcpip_thread: API message 0x3ffb7e40
[14:25:53]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:53]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:53]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:53]  etharp_timer
[14:25:53]  tcp_output: nothing to send (0x0)
[14:25:54]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:54]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:54]  tcpip_thread: API message 0x3ffb7e40
[14:25:54]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:54]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:54]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:54]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:54]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:54]  tcpip_thread: API message 0x3ffb7e40
[14:25:54]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:54]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:54]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:54]  etharp_timer
[14:25:54]  tcp_output: nothing to send (0x0)
[14:25:55]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:55]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:55]  tcpip_thread: API message 0x3ffb7e40
[14:25:55]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:55]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:55]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:55]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:55]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:55]  tcpip_thread: API message 0x3ffb7e40
[14:25:55]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:55]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:55]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:55]  etharp_timer
[14:25:55]  tcp_output: nothing to send (0x0)
[14:25:56]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:56]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:56]  tcpip_thread: API message 0x3ffb7e40
[14:25:56]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:56]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:56]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:56]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:56]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:56]  tcpip_thread: API message 0x3ffb7e40
[14:25:56]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:56]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:56]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:56]  etharp_timer
[14:25:56]  tcp_output: nothing to send (0x0)
[14:25:57]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:57]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:57]  tcpip_thread: API message 0x3ffb7e40
[14:25:57]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:57]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:57]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:57]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:57]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:57]  tcpip_thread: API message 0x3ffb7e40
[14:25:57]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:57]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:57]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:57]  etharp_timer
[14:25:57]  tcp_output: nothing to send (0x0)
[14:25:58]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:58]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:58]  tcpip_thread: API message 0x3ffb7e40
[14:25:58]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:58]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:58]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:58]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:58]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:58]  tcpip_thread: API message 0x3ffb7e40
[14:25:58]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:58]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:58]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:58]  etharp_timer
[14:25:58]  tcp_output: nothing to send (0x0)
[14:25:59]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:59]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:59]  tcpip_thread: API message 0x3ffb7e40
[14:25:59]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:59]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:59]  lwip_recv_tcp: top while sock->lastdata=0x0

=====================================================================   TCP retransmission example ================================================

[14:25:59]  [Info] Publishing message to request topic
[14:25:59]  lwip_send(54, data=0x3ffb6124, size=78, flags=0x0)
[14:25:59]  tcpip_thread: API message 0x3ffba290
[14:25:59]  tcp_write(pcb=0x3ffb4c2c, data=0x3ffb6124, len=78, apiflags=1)
[14:25:59]  pbuf_alloc(length=1440)
[14:25:59]  pbuf_alloc(length=1440) == 0x3ffdf42c
[14:25:59]  pbuf_add_header: old 0x3ffdf490 new 0x3ffdf47c (20)
[14:25:59]  tcp_write: queueing 1010700562:1010700640
[14:25:59]  tcp_output_segment: rtseq 1010700562
[14:25:59]  tcp_output_segment: 1010700562:1010700640
[14:25:59]  pbuf_add_header: old 0x3ffdf47c new 0x3ffdf468 (20)
[14:25:59]  ip4_output_if: en1
[14:25:59]  IP header:
[14:25:59]  +-------------------------------+
[14:25:59]  | 4 | 5 |  0x00 |       118     | (v, hl, tos, len)
[14:25:59]  +-------------------------------+
[14:25:59]  |       79      |000|       0   | (id, flags, offset)
[14:25:59]  +-------------------------------+
[14:25:59]  |  255  |    6  |    0xadc7     | (ttl, proto, chksum)
[14:25:59]  +-------------------------------+
[14:25:59]  |  iot device  | (src)
[14:25:59]  +-------------------------------+
[14:25:59]  |   aws iot core  | (dest)
[14:25:59]  +-------------------------------+
[14:25:59]  ip4_output_if: call netif->output()
[14:25:59]  pbuf_add_header: old 0x3ffdf468 new 0x3ffdf45a (14)
[14:25:59]  ethernet_output: sending packet 0x3ffdf42c
[14:25:59]  lwip_send(54) err=0 written=78
[14:25:59]  lwip_send(54, data=0x3ffb6124, size=150, flags=0x0)
[14:25:59]  tcpip_thread: API message 0x3ffba290
[14:25:59]  tcp_write(pcb=0x3ffb4c2c, data=0x3ffb6124, len=150, apiflags=1)
[14:25:59]  pbuf_alloc(length=1440)
[14:25:59]  pbuf_alloc(length=1440) == 0x3fff5328
[14:25:59]  pbuf_add_header: old 0x3fff538c new 0x3fff5378 (20)
[14:25:59]  tcp_write: queueing 1010700640:1010700790
[14:25:59]  lwip_send(54) err=0 written=150
[14:25:59]  [Info] Time elapsed: 130 ms between publish and publish ACK
[14:25:59]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:25:59]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:25:59]  tcpip_thread: API message 0x3ffb7e40
[14:25:59]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:25:59]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:25:59]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:25:59]  etharp_timer
[14:25:59]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:26:00]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:26:00]  tcpip_thread: API message 0x3ffb7e40
[14:26:00]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:26:00]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:26:00]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:26:00]  tcp_slowtmr: rtime 3 pcb->rto 3
[14:26:00]  tcp_output_segment: rtseq 1010700562
[14:26:00]  tcp_output_segment: 1010700562:1010700640
[14:26:00]  pbuf_add_header: old 0x3ffdf47c new 0x3ffdf468 (20)
[14:26:00]  ip4_output_if: en1
[14:26:00]  IP header:
[14:26:00]  +-------------------------------+
[14:26:00]  | 4 | 5 |  0x00 |       118     | (v, hl, tos, len)
[14:26:00]  +-------------------------------+
[14:26:00]  |       80      |000|       0   | (id, flags, offset)
[14:26:00]  +-------------------------------+
[14:26:00]  |  255  |    6  |    0xadc6     | (ttl, proto, chksum)
[14:26:00]  +-------------------------------+
[14:26:00]  |  iot device  | (src)
[14:26:00]  +-------------------------------+
[14:26:00]  |   aws iot core  | (dest)
[14:26:00]  +-------------------------------+
[14:26:00]  ip4_output_if: call netif->output()
[14:26:00]  pbuf_add_header: old 0x3ffdf468 new 0x3ffdf45a (14)
[14:26:00]  ethernet_output: sending packet 0x3ffdf42c
[14:26:00]  pbuf_alloc(length=60)
[14:26:00]  pbuf_alloc(length=60) == 0x3ffaecd4
[14:26:00]  tcpip_thread: PACKET 0x3ffb59c8
[14:26:00]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:26:00]  pbuf_remove_header: old 0x3fff803c new 0x3fff804a (14)
[14:26:00]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:26:00]  ip4_input: packet accepted on interface en
[14:26:00]  ip4_input: 
[14:26:00]  IP header:
[14:26:00]  +-------------------------------+
[14:26:00]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:26:00]  +-------------------------------+
[14:26:00]  |    29321      |010|       0   | (id, flags, offset)
[14:26:00]  +-------------------------------+
[14:26:00]  |  229  |    6  |    0x15db     | (ttl, proto, chksum)
[14:26:00]  +-------------------------------+
[14:26:00]  |   aws iot core  | (src)
[14:26:00]  +-------------------------------+
[14:26:00]  |  iot device  | (dest)
[14:26:00]  +-------------------------------+
[14:26:00]  ip4_input: p->len 40 p->tot_len 40
[14:26:00]  pbuf_remove_header: old 0x3fff804a new 0x3fff805e (20)
[14:26:00]  pbuf_remove_header: old 0x3fff805e new 0x3fff8072 (20)
[14:26:00]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:26:00]  tcp_receive: ACK for 1010700640, unacked->seqno 1010700562:1010700640
[14:26:00]  tcp_receive: removing 1010700562:1010700640 from pcb->unacked
[14:26:00]  pbuf_free(0x3ffdf42c)
[14:26:00]  pbuf_free: deallocating 0x3ffdf42c
[14:26:00]  tcp_receive: pcb->rttest 431 rtseq 1010700562 ackno 1010700640
[14:26:00]  tcp_receive: experienced rtt 0 ticks (0 msec).
[14:26:00]  tcp_receive: RTO 3 (1500 milliseconds)
[14:26:00]  tcp_output_segment: rtseq 1010700640
[14:26:00]  tcp_output_segment: 1010700640:1010700790
[14:26:00]  pbuf_add_header: old 0x3fff5378 new 0x3fff5364 (20)
[14:26:00]  ip4_output_if: en1
[14:26:00]  IP header:
[14:26:00]  +-------------------------------+
[14:26:00]  | 4 | 5 |  0x00 |       190     | (v, hl, tos, len)
[14:26:00]  +-------------------------------+
[14:26:00]  |       81      |000|       0   | (id, flags, offset)
[14:26:00]  +-------------------------------+
[14:26:00]  |  255  |    6  |    0xad7d     | (ttl, proto, chksum)
[14:26:00]  +-------------------------------+
[14:26:00]  |  iot device  | (src)
[14:26:00]  +-------------------------------+
[14:26:00]  |   aws iot core  | (dest)
[14:26:00]  +-------------------------------+
[14:26:00]  ip4_output_if: call netif->output()
[14:26:00]  pbuf_add_header: old 0x3fff5364 new 0x3fff5356 (14)
[14:26:00]  ethernet_output: sending packet 0x3fff5328
[14:26:00]  pbuf_free(0x3ffaecd4)
[14:26:00]  pbuf_free: deallocating 0x3ffaecd4
[14:26:00]  lwip_recv_tcp: netconn_recv err=-3, pbuf=0x0
[14:26:00]  pbuf_alloc(length=60)
[14:26:00]  pbuf_alloc(length=60) == 0x3ffaecd4
[14:26:00]  tcpip_thread: PACKET 0x3ffb976c
[14:26:00]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:26:00]  pbuf_remove_header: old 0x3ffdf42c new 0x3ffdf43a (14)
[14:26:00]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:26:00]  ip4_input: packet accepted on interface en
[14:26:00]  ip4_input: 
[14:26:00]  IP header:
[14:26:00]  +-------------------------------+
[14:26:00]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:26:00]  +-------------------------------+
[14:26:00]  |    29322      |010|       0   | (id, flags, offset)
[14:26:00]  +-------------------------------+
[14:26:00]  |  229  |    6  |    0x15da     | (ttl, proto, chksum)
[14:26:00]  +-------------------------------+
[14:26:00]  |   aws iot core  | (src)
[14:26:00]  +-------------------------------+
[14:26:00]  |  iot device  | (dest)
[14:26:00]  +-------------------------------+
[14:26:00]  ip4_input: p->len 40 p->tot_len 40
[14:26:00]  pbuf_remove_header: old 0x3ffdf43a new 0x3ffdf44e (20)
[14:26:00]  pbuf_remove_header: old 0x3ffdf44e new 0x3ffdf462 (20)
[14:26:00]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:26:00]  tcp_receive: ACK for 1010700790, unacked->seqno 1010700640:1010700790
[14:26:00]  tcp_receive: removing 1010700640:1010700790 from pcb->unacked
[14:26:00]  pbuf_free(0x3fff5328)
[14:26:00]  pbuf_free: deallocating 0x3fff5328
[14:26:00]  tcp_receive: pcb->rttest 431 rtseq 1010700640 ackno 1010700790
[14:26:00]  tcp_receive: experienced rtt 0 ticks (0 msec).
[14:26:00]  tcp_receive: RTO 3 (1500 milliseconds)
[14:26:00]  tcp_output: nothing to send (0x0)
[14:26:00]  pbuf_free(0x3ffaecd4)
[14:26:00]  pbuf_free: deallocating 0x3ffaecd4
[14:26:00]  etharp_timer
[14:26:00]  lwip_recv_tcp: p == NULL, error is "Timeout."!
[14:26:00]  tcpip_thread: API message 0x3ffb7e40
[14:26:00]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=-1
[14:26:00]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:26:00]  lwip_recv_tcp: top while sock->lastdata=0x0
[14:26:00]  pbuf_alloc(length=183)
[14:26:01]  pbuf_alloc(length=183) == 0x3ffaf514
[14:26:01]  tcpip_thread: PACKET 0x3ffb976c
[14:26:01]  ethernet_input: dest:iot_device_mac, src:gateway_mac, type:800
[14:26:01]  pbuf_remove_header: old 0x3ffdf42c new 0x3ffdf43a (14)
[14:26:01]  ip_input: iphdr->dest 0xb489a8c0 netif->ip_addr 0xb489a8c0 (0x89a8c0, 0x89a8c0, 0xb4000000)
[14:26:01]  ip4_input: packet accepted on interface en
[14:26:01]  ip4_input: 
[14:26:01]  IP header:
[14:26:01]  +-------------------------------+
[14:26:01]  | 4 | 5 |  0x00 |       169     | (v, hl, tos, len)
[14:26:01]  +-------------------------------+
[14:26:01]  |    29323      |010|       0   | (id, flags, offset)
[14:26:01]  +-------------------------------+
[14:26:01]  |  229  |    6  |    0x1558     | (ttl, proto, chksum)
[14:26:01]  +-------------------------------+
[14:26:01]  |   aws iot core  | (src)
[14:26:01]  +-------------------------------+
[14:26:01]  |  iot device  | (dest)
[14:26:01]  +-------------------------------+
[14:26:01]  ip4_input: p->len 169 p->tot_len 169
[14:26:01]  pbuf_remove_header: old 0x3ffdf43a new 0x3ffdf44e (20)
[14:26:01]  pbuf_remove_header: old 0x3ffdf44e new 0x3ffdf462 (20)
[14:26:01]  +-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
[14:26:01]  tcp_receive: pcb->rttest 0 rtseq 1010700640 ackno 1010700790
[14:26:01]  tcp_output: nothing to send (0x0)
[14:26:01]  netconn_recv_data: received 0x3ffaf514, len=129
[14:26:01]  lwip_recv_tcp: netconn_recv err=0, pbuf=0x3ffaf514
[14:26:01]  lwip_recv_tcp: buflen=129 recv_left=5 off=0
[14:26:01]  pbuf_remove_header: old 0x3ffdf462 new 0x3ffdf467 (5)
[14:26:01]  lwip_recv_tcp: lastdata now pbuf=0x3ffaf514
[14:26:01]  tcpip_thread: API message 0x3ffb7e00
[14:26:01]  tcpip_thread: API message 0x3ffb7e40
[14:26:01]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=5
[14:26:01]  lwip_recvfrom(54, 0x3ffdb34d, 124, 0x0, ..)
[14:26:01]  tcp_output: nothing to send (0x0)
[14:26:01]  pbuf_alloc(length=20)
[14:26:01]  pbuf_alloc(length=20) == 0x3ffaeba8
[14:26:01]  tcp_output: sending ACK for -1485235564
[14:26:01]  pbuf_add_header: old 0x3ffaebf8 new 0x3ffaebe4 (20)
[14:26:01]  ip4_output_if: en1
[14:26:01]  IP header:
[14:26:01]  +-------------------------------+
[14:26:01]  | 4 | 5 |  0x00 |        40     | (v, hl, tos, len)
[14:26:01]  +-------------------------------+
[14:26:01]  |       82      |000|       0   | (id, flags, offset)
[14:26:01]  +-------------------------------+
[14:26:01]  |  255  |    6  |    0xae12     | (ttl, proto, chksum)
[14:26:01]  +-------------------------------+
[14:26:01]  |  iot device  | (src)
[14:26:01]  +-------------------------------+
[14:26:01]  |   aws iot core  | (dest)
[14:26:01]  +-------------------------------+
[14:26:01]  ip4_output_if: call netif->output()
[14:26:01]  pbuf_add_header: old 0x3ffaebe4 new 0x3ffaebd6 (14)
[14:26:01]  ethernet_output: sending packet 0x3ffaeba8
[14:26:01]  pbuf_free(0x3ffaeba8)
[14:26:01]  pbuf_free: deallocating 0x3ffaeba8
[14:26:01]  lwip_recv_tcp: top while sock->lastdata=0x3ffaf514
[14:26:01]  lwip_recv_tcp: buflen=124 recv_left=124 off=0
[14:26:01]  lwip_recv_tcp: deleting pbuf=0x3ffaf514
[14:26:01]  pbuf_free(0x3ffaf514)
[14:26:01]  pbuf_free: deallocating 0x3ffaf514
[14:26:01]  tcpip_thread: API message 0x3ffb7e00
[14:26:01]  tcpip_thread: API message 0x3ffb7e40
[14:26:01]  lwip_recvfrom(54):  addr=aws_iot_ip port=8883 len=124
//
//@commnet: Note that req-response happens in 1710 ms including TCP retransmission
//
[14:26:01]  [Info] Time elapsed: 1710 ms between publish ACK and receive response
[14:26:01]  lwip_recvfrom(54, 0x3ffdb348, 5, 0x0, ..)
[14:26:01]  lwip_recv_tcp: top while sock->lastdata=0x0

Caution - I am not LWIP expert and below are my observations from your logs. You may get better support on LWIP or Espressif forums.

In the failure case, ethernet_output is not followed by ethernet_input which indicates that the Ethernet driver is not sending the ACK packet to the TCP stack. Can you try to trace the ethernet_input path?

Thank you @aggarg .
That’s a very good point there. We’ll try and narrow down this issue to the layers below.