pxcallback invoking eMQTTAgentDisconnect in endless loop

nickm2018 wrote on January 15, 2019:

On one device recently I noticed that I would receive the eMQTTAgentDisconnect in the event callback (pxCallback of the connectionParams) after successfully connecting and successfully subscribing to the broker. My firmware is structured such that there is a 4-6 second delay before reconnecting if the connection was closed.

Basic pseudo code
mqtt agent create
while(1)
{
mqtt connect
mqtt subscribe
while(!error)
{
mqtt publish
if error or disconnect callback invoked
disconnect
}
delay 4 seconds
}

I am not sure why I was receiving the disconnect message from the broker. I power cycled the device and it connected fine. I do not know if there was something lingering in the MQTT agent instance or what.

The mqtt broker log showed the connect and disconnect occurring but without any errors. There were a couple of throttling errors reported but the delays were > than the 1 second limit. Right after resetting the device it looks like there was a duplicate client id error and then it recovered. My client ID’s are the device MAC address so it should be unique to my device.

I have had problems with the duplicate client ID issue due to device resets and the disconnect callback seems like the only way to handle it.

What happens if the broker sees a duplicate id scenario but there is only one real client and that is my board? Could there be a possible looping scenario as described above?

abhijit-AWS wrote on January 16, 2019:

Hello nickm2018 !
Do you see the disconnect happening every time or is it intermittent? Are you connecting to AWS broker? Can you please enable MQTT debug logs on your device if the problem is reproducible?
Thanks.

nickm2018 wrote on January 16, 2019:

When it enters this “mode”, the connect and disconnect are constant. This issue is not easily reproduced and I do not have a handle on what triggers it. I am using a AWS broker connection.
In the below log file, i believe the DUPLICATE_CLIENT_ID issue happens after i power cycle the board. I have seen this before with resets, either unexpected or post OTA updates. I use the MAC address as the clientID.

Here is a small capture of the broker log.
2019-01-14 22:56:21.513 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 22:56:16.597 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:26:37.945 logLevel:“ERROR” status:“Failure” eventType:“Disconnect” protocol:“MQTT” reason:“DUPLICATE_CLIENT_ID” details:“A new connection was established with the same client ID”} |
2019-01-14 14:26:35.910 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:25:23.608 logLevel:“ERROR” status:“Failure” eventType:“Disconnect” protocol:“MQTT” reason:“DUPLICATE_CLIENT_ID” details:“A new connection was established with the same client ID”} |
2019-01-14 14:25:21.542 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:23:56.123 logLevel:“ERROR” status:“Failure” eventType:“Disconnect” protocol:“MQTT” reason:“DUPLICATE_CLIENT_ID” details:“A new connection was established with the same client ID”} |
2019-01-14 14:23:54.084 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:54.219 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:49.055 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:47.805 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:41.646 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:39.769 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:34.564 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:32.456 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:26.945 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:25.031 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:19.141 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:17.557 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:12.915 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:11.218 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:22:05.509 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:22:04.017 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:21:58.584 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:21:56.893 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:21:51.419 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:21:49.425 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:21:44.359 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:21:42.640 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”
2019-01-14 14:21:37.038 logLevel:“INFO” status:“Success” eventType:“Disconnect” protocol:“MQTT”
2019-01-14 14:21:35.317 logLevel:“INFO” status:“Success” eventType:“Connect” protocol:“MQTT”

Here is a small capture of my device’s debug log
1257670 221645508 MQTT Received message d6c30000 from queue.
1257671 221645818 MQTT data sent 35 time=221645818
1257672 221645883 worker thread Socket sending wakeup to MQTT task.
1257673 221645884 MQTT MQTT Connect was accepted. Connection established.
1257674 221645884 MQTT Notifying task.
1257675 221645884 MQTT Received message 0 from queue.
1257676 221645884 App Command sent to MQTT task passed.
1257677 221645884 App Connected
1257678 221646414 App Sending command to MQTT task.
1257679 221646414 MQTT Received message d6c40000 from queue.
1257680 221646414 MQTT data sent 32 time=221646414
1257681 221646465 worker thread Socket sending wakeup to MQTT task.
1257682 221646465 MQTT MQTT Subscribe was accepted. Subscribed.
1257683 221646465 MQTT Notifying task.
1257684 221646466 MQTT Received message 0 from queue.
1257685 221646466 App Command sent to MQTT task passed.
1257686 221646466 App Sync Subscription = Successful
1257687 221646466 App Sending command to MQTT task.
1257688 221646566 MQTT Received message d6c50000 from queue.
1257689 221646566 MQTT data sent 48 time=221646566
1257690 221646617 worker thread Socket sending wakeup to MQTT task.
1257691 221646617 MQTT MQTT Subscribe was accepted. Subscribed.
1257692 221646617 MQTT Notifying task.
1257693 221646618 MQTT Received message 0 from queue.
1257694 221646618 App Command sent to MQTT task passed.
1257695 221646618 App Command Subscription = Successful
1257696 221647105 App Transition to Active
1257697 221647437 App ACTIVE to MQTT_AGENT_DISCONNECT_EVENT
1257698 221647698 App SERVICE SHUTTING DOWN STATE
1257699 221647698 App Sending command to MQTT task.
1257700 221647718 MQTT Received message d6c60000 from queue.
1257701 221647718 MQTT data sent 2 time=221647718
1257702 221647718 MQTT About to close socket.
1257703 221647719 MQTT Socket closed.
1257704 221647719 MQTT Notifying task.
1257705 221647719 App Command sent to MQTT task passed.
1257706 221647719 App Disconnected from MQTT broker
1257707 221652227 app_thread Heap is healthy 91 perc free (1065976 of 1169056)
1257708 221653604 App MQTT clientID CC:C0:79:DF:45:46
1257711 221653605 App Sending command to MQTT task.
1257712 221653619 MQTT Received message d6c70000 from queue.
1257713 221653873 MQTT data sent 35 time=221653873
1257714 221653921 worker thread Socket sending wakeup to MQTT task.
1257715 221653923 MQTT MQTT Connect was accepted. Connection established.
1257716 221653923 MQTT Notifying task.
1257717 221653923 MQTT Received message 0 from queue.
1257718 221653923 App Command sent to MQTT task passed.
1257719 221653923 App Connected
1257720 221654233 App Sending command to MQTT task.
1257721 221654233 MQTT Received message d6c80000 from queue.
1257722 221654233 MQTT data sent 32 time=221654233
1257723 221654344 worker thread Socket sending wakeup to MQTT task.
1257724 221654344 MQTT Received message 0 from queue.
1257725 221654394 MQTT MQTT Subscribe was accepted. Subscribed.
1257726 221654394 MQTT Notifying task.
1257727 221654394 App Command sent to MQTT task passed.
1257728 221654394 App Sync Subscription = Successful
1257729 221654394 App Sending command to MQTT task.
1257730 221654394 MQTT Received message d6c90000 from queue.
1257731 221654395 MQTT data sent 48 time=221654395
1257732 221654436 worker thread Socket sending wakeup to MQTT task.
1257733 221654445 MQTT MQTT Subscribe was accepted. Subscribed.
1257734 221654445 MQTT Notifying task.
1257735 221654445 MQTT Received message 0 from queue.
1257736 221654445 App Command sent to MQTT task passed.
1257737 221654445 App Command Subscription = Successful
1257738 221654829 App Transition to Active
1257739 221654994 App ACTIVE to MQTT_AGENT_DISCONNECT_EVENT
1257740 221655096 App SERVICE SHUTTING DOWN STATE
1257741 221655096 App Sending command to MQTT task.
1257742 221655145 MQTT Received message d6ca0000 from queue.
1257743 221655145 MQTT data sent 2 time=221655145
1257744 221655145 MQTT About to close socket.
1257745 221655146 MQTT Socket closed.
1257746 221655146 MQTT Notifying task.
1257747 221655146 App Command sent to MQTT task passed.
1257748 221655146 App Disconnected from MQTT broker

The ACTIVE to MQTT_AGENT_DISCONNECT_EVENT statement is printed when my application receives the disconnect callback and considers the mqtt connection as closed. From the debug output, it looks like the callback is triggered without the mqtt library receiving any messages.
Unfortunately i do not have the entire serial log file to know what happened prior to the device entering this more.

nickm2018 wrote on January 16, 2019:

I was able to witness this on another device.

The device was communicating correctly with the server and then all of a sudden it appears to have been disconnected by the server. After the disconnect, the device enters the loop of successful connect, successful subscribe, then disconnect callback.

“1/14/2019 2:02:24 PM”,86998 237063369 MQTT Publish message
“1/14/2019 2:02:24 PM”,86999 237063372 MQTT data sent 523, time=237063372
“1/14/2019 2:02:24 PM”,87000 237063402 worker thread Socket sending wakeup to MQTT task.
“1/14/2019 2:02:24 PM”,87001 237063403 MQTT MQTT Publish was successful.
“1/14/2019 2:02:24 PM”,87002 237063403 MQTT Notifying task.
“1/14/2019 2:02:24 PM”,87003 237063403 MQTT Received message 0 from queue.
“1/14/2019 2:02:24 PM”,87004 237063403 App Command sent to MQTT task passed.
“1/14/2019 2:02:24 PM”,87005 237063403 App Successfully published Message
“1/14/2019 2:02:43 PM”,87006 237082817 worker thread Socket sending wakeup to MQTT task.
“1/14/2019 2:02:43 PM”,87007 237082817 MQTT Received message 0 from queue.
“1/14/2019 2:02:43 PM”,87008 237082818 MQTT data sent 2, time=237082818
“1/14/2019 2:02:43 PM”,87009 237082818 MQTT About to close socket.
“1/14/2019 2:02:43 PM”,87010 237082818 MQTT Socket closed.
“1/14/2019 2:02:43 PM”,87011 237082818 App ACTIVE to MQTT_AGENT_DISCONNECT_EVENT
“1/14/2019 2:02:44 PM”,87012 237083103 App SERVICE SHUTTING DOWN STATE
“1/14/2019 2:02:44 PM”,87013 237083103 App Sending command to MQTT task.
“1/14/2019 2:02:44 PM”,87014 237083103 MQTT Received message f770000 from queue.
“1/14/2019 2:02:44 PM”,87015 237083103 MQTT Notifying task.
“1/14/2019 2:02:44 PM”,87016 237083103 App Command sent to MQTT task passed.
“1/14/2019 2:02:44 PM”,87017 237083103 App Disconnected from MQTT broker

SarenaAtAws wrote on March 14, 2019:

Hello nickm2018,

Is this issue still on your radar? As you may already know, a eMQTTClientDisconnected happens for many reasons:

  1. When the MQTT headers in a received packet is malformed.
  2. When we receive a packet that is not a PUBLISH, CONNACK, PUBACK, SUBACK, UNSUBAK, or PINGRESP type. We do not currently support the other MQTT message types.
  3. If the server refuses our MQTT connection.
  4. If the packet is malformed from not having enough bytes or the headers are not matching as expected.
    In the prvProcessReceivedDisconnect() function in aws_mqtt_agent.c could you print out the error?

mqttconfigDEBUG_LOG("Disconnect type: %d", pxParam->xDisconnectData.xDisconnectData);

Also, it appears we can’t extract which packet type caused the error and if it was the header or the number of bytes. Could you also put the following code in aws_mqtt_lib.c before every line containing xMalformedPacket = eMQTTTrue;:


// Define fileLine
configPRINTF("Disconnect in: %s, line: %d", __FUNCTION__, fileLine);