Mege OTA and greengrass demo together

TJirsch wrote on September 24, 2019:

I have tried to share a MQTT Connection via Greengrass with the OTA Agent.
From what I see, you get the same errors as I do:


"438 3328 [OTA Task] [OTA_CheckForUpdate] Failed to publish MQTT message."

This message corresponds to a message on greengrass(from my logs):

"[2019-09-25T00:26:08.19+02:00][ERROR]-MQTT QoS Error: Greengrass does not support QoS 1."

The message in the greengrass logs always occurs near the message on the esp32 from the OTA Task.

The OTA Task wants to use QOS1, Greengrass only supports QOS0, so , OTA over GG, no can do.
Look in the greengrass logs for yourself with:
tail -f /greengrass/ggc/var/log/system/*.log /greengrass/ggc/var/log/system/localwatch/localwatch.log

If you supply the OTA Agent with a MQTT Connection directly to the IoT Backend, all is fine, you can use the same client id/certs etc. in this case.
The clientid must be unique +per Broker+, so you can connect to GG and the cloud with the same id because the two are different brokers.

What seems to work now, is sharing an MQTT connection for Thing Shadow and normal MQTT to the same broker, both for Greengrass and IoT Backend. This is an aw(e|s)some piece of work and removes the need for a third Mqtt connection, if you want to do ota, shadow and mqtt to aws at the same time.

I am using a timertask to switch between “normal operations” and “go look for an ota update” mode to keep memory usage down.

Gaurav-Aggarwal-AWS wrote on September 24, 2019:

What I mentioned in the above post is that after changing the OTA agent to use QoS0 instead of QoS1 and establishing the above mentioned subscriptions in the Greengrass group, sharing MQTT connection to Greengrass core for OTA works for me.

Thanks.

TJirsch wrote on September 25, 2019:

I have changed the QOS Settings in the ota_agent.c file and now the ota updates work in parallel to the shadow updates on one mqtt connection over greengrass and also if the mqtt connection is directly connected to the cloud.
That is really cool ! Thank you.

Gaurav-Aggarwal-AWS wrote on September 25, 2019:

I did not create a separate OTA task but looking at the code you shared in the beginning of this post, Greengrass publishes should work. Here are the few things to check:

  1. Do you have the required subscription in your Greengrass group i.e. from device to cloud for topic “freertos/demos/ggd”?
  2. Is MQTT_AGENT_Publish in the function prvSendMessageToGGC is being called? You can check that by putting a log message before calling MQTT_AGENT_Publish.
  3. Subscribe to the topic “freertos/demos/ggd” on the AWS IoT console to see if those messages are being published. Note that the demo only publishes 3 messages which is configured using the ggdDEMO_MAX_MQTT_MESSAGES macro.

Thanks.

embeddedx wrote on September 25, 2019:

Thanks fof your efforts
When do you create the OTA task?
It seems the greengrass stopped publishing after after OTA task creation!

Thanks

TJirsch wrote on September 26, 2019:

I would like to create a PR for the QoS change, what would be the correct way to do so ?
I can fork the repo an create a PR from that.

Gaurav-Aggarwal-AWS wrote on September 27, 2019:

Yes, you can create a PR that way. However, we would probably want to make the change in a more configureable way rather than changing them to QoS0. I have asked my colleagues to take a look and see if they can address it.

Thanks.

embeddedx wrote on October 02, 2019:

Hi,
By following the mentioned updates and creating the new subscriptions, requesting OTA job information is now possible.
However I got an error in receiving one block and this cause the connection to be closed!

https://gist.github.com/ahmedwahdan/606bc99e9b2a26b3f07f8ff1ee2bef6e


670 15713 [iot_thread] State: Active  Received: 200   Queued: 200   Processed: 200   Dropped: 0
671 15807 [OTA Task] [prvIngestDataBlock] Received file block 202, size 1024
672 15808 [OTA Task] [prvIngestDataBlock] Remaining: 670
673 15808 [OTA Task] [prvIngestDataBlock] Received file block 203, size 1024
674 15808 [OTA Task] [prvIngestDataBlock] Remaining: 669
675 15833 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
676 15833 [iot_thread] State: Active  Received: 202   Queued: 202   Processed: 202   Dropped: 0
677 15953 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
678 15953 [iot_thread] State: Active  Received: 202   Queued: 202   Processed: 202   Dropped: 0
679 16058 [OTA Task] [INFO ][MQTT][160580] (MQTT connection 0x3fff2190) MQTT PUBLISH operation queued.
680 16058 [OTA Task] [prvPublishGetStreamMessage] OK: $aws/things/HelloWorld_Publisher/streams/AFR_OTA-2374f622-5de3-4930-
8cbe-4e9d5d508b0d/get/cbor
681 16060 [OTA Task] [prvIngestDataBlock] Received file block 204, size 1024
682 16060 [OTA Task] [prvIngestDataBlock] Remaining: 668
683 16073 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
684 16073 [iot_thread] State: Active  Received: 203   Queued: 203   Processed: 203   Dropped: 0
685 16193 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
686 16193 [iot_thread] State: Active  Received: 203   Queued: 203   Processed: 203   Dropped: 0
687 16260 [NetRecv] [ERROR][NET][162600] Error 0 while receiving data.
688 16260 [NetRecv] [WARN ][NET][162600] Receive requested 1140 bytes, but 97 bytes received instead.
689 16260 [NetRecv] [ERROR][MQTT][162600] (MQTT connection 0x3fff2190) Error processing incoming data. Closing connection.

690 16261 [NetRecv] [INFO ][MQTT][162610] (MQTT connection 0x3fff2190) Network connection closed.
691 16309 [OTA Task] [WARN ][MQTT][163090] (MQTT connection 0x3fff2190) Attempt to use closed connection.
692 16309 [OTA Task] [ERROR][MQTT][163090] (MQTT connection 0x3fff2190) New operation record cannot be created for a close
d connection
693 16309 [OTA Task] [prvPublishGetStreamMessage] Failed: $aws/things/HelloWorld_Publisher/streams/AFR_OTA-2374f622-5de3-4
930-8cbe-4e9d5d508b0d/get/cbor
694 16313 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
695 16313 [iot_thread] State: Active  Received: 203   Queued: 203   Processed: 203   Dropped: 0
696 16433 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
697 16433 [iot_thread] State: Active  Received: 203   Queued: 203   Processed: 203   Dropped: 0
698 16553 [iot_thread] WAHDAN : vRunOTAUpdateDemo : Inside the while loop
699 16553 [iot_thread] State: Active  Received: 203   Queued: 203   Processed: 203   Dropped: 0
700 16559 [OTA Task] [WARN ][MQTT][165590] (MQTT connection 0x3fff2190) Attempt to use closed connection.
701 16559 [OTA Task] [ERROR][MQTT][165590] (MQTT connection 0x3fff2190) New operation record cannot be created for a close
d connection
702 16559 [OTA Task] [prvPublishGetStreamMessage] Failed: $aws/things/HelloWorld_Publisher/streams/AFR_OTA-2374f622-5de3-4
930-8cbe-4e9d5d508b0d/get/cbor


Edit:
This also occurs with the normal OTA demo, this might be related to the QOS0 that we changes, I don’t know!
Edit2:
I have successful OTA update through greengrass.
I think this may be kind of race condition!
Shouldn’t this case be handled, like dropping this packet and re-request?

Edited by: embeddedx on Oct 2, 2019 1:15 AM

embeddedx wrote on October 08, 2019:

Hi,
Another issue is that the first connection to IOT core to retrieve the Greengrass core information works, but the second time fails with network_error and I need to restart the router(in my case I use the mobile as access point), and it doesn’t matter if I closed the connection or not, any ideas?