OTA stuck on a old/previous job

nickm2018 wrote on February 05, 2019:

Are there any timeouts once subscribed to a OTA job? As i deploy more and more units I am running into situations where the library is still associated to a job that has either been canceled, deleted, or dropped from some reason. This causes the device to remain in a queued state for the next job.

Amazon FreeRTOS OTA Agent V1.0.0
Amazon FreeRTOS release 1.4.6

Two types of errors that i have captured from multiple attempts to get a device to start the download.
First error says the library is busy with existing job but there are no active jobs in the IoT console.

179776 347670946 [OTA Task] [prvParseJobDoc] Busy with existing job. Ignoring.                  
179777 347670946 [OTA Task] [prvParseJobDoc] Error 1 parsing job document.
179778 347670946 [OTA Task] [prvParseJobDoc] Rejecting job due to OTA_JobParseErr_t 1           
179796 347671098 [OTA Task] [prvOTA_Close] Context->0x005599a4

This is the next attempt to send an image. It is about to start and then the mqtt library indicates subscribe limit is reached. The library has a max of 8 subscriptions. 2 are used by my application so there should be 6 left to the OTA (seems to only use 3).

179951 347887000 [OTA Task] [prvParseJobDoc] Job was accepted. Attempting to start transfer.
179952 347887000 [OTA Task] [OTA]	cnewImageStatus = 0
179953 347887092 [OTA Task] [prvPAL_GetPlatformImageState]	currentImageStatus = 0
179954 347887092 [OTA Task] Sending command to MQTT task.
179955 347887092 [MQTT] Received message 16fa0000from queue.
179956 347887092 [MQTT] Initiating MQTT subscribe.
179957 347887093 [MQTT] WARN: Subscription Manager full! No space left to store new subscriptions.
179958 347887093 [MQTT] MQTT_Subscribe failed!

Edited by: nickm2018 on Feb 5, 2019 3:08 PM

nickm2018 wrote on February 06, 2019:

What happens if the OTA library looses connection/disconnects/looses sync with the job and there is an active subscribe to $aws/things/%s/streams/%s/data/cbor?

In the MQTT library, when a new subscribe is started, it checks the subscribe manager list and removes any matching topics. The stream topic will be different for each job / portion of file, correct? Could this be the reason why my subscription manager ran out of room?

Is there a way to purge all subscriptions associated with a mqtt connection from the subscription manager list?

mradula-aws wrote on February 06, 2019:

yes it looks like the mqtt subscription manager should remove the jobs which are not active currently. We are debugging the code to why it is behaving incorrectly.