Node stuck when OTA process fail

leandropg wrote on August 20, 2019:

I am developing with AWS FreeRTOS 201906.00_Major over Microchip Curiosity PIC32MZ EF board.

I have a problem with OTA updates. When the node made the OTA process, the version is updated to the new version and in the next internet connection the node inform the AWS that node was updated. This is the normal scenary.

But when the node does not can inform that the update worked and the node reboot in this state, it should restart in the previous version. Instead of this, the node does is stuck and does not make anything. I pushed the Reset Button and does not work. Only a hard reset (Turn Off and Turn On) permits to node restart in the previous version.

How can I solve this? I can not permit that a node stuck if the OTA process fail.

Thanks.

hbouvier-AWS wrote on August 26, 2019:

Hello,

Sorry for the delay. When OTA performed successfully, the node reboot in “self-test” mode and tries to notify that the OTA happened successfully. If it does not manage that, then it should definitly reboot on the new version.
Something is wrong, would you mind providing us us with a log file that would help us troubleshoot?
Thank you!

leandropg wrote on August 28, 2019:

The problem is simple. The OTA update works fine… when the process finish it start the new version… but in the new version the node does not obtain Internet connection in that first moment… by our implementation the watchdog timer reboot the application and in this moment the node stucks… in that first start the node never reach the AWS MQTT Server and reboot and so it stucks… I can not send logs because has private information… but the problem is simple:

  1. OTA PROCESS FINAL - OK
  2. RESTART NEW VERSION - OK
    3. OBTAIN INTERNET CONNECTION IN NEW VERSION - FAIL
  3. BY CODE I REBOOT THE NODE - OK
    5. THE NODE STUCKS - FAIL

The MCLR Reset button does not work… The only way is turn off and turn on the node… so the node restart in the previous version… I need fixed it because when the OTA fail… the node need a physical operation and the node are installed in the roof of a building… it is not a easy task

Edited by: leandropg on Aug 28, 2019 4:05 PM

PrasadV-AWS wrote on August 29, 2019:

Hello,

I think you are missing an important step in setting up the microchip platform and that is to build a factory image and flash it before deploying. The platform has a bootloader which performs crypto signature verification on initial flashed image as well as any new OTA images received. If you are using mplab IDE and flashing the image it is useful for debugging but not for deployment as it is not signed image.

To create this initial signed image a script is provided - factory_image_generator.py and its documentation is [https://docs.aws.amazon.com/freertos/latest/userguide/burn-initial-firmware-microchip.html].

I was able to reproduce what you are observing if I do not create a signed factory image. Please let me know if following these steps solves your problem.

Edited by: PrasadV-AWS on Aug 29, 2019 2:11 AM

leandropg wrote on September 02, 2019:

Thanks very much. This was my problem. I suggest add the words Development and Production in the documentation for improve the understand. Of this way is more clear:

To burn the demo application onto your board —> Development
To build and flash a factory image --> Production

Thank you very much.