OTA update, ota agent, ota_pal

I am trying to understand how AWS OTA update works. Specifically, I am reading ota.c and ota_pal.c and trying to understand the code

Can somebody explain:

  • What is a file context, ie. OtaFileContext_t * pFileContext = &( otaAgent.fileContext );
  • How does processJobHandler() work?
  • How is the OTA update initiated? How does OTA_Init() work?

A lot of questions about OTA agent… but I cannot find useful materials explaining the flow

Reading ota_pal.c is not so difficult to understand

Please help

examples of ota.c which I don’t understand:

  • in parseJobDoc(), what does it do? What are the inputs? What does it mean by returning a OtaFileContext_t type?
  • What does getFileContextFromJob() do? How does the “job document” look like? What does it mean by “getting the file context”?

Is there any document to explain all these? Reading the code does not help!!

Hi @leesp

There is rendered doxygen available here.

What is a file context, ie. OtaFileContext_t * pFileContext = &( otaAgent.fileContext );

A data structure used for duration of OTA which stores necessary data, such as stream source ID, parsed from the OTA Job document. It also holds other data for tracking the state of the download.

How does processJobHandler() work?

How is the OTA update initiated? How does OTA_Init() work?

There’s a state machine table defined here. Per the FSM, when OtaAgentEventReceivedJobDocument event occurs, processJobHandler is called. It attempts to populate a file context with aforementioned data, as presented in OTA Job JSON. Internally it eventually calls parseJobDoc which may call validateAndStartJob to kick off the download. OTA_Init initializes required interfaces and memory for the OTA agent.

in parseJobDoc(), what does it do? What are the inputs? What does it mean by returning a OtaFileContext_t type?

Parameters and even code examples are documented in the doxygen. They’re also present in the code such as here. It attempts to parse a minimum set of values from a JSON “job document”, required for OTA. These values are stored in a file context.

What does getFileContextFromJob() do? How does the “job document” look like? What does it mean by “getting the file context”?

It’s similar to parseJobDoc in that it’s trying to get a valid file context that can OTA, but it also takes care of more than just the JSON parsing aspect, doing other higher level tasks like memory allocation, creating a file buffer for storing download contents on device, etc.
OtaFileContext_t is defined here. The job document structure isn’t documented anywhere to my knowledge, but I have one handy:

{
  "afr_ota": {
    "protocols": [
      "MQTT"
    ],
    "streamname": "AFR_OTA-Demo",
    "files": [
      {
        "filepath": "/device/updates",
        "filesize": 40960,
        "fileid": 0,
        "certfile": "Code Verify Key",
        "update_data_url": null,
        "auth_scheme": null,
        "sig-sha256-ecdsa": "<redacted>"
      }
    ]
  }
}

Would highly recommend the doxygen documents linked here, and these supplemental AWS dev docs here

Dear David,

In getFileContextFromJob:

getFileContextFromJob(…)

pUpdateFile = parseJobDoc(…);

palStatus = otaAgent.pOtaInterface->pal.createFile( pUpdateFile );

return pUpdateFile;

A file context pUpdateFile has already been “created” (returned by the call of parseJobDoc). Then what is the purpose of passing this file context into pal.createFile()?
Would the file context “created” by pal.createFile(), be stored in some global variable or structures?

What is the actual file context? Is it otaAgent.fileContext? Or pUpdateFile (can’t be this, since it is just a local structure variable)?

Its so confusing. I’ve read the code many times

OtaContext_t is defined here.

pal.createFile deals separately with creating the destination “file” on the device that will house the download contents. It is passed the OtaFileContext_t so it can reference its prepared data, such as its use of OtaFileContext_t::pFilePath in this demo posix implementation.

parseJobDoc generally returns a pointer to otaAgent.fileContext, as shown here, which is assigned to pUpdateFile so there is just the one OtaFileContext_t that is housing the in flight OTA data.

There are not multiple instances of the context, just pointers to it. Ditto for other static variables in ota.c, such asotaAgent which encapsulates the control flow data for an OTA.

Dear David,

pal.createFile deals separately with creating the destination “file” on the device…

The “destination file” which you are referring to, is it load_firmware_control_block as in here?

Can you explain the flow of incoming data packet? Am I right that, once data comes in:

  1. pal.writeBlock() will be called to write the data packet into a queue
  2. ota_flashing_task() would run and get the data packet from the queue, and flash it to device’s ROM
    In above, where have the “destination file” been used? I don’t see how it comes into picture

I might have figured out what “destination file” pal.createFile is trying to create.

In the context of Renesas RX65N MCU, it creates FreeRTOS queue for housing the download contents, unlike the POSIX example you gave, which creates a C file with fopen()

Correct me if I am wrong

That is correct, the Amazon-FreeRTOS Renesas RX65N port uses a queue as the file destination, which is used by the flashing task. Whereas for the POSIX port, it’s a posix file. The “file destination” is simply an abstract concept to assist in explaining how the OTA library works. The writeBlock is a PAL abstraction with no awareness of how that that data is actually stored, just that it can store blocks based on index and size. Whether more intricacies occur below this boundary is up to the the user OTA port.

The complexity of the OTA port varies across platforms. A simpler example, the nrf52 port “file destination” is simply a designated area in flash and its writeBlock literally writes blocks directly to this flash.

Thanks David. Now I understand what a file is in the context of RX65N port

More questions:

  1. pal.writeBlock() would store data block in the queue; and the task “ota_flashing_task” would get data block from the queue, and write to the flash memory. In otapal_WriteBlock(), if the incoming data block’s Offset and BlockSize is a multiple of FLASH_CF_MIN_PGM_SIZE, then the data block will be store to the queue directly. Otherwise, “fragmented block” stuff will be performed, before the “modified” fragmented block is stored in the queue

Is my understanding above correct?

  1. In otapal_CloseFile(), otapal_CheckFileSignature() would be called. When pal.CloseFile() is called, are all the data blocks written to the flash memory already? Or they are still in the file/queue?
    Looks like otapal_CheckFileSignature() is flashing data in assembled_flash_buffer to the flash memory? I thought this is the task of ota_flashing_task()? Also the subsequent verification is performed on buffer, or flash memory?

Hello @DavidGC-FreeRTOS , any revelation is much appreciated!

Hi @leesp

Both 1 and 2 are largely dependent on the OTA port. The pal.* and other interface functions require some minimal functionality, outlined in the porting guide, and can serve as hooks into the OTA library if a user wants more complex behavior in their port.

#1
I’m not an expert on our RX65N port/architecture, but from what I’ve glanced, it seems they are accommodating for partial block writes – not just for the last block.

#2
As mentioned, under the PAL and other interface layers, users have substantial liberty to modify the OTA process through is interface hooks. In this case, it seems the image is buffered locally on the device, but not actually written to the effective image location until after it has been verified.

This and other port specific features aren’t required by the OTA library, but can be implemented by the ports if it’s beneficial. It all depends on the architecture for the port and the security, performance, etc. characteristics desired for that port. Irrespective, the library and how it works, in/above it’s layer, should remain the same – assuming port is done correctly. The PAL and other interfaces offer outlets for customization.

Diagram from the doxygen link I posted before. It shows a nice layer breakdown.

@DavidGC-FreeRTOS

Would you please explain what is the purpose of inselftesthandler()?

Also what does it mean by “job in self test but platform image state is not”?

How is selftest being done?

Thank you

Hi @leesp

Please review the documentation that I’ve linked throughout this thread as it should give you a framework to dive deeper and understand various parts of the library. There you’ll learn about the various image states and other library notions.

This thread has considerably diverged from it’s title and problem statement. If after going through documentation, you face more insurmountable confusion, please open up a new ticket.

Hi @DavidGC-FreeRTOS,

Me too facing similar confusion about the state to handle for successful download, but this thread abruptly stopped without any conclusion.

Could you please tell us, what the expected state to be happen to successful write callback to get trigger. Please kindly clarify with some state diagram / state transition information with following APIs.

otaPal_SetPlatformImageState
otaPal_GetPlatformImageState

Thanks,
Ganesh

I assume your problem is resolved in this one. Feel free to open a new issue if you still face any problem.