Zynq Ultrascale MPSoC task floating point corruption

wat · February 9, 2022, 3:54pm

I added codes for saving and restoring floating point registers to FreeRTOS_IRQ_Handler. I just copied them from portSAVE_CONTEXT in same file. Then, I don’t see any floating point corruption finally! Thank you for your advices!

But still I have one unknown point.
At first, I added codes only for registers v16-v31 but floating point corruption was still seen. Adding v8-v31 was also same. Finally, I added codes for all registers v0-v31 and there is no floating point corruption.
I am not sure why v0-v15 are needed to be saved and restored. And I am not sure it is allowable by ABI and FreeRTOS. Could you give me hints or advices?

16 large registers is quite an overhead.

Do you mean time overhead or stack space overhead or both of them? I cannot measure actual time overhead. In this case, time overhead can be estimated based on CPU clock and assembly code?

are you performing a complex floating point operation, or just simple arithmetic?

My floating point operations in interrupts are not many. All of them are enough simple. I can replace them to integer operations if I try.
But I would like to change all interrupts to “FPU safe” because of corruption risk in future. New floating point operation may be introduced in interrupt without any notice in future.
Now I understand I should compare this risk and overhead of FPU safe handling. Do you think overhead is larger than future risks in general?
Thank you for your support.

rtel · February 9, 2022, 7:03pm

The overhead is both in time and stack usage - whether that is an issue for you is very much dependent on the application you are writing. These devices can have a lot of RAM.

It is unexpected that you need to save all the floating point registers on interrupt entry - I would expect it on a context switch. I need to compile some code to see what it is doing. We have a Zynq project that uses Vitis and runs in QEMU now too - although its not merged into the mainline yet (still in the PR). That will make it easier to see what is going on.

wat · February 13, 2022, 1:51pm

I’m sorry for late response. I couldn’t reply for this topic. Then, I started the following new topic.

Zynq Ultrascale MPSoC task floating point corruption 2

I’m sorry for separated topics. Thank you for your support.

[RTEL EDIT: Pasting the post from the other thread below to combine threads. I also upped your privileges so you can continue to post in this thread:]

The overhead is both in time and stack usage - whether that is an issue for you is very much dependent on the application you are writing. These devices can have a lot of RAM.

Yes, my device has enough RAM. It means stack usage is not problem. I will estimate time overhead.

It is unexpected that you need to save all the floating point registers on interrupt entry

I analyzed disassembled code for my floating point operations in interrupt. I found that s0, s1 and s2 registers are used in those operations. Does it mean v0-v2 require to be saved on IRQ entry?

I need to compile some code to see what it is doing. We have a Zynq project that uses Vitis and runs in QEMU now too

Thank you very much for an investigation on your side. I am using Xilinx SDK 2019.1 instead of Vitis. Compiler is GCC which is built in Xilinx SDK. Maybe it is almost same as compiler in Vitis. Thank you for your support!

rtel · February 13, 2022, 4:30pm

Could you also look at the assembly code where the function that uses those registers is called. Ideally if you can post the assembly here.

wat · February 15, 2022, 2:15pm

Thank you for combining threads and updating my account.

Could you also look at the assembly code where the function that uses those registers is called.

I checked all assembly codes which are executed in interrupt.
FreeRTOS_IRQ_Handler → vApplicationIRQHandler → My interrupt function ->…-> Goes back to FreeRTOS_IRQ_Handler.
In this sequence, floating point registers are found only in the steps for my floating point calculation corresponding to C language code. No floating point register(v,q,d,s,h or b) is saved or restored in the sequence.
unfortunately, I cannot post assembly code directly because of the rules of my side. I’m sorry.

In a microsoft document Overview of ARM64 ABI conventions, v0-v7 are explained as not only “parameters” but also “volatile” and “scratch registers”. v8-v15 are Non-volatile only for low 64bit. So, I guess it means most of registers should be saved on IRQ entry. Could you give me your advice?
Thank you for your support.

wat · February 28, 2022, 2:25pm

I have one question.
If an overhead for saving “all” floating point registers is acceptable, is there any other problem for saving “all” floating point registers?

I have measured overhead for saving all floating point registers. It is 50ns. I think it is acceptable for my system. I measured time between external interrupt signal and GPIO signal generated by firmware in interrupt function by using an oscilloscope.
Result is
Time with saving “no” floating point registers: 450ns
Time with saving “all” floating point registers: 500ns

I would like to choose this solution even if some registers don’t need to be saved actually. Could you tell me your opinion?

rtel · February 28, 2022, 3:02pm

The only impact would be the additional RAM used and interrupt entry time.

wat · March 1, 2022, 2:04pm

Thank you for a quick response. Impacts of RAM usage and entry time are acceptable for my system. My problem is solved because of your help. Thank you again!

dykeagdrs · October 30, 2023, 8:07pm

We believe that we are having the same problem our Ultrascale A53. Rather that re-inventing the wheel, would you mind sharing the code changes?

wat · February 3, 2024, 4:18pm

Sorry for the late reply.
I found that this problem has been modified in the newest version of FreeRTOS A53 port.
I have never tested it but it seems to be almost same modification as this topic.
You can find “savefloatregisters” in “FreeRTOS_IRQ_Handler”.

portASM.S

github.com

Xilinx/embeddedsw/blob/xlnx_rel_v2023.2/ThirdParty/bsp/freertos10_xilinx/src/Source/portable/GCC/ARM_CA53/portASM.S

/*
 * FreeRTOS Kernel V10.5.1
 * Copyright (C) 2021 Amazon.com, Inc. or its affiliates.  All Rights Reserved.
 * Copyright (C) 2014 - 2021 Xilinx, Inc. All rights reserved.
 * Copyright (c) 2022 - 2023 Advanced Micro Devices, Inc. All Rights Reserved.
 *
 * SPDX-License-Identifier: MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy of
 * this software and associated documentation files (the "Software"), to deal in
 * the Software without restriction, including without limitation the rights to
 * use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 * the Software, and to permit persons to whom the Software is furnished to do so,
 * subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS

This file has been truncated. show original

StefanBa · May 2, 2024, 5:25pm

Sorry for another late reply.
I opened a pull request that will partially resolve this issue by adding configUSE_TASK_FPU_SUPPORT to the AARCH64 port:
https://github.com/FreeRTOS/FreeRTOS-Kernel/pull/1048

For a full fix the FPU registers need to be saved/restored for interrupts like in the Xilinx portASM.S posted by @wat above.

aggarg · May 8, 2024, 3:08pm

Thank you for your contribution!