Xilinx Zynq7 Interrupt latency

Dear community,
I am using a Xilinx Zynq7 with an ARM Cortex A9 cpu together with FreeRTOS.

In my application I need to react pretty fast (<2 microseconds) to a GPIO interrupt.
I need to give a semaphore in the ISR that unblocks a task which sends an SPI command.

Currently it takes nearly 15 us before the SPI starts sending.
The CPU clocks at 666 MHz, the GPIO Interrupt priority in the ARM GIC is set to 160, the interrupts captured on the GPIO pin are edge-sensitive, active high

In order to avoid possible latency-increasing factors, I ran a test in a separate test project, with a single task waiting for the semaphore to be given & no TCP/IP stack running.

The latency is still more or less the same as in my original application (which uses TCP/IP stack and has several tasks running in background).

Does anybody know anything I could try to decrease the latency further, or is this already the best it gets?

I enabled configUSE_TASK_FPU_SUPPORT for every task which means that the FPU registers must be saved on every interrupt. Could this be a meaningful latency increasing factor?
The thing is, I need the FPU support enabled as I use several FreeRTOS resources in the ISR(s).
Maybe I could avoid using FPU registers in this single ISR that matters in term of low latency, but not for all. Is it even possible to call only some ISRs withought prior saving FPU registers?

Thanks for your time!
Best Regards,

Unlike a more predictable Cortex-M, there is going to be variance in Cortex-A interrupt latencies - depending on cache state, memory speed, etc. This article provides some experimental results. https://www.jblopen.com/arm-cortex-a-interrupt-latency/

Using a direct to task notification in place of the semaphore will shave a little off: https://www.freertos.org/2020/09/decrease-ram-footprint-and-accelerate-execution-with-freertos-notifications.html

Are you able to start the SPI from the interrupt hander itself? If so then you can ensure the interrupt priority is above configMAX_SYSCALL_INTERRUPT_PRIORITY so critical sections in the kernel don’t add jitter, have faster interrupt entry, but at the cost of not being able to use the FreeRTOS API from within the ISR.

Thanks for the hint with the task notifications, I did not know of that mechanism.

Regarding your question, no, the SPI driver uses FreeRTOS mechanisms and I don’t want to change that if I can avoid it somehow.

I did a barebone test withought FreeRTOS and discoverred that the latency is still too high (if I recall it correctly, around 9 us) thus I guess it wouldn’t get better than that anyway.

Thanks for your fast reply.