Printf debugging is a useful tool for software development and it is also a simple way of providing logging functions using a program like TeraTerm. However using printf with a blocking uart driver is very slow at ordinary baud rates such as 115200. For example a message of 29 characters including a CR and LF takes about 30 ms of uart time and if any floating point vaiables are output the sprintf time can be anywhere from 20 to 150 ms.
If you want to debug a loop that runs at 50 Hz with a slack time of 10% the standard blocking printf to uart function is too slow. To get over this problem I wrote a non-blocking dma function called printfdma that takes the string and variables does the formatting and puts the resultant string on a FreeRTOS queue. A seperate lower priority task sends the characters to the uart using dma.
Instead of using the Berkley stdio.h for the print f and sprint f functions I used the one developed by
Eyal Rozenberg eyalroz1@gmx.com * 2021-2024, Haifa, Palestine/Israel. You can get it at github here:
The advantage of this version is that it is thread safe (re-entrent), it doesn’t use the heap and it has some options that make it more suitable for
enbedded systems. It is not fully compliant with the standard C library, but it works for me.
I did some comparative testing against the Berkley printf/sprintf functions and found the following:
Three different number formats were used, Scientific, float to 6 decimals and float to 3 decimals. I tried the Rozenberg with double format on and off,
so there are two data sets for this sprintf function. Of course the maximum time to do an sprintf is the limiting factor in therms of speed and it shows that the Rozenberg
is considerably faster than the stdio for the float with three decimals case. Scientific notation is very slow with a maximum time of 173 us for the stdio and 149 for Rozenberg.
If you need speed go for floating point and keep the precision as small as practical.
The printfdma function is self contained in two files a code file and header. These should be added to your system.
They contain all of the functions necessary including the DMA interrupt config and interrupt request handlers.
Depending on the processor and board you use you will have to set up the uart you intend to use. In this example I have ported it to
UART 2 on an STM32g431 nucleo board. You will also have to set the DMA channels, pins and other paraphernalia for other boards.
The specific board I used was the NUCLEO-G431KB.
The system also has a funtcion to retrieve dat/time onfo from the RTC, but there is a lot of overhead with this so if you need
order of occurrence data for each printfdma call use a suitably configured counter and output that as an integer.
Because the system requires a freertos queue you cannot use the printfdma function until the printf Task is running under the scheduler.
Usage:
#include "printfUart1_DMA_Driver.h"
#else
#include "usart.h"
#endif
int main(void)
{
/* Reset of all peripherals, Initializes the Flash interface and the Systick. */
HAL_Init();
/* Configure the system clock */
SystemClock_Config();
/* Initialize all configured peripherals */
MX_GPIO_Init();
#if(PRINTF_DMA_UART1_ON == 1)
MX_USART2_UART_Init(); /* Initialisation of printfdma system uart, task, mutex etc */
initPrintf_DMA_Uart2(PRIO_PRINTF_DMA);
#else
/* Use the ordinary blocking function */
printf("Initalisation Done Sandard Blocking printf\r\n");
printVersion();
#endif
/* Init scheduler */
osKernelInitialize();
/* Call init function for freertos objects (in cmsis_os2.c) */
MX_FREERTOS_Init();
/* Start scheduler */
osKernelStart();
/* Infinite loop */
/* USER CODE BEGIN WHILE */
while (1)
{
}
}
To call the function you simply write:
printfdma("Float Num Value = %01.6f, Other float = %01.6f\r\n", pi, 2 * pi);
printfdma("Int Num Value = %8d\r\n", count++);
printfdma("Int Num Value = 2560779.345\r\n", count++);
printfdma("Literal String no conversions\r\n", NULL);
The printfdam function is not fully compliant with the stdio version. To speed up processing if you set a single parameter to NULL the functions skips the vsprintf_ formatting function and
simply outputs the string literally to the queue for printing to the UART.
A shot of my scope is shown below.
The top two traces are the output of the uart with a decode. The Uart is set to 115200 baud.
The bottom trace is the time it takes for the printfdma function to decode the run sprintf and place the
data in the queue.
PrintfTest_V2.0.11.zip (1.6 MB)
As you can see the printfdma function takes a maximum time of about 75 us, with the transmit time being
being about 2.5 ms. If you print integers or simple short literal strings the time is much faster, probably less than 35 us. The limiting factor is the xQueueSend function which takes about 9us to execute with a 100 MHz CPU clock.
I have used printfdma on a number of my projects that have time critical sections and it has been very useful. There are probably some optimisations that could make the system faster such as double buffering the string so that the xQueueSend function is outside of the printfdma call and by offloading the sprintf conversion to the print task, however this would be complex and require a lot more memory.
Hope someone finds this useful.