Recommendations on posix port setup and cleanup

cmorganBE · October 11, 2023, 3:00pm

We are using the posix port to test our system code, based on esp-idf, and had some questions on best practices.

We’d like to spin up a new set of our system (threads etc) for each catch2 test case without having to build separate executables for each test. Also, we run valgrind against our tests and have found a number of leaks related to threads being left behind.
How does one cause the main code to exit cleanly? The posix example shows:


    /* Start the scheduler itself. */
    vTaskStartScheduler();

    /* Should never get here unless there was not enough heap space to create
     * the idle and other system tasks. */
    return 0;
}

but in our case we’d like to be able to shut down and clean everything up without calling ‘exit’.

ActoryOu · October 19, 2023, 6:36am

Hi @cmorganBE,
Welcome to FreeRTOS community!

Sorry for the late response. From vTaskStartScheduler() manual, it only returns if insufficient RTOS heap available to create the idle or timer daemon tasks. So return 0 is never triggered here unless heap memory is not enough.

Could you elaborate more about your use case? IMHO, I never need vTaskStartScheduler() to return.

Thanks.

cmorganBE · October 19, 2023, 12:45pm

HI @ActoryOu!

I’m working on catch2 tests for the logic of a system that runs on esp-idf (FreeRTOS). Each of these tests is mostly standalone and does something like:

Create objects and start threads
Run test and check for states/values
Shutdown and return an exit code based on whether tests passed or failed

We run two passes, one test suite as its own standalone, and one under valgrind, to look for memory errors, leaks etc.

Even when calling vTaskDelete() the pthreads that underly the FreeRTOS tasks appear to be left in a suspended state and not properly shut down.

At the global level, when all tests are done, we end up having to call exit() to shut the application down, again, leaving resources un-freed in the system and tripping a bunch of valgrind warnings.

We recently switched some threads from vTaskCreate to vTaskCreateStatic and that broke all of the tests after the first one, its unclear what the issue might be there at the moment.

So I’m here looking for input on how one is supposed to gracefully startup and shut down the FreeRTOS posix sim to ensure resources are freed, both so the subsequent tests run and to cut down on valgrind errors.

ActoryOu · October 20, 2023, 8:21am

Hi @cmorganBE,
Thanks for your clarification! It’s a lot clear now!

Is it possible to calculate the memory usage “after” starting the scheduler? Then comparing memory usage before and after running test cases. It’ll ignore the kernel usage and usage of task stack. And that must be the target you’d like to test.

BTW, tasks deletion is done in idle task here. If you want to releast resource for deleting tasks, give idle task some time slot to run. But in your case, that might not enough because FreeRTOS kernel also create idle tasks that’s not deleted by you.

Thanks.

cmorganBE · October 30, 2023, 3:17pm

The lack of graceful shutdown is an issue for us because its making our system tests difficult to run several independent tests within the same executable, and get clean valgrind runs. It also feels sloppy that there aren’t clean ways to shut down, and that the existing freertos posix tests aren’t being tested under valgrind to confirm they behave correctly.

The posix sim is complex enough that I’m not excited about trying to fix it either. I’m not sure what we should do.

Should I open some PRs to introduce running some of the existing tests in the test suite under valgrind to start to make the issue visible to other devs?

richard-damon · October 30, 2023, 3:43pm

My impression (I don’t use the sim frameworks) is that since they are primrily a testing framework, and as such, it is desired to not change the core files much to support them. In actual microprocessor implementations, “shuting down” FreeRTOS and returning to main is NOT a commonly needed feature, so the added code to support that isn’t present, and to get a “clean” shutdown, it would likely need some moderate added code.

If you want something like valgrind to be satisfied, then my expectation is that a kernel shutdown would only happen after ALL tasks (except Idle) have terminated, and then Idle could detect this state and shut down the kernel, perhaps as part of the user suppied Idle hook.

This would require a change to any kernel supplied tasks (like TimerSVC) to have a way to tell them to shut down. It also means all user tasks need to accept a shutdown command.

cmorganBE · November 1, 2023, 7:38pm

We’ve seen a huge amount of value being able to run under the sim and use tools like valgrind to check our work. It’s caught maybe a dozen system memory corruption issues just on the most recent esp-idf/FreeRTOS project we developed, and those were issues that were most latent, the failure modes were difficult to catch and debug.

I can appreciate that the OS wouldn’t necessarily be adjusted to support this use case.

In our case yes, we added a way to shut down each of our threads gracefully and simply don’t use that during the production usage of the system.

It’s hard to overstate the value of being able to run this stuff on linux and run valgrind on it. We’d probably switch RTOSes to get that support, depending on platform support of doing so, given the amount of time we’ve saved on debug from being able to run these advanced tools against our logic.

cobusve · November 2, 2023, 6:16pm

Hi Chris,

We are glad to hear that the posix port is assisting your testing, that is of course precisely why we had created it. In the FreeRTOS world this implementation is brand new and we appreciate any feedback on how to make it more useful. It still has a lot of room for improvement!

We would love to see some PR’s with improvements and we can have some more specific discussions around those, we have internally had similar problems with cleanup and shutdown and as richard-damon points out our goal is indeed to avoid any changes to the actual production code just to support these test cases. We have a couple of ideas and a couple of challenges to achieve this and with a little bit of help from the community I think this is something we can make really great.

We are using the posix port more and more in our internal testing as well, so it is seeing a lot of use !

cmorganBE · November 30, 2023, 8:33pm

PR open here to improve the posix port functionality, Improve POSIX port functionality by cmorganBE · Pull Request #914 · FreeRTOS/FreeRTOS-Kernel · GitHub