C++ Exceptions in the Posix Port

eslattery · July 26, 2024, 4:05am

I have run into a possible issue with some unit tests. While I am not testing FreeRTOS itself, some module do rely on the RTOS and so I updated our unit tests to run in the FreeRTOS Posix simulator. We also use the simulator extensively to have a functional device simulator that non-engineers can play with on their desktops.

The module I am testing here is an abstraction around a Task. It works perfectly on the MSVC/MinGW simulator, but our tests run in CI on linux and therefore use the posix simulator.

When calling vTaskDelete on a running task in the posix simulator, with C++ exceptions enabled, I get the error:

terminate called without an active exception
Signal: SIGABRT (Aborted)

we do not use exceptions in the device firmware, but would like to have them enabled in the simulator for testing. This is so that we can have our asserts throw instead of abort, and in that way test the asserts. This is sometimes called the “Lakos Rule”.

I have an example on github, with a run that passes without exceptions and one that crashes with them. You can see in the example I have an assert in the Task constructor that the provided memory is big enough and so in the tests I have a test that uses the test frame work REQUIRE_THROWS to test that assert.

I have traced the issue to vPortCancelThread in the Posix port.c file:

void vPortCancelThread( void * pxTaskToDelete )
{
    Thread_t * pxThreadToCancel = prvGetThreadFromTask( pxTaskToDelete );

    /*
     * The thread has already been suspended so it can be safely cancelled.
     */
    pthread_cancel( pxThreadToCancel->pthread );
    event_signal( pxThreadToCancel->ev );
    pthread_join( pxThreadToCancel->pthread, NULL );
    event_delete( pxThreadToCancel->ev );
}

I also found some references on Stack Overflow about pthread cancellations and C++ exceptions not mixing well. That seems to match what I am seeing but placing try/catch blocks in the tasks that I am creating doesn’t seem to help.

Any suggestions on how I can maybe modify the test code or test framework to make this work? or confirmation that it is the exceptions/pthread cancellation issue? I can compile without exceptions of course but that limits what we can test and some of the reporting features I have planned for the functional simulator.

aggarg · July 26, 2024, 6:00am

Is it correct to say that the primary purpose of throwing exceptions is to test asserts? If that is correct, is it possible for you to use something like what our tests do - FreeRTOS/FreeRTOS/Test/CMock/queue/generic/queue_create_dynamic_utest.c at main · FreeRTOS/FreeRTOS · GitHub.

eslattery · July 26, 2024, 7:21pm

I had thought of that, using a handler version of our asserts and mocking it out with a flag or counter. But I don’t think that works in all situations. Many times an assert is checking an invariant or pointer, and if code execution continues past the assert that could lead to a null pointer de-reference or some other undefined behavior. In this specific case the assert is checking that the task state struct isn’t in use already, so if the code goes past that point it will corrupt an already running task.

Is it correct to say that the primary purpose of throwing exceptions is to test asserts?

That is the problem I am currently working on, but It isn’t the only reason. Using exceptions in the functional simulator would also be highly desirable.

One thing I noticed this morning is that this only happens if the thread that is getting deleted has called taskYIELD()

eslattery · July 26, 2024, 7:26pm

Ok, I hadn’t even posted that and I dug deeper and figured out the issue. I had to learn a few new things along the way. Mainly that pthreads in linux uses exceptions to implement cancellation. That exception can be caught like so

try {
    taskYIELD();
}
catch (const abi::__forced_unwind& err) {
    // this is the exception that gcc uses for pthread cancellation, must re-throw
    throw;
}

Doing that I was able to catch the cancellation, but putting that around my top level task function the re-thrown exception never made it there… Then I finally realized it was because my function that wraps taskYIELD(); is marked as noexcept!

Now that I know pthreads is using exceptions to implement cancellation I had to go through all my functions and reevaluate which functions can and can not throw. That also explains why this issue isn’t seen on other OSes. I am glad there is a simple and non-intrusive solution for this, and always happy when I learn something along the way

aggarg · July 28, 2024, 6:53am

Thank you for sharing your solution!