Unit Test Strategy

Hi!

I’ve spent two years on an embedded C++ project using FreeRTOS, and it is getting crucial to have a regression test suite to avoid creating new bugs when fixing others.
It is a fairly big code base with 20+ tasks and a few more units to be tested. I’ve chosen Google Test/Mock framework to be run on a Linux computer, so I’m using the FreeRTOS Posix Linux port.
I’ve spent a few days just to be able to compile the code with Linux (Ubuntu) gcc and to stub/port HAL functions and macroes. Some units which are very close to HAL will not be possible to test (I think). Also I’ve spent a few days to get rid of all the singleton’s I created (I will never create singleton’s again!).
Now when I’m starting to set up the tests, I get a bit hesitant about which way to go. I think these are my options:

  1. Call vTaskStartScheduler() from a pthread-created thread in google test main() function, and then let all tasks run for the duration of all tests.

  2. Call vTaskStartScheduler() at each unit test setup (pthread-created thread), and only run the tasks required for the particular unit to be tested.

  3. Break out the task function (everything inside the task loop) so that I can execute the task function “manually” from a unit test.

  4. Call vTaskStartScheduler() from a pthread-created thread in google test main() function, and then create/delete tasks from the units being involved in the particular unit test.

Option (1) seems nice because I imagine I can simulate a lot of things, as the entire project is running. But it does not seem suitable for mocking/substitue neighouring objects for a particular unit to be tested.

Option (2) seems nice because I can instantiate the unit to be tested in each unit test, and provide adjacent mock objects and I imagine I will be able to have better possibilities to create exhaustive tests. However, this option requires that I can re-start FreeRTOS at each unit test without memory losses and to what I’ve read about stopping FreeRTOS is that it may not be fully tested (in which case I need to hack into the FreeRTOS code, which I would like to avoid). I’m using the FreeRTOS heap manager also in the Linux port, but I guess I could use Linux heap instead because then losing a few kB’s at every test run would probably not be noticed.

Option (3) seems nice because I have full control of the task execution, and I will be able to test one unit at a time. The downside is that I need to re-write all tasks and expose the task function for the unit tests. Also it’s unclear what more pitfalls I can expect if I don’t really run FreeRTOS tasks.

Option (4) seems nice as I need only to add a destructor to my units to delete its resources and tasks.

I’m very interested to hear experiences from anyone who has done a similar job! Perhaps there is a better option? It’s not an option to run unit tests on target though. At the moment I favor option (4), and will probably try this and see where it leads.

BR /Martin

Hi Martin,

Great decision! :clap:

Technically anything is testable given enough changes and modification.
To test low level C functions with gtest/gmock, you will have to wrap them in classes and call them using the object throughout your code , including FreeRTOS API functions.
1- Unit tests should only test the function you are testing, and indirectly its static helper functions.
2- all FreeRTOS calls/Hardware calls should be mocked (including vTaskStartScheduler)
so I would vote for your number 2 solution here.

it might be beneficial to do that, but it should not be under unit testing.

you should just ignore FreeRTOS by mocking it, and just test your own code
what you are describing sounds more like integration testing.

true.
Also you won’t be able to add coverage information with gcov or profiling with gprof, or even benchmarking your code with Google Benchmark.
Linux provide lots of tools that it is nice/beneficial to take advantage of.

Hope this helps!

Hi Martin,

from my understanding, this won’t work well for typical FreeRTOS applications because by definition, there is a lot of inter task communication and mutual dependencies between tasks which are asynchronous, thus non deterministic and thus typically elude unit testing. Iow, a task that needs a notification/mailbox from another task (let alone an ISR) to do what it is supposed to do can’t really be treated as a standalone testable unit.

I’m curious about how you address those issues?

Thanks!

Thank you for your input!
I’m a bit hesitant to mock out FreeRTOS, because in a previous project some people chose to mock evertything but their own code. It became more of a check that some functions was called a certain number of times rather than a functional test. Also it made it tedious to make structural changes which didn’t really affect the functionality, because one needed to update the unit test accordingly.
Also, my code has many units with a function API, which are forwarded to the unit’s task as work packages. If possible I’d rather test the whole chain including FreeRTOS.
I’ve been struggling with the Linux port of FreeRTOS, because it didn’t compile with the preemptive option enabled, but occasionally crashed without it because the vTaskSuspendAll isn’t really forwarded to the port code (I think, because the crashes is due to calls from a task while the scheduler is suspended).
I fixed so that I can set the preemptive option on again (I copied a few lines for “optimized task switch” from the ARM port, and I’m running the google test from a FreeRTOS task (and the scheduler from another task).
If I can’t get that working next week, I’ll probably go for mocking everything.
BR, /Martin

I want to add that I now aim for only running the unit under test’s task(s). Some adjacent units which are used by many units, like the settings/configuration unit may work as on target but instead of reading from e.g. EEPROM or SD card, I can port these low level accesses to file accesses on the Linux to be able to provide different configuration for different unit tests.

At the end of the day you may test the way you see fit for your project!

But from a literature and nomenclature point of view, Unit Testing is when you test a single unit and mock/stub/fake all external calls to that unit.
Testing multiple units together is called Integration testing, even if you are using gtest with gmock, and you might have to do both (unit and integration), again depending on your product, how complex it is, how it is using other components etc…
Testing a unit with the OS(FreeRTOS) is Integration Testing.

Let us know how it goes, and if you need help as well.

Unfortunatley I’m putting this unit test effort on hold for the time being, because I’m not really getting anywhere with writing test cases. It basically works to create a unit with a task from a unit test and it works fine 5 times out of 6. But occasionally I get crashes from FreeRTOS when creating or delete the unit with the task. I really know too little to really point the finger to the Linux port, but I suspect there’s time gaps where a pthread accesses FreeRTOS (like during task switch where I often get an failing assert) when it’s not supposed to.
For my project’s sake I need at least some kind of integration, unit or functional test running pretty soon. So I’m switching to creating an integration test (for now). I know it does not replace a unit test as it will mostly testing things that are supposed to work and will not find flaws in each unit due to missing or bad error handling.
To respond to RAc, I think I could mimic/replace a unit’s ISR’s from e.g. a UART pretty easily. Or I could have called a higher level function to mimic an incoming string from e.g. a UART. By simply redefining “private” as “public” just above the include of the header file to be tested, I could from the unit test access any private function in the unit under test.
When I get back to doing unit tests I would probably do what gedeonag suggest, to mock out the FreeRTOS entirely and rework my units with tasks a bit so I could make function calls to the task’s loop. Like:


void MyUnit::init()
{
    rtos_res = xTaskCreate(my_unit_task_entry,
                           "my_unit_task",
                           2U * TASK_STACK_SIZE_DEFAULT,
                           this,
                           TASK_PRIO_DEFAULT,
                           NULL);
}
void MyUnit::my_unit_task_entry(void *param)
{
    MyUnit* my_unit_task_p = static_cast(param);
    my_unit_task_p->my_unit_task();
}
void MyUnit::my_unit_task()
{
    for ( ; ; )
    {
        // Wait for a work package.
        my_unit_wp_t wp;
        BaseType_t rtos_res = xQueueReceive(m_work_queue, &wp, portMAX_DELAY);
        my_unit_task_event(&wp);
    }
}
void MyUnit::my_unit_task_event(const my_unit_wp_t* const wp)
{
    // Handle work package
}

Then I could call the my_unit_ttask_event() with anything from a unit test. Well, it’s a plan :slight_smile:

If anyone is interested, this is the contents of my “main_gtest.cpp” which is executed by the Google Test (one can use gdb to debug the executable in Linux):


#include "FreeRTOSConfig.h"
#include "FreeRTOS.h"
extern "C"
{
#include "task.h"
}
#include "gtest/gtest.h"
/**
 * Handle to the FreeRTOS scheduler thread id.
 */
static pthread_t m_freertos_thread_id;
static int m_result = 0;
static bool m_test_is_done = false;
/**
 * FreeRTOS task which runs all my unit tests.
 */
static void gtest_task(void *param)
{
    (void)param;
    // Run Google Test from here!
    m_result = RUN_ALL_TESTS();
    m_test_is_done = true;
    vTaskSuspend(nullptr);
}
/**
 * Creates the unit test FreeRTOS task.
 */
static void start_free_rtos()
{
    BaseType_t rtos_res = xTaskCreate(gtest_task,
                                      "gtest_task",
                                      TASK_STACK_SIZE_DEFAULT * 10U,
                                      nullptr,
                                      TASK_PRIO_DEFAULT,
                                      nullptr);
    if (rtos_res != pdPASS)
    {
        abort();
    }
}
/**
 * FreeRTOS scheduler thread.
 */
static void* free_rtos_thread(void* data)
{
    (void)data;

    // Make this thread cancellable
    pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL);
    pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
    // Start tests
    start_free_rtos();
    // Start FreeRTOS scheduler (it never returns)
    vTaskStartScheduler();
    return nullptr;
}

static void end_free_rtos()
{
    pthread_cancel(m_freertos_thread_id);
    pthread_join(m_freertos_thread_id, nullptr);
    // Note: A call to vTaskEndScheduler() never returns.
}

int main(int argc, char **argv)
{
    testing::InitGoogleTest(&argc, argv);
    // Create the FreeRTOS scheduler thread and wait until tests are done
    pthread_create(&m_freertos_thread_id, nullptr, &free_rtos_thread, nullptr);
    while (!m_test_is_done)
    {
        sleep(1);
    }
    end_free_rtos();
    return m_result;
}

well yes, but that won’t help you a lot due to a) the asynchronous nature of events and b) due to the indeterministic nature of what is communicated. Assuming, for example, that the events do not come from a USART but are key strokes from a key pad ISR - if you only feed known key stroke sequences in predetermined intervals into your task, you won’t catch very many use cases.

My point, I believe, is that unit testing by definition is static and deterministic; you feed your function values from a known domain and expect determinstic answers. Most RTOS applications (tasks) can’t be expressed in static and deterministic control flow sequences. For example, you won’t catch very many deadlocks using unit testing.

@RAc @lundholm,
Totally agree with RAc, unit testing can’t catch deadlocks(in most cases at least) , it is used to confirm that a certain function adheres to the requirements for example.
Thus the need for different types of testing in a project (unit, feature, e2e, integration, functional, stress, fuzzing, verifast, cbmc, valgrind, benchmarking, profiling … so many to count them all) the job of a software engineer is to determine the needed ones (could be 1 could be all) depending on nature the project.
Unfortunately or fortunately :slight_smile: we do not know the type of project your are working on.

For the crash you are observing, could you enable core-dumps (ulimit -c unlimited), then make the crash happen and then

$ gdb -c <core_file> <executable_name>
gdb$ thread apply all bt

and post the output? (make sure you compile/link with debugging symbols and no optimization enabled -ggdb3 -O0)

Now I get an assert from line 4167 in ‘tasks.c’ function xTaskPriorityDisinherit() [V10.4.3] which is line ‘configASSERT( pxTCB == pxCurrentTCB )’. I’d say it now crashes one time out of four attempts with the same assert. The other times the test case is run successfully.
The assert is triggered by my CicMutex class, which is a standard mutex wrapper class. It is declared at the top of CicSettings::config() and destructed (and mutex is released) when the object is out of scope. Actually this type of crash smells stack overwrite, but I’ve tested to increase task stack size with no difference. All other threads/tasks seems to sleep nicely from looking at the crash printout. I tested to remove the mutex for this scope, but it failed on another mutex release with same assert. I suspect it’s my code causing the crash, but it’s weird that it often runs fine.

From test run:


[----------] 2 tests from CicFreqManagerTest
[ RUN      ] CicFreqManagerTest.SetGetFrequency
[cic_freq_mgr_test.cpp:99] Settings file: 'cic_freq_mgr_settings.txt'
[cic_settings.cpp:250] CIC Settings read 8064 [B] from NVM in 0 [ms].
ASSERT-os/FreeRTOS/tasks.c-4167
Aborted (core dumped)
c

From core dump:

Reading symbols from CIC_GTEST.exe...
[New LWP 5444]
[New LWP 5445]
[New LWP 5447]
[New LWP 5443]
[New LWP 5446]
[New LWP 5442]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./CIC_GTEST.exe'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f97f4a2e700 (LWP 5444))]
(gdb) thread apply all bt

Thread 6 (Thread 0x7f97f5230740 (LWP 5442)):
#0  futex_wait_cancelable (private=, expected=0, futex_word=0x7f97f0003494) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f97f0003440, cond=0x7f97f0003468) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x7f97f0003468, mutex=0x7f97f0003440) at pthread_cond_wait.c:638
#3  0x000055dc2e38d7d7 in event_wait (ev=0x7f97f0003440) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/utils/wait_for_event.c:41
#4  0x000055dc2e38d4e9 in prvSuspendSelf (thread=0x7f97f0003338) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:496
#5  0x000055dc2e38d4bd in prvSwitchThread (pxThreadToResume=0x7f97e800e2b8, pxThreadToSuspend=0x7f97f0003338) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:474
#6  0x000055dc2e38d36f in vPortSystemTickHandler (sig=14) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:400
#7  
#8  0x00007f97f57f73bf in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fff343e38f0, rem=rem@entry=0x7fff343e38f0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#9  0x00007f97f57fd047 in __GI___nanosleep (requested_time=requested_time@entry=0x7fff343e38f0, remaining=remaining@entry=0x7fff343e38f0) at nanosleep.c:27
#10 0x00007f97f57fcf7e in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#11 0x000055dc2e3fe359 in main (argc=1, argv=0x7fff343e3a38) at gtest/main_gtest.cpp:72

Thread 5 (Thread 0x7f97ef7fe700 (LWP 5446)):
#0  futex_wait_cancelable (private=, expected=0, futex_word=0x7f97f0005190) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f97f0005140, cond=0x7f97f0005168) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x7f97f0005168, mutex=0x7f97f0005140) at pthread_cond_wait.c:638
#3  0x000055dc2e38d7d7 in event_wait (ev=0x7f97f0005140) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/utils/wait_for_event.c:41
#4  0x000055dc2e38d4e9 in prvSuspendSelf (thread=0x7f97f0005038) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:496
#5  0x000055dc2e38d4bd in prvSwitchThread (pxThreadToResume=0x7f97f0003338, pxThreadToSuspend=0x7f97f0005038) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:474
#6  0x000055dc2e38d181 in vPortYieldFromISR () at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:279
#7  0x000055dc2e38d196 in vPortYield () at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:287
#8  0x000055dc2e38c918 in prvProcessTimerOrBlockTask (xNextExpireTime=0, xListWasEmpty=1) at os/FreeRTOS/timers.c:633
#9  0x000055dc2e38c86d in prvTimerTask (pvParameters=0x0) at os/FreeRTOS/timers.c:579
#10 0x000055dc2e38d451 in prvWaitForStart (pvParams=0x7f97f0005038) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:440
#11 0x00007f97f5355609 in start_thread (arg=) at pthread_create.c:477
#12 0x00007f97f5839293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 4 (Thread 0x7f97f522f700 (LWP 5443)):
#0  0x00007f97f575e322 in __GI___sigtimedwait (set=set@entry=0x7f97f522ee20, info=info@entry=0x7f97f522ed60, timeout=timeout@entry=0x0) at ../sysdeps/unix/sysv/linux/sigtimedwait.c:29
#1  0x00007f97f575d6ac in __GI___sigwait (set=0x7f97f522ee20, sig=0x7f97f522ee1c) at ../sysdeps/unix/sysv/linux/sigwait.c:28
#2  0x000055dc2e38d094 in xPortStartScheduler () at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:197
#3  0x000055dc2e38a688 in vTaskStartScheduler () at os/FreeRTOS/tasks.c:2089
#4  0x000055dc2e3fe2cc in free_rtos_thread (data=0x0) at gtest/main_gtest.cpp:50
#5  0x00007f97f5355609 in start_thread (arg=) at pthread_create.c:477
#6  0x00007f97f5839293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 3 (Thread 0x7f97eeffd700 (LWP 5447)):
#0  futex_wait_cancelable (private=, expected=0, futex_word=0x7f97e8002630) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f97e80025e0, cond=0x7f97e8002608) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x7f97e8002608, mutex=0x7f97e80025e0) at pthread_cond_wait.c:638
#3  0x000055dc2e38d7d7 in event_wait (ev=0x7f97e80025e0) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/utils/wait_for_event.c:41
#4  0x000055dc2e38d4e9 in prvSuspendSelf (thread=0x7f97e800e2b8) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:496
#5  0x000055dc2e38d42d in prvWaitForStart (pvParams=0x7f97e800e2b8) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:433
#6  0x00007f97f5355609 in start_thread (arg=) at pthread_create.c:477
#7  0x00007f97f5839293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7f97effff700 (LWP 5445)):
#0  futex_wait_cancelable (private=, expected=0, futex_word=0x7f97f0003f10) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7f97f0003ec0, cond=0x7f97f0003ee8) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=0x7f97f0003ee8, mutex=0x7f97f0003ec0) at pthread_cond_wait.c:638
#3  0x000055dc2e38d7d7 in event_wait (ev=0x7f97f0003ec0) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/utils/wait_for_event.c:41
#4  0x000055dc2e38d4e9 in prvSuspendSelf (thread=0x7f97f0003db8) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:496
#5  0x000055dc2e38d42d in prvWaitForStart (pvParams=0x7f97f0003db8) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:433
#6  0x00007f97f5355609 in start_thread (arg=) at pthread_create.c:477
#7  0x00007f97f5839293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
--Type  for more, q to quit, c to continue without paging--c

Thread 1 (Thread 0x7f97f4a2e700 (LWP 5444)):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f97f573c859 in __GI_abort () at abort.c:79
#2  0x000055dc2e38e895 in cic_assert_failed (file=0x55dc2e450018 "os/FreeRTOS/tasks.c", line=4167) at gtest/ports/src/cic_common_port.c:29
#3  0x000055dc2e38b7d7 in xTaskPriorityDisinherit (pxMutexHolder=0x7f97f0003370) at os/FreeRTOS/tasks.c:4167
#4  0x000055dc2e38975d in prvCopyDataToQueue (pxQueue=0x7f97e800ce50, pvItemToQueue=0x0, xPosition=0) at os/FreeRTOS/queue.c:2144
#5  0x000055dc2e388f06 in xQueueGenericSend (xQueue=0x7f97e800ce50, pvItemToQueue=0x0, xTicksToWait=0, xCopyPosition=0) at os/FreeRTOS/queue.c:866
#6  0x000055dc2e39217c in CicMutex::~CicMutex (this=0x7f97f4a2db40, __in_chrg=) at cic/inc/cic_mutex.hpp:42
#7  0x000055dc2e3bd4a1 in CicSettings::config (this=0x7f97e800c670) at cic/src/cic_settings.cpp:187
#8  0x000055dc2e3fe0cc in CicSettingsMock::config (this=0x7f97e800c670, settings_file_name_p=0x55dc2e46c09b "cic_freq_mgr_settings.txt") at gtest/mocks/src/cic_settings_mock.cpp:45
#9  0x000055dc2e41e1c1 in CicFreqManagerTest::SetUp (this=0x7f97e800a530) at gtest/tests/src/cic_freq_mgr_test.cpp:100
#10 0x000055dc2e44edd1 in testing::internal::HandleSehExceptionsInMethodIfSupported (location=0x55dc2e46d997 "SetUp()", method=, object=0x7f97e800a530) at ./googletest/src/gtest.cc:2414
#11 testing::internal::HandleExceptionsInMethodIfSupported (object=object@entry=0x7f97e800a530, method=, location=location@entry=0x55dc2e46d997 "SetUp()") at ./googletest/src/gtest.cc:2469
#12 0x000055dc2e442c11 in testing::Test::Run (this=0x7f97e800a530) at ./googletest/src/gtest.cc:2503
#13 testing::Test::Run (this=0x7f97e800a530) at ./googletest/src/gtest.cc:2498
#14 0x000055dc2e442de5 in testing::TestInfo::Run (this=0x55dc2eff91b0) at ./googletest/src/gtest.cc:2684
#15 testing::TestInfo::Run (this=0x55dc2eff91b0) at ./googletest/src/gtest.cc:2657
#16 0x000055dc2e442ecd in testing::TestSuite::Run (this=0x55dc2eff9380) at ./googletest/src/gtest.cc:2816
#17 testing::TestSuite::Run (this=0x55dc2eff9380) at ./googletest/src/gtest.cc:2795
#18 0x000055dc2e4433ec in testing::internal::UnitTestImpl::RunAllTests (this=0x55dc2eff40a0) at /usr/include/c++/9/bits/stl_vector.h:1040
#19 0x000055dc2e44f341 in testing::internal::HandleSehExceptionsInMethodIfSupported (location=0x55dc2e46edf8 "auxiliary test code (environments or event listeners)", method=, object=0x55dc2eff40a0) at ./googletest/src/gtest.cc:2414
#20 testing::internal::HandleExceptionsInMethodIfSupported (object=0x55dc2eff40a0, method=, location=location@entry=0x55dc2e46edf8 "auxiliary test code (environments or event listeners)") at ./googletest/src/gtest.cc:2469
#21 0x000055dc2e44361c in testing::UnitTest::Run (this=0x55dc2e4c8980 ) at ./googletest/include/gtest/gtest.h:1412
#22 0x000055dc2e3fe200 in RUN_ALL_TESTS () at /usr/include/gtest/gtest.h:2490
#23 0x000055dc2e3fe217 in gtest_task (param=0x0) at gtest/main_gtest.cpp:19
#24 0x000055dc2e38d451 in prvWaitForStart (pvParams=0x7f97f0003338) at os/FreeRTOS/portable/ThirdParty/GCC/Posix/port.c:440
#25 0x00007f97f5355609 in start_thread (arg=) at pthread_create.c:477
#26 0x00007f97f5839293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

well, that’s the very nature of asynchronous concurrent execution. Again, due to multiple possible orderings of task preemptions, you’ll get non deterministic behavior.

The assert seems to imply that a different task is releasing the mutex than the one that claimed it. Could it be that you do not use your mutex properly and/or really want semaphore instead of mutex behavior?

And btw I’m creating software for a 15 kW induction heating device, with a temperature regulator controlling a current regulator which in turns controls a frequency regulator :slight_smile: The closer the frequency is to the resonance frequency of the induction coil, the higher the heating power. Supporting the regulators there is a system control block (init, config, operating or error state), a central power manager which has a hw/sw monitor and control function, an NVM (EEPROM) unit, a FAT-SD unit (FreeRTOS port; works very nice thank you!) and peripheral device support for temperature sensors and remote devices. FreeRTOS works really nice on target!
As it is intended for industrial usage, I really need all tests I can come up with, but since I’m the only SW resource I have to (try) to spend my time wisely.

There is quite a lot of inter-task communication and I find no problem with that (at the moment) on target. SW is running for days with no issues.
In this case it is “impossible” that the task releasing the mutex is different than the creator as the mutex is wrapped in a mutex object (the mutex object is just declared in the beginning of a function), so I’d rather think the mutex data which is overwritten (by me). But why would it not crash every time? Fishy.
Anyhow, to really verify if the FreeRTOS can be run as I’ve set it up in Linux with google test I problably would need to downscope the test a bit to limit the sources of possible errors.

Oh I see. Something like this?

class MutexAbstraction
{
private:
MutexObject *itsMutex;
public:
MutexAbstraction(MutexObject *existingmutex) {itsMutex = existingmutex;
claim itsMutex};
~MutexAbstraction() {release itsMutex;};
}

SomeCode:
{
MutexAbstraction aMutex(a mutex created global scope);
protected code;
}

Perfectly legitimate and almost fool proof. If that kind of code crashes, it could only be if your stack gets overtrampled (more precisely, the itsMutex element of the dummy class instance on your application stack would), but I would expect it to crash in a different position then.

Usual question: Are all your app stacks big enough?

Exactly :slight_smile: It’s a really neat way to wrap mutexes in C++.

At least on target :slight_smile: But this mutex would be created from the heap and not from the stack actually (as configSUPPORT_DYNAMIC_ALLOCATION=1 in FreeRTOS config). I’ve ported FreeRTOS pvPortMalloc() accesses to Linux stdlib malloc() btw in the Linux port.

yes, I wouldn’t expect the mutex itself to be corrupted, just the pointer to it in the object allocated on the application stack. My suspicion is that whatever the overtrampled object points to in the location where itsMutex is expected to FreeRTOS looks (for whatever reasons) close enough to an object so that it tries to look at it as a valid mutex, then throws the assert.

It’s basically the only scenario I can think of that could throw the assert, and yes, it can happen intermittendly. If the random asserts happen reliably in some known time frame, I’d bump the stacks on your test environment temporarily and see if you still see the asserts. It should be a test that doesn’t cost you much but can spare you a good deal of headaches if it succeeds.

Thank you for involving yourself :slight_smile: All inputs are welcome!
Yep, I agree that the stack is the suspect but I’ve already bumped the stack size up and down, with exactly the same assert. And I tried to remove this particular mutex (I don’t really need them in the unit tests) to get hit by an assert in another mutex release in a different unit. The mutex pointer is owned by the C++ class instance, so I believe that’s also in the heap. So I could have a problem with heap being overwritten, but then I’d not expect an assert from FreeRTOS to fail, but fatal access crashes in Linux when accessing the mutex pointer.

well, I looked through the code again, and it seerms very unlikely that there is a problem with the mutex object itself (among other things because in order to reach the assert, the queue type must be queueQUEUE_IS_MUTEX which is unlikely for an overtrampled pointer).

Thus the only possible scenario for the assert to catch is when pxCurrentTCB gets overwritten. I would probably dump it periodically to see if that looks like a valid pointer.

1 Like

Actually it was interesting to print out the pxTCB and the pxCurrentTCB before the mutex assert! There are two very similar tests which goes like:

  1. Create instance of a class with a FreeRTOS task
  2. Inititialize instance to create the task
  3. Sleep short periods until the task reports it is operating (from the task itself)
  4. Call a function of the class which creates a work package which is put in the tasks queue
  5. Verify that the work package was handled by the task and work completed

In these particular test the first requests a frequency, which ends in a stub class which loops the requested frequency back as a measured frequency.
The second test is similar but sends a duty cycle ratio request instead.


[----------] 2 tests from CicFreqManagerTest
[ RUN      ] CicFreqManagerTest.SetGetFrequency
[os/FreeRTOS/tasks.c:773] 7FA19C003BD0          <---- Created task
[cic_freq_mgr_test.cpp:96] Settings file: 'cic_freq_mgr_settings.txt'
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70         <---- gtest task
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70         <---- pxTCB
[os/FreeRTOS/tasks.c:4174] 7FA1A4019B70         <---- pxCurrentTCB
[os/FreeRTOS/tasks.c:4175] 1
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
...
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4174] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4175] 1
[os/FreeRTOS/tasks.c:4711] 7FA19C003BD0
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA19C003BD0
[os/FreeRTOS/tasks.c:4174] 7FA19C003BD0
[os/FreeRTOS/tasks.c:4175] 1
[       OK ] CicFreqManagerTest.SetGetFrequency (1170 ms)
[ RUN      ] CicFreqManagerTest.SetGetDuty
[os/FreeRTOS/tasks.c:773] 7FA19C003BD0          <---- Created task
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4174] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4175] 1
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4174] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4175] 1
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4174] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4175] 1
[os/FreeRTOS/tasks.c:4711] 7FA1A4019B70
[os/FreeRTOS/tasks.c:4712] 1
[os/FreeRTOS/tasks.c:4173] 7FA1A4019B70     <---- pxTCB 
[os/FreeRTOS/tasks.c:4174] 7FA19C003BD0     <---- pxCurrentTCB
[os/FreeRTOS/tasks.c:4175] 1
ASSERT-os/FreeRTOS/tasks.c-4176
Aborted (core dumped)