uxTaskGetStackHighWaterMark is too optimistic?

jankok wrote on Wednesday, March 21, 2018:

I have an application where the high water mark is 16 after many minutes of using the app. But if I reduce the allocated stack size by just 1 byte, the app hangs within seconds of starting the app. Why?

I would think the “16” means that there are 16 bytes of stack space that have never been used since the task started, therefore I should be able to reduce the stack size by 15 and still have 1 byte unused. But any attempt to get the high water mark below 16 causes the app to hang within seconds or at most a few minutes.

I’m playing with this app based on an Arduino Unu:
https://learn.adafruit.com/arduin-o-phone-arduino-powered-diy-cellphone/parts-and-prep . I’ve added FreeRTOS and some code to display the current time. There are three tasks:

The idle task, which does nothing except call uxTaskGetStackHighWaterMark(NULL) and save the value in a global, volatile variable.

The “GPS” task (priority 0), which reads the time from a GPS module and displays it on the LCD display. It also calls uxTaskGetStackHighWaterMark(NULL) and saves the value in a global, volatile variable.

The “Phone” task (priority 1), which monitors the touch sense screen, handles touches, and updates the display as the user enters a phone number. It also keeps track of its high water mark, and displays all the high water marks whenever the user touches the screen.

The GPS task communicates with the GPS module using SoftwareSerial, which can’t tolerate interruptions from the Phone task, so I’ve enclosed all communications with the GPS module with vTaskSuspend(PhoneTaskHandle) and vTaskResume(PhoneTaskHandle).

Both the GPS and the Phone task access the display, so I have DisplaySemaphore = xSemaphoreCreateMutex(); and I enclose all communication with the display with xSemaphoreTake(DisplaySemaphore, portMAX_DELAY) and xSemaphoreGive(DisplaySemaphore).

I’ve changed these configurations from the Amazon FreeRTOSConfig.h V10.0.0:
configMAX_PRIORITIES 2
configMINIMAL_STACK_SIZE 74
configCHECK_FOR_STACK_OVERFLOW 2
configUSE_TIMERS 0 // The Arduino port uses the watchdog timer for scheduling.

I currently have the GPS task stack size set at 236 and the Phone task at 251. With these settings, the high water marks for the processes, after many minutes of running and using the app, are Idle: 31, GPS: 32, Phone: 32.

If I decrease any of those stack sizes by 17 (which should leave 14 or 15 bytes of headroom), the app hangs within seconds (usually) or a few minutes of use. Why is that?

Edit: When adjusting stack space, sometimes the app will hang and the Arduino LED will blink, indicating insufficient stack or heap space. But usually when the app hangs, the LED does NOT blink.

rtel wrote on Wednesday, March 21, 2018:

The stack is not necessary consumed one byte at a time though -
especially on 32-bit devices where a value of 16 would mean you only
have four spaced left, although I note you are using an 8-bit device.

jankok wrote on Wednesday, March 21, 2018:

Well, according to https://www.freertos.org/uxTaskGetStackHighWaterMark.html, “The value returned is the high water mark in words (for example, on a 32 bit machine a return value of 1 would indicate that 4 bytes of stack were unused).”

So on a 32-bit device, a high water value of 16 would mean there are 64 bytes of stack space that haven’t been used, right?

But this doesn’t answer my original question.

rtel wrote on Wednesday, March 21, 2018:

Ah, ok, perhaps I was making assumptions.

jankok wrote on Wednesday, March 21, 2018:

Ok, I see why trying to get the high water mark to be 15 causes a hang. I have configCHECK_FOR_STACK_OVERFLOW set to 2. So as soon as one of the last 16 bytes on the stack gets clobbered, a stack overflow is detected at the next task switch event.

But that doesn’t explain why the LED doesn’t blink when that happens.

It also doesn’t explain why the app seems to run reliably when the stack sizes are tuned so the high water marks are 17 for all 3 tasks, but then I set configCHECK_FOR_STACK_OVERFLOW to 1 and decrease the stack sizes for the GPS and Phone tasks by 16 (so there should be 1 byte of headroom for those two tasks), the app hangs soon after running it.