Hard fault after load task stack from storage

Sorry my English is little.

I am running FreeRTOS on an MSP432 Launchpad, which features a Cortex M4F MCU. For a specific purpose, I need to save all task stacks and TCBs to secondary storage and then load them back. Below is the entire process:

Upon completion of the saving process, the program will enter a forever loop, allowing me to halt the IDE. After rebuilding and restarting the program, the operating system will load the task stacks and TCBs from storage and add them to the ready list.

Initially, I added three tasks to the OS, and everything worked fine. However, upon adding another task that includes a float array, the program encountered a hard fault upon restarting.

All four tasks are from MiBench. The three tasks that worked fine are bitcnt, basicmath, and strsch. Strsch performs string search with a predefined string, thus involving a character array.
The fourth task is FFT, which performs Fourier transform with specific input, building 6 float arrays with these inputs. However, the FFT task enters a hard fault.

I also ran a test task that utilized an integer array for simple math operations, and it worked without any issues. However, when I changed the integer array to a float array, a hard fault occurred.

void testArrrayTask()
{
    //change array type to int, this task will work fine.
    float a[100];
    float b[100];
    float c[100];
    int i;
    int progress = 0;
    while(1)
    {
        for(i=0;i<100;i++)
        {
            a[i] = rand()%1000;
            b[i] = rand()%1000;
            c[i] = a[i]*b[i] + a[i]/b[i] + (a[i]-b[i]);
        }
        progress+=1;
    }
}

So I think the issue lies with the float type array. Although basicmath involves float operations, it doesn’t store the results in an array.

For the purpose of simply saving and loading task stacks and TCBs, I opted not to use the built-in heap4 for memory management; instead, I used a large uint8_t array. Task stacks and TCB memory are allocated from this array during task creation.

static unsigned char stackMEM[STACK_SIZE];
static unsigned char tcbMEM[sizeof( tskTCB ) * TASK_NUM];

void * getStackAddr(int taskID)
{
    // add switch case to allocate memory for task
    switch(taskID)
    {
        case IDLE:
            return &idleMEM[0];
        case BASICMATH:
            return &stackMEM[0];
        case BITCNTS:
            return &stackMEM[BASICMATHMEM];
        case STRSRCH:
            return &stackMEM[BASICMATHMEM+BITCNTSMEM];
        case FFT:
            return &stackMEM[BASICMATHMEM+BITCNTSMEM+STRSRCHMEM];
    }
    return &stackMEM[0];
}
void * getTCBAddr(int taskID)
{
    return &tcbMEM[sizeof( tskTCB ) * taskID];
}

I am considering reverting to using heap4 in hopes of resolving this issue, but it may entail considerable work to implement the saving and loading process. Apart from this approach, are there any other solutions to address the float array issue?

Thank you for your attention.

Are you trying to bring the system to the state it was at the time of saving? If yes, are you also saving and restoring the registers?

Assuming that you are doing everything else correct, one possible reason could be that your stack is not large enough to hold these large arrays. Try reducing the size of arrays or increasing the stack size.

Thank you for your reply!

Yes. I am trying to bring the system to the state it was at the time of saving.
I save the state when the system enter vTaskSwitchContext(), before switch in next task.
So the registers should already on the top of task stack.

I have already tried 3 task without float array one and it work fine which the state restoring successfully. So I think the point where I save the state is correct.

The stack size I have already enlarge as large as possible, but it still remain same problem.

You may need to check if you need to do something to force the Float Registers to save, as the M4F has the ability to hold off the saving of the floating point registers in ISRs unless there is a need to do so. I would think that the port would handle that for switching tasks, but you might be intercepting before that happens.

That is a very good suggestion from @richard-damon. For a quick testing, you can disable lazy stacking by changing this line to the following:

#define portASPEN_AND_LSPEN_BITS              ( 0x1UL << 31UL )

Thank you for all your guys reply.
I think I will check the FPU register. I will update if this progress is done.

Update.

So I just chang the line.

#define portASPEN_AND_LSPEN_BITS              ( 0x1UL << 31UL )

After making this adjustment, I ran my project and used the register viewer to check the registers. Lazy Stacking is disabled, but somehow the same problem persists.

I think perhaps the data in registers s0-s15 is not being saved in the task stack? In port.asm, it only deals with registers s16-s31 and the FPU driver deals with s0-s15.

That is because the hardware stores s0-s15 itself when an interrupt happens and we need to only store s16-s31. I am not familiar with your state restore code but make sure that you are restoring all the FPU registers.

Update and maybe the final update.
The problem has been resolved, and I’d like to provide a detailed explanation of the solution.

Since I need to save the system state before I stop the system and then restore the system state when I restart the system.
The way I save the system state is just save all the tcb and all the task stack to secondary storage. In restore phase, just load these data to the memory.

Thre problem is the floating point operation task. These task can’t successfully restore and system jump to hard fault.
I check the arm doc and discover that hardware will automatically save the “half” of the fpu register (this function is called stacking and lazy stacking) and programmer should save the rest by themselves.

FreeRTOS souce code actually do these in port.asm. It seems like the issue is in the automatically save part so I disable the stacking and lazy stacking, and let the port.asm save all the fpu register.

;/* Is the task using the FPU context?  If so, push high vfp registers. */
	tst r14, #0x10
	it eq
	vstmdbeq r0!, {s0-s31}
.
.
.
;/* Is the task using the FPU context?  If so, pop the high vfp registers
	;too. */
	tst r14, #0x10
	it eq
	vldmiaeq r0!, {s0-s31}

Then the problem got solved and system successfully restore even with the floating point operation task.

I also comment out the fpu enable in port.c and do this with original board driver

/* Ensure the VFP is enabled - it should be anyway. */
//vPortEnableVFP();

/* Lazy save always. */
//*( portFPCCR ) |= portASPEN_AND_LSPEN_BITS;

//these two function are board driver
FPU_enableModule();
FPU_disableStacking();

Thanks again to aggarg and richard-damon for their help and advice!

Thank you for sharing your solution!