vTaskDelay cause system halt

vicui wrote on Saturday, August 17, 2013:

Hi:

I use STM32 chip to receive data vai 485 bus. and my receive routine as followed:
static void rs485_plus_rx_routine(void)
{
rs485_plus_rx_flag = 0;
rs485_plus_rx_count = 0;
rxtimeout_plus = 0;
cmdrsppacketsize_plus = 0;

while (!rs485_plus_rx_flag)
{
if (rxtimeout_plus ++ > SLIP_RX_TIMEOUT_PLUS)
{
break;
}
vTaskDelay(1);
   }
}
after running some hours, system halt, I printf some debug info and found system halt in vTaskDelay(1) function. never exit. Who know why ?

Vincent

rtel wrote on Saturday, August 17, 2013:

I’m not sure that simple debug printf()'ing, which will radically change the behaviour of the code you are testing, is accurate enough to know exactly where the problem lies.  For example, vTaskDelay() will cause a switch to another task - so you don’t know if the problem occurs when you call the function, in the function, or the task it switches to.

Are you using FreeRTOS V7.5.2?  If not I would recommend switching to it as it has some extra diagnostic functionality, some of which was put in specifically for users of the STM32 peripheral driver library - which has some unique characteristics.

When you are using V7.5.2 ensure configASSERT() is defined, along with the normal stack overflow checking, etc.

Also look at http://www.freertos.org/FAQHelp.html

Regards.

vicui wrote on Saturday, August 17, 2013:

HI Rechard:

I am using 7.4.2.  the total stack size of all task is 22K, but i set system heap to 32K . I don’t know it is enough or not .

vincent

davedoors wrote on Saturday, August 17, 2013:

That would seem like a lot of stack, so it is probably ok, but we don’t know how your application is using the stack. If you followed Richard’s suggestions you would know, rather than still have to guess.

vicui wrote on Sunday, August 18, 2013:

Hi:

I running my application with FreeRTOSv7.5.2 and enable configASSERT, but i don’t see any ASSERT when system halt.
does it mean it is my application problem ?
xTaskCreate(dhcp_task,     “DHCPDOG”,  configMINIMAL_STACK_SIZE*3, NULL, DHCP_TASK_PRIO,     &dhcphandle);
xTaskCreate(oled_task, “OLED”, configMINIMAL_STACK_SIZE*2, NULL, OLED_TASK_PRIO,     &oledhandle);
xTaskCreate(log_task, “LOG”, configMINIMAL_STACK_SIZE*2, NULL, LOG_TASK_PRIO,     &loghandle);
xTaskCreate(serial_task, “COM”, configMINIMAL_STACK_SIZE*4, NULL, COM_TASK_PRIO,     NULL);
xTaskCreate(relay_task, “RELAY”,    configMINIMAL_STACK_SIZE*3,     NULL, RS485_TASK_PRIO,     &pduhandle);
xTaskCreate(daisyqna_task,“DSYQNA”, configMINIMAL_STACK_SIZE*3,     NULL, DAISYCHAIN_TASK_PRIO, NULL);
xTaskCreate(sensor_task, “SENSOR”, configMINIMAL_STACK_SIZE*3, NULL, SENSOR_TASK_PRIO, &sensorhandle);

xTaskCreate(svrrch_task, “SVRRCH”, configMINIMAL_STACK_SIZE*2, NULL, SVRRCH_TASK_PRIO, &svrhandle);
xTaskCreate(ftp_task, “FTP”, configMINIMAL_STACK_SIZE*3, NULL, FTP_THREAD_PRIO, &ftphandle);
xTaskCreate(telnet_task, “TELNET”, configMINIMAL_STACK_SIZE*4, NULL, TELNET_THREAD_PRIO, &telnethandle);
xTaskCreate(ssh_task, “SSH”, configMINIMAL_STACK_SIZE*5, NULL, SSHSERVER_THREAD_PRIO, &sshhandle);
                    xTaskCreate(http_task, “HTTP”, configMINIMAL_STACK_SIZE*6, NULL, HTTPSERVER_THREAD_PRIO, &httphandle);
above is my all tasks except TCPIP and ETHERNET tasks.

davedoors wrote on Sunday, August 18, 2013:

does it mean it is my application problem

Not definitely, but to be blunt, most likely. I can’t remember the last time a support request originated from a bug in the code. Most problems arise from bad configuration options or just simple application coding errors. If you think about it that is logical though. The kernel is a small piece of code used by thousands of people over a long period of time. Application code is normally much larger and brand new. All code has bugs in it though.

I would recommend cutting your application down to its bones. Run just a few tasks, test, then incrementally add more until the problem recurs, then back track a little and debug.

vicui wrote on Sunday, August 18, 2013:

yes, my tasks is more complex.  before I don’t add DAISYCHAIN task, all tasks work well in single device.
once adding DAISYCHAIN task, system halt happen. DAISYCHAIN task use other 485 channel to poll data. there is other task use one 485 channel to polling data. I seperate the data buffer in enough array. i trace the DAISYCHAIN task to find where system.  the result is I post previous.
I start to guess FATFS conf, there is a File REENTRANT feature,I enable it…. don’t know if it is root cause.

vincent

richard_damon wrote on Sunday, August 18, 2013:

First suggestion, make sure you have turned on stack overflow checking to see if every task has enough stack as stack overflows can cause all sorts of strange problems.

Related to this, you are defining all your stack sizes as configMINIMUM_STACK_SIZE*n, this is normally a bad idea. configMINIMUM_STACK_SIZE is set to be enough to cover the system overhead and some minimal amount of base stack for the tasks (enough for the idle task). If your task needs additional space, that space is almost assuredly not based on this value, but an absolute number of bytes, so your stack sizes should be of the form  configMINIMUM_STACK_SIZE+n (with a different n).

The second likely source of the problem is an interrupt with the wrong priority or using a non FromISR API routine, which can cause data corruption inside FreeRTOS.

Lastly, looking at the code for the function, it looks like it would be better for the ISR to use a semaphore to signal when data was available, rather than polling on a data flag. (Unless there is some reason the ISR can not have an interrupt priority compatible with this). By polling you will not start to process the data until the time tick after the data arrives, and are using CPU time to run the loop while waiting.

And finally, debugging through routines like vTaskDelay can be tricky as these routines do context switches inside of them selves, so you can easily end up inside another tasks operation.

vicui wrote on Sunday, August 18, 2013:

I set 485 and 485plus and ethernet NVIC as following, would you help check it is problem or not ?
static void RS485_NVIC_Config(void)
{
    NVIC_InitTypeDef NVIC_InitStructure;

    NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
    NVIC_InitStructure.NVIC_IRQChannel = USART1_IRQn;
    NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
    NVIC_Init(&NVIC_InitStructure);
}
static void RS485_plus_NVIC_Config(void)
{
    NVIC_InitTypeDef NVIC_InitStructure;

    NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
    NVIC_InitStructure.NVIC_IRQChannel = USART3_IRQn;
    NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
    NVIC_Init(&NVIC_InitStructure);
}
void ETH_NVIC_Config(void)
{
  NVIC_InitTypeDef   NVIC_InitStructure;

  /* 2 bit for pre-emption priority, 2 bits for subpriority */
  NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
 
  /* Enable the Ethernet global Interrupt */
  NVIC_InitStructure.NVIC_IRQChannel = ETH_IRQn;
  NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 2;
  NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
  NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
  NVIC_Init(&NVIC_InitStructure);   
}

rtel wrote on Monday, August 19, 2013:

Your code is wrong, and FreeRTOS V7.5.2 has traps specifically to catch this error (if you use it with configASSERT()) defined.

Go to the link I posted in my first reply.  At the top of that page you will see “a special note for Cortex-M3 users”, with a link to “A page dedicated to explaining the ARM Cortex-M interrupt behaviour”.  On that page you will find some bold red text that says “A special note for STM32 users” — that note has the answer you are looking for.

Regards.

vicui wrote on Monday, August 19, 2013:

I saw the node before.before RTOS schedule , it must set NVIC_PriorityGroupConfig( NVIC_PriorityGroup_4 );

but, when could I set NVIC_PriorityGroupConfig( NVIC_PriorityGroup_2); after OS running ? or only set  NVIC_PriorityGroup_4 as default value ?

I enable configASSERT, but i don’t catch the trap .
if the interrupt configure is wrong, why does it run so long to be hatl ?

rtel wrote on Monday, August 19, 2013:

when could I set NVIC_PriorityGroupConfig( NVIC_PriorityGroup_2); after OS running ?

You can’t use that setting very easily.  All priority bits must be set to preemption priority if you want a simple system.  If any other setting is used then the interrupt masking to correctly implement interrupt nesting will  be very complex.

I enable configASSERT, but i don’t catch the trap .

Please post your configASSERT() implementation.

Regards.

vicui wrote on Monday, August 19, 2013:

understand .
I only define configASSERT  printf("%s %d \n", __FILE__, __LINE__) in freertosconfig.h

vicui wrote on Monday, August 19, 2013:

i set NVIC_PriorityGroupConfig( NVIC_PriorityGroup_4); before system run , system still halt after running some while

vincent

davedoors wrote on Monday, August 19, 2013:

I only define configASSERT  printf(“%s %d \n”, __FILE__, __LINE__) in freertosconfig.h

and are you sure printf() is working. Besides which, that particular test is done in an interrupt, so printf() is probably not going to work as you expect and may even overflow the interrupt stack.

Normally configASSERT() will not return because if it is triggered there is an error that needs attention. If you add a null loop to your definition (for(;;):wink: then you will know when it is called because your code will stop in the loop.

vicui wrote on Monday, August 19, 2013:

I change my NVIC code as following
static void RS485_NVIC_Config(void)
{
    NVIC_InitTypeDef NVIC_InitStructure;

    NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
    NVIC_InitStructure.NVIC_IRQChannel = USART1_IRQn;
    NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority =1;
    NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
    NVIC_Init(&NVIC_InitStructure);
}
static void RS485_plus_NVIC_Config(void)
{
    NVIC_InitTypeDef NVIC_InitStructure;

    NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
    NVIC_InitStructure.NVIC_IRQChannel = USART3_IRQn;
    NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority =3;
    NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
    NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
    NVIC_Init(&NVIC_InitStructure);
}
void ETH_NVIC_Config(void)
{
  NVIC_InitTypeDef   NVIC_InitStructure;

  /* 2 bit for pre-emption priority, 2 bits for subpriority */
  NVIC_PriorityGroupConfig(NVIC_PriorityGroup_2);
 
  /* Enable the Ethernet global Interrupt */
  NVIC_InitStructure.NVIC_IRQChannel = ETH_IRQn;
  NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 2;
  NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
  NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
  NVIC_Init(&NVIC_InitStructure);   
}
system halt in the line …\USER\FreeRTOS_v7.5.2\queue.c 560
could you help to check this ?

vincent

vicui wrote on Tuesday, August 20, 2013:

Hi:

is there a latest port.c for MDK ? I found that i always use old port.c . so ,I don’t catch the trap .

vicui wrote on Tuesday, August 20, 2013:

I use the FreeRTOSV7.5.2\FreeRTOS\Source\portable\GCC\ARM_CM3 port.c and portmirco.h to compile , and compile fail. more error and worning happen.

vicui wrote on Tuesday, August 20, 2013:

do need update it MDKv4.72a ?

vicui wrote on Tuesday, August 20, 2013:

Post the compiler ERROR and WARNING
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(135): warning:  #1207-D: attribute “naked” ignored
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(137): warning:  #1207-D: attribute “naked” ignored
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(142): warning:  #1207-D: attribute “naked” ignored
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(177): warning:  #191-D: type qualifier is meaningless on cast type
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(218): warning:  #1267-D: Implicit physical register R3 should be defined as a variable
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(218): error:  #1086: Operand is wrong type
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(218): error:  #114: label “pxCurrentTCBConst2” was referenced but not defined
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(232): warning:  #1267-D: Implicit physical register R0 should be defined as a variable
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(232): error:  #29: expected an expression
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(248): warning:  #191-D: type qualifier is meaningless on cast type
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(346): warning:  #1207-D: attribute “naked” ignored
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(354): error:  #18: expected a “)”
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(363): warning:  #1207-D: attribute “naked” ignored
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(369): error:  #18: expected a “)”
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(407): error:  #18: expected a “)”
…\USER\FreeRTOS_v7.5.2\portable\MDK-ARM\ARM_CM3\port.c(592): error:  #18: expected a “)”

assemable code compile error