FreeRTOS 8.2.1 mutex

rzingler wrote on Saturday, November 19, 2016:

Hi,

I’m having problems with a mutex, used to protect the access to an external Flash memory. Randomly, a task that uses the mutex (I don’t know if only one or more than one), for an unknown reason, accesses the external Flash memory without first taking the mutex, and, consequently, corrupts the memory data. I discover that by turning on a LED immediately after taking the mutex, and turning the LED off again before giving the mutex. When any task, after taking the mutex, detect the LED already ON, I show an error message, indicating the error (I made it on all tasks that access the mutex/external Flash).
I have already activated stack overflow detection, an all the tasks are OK, even after the problem occurs.
The problem sometimes appears in less than 1 hour, sometimes after almost a day.

The project characteristics are as follows:

  • FreeRTOS 8.2.1
  • KEIL uVision4 IDE
  • LPC2138 (ARM7) microcontroller
  • 768 bytes of stack in Supervisor and IRQ modes (all other modes have 8 bytes stack sizes).
  • Level 0 optimization (-O0)
  • ARM/Thumb Interworking enabled
  • Thumb-mode code generation
  • Macros taskENTER_CRITICAL() and taskEXIT_CRITICAL() are called on the application code (can they corrupt/alter the stack?).
  • Memory management with heap_1.c.

Here is the task used to take the mutex and store the data on external Flash.

static void taskStoreReadings( void *pvParameters ){	
	unsigned char		
		*ptr;		
	unsigned short	
        indice,	
		sectorNumber,
		vectorIndex;
	unsigned int
		memAddr;
	struct elecSensorData
		elecSensor;	
	struct rainWindLightData
		mecSensor;
	struct sensorLogData	
		sensorLog;	 		
		
	(void) pvParameters;	

	for(;;){
	
		//
		while( xQueueReceive(elecSensorQueue, &elecSensor, portMAX_DELAY) == pdFALSE );  		
		while( xQueueReceive(rainWindLightQueue, &mecSensor, portMAX_DELAY) == pdFALSE );	

		//
		while( xSemaphoreTake(spiSemaphore, portMAX_DELAY) == pdFALSE );
		
		sensorLog.elecSensor= elecSensor;
		sensorLog.mecLightSensor= mecSensor;
		sensorLog.rtc= rtcData;		
		
		//calcula endereco de inicio do log na flash externa	
		memAddr= (unsigned int)ptrProd * SENSOR_LOG_SIZE;
		memAddr += SENSOR_LOG_ADDR_INI;  			
	
		//calcula o indice de inicio do log dentro do array
		vectorIndex= memAddr % SPIMEM_SECTOR_SIZE;	
	
		//calcula a qual setor o endereco pertence
		sectorNumber= memAddr / SPIMEM_SECTOR_SIZE;	
        
        //
        spiMemReadSector( sectorNumber, extMemBuf );	

        //copia os dados do registro p/ o vetor 
		ptr= (unsigned char *)&sensorLog;
		for( indice= 0; indice < sizeof(sensorLog); indice++ ){
			extMemBuf[vectorIndex++]= *ptr;
			ptr++;
		}

        //
        spiMemEraseSector( sectorNumber );
	    spiMemWriteSector( sectorNumber, extMemBuf );		

		//atualiza ponteiro de fim de fila
		ptrProd++;
		if( ptrProd == MAX_LOGS_SENSORES )
			ptrProd= 0;	        
				
		//
		xSemaphoreGive( spiSemaphore );	  				
		
	}

}  

Any help would be appreciated.

Thanks.

rtel wrote on Sunday, November 20, 2016:

Thanks for providing the detailed information - I wish more posters
would do that!

The cost you posted does not show where the LED is set and cleared. Am
I correct in assuming you set it immediately after the following line:

while( xSemaphoreTake(spiSemaphore, portMAX_DELAY) == pdFALSE );

and clear it again immediately before the line:

xSemaphoreGive( spiSemaphore );

Which other tasks use the mutex? Is it more instances of
taskStoreReadings(), or different implementations. If the latter could
you show the other tasks too (cut down if necessary).

Is the mutex used from an interrupt at all?

rzingler wrote on Sunday, November 20, 2016:

Thanks for the quick response.

I’ve already removed the lines that sets/clears the microcontroller’s LED pin (it was just a test to indicate me something goes wrong). But, yes, they were placed as you presume (setting immediatley after taking the mutex and clearing immediately before giving the mutex back).

There are not more instances of taskStoreReadings().

The mutex is not used from any interrupt.

Below are the other tasks that uses the mutex.

static void taskBlinkLed( void *pvParameters ){	
    unsigned short
        seconds= 0;	
    unsigned int
        opTime;
	
	for(;;){ 			
		
        //descomentar depois de terminar o debug
		prvToggleOnBoardLED();	

        if( BIT_TEST(RUNLED_VAL_REG, RUNLED_BIT) ){
            seconds++;
            if( seconds == 3600 ){
                seconds= 0;
                while( xSemaphoreTake(spiSemaphore, portMAX_DELAY) == pdFALSE );
                           
                spiMemReadSector( 0, extMemBuf );
                
                //extrai o tempo atual do buffer lido da memoria externa
                opTime= (unsigned int)extMemBuf[ADDR_OP_TIME + 0] << 24;
                opTime |= (unsigned int)extMemBuf[ADDR_OP_TIME + 1] << 16;
                opTime |= (unsigned int)extMemBuf[ADDR_OP_TIME + 2] << 8;
                opTime |= (unsigned int)extMemBuf[ADDR_OP_TIME + 3] << 0;

                //incrementa em 60 minutos
                opTime += 60;

                //
                extMemBuf[ADDR_OP_TIME + 0]= opTime >> 24;
                extMemBuf[ADDR_OP_TIME + 1]= opTime >> 16;
                extMemBuf[ADDR_OP_TIME + 2]= opTime >> 8;
                extMemBuf[ADDR_OP_TIME + 3]= opTime >> 0;

                spiMemEraseSector( 0 );

                spiMemWriteSector( 0, extMemBuf );
                
				
                xSemaphoreGive( spiSemaphore );

            }

        }
		
		vTaskDelay( (TickType_t)500 / portTICK_PERIOD_MS );		
		
	}

}
static void taskSendLog( void *pvParameters ){
	unsigned char
		mac[6],
		index, 		
		*ptr,		
		count;        
	char
		strMac[12 + 1];		
	unsigned short  		
		regLength,
		logsToSend,
		id,
        ptrConsumidor;
	unsigned int
		regAddr;
	signed int
		lcdValue; 	
	struct httpPacket
		packet,
		*httpPacketPtr;
	struct sensorLogData	
		sensorLog;
		
		
	(void) pvParameters;

	//
	httpPacketPtr= &packet;

	//
	ds2411GetSerialNum( mac );
	regLength= 0;
	for( index= 0; index < 6; index++ )
		regLength += sprintf( strMac + regLength, "%02X", mac[index] );            	

	//
	packet.method= HTTP_POST;  	
	packet.port= 80;  	
	packet.data= httpData; 	
	
	//
	for(;;){

		//
		while( xSemaphoreTake(spiSemaphore, portMAX_DELAY) == pdFALSE );
       
        
    	index= 0;
    	do{
    		packet.url[index]= spiMemReadByte( ADDR_URL + index );
    		index++;
    	}
    	while( packet.url[index - 1] );   
				
		//verifica quantos logs irá enviar desta vez 		
		logsToSend= ptrProd >= ptrCons? ptrProd - ptrCons: (MAX_LOGS_SENSORES - ptrCons) + ptrProd;
		if( logsToSend > 5 )
			logsToSend= 5;	                  	  	

		//usa o valor do TC1 para gerar o ID da requisicao
		id= T1TC % 1000;				

		//		        
        regLength= sprintf( httpData, "{\"method\":\"gravaLogWN\",\"params\":[\"%s\",{\"ident\":\"%s\",\"opTime\":%u,\"reg\":[", chaveLogin, strMac, spiMemRead32Bits(ADDR_OP_TIME) / 60 );

        //		
        ptrConsumidor= ptrCons;   

		//loop p/ montar o pacote com os logs
		for( count= 1; count <= logsToSend; count++ ){           
                
			//le o registro da Flash externa
			ptr= (unsigned char *)&sensorLog;
			regAddr= (unsigned int) SENSOR_LOG_ADDR_INI + (ptrConsumidor * SENSOR_LOG_SIZE);
			for( index= 0; index < sizeof(sensorLog); index++ ){
				*ptr= spiMemReadByte( regAddr );                
				regAddr++;
				ptr++;
			}            
			
			ptrConsumidor++;
			if( ptrConsumidor == MAX_LOGS_SENSORES )
				ptrConsumidor= 0;	
	
			
			regLength += sprintf( httpData + regLength, "{\"dthr\":\"%u-%02u-%02u %02u:%02u:%02u\",", 
			sensorLog.rtc.year + 2000, sensorLog.rtc.month, sensorLog.rtc.date, sensorLog.rtc.hour, sensorLog.rtc.minute, sensorLog.rtc.second );
					
			
			lcdValue= sensorLog.elecSensor.humidity * 10;					
			regLength += sprintf( httpData + regLength, "\"hum\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
			lcdValue= sensorLog.elecSensor.pressure * 10;						
			regLength += sprintf( httpData + regLength, "\"press\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
			lcdValue= sensorLog.elecSensor.radiation * 10;					
			regLength += sprintf( httpData + regLength, "\"rad\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
			lcdValue= sensorLog.elecSensor.temp * 10;	
			if( lcdValue < 0 ){
				lcdValue *= -1;
				regLength += sprintf( httpData + regLength, "\"temp\":-%u.%u,", lcdValue / 10, lcdValue % 10 );		
			}
			else
				regLength += sprintf( httpData + regLength, "\"temp\":%u.%u,", lcdValue / 10, lcdValue % 10 );		
			
			
			lcdValue= sensorLog.elecSensor.co2Level * 10;		
			regLength += sprintf( httpData + regLength, "\"CO2\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
			lcdValue= sensorLog.elecSensor.tvocLevel * 10;	
			regLength += sprintf( httpData + regLength, "\"TVOC\":%u.%u,", lcdValue / 10, lcdValue % 10 );	
	
			
			lcdValue= sensorLog.elecSensor.batVoltage * 10;	
			regLength += sprintf( httpData + regLength, "\"bat\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
				
			lcdValue= sensorLog.mecLightSensor.rainFall * 10;				
			regLength += sprintf( httpData + regLength, "\"rainFall\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
					
			regLength += sprintf( httpData + regLength, "\"lightDist\":%u,", sensorLog.mecLightSensor.lightDist );
			
			
			regLength += sprintf( httpData + regLength, "\"lightCount\":%u,", sensorLog.mecLightSensor.lightCount );
			
			
			lcdValue= sensorLog.mecLightSensor.windAvgSpeed * 10;				
			regLength += sprintf( httpData + regLength, "\"wAvgSpeed\":%u.%u,", lcdValue / 10, lcdValue % 10 );
			
			
			lcdValue= sensorLog.mecLightSensor.windPeakSpeed * 10;				
			regLength += sprintf( httpData + regLength, "\"wPeakSpeed\":%u.%u,", lcdValue / 10, lcdValue % 10 );		
			
			
			regLength += sprintf( httpData + regLength, "%s", "\"wDir\":[" );			
			for( index= 0; index < 8; index++ ){
				if( index != 7 )
					regLength += sprintf( httpData + regLength, "%u,", sensorLog.mecLightSensor.windDirectionCount[index] );
				else
					regLength += sprintf( httpData + regLength, "%u", sensorLog.mecLightSensor.windDirectionCount[index] );
			}
			regLength += sprintf( httpData + regLength, "%s", "]}" );
			
			if( count != logsToSend )
				regLength += sprintf( httpData + regLength, "%s", "," );
			else
				regLength += sprintf( httpData + regLength, "]}],\"id\":%u}\n", id );			 	
			
		}         
    
		//devolve o mutex de acesso à Flash externa
		xSemaphoreGive( spiSemaphore );		
		
		//somente executa se existe(m) log(s) pendente(s)
		if( logsToSend ){ 	    
            
            while( xQueueReceive(serverRxQueue, &ptr, 0) == pdTRUE );

            //fica em loop para ter certeza de que o item foi postado na fila com sucesso.
            //O processo que gerencia o modem irá processar este item, pois fica "escutando" a fila.
            while( xQueueSend(sendHttpQueue, &httpPacketPtr, portMAX_DELAY) != pdTRUE );
            
            //salva numa variavel global o indice de consumo.
            //Este indice será usado na funcao que processa a resposta. A variavel global "ptrCons" irá assumir o valor de "logConsumeIndex" caso resposta com sucesso.
            /*
            rzingler: 
                - talvez passar a variavel local "ptrConsumidor" como parametro para a funcao httpProcessServerResponse().
                - Evitaria a necessidade de usar uma variavel global p/ manter o valor.
            */
            logConsumeIndex= ptrConsumidor; 

            //funcao que processa a resposta do servidor
            httpProcessServerResponse();        

        }

        //intervalo entre uma e outra verificacao por logs pendentes
        vTaskDelay( 10000 / portTICK_PERIOD_MS ); 		        
		 	
	}

}

The last two tasks are defined in a different file, so the mutex created in main.c is passed as a parameter, as below (in the third line inside the function, the local variable spiSemaphore points to the same place as the mutex created in main.c):

void httpRequestInit( QueueHandle_t *serverResponseQueue, QueueHandle_t *lcdQueue, SemaphoreHandle_t *spiSem ){	 	
	
	ptrModemConf= &modemConf;

	serverRxQueue= *serverResponseQueue;	

	spiSemaphore= *spiSem;    
	
	//obtem a fila de envio criada pelo arquivo do modem
	sendHttpQueue= modemGetSendDataQueue();	  				
	
	//
	ptrMsg= &msg;
	lcdMsgQueue= *lcdQueue;

	//	
    xTaskCreate( taskLogin, "login", configMINIMAL_STACK_SIZE * 2, NULL, 1, &loginHandle );

}

davedoors wrote on Sunday, November 20, 2016:

A lot of data is being written into httpData. Could it overflow?

Which TCP/IP stack are you using? Does it try and free memory (noting you use heap_1 which cant free memory).

Maybe the mutex is getting corrupted. Try adding a new global variable:

TaskHandle_t Holder = NULL;

After every call to take the mutex (xSemaphoreTake) add the following:

// No other task should have the mutex
configASSERT(Holder==NULL);

// Set Holder to show a task has the mutex
Holder = xTaskGetCurrentTaskHandle();

// Check the mutex agrees who the holder is
configASSERT(Holder==xSemaphoreGetMutexHolder(spiSemaphore));

// The mutex should be empty
configASSERT(uxSemaphoreGetCount(spiSemaphore)==0);

// Should not be able to take the mutex again (unless its a recursive mutex)
configASSERT(xSemaphoreTake(spiSemaphore, 0)==pdFAIL);

Clear Holder again immediately before giving the mutex back:

Holder = NULL;

Do any of the configASSERT() calls fail? If so, look at the spiSemaphore structure in the debugger. Does it look sane? You might have to cast spiSemaphore to:

((xQUEUE*)spiSemaphore)

to view it.

rzingler wrote on Saturday, November 26, 2016:

Hi, Dave.

I included the configASSERT macro calls as you suggested, and they haven’t failed when the problem appears. Could it be perhaps a stack overflow in any of the ARM7 modes? The task stacks are not overflowing (I’ve the overflow detection enabled).

Thanks.

davedoors wrote on Saturday, November 26, 2016:

If the asserts dont fail then its probably not the mutex and something else is happening. It could be one of the other stacks, IRQ or SVC, its easy enough to increase those to see if the behavior changes.

rzingler wrote on Monday, December 12, 2016:

Hi, Dave.

It´s working now. The problem was generated by some macros used to set/clear variables and registers bits. These macros did read-modify-write operations, and sometimes, due to the multi-task environment, a call was made after the current call has ending its work, and so corrupted some GPIO´s (perhaps the pin used to control the CS signal of the external memory). This behavior deceived me also in the case of the debug LED I mentioned.

Thanks for the help.