Cellular Library - Socket Receive Prefix Parsing function treats URC as a valid response

Hello All,
I am back with yet another parsing question for custom PLS63 modem implementation of the Cellular Library.

I started running into some race conditions after my latest fixes on the communication interface.

While receiving data if the URC indicating socket is ready for a read comes just before a read operation were to be performed but before URC is parsed then the URC is getting treated as a response by the prefix parsing function. I am not able to wait until the data ready callback is called because mbedTLS and coreMQTT-Agent libraries poll the read function.

Here is a capture that shows the issue.

I don’t have cellular debug logs because enabling logs masks this issue because of extra timing spent on the string manipulation functions and when data is copied to the logging thread.

The format of the URC is:
“\r\n^SISR: 0,1\r\n” where 0 is the socket index and 1 indicate there are 1 or more chars to read.
The format of the read response is:
“\r\n^SISR: 0,n\r\n\r\n” where n is between 0 and requested size.
So the only difference between the two is the fact that second line has no data for the URC.
1 is a valid number that can be received while reading data so I am not able to use that to differentiate between a URC and a response. I wish the modem used any non numeric character to indicate the URC :slight_smile:

My prefix parsing function is the following:

static CellularPktStatus_t _Cellular_SocketRecvDataPrefix( void * pCallbackContext,
                                                 char * pLine,
                                                 uint32_t lineLength,
                                                 char ** ppDataStart,
                                                 uint32_t * pDataLength )
{
    (void) pCallbackContext;
    *pDataLength = 0;
    *ppDataStart = NULL;
    if( ( pLine == NULL ) || 
        ( ppDataStart == NULL ) || 
        ( pDataLength == NULL ) ) {
        LogDebug(("_Cellular_SocketRecvDataPrefix: bad parameter"));
        return CELLULAR_PKT_STATUS_BAD_PARAM;
    }
    
    const size_t kMaxResponseLength = 16; // strlen("^SISR:10,15000\r\n") needs to be longer than strlen("^SISR:0,1\r\n5\r\n")
    char localLine[kMaxResponseLength + 1];
    // Cellular Library Parsing Functions call Validate String function 
    // which checks if the string is shorter than CELLULAR_AT_MAX_STRING_SIZE.
    // The data response can be much longer than this because of MTU size. 
    // Copy the prefix part of the response and perform parsing on that.
    memset(localLine, 0, kMaxResponseLength + 1);
    strncpy(localLine, pLine, kMaxResponseLength);

    char * pCursor = localLine; 
    bool bHasPrefix = false;
    CellularATError_t atCoreStatus = Cellular_ATStrStartWith(pCursor,"^SISR:",&bHasPrefix);
    if( atCoreStatus != CELLULAR_AT_SUCCESS ){
        LogError(("_Cellular_SocketRecvDataPrefix: Cellular_ATStrStartWith error %d", atCoreStatus));
        return  _Cellular_TranslateAtCoreStatus( atCoreStatus );
    } else if ( bHasPrefix != true) {
        return CELLULAR_PKT_STATUS_OK; // Haven't received enough characters or received URC
    }


    // "^SISR: <socketIndex>,<dataLength>\r\n<data>..." -> "<socketIndex>,<dataLength>\r\n<data>..." 
    atCoreStatus = Cellular_ATRemovePrefix( &pCursor );
    if( atCoreStatus != CELLULAR_AT_SUCCESS ) {
        return  _Cellular_TranslateAtCoreStatus( atCoreStatus );
    } else if( *pCursor  == '\0'){
        return CELLULAR_PKT_STATUS_SIZE_MISMATCH;
    }

    // "<socketIndex>,<dataLength>\r\n<data>..." -> "<dataLength>\r\n<data>..." 
    char * pSocketIndexToken; // Unused token, skip
    atCoreStatus = Cellular_ATGetNextTok( &pCursor, &pSocketIndexToken );
    if( atCoreStatus != CELLULAR_AT_SUCCESS ) {
        LogError(("_Cellular_SocketRecvDataPrefix: Cellular_ATGetNextTok error %d", atCoreStatus));
        return  _Cellular_TranslateAtCoreStatus( atCoreStatus );
    } else if((pSocketIndexToken == NULL) ||
              ( *pCursor  == '\0') ){
        return CELLULAR_PKT_STATUS_SIZE_MISMATCH;
    }

    // "<dataLength>\r\n<data>..." -> "\r\n<data>..."
    char * pSocketDataLengthToken;
    const char * pDelimiter = "\r";
    atCoreStatus = Cellular_ATGetSpecificNextTok( &pCursor, pDelimiter, &pSocketDataLengthToken );
    if( atCoreStatus != CELLULAR_AT_SUCCESS ) {
        LogError(("_Cellular_SocketRecvDataPrefix: Cellular_ATGetSpecificNextTok error %d", atCoreStatus));
        return  _Cellular_TranslateAtCoreStatus( atCoreStatus );
    } else if((pSocketDataLengthToken == NULL) || // \r is missing, no token found
       ( *pCursor != '\n')){ // \n is missing, data has not started yet
        return CELLULAR_PKT_STATUS_SIZE_MISMATCH;
    }

    // Convert data length from string to integer
    int32_t socketDataLength = 0; 
    atCoreStatus = Cellular_ATStrtoi( pSocketDataLengthToken, 10, &socketDataLength);
    if( atCoreStatus != CELLULAR_AT_SUCCESS ) {
        LogError(("_Cellular_SocketRecvDataPrefix: failed to convert data length token to integer"));
        return  _Cellular_TranslateAtCoreStatus( atCoreStatus );
    } else if( ( socketDataLength > ( int32_t ) CELLULAR_MAX_RECV_DATA_LEN) ) {
        LogError(("_Cellular_SocketRecvDataPrefix: receive data length is larger than MTU"));
        return CELLULAR_PKT_STATUS_BAD_RESPONSE;
    } else if( socketDataLength < 0 ){  // Negative response means socket has closed
        LogDebug(("_Cellular_SocketRecvDataPrefix: socket has closed"));
        return CELLULAR_PKT_STATUS_SEND_ERROR;
    }
    pCursor++; // Skip \n, points to start of data
    *pDataLength = (uint32_t) socketDataLength;
    // Find the prefix length and jump forward
    *ppDataStart = &pLine[pCursor-localLine]; 

    return CELLULAR_PKT_STATUS_OK;

However, this can’t differentiate between the URC and the response so I tried to add the following snippet after the cursor increment at the end of the function:

    bool possibleURC = socketDataLength == 1;
    bool hasNoData = *(pCursor) != '\0';
    bool dataLineFinished = (*(pCursor+1) != '\r') && (*(pCursor+2) != '\n');
    if(possibleURC && ( hasNoData || !dataLineFinished)) {
        return CELLULAR_PKT_STATUS_OK;
    }

My goal was to detect if there is no data.
However, we can receive another URC just after the first so I also added a check for the next two characters being \r\n which would indicate there were data and this was a proper read response.
Currently this fix is not working and I am debugging it.

I also wanted to post here to get some ideas.

Thank you,
Tuna

Hi Tuna,

Thank you for your information.
I will look into this problem and reply in this thread later.

Thanks Ching-Hsin,

What Telit recommends is waiting 100 ms between commands in section 1.6 of the AT command manual

However, since a URC can come in at any time this would require a countdown to restart whenever a URC would be parsed. Does the active thread responsible for sending the command or is the thread spawned by CellularInterface sending the command?

Thanks again for looking into this.

Tuna

EDIT: I think I found where the problem is.
getMsgType marks the URC line as solicited because it matches the currently enqueued message.

This is essentially a race condition caused by the previous write to the socket making modem receive new data and the application not waiting any time before a read after a write.

I added a 10 ms delay in Socket Wrapper and it seems to prevent the race condition but this is definitely a missed edge case in the cellular library.

Hi Tuna,

It is difficult to differentiate URC and AT command response when they are of the same prefix. From you information, there are two problems with the cellular interface:

  1. The URC is of the same format with data receive prefix.

We can use the following AT command sequence to discuss the first problem:

AT^SISR=0,5
^SISR: 0,1             // URC is of the same format as AT^SISR response
^SISW: 0,1             // Another URC
^SISR: 0,5             // The real "AT^SISR=0,5" response
"5 bytes raw data"
OK

This sequence may confuse the AT command response parsing since the URC and AT command response has exactly the same format.

We can consider the following scenario:
“^SISR: 0,1” is real AT command response.

AT^SISR=0,5\n
^SISR: 0, <1 or 2>
"1 or 2 bytes raw data"
OK

In that case, we can expect 1 or 2 bytes of raw data received and later success token “OK” is received. If not, this is an URC response.

The data received function can be udpated with the following logic:

static CellularPktStatus_t _Cellular_SocketRecvDataPrefix( ... )
{
    ...

    /* Check if the input line contains valid format "^SISR: <srvProfileId>, <cnfReadLength>. */
    if( prvCheckPacketFormat( pLine ) == pdTRUE )
    {
        /* If the format is correct and the cnfReadLength is 1 or 2 which may be
         * the URC response, check the raw data and succes token is received or not. */
        if( ( cnfReadLength == 1 ) || ( cnfReadLength == 2 ) )
        {
            if( prvCheckCompletePacketReceived( pLine ) == pdTRUE )
            {
                /* This is AT command response. Handle the AT command response here. */
            }
            else
            {
                /* This is a URC. Return to cellular interface for further processing. */
            }
        }
        else
        {
            /* urcCauseId can only be 1 or 2. This is AT command response. */
        }
    }
    ...
}
  1. The URC is regarded as AT command response and URC callback is not called.

We can use the following AT command sequence to discuss this problem:

AT^SISR=0,5
^SISR: 0,1             // This line is still regarded as AT command response and not handled by URC callback
^SISR: 0,5   
"5 bytes raw data"
OK

The AT command is expecting response with prefix “^SISR” which is exactly the same as URC prefix. Just like you point out in the code snippet. It happens with the following conditions :

  • Cellular interface is waiting for AT command response
  • The response has the same format as URC

In inevitably, the URC will be regarded as AT command response in the _getMsgType.

In this case, the following response linked list will be returned:

“^SISR: 0,1” → “^SISR: 0,5” → “5 bytes Raw data”

We can handle the URC in the receive callback directly since the callback is called in the same thread in which URC callback is called. The commit for network registration can be referenced.

The AT command is sent in active thread instead of the cellular interface thread. It is possible that URC will be regarded as AT command response if they have the same format even when the AT command is sent in cellular interface thread. Cellular interface relies on port implementation to differentiate between them.