Cellular Interface - Receiving unexpected/unhandled response after a command response makes next command fail

Hello Everyone,

I am porting the cellular interface for Thales Cinterion PLS63 Modem and I have been observing a consistent issues on the following case:

  1. Device is in airplane mode.
  2. AT+CFUN=1 command is sent.
  3. Modem Responds with “\r\nOK\r\n\r\n^SYSSTART\r\n”.
  4. OK Response is handled on loop inside _handleAllReceived function under cellular_pktio.c.
    NOTE: The pCpmtext->pktRespQueue gets a message about the response value here.
  5. Line containing ^SYSSTART is handled but pContext->PktioAtCmdType still points to the old value since the caller thread haven’t run yet to receive the pktRespQueue message and cleanup the PktioAtCmdType and prefixes and callbacks etc.
    Because of this if ^SYSSTART is not found on the URC tables (CellularUrcTokenWoPrefixTable or CellularUrcHandlerTable) it is marked as AT_SOLICITED by _getMsgType function and because it doesn’t match anything on the tokens _processIntermediateResponse does the incorrect thing.
    NOTE: I am not sure why the default case here saves the string to a pResp. I think that case should log and break instead of trying to save data.
  6. pResp contains a not null line and pktStatus says pending buffer .
  7. A command with single line of response is called ( AT+COPS? etc.) first line of pResp contains data. Command fails to parse properly and times out after the maximum amount of time.

I hope this explanation is clear enough.
Took me debugging couple hours to understand what is going on.

Basically there are two problems in my opinion.

  1. PktioAtCmdType has a data race from being accessed by receiver thread and caller thread. Even when the pResp is queued to be sent back to the caller thread that doesn’t mean cmd type gets cleared since receiver thread might be running on a higher priority. (I didn’t see any note on priorities but please direct me if I missed anything)
    I believe as the Error or Okay response is received cmd type variable on context should be cleared by receiver thread.
  2. _processIntermediateResponse saves At Data even when the cmd type requires a single line or no command. I believe this can result in erroneous saves. I know there are cases where URC handlers can have multiple lines and need to be processed when they end but I am not sure how this case gets handled by the library yet (Haven’t tested any SMS commands that work like this.)
    But I don’t understand why a line that doesn’t match anything gets saved at all instead of dropped.

I definitely understand that having synchronous and asynchronous responses make things more complicated. However, the library seems to be making lots of incomplete,incorrect assumptions when it comes to determining the nature of the responses and how to handle them. Is there documentation on some sort of a sequence diagram that goes over the implementation details of how the receive logic is handled? Debugging gives information but a more cohesive documentation would be a valuable asset.

Thank you,
Tuna Bicim

Hi Tuna,

Hope I understand your problem correctly. If I misunderstand your problem, please help to point it out.
The following is the sequence between modem and cellular interface

> : Cellular interface to modem
< : Modem to cellular interface

> AT+CFUN=1
<
< OK
<
< ^SYSSTART
<
  1. “AT+CFUN=1” in Cellular_CommonRfOn set the atCmdType to CELLULAR_AT_NO_RESULT. It means the command only expect response is either token in CellularSrcTokenErrorTable or in CellularSrcTokenSuccessTable.

  2. Then later “^SYSSTART” is received by pktio thread and pktio thread failed to recognize it as URC code. As you mentioned, pContext->PktioAtCmdType still points to the old value. So the pktio thread continue to append the result to pResp which cause the following errors.

The problem is “^SYSSTART”. This input line doesn’t contain “:” so it should be regarded as URC token without prefix in CellularUrcTokenWoPrefixTable. In this way, it doesn’t matter pContext->PktioAtCmdType points to what value.

urcParseToken function will split the input line into the the following token. “SYSSTART” will be used to search for the urc handler.

"^SYSSTART" = "^" + ( pTokenPtr = "SYSSTART" ) + ( pSavePtr = NULL )

We can simply add an entry in CellularUrcHandlerTable

CellularAtParseTokenMap_t CellularUrcHandlerTable[] =
{
   ...
    { "SYSSTART",             Cellular_ProcessSYSSTART },
   ...
}

A better solution is to add the “^” back in urcParseToken if pSaveStr is NULL.

"^SYSSTART" = ( pTokenPtr = "^SYSSTART" ) + ( pSavePtr = NULL )

And add an entry in CellularUrcHandlerTable in this way

CellularAtParseTokenMap_t CellularUrcHandlerTable[] =
{
   ...
    { "^SYSSTART",             Cellular_ProcessSYSSTART },
   ...
}

You also pointed out the corner case that if unrecognized string is received after success or error token of current command is received. The library should report warning message and drop this line. In this case, “^SYSSTART” is not expected by the AT command and not recognized as URC code as well. It should be dropped by cellular interface and reported as warning message.

We would like to update the code for the following changes:

  1. Handle URC without prefix but start with “+” or modem specific prefix char.
  2. Update the code to drop and warn unknown message after receiving success or error token.

As well as the changes we discussed in the other thread. Porting document is something we should improve.

Indeed “^SYSSTART” URC token is not considered in our initial design. We would like to co-work with the community to support various cellular modem. Once we update the code, we will also reply in this thread. Thank you for your valuable opinion.

Hello Ching-Hsin,

Thanks for the detailed breakdown.
I think we are on the same page.
Obviously I can add ^SYSSTART as a URC handler. That is a way to solve this issue but my concern was just receiving gibberish that won’t match anything instead of a proper response.

I mostly talked about the PktioAtCmdType because It is used to determine the message type on _getMsgType which resulted in the packet being saved by _processIntermediateLine. Number two on your code changes would be a really good change for sure. That is the part where I was worried the most because it makes the next response timeout and some timeouts are quite long for this modem (up to minutes).

Thanks again for looking into this issue.
If you have anything else that I need to clarify I would be happy to do so.

Thank you,
Tuna

Hi Tuna,

Thank you for your opinion. I will bring back this issue to have more discussion.
I will also update in this thread to discuss with you.
Hope we can clarify this concern together. Again, thank you for pointing out your concern.

1 Like

Hello Ching-Hsin,

I noticed there was a pull request for unhandled at responses that was merged in. Does that cover fixes for what we have been discussing or will there be more prs that will address everything? I am also curious about the same question for custom prefix characters.

Thank you,
Tuna

Hi Tuna,

This PR fixes the second problem.

  1. Update the code to drop and warn unknown message after receiving success or error token.

The first problem mentioned in this thread is a feature request. It will be updated in another PR.

I would like to take the chance to explain this PR and undefined message type. The logic to decide AT response type is _getMsgType function.

AT response will be considered as AT_UNDEFINED only when

  • It doesn’t have prefix
    A input line with prefix won’t be regarded as AT_UNDEFINED message.
+<prefix>:<payload>
  • It is not declared in URC without prefix table
    urcTokenWoPrefix is the logic to check URC without prefix.

  • It is not expected in the AT command response
    _processIntermediateResponse is the logic that checks the expected input line for the current AT command.

The default behavior for cellular interface when receiving a AT_UNDEFINED message is to drop this message and clear receive context.

Cellular module porting can also use its own handler for AT_UNDEFINED message with the _Cellular_RegisterUndefinedRespCallback API. If the undefined response handler returns CELLULAR_PKT_STATUS_OK, cellular interface will keep processing without clearing receive context.

For the problem mentioned in this thread, I describe the change in comment.

> AT+CFUN=1
<
< OK // The success token is received. The AT command type is cleared in receive thread. The following line won't be appended to response.
<
< ^SYSSTART // This command will be regarded as AT_UNDEFINED. It can be handled by undefined response handler if registered.

The atCmdType is now cleared in pktio thread. The pktRequestMutex ensures that only one command can be sent to cellular modem at a time. So we don’t have to worry about the response of next AT command. If current AT command times out, there may be a problem with the cellular modem. In this case, it is recommended to reset the cellular modem.