FreeRTOS+TCP will not reply ping after unplug/plug eth cable when sending data

@michaelv

Can you put a breakpoint in this line and check if the read/write with PHY are still successful by checking the return value of HAL_ETH_ReadPHYRegister after the ETH cable disconnection?

Where is this log: xSTM32F_NetworkInterfaceInitialise returns 1 coming from?
Also, can you share the copy stm32 network interface file you use to associate the logs that you have shared with control flow?

The return value is successful, but I had to alter the xSTM32_PhyRead() and xSTM32_PhyWrite() because of access issue of my PHY.

It comes from @htibosch suggestion:

Here is the log again, and I’ve attached my NetworkInterface file.

FreeRTOS_AddEndPoint: MAC: 48-5c IPv4: ac1601e9ip
FreeRTOS_NetworkDown is called
xPhyReset: phyBMCR_RESET 0 ready
+TCP: advertise: 0101 config 3100
prvEthernetUpdateConfig: LS mask 00 Force 1
xSTM32F_NetworkInterfaceInitialise returns 0
xPhyCheckLinkStatus: PHY LS now 01
prvEMACHandlerTask LS has changed
prvEthernetUpdateConfig: LS mask 01 Force 0
Network buffers: 57 lowest 57
Heap: current 3360 lowest 3360
FreeRTOS_NetworkDown is called
Link Status is high
xSTM32F_NetworkInterfaceInitialise returns 1
Heap: current 1728 lowest 1728
xPhyCheckLinkStatus: PHY LS now 00
prvEMACHandlerTask LS has changed
prvEthernetUpdateConfig: LS mask 00 Force 0
FreeRTOS_NetworkDown is called
xSTM32F_NetworkInterfaceInitialise returns 0
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called
FreeRTOS_NetworkDown is called

NetworkInterface.c (55.5 KB)

Not sure if MV88E6071_WaitNotBusy is the right way to wait for the writes to get reflected as the MV88E6071_SMI_CMD is continuously polled. Did you take a look at how the Linux kernel does this [mv88e6xxx_smi_direct_wait] for similar (mv88e6xxx) PHYs?

I can’t find the string smi_direct_wait in the file you’re refering to.
The last bit of MV88E6071_SMI_CMD register is the reference whether the SMI read/write command is completed, so polling this register is the way to know it (as I understand).
From debugging MV88E6071_WaitNotBusy I saw that SMI operation completes by the time the first or the second iteration of the polling loop.

Any other suggestions?

I can’t find the string smi_direct_wait in the file you’re refering to.

mv88e6xxx_smi_direct_wait is the function I was referring to.

Any other suggestions?

Its tricky without a datasheet or similar custom board to debug this. I would refer to the Linux kernel drivers to see if the PHY reads and writes are handled properly.

as I understand this function, it behaves pretty much the same as my MV88E6071_WaitNotBusy (plus it has a delay between retries).

before unplugging the cable everything works fine, so I assume that PHY access works correctly.

As I have responded here. The same STM32F4 network interface with lan8742a as PHY is able to detect link status change and bring the network back up once the network cable is reconnected.

before unplugging the cable everything works fine, so I assume that PHY access works correctly.

Since you are doing the PHY access indirectly, you might have to reach out to the PHY vendor to see how the link status change can be captured.

@tony-josi-aws what function in the +TCP is responsible of probing the link status, once the link is gone?

There is a link-status per interface, so there is this generic function:

    BaseType_t xGetPhyLinkStatus( struct xNetworkInterface * pxInterface );

This function will just lookup a flag, it shall not use the PHY to poll the connection.
The network interface task is responsible for updating the link-status flag.

EDIT

Here the PHY status is monitored:

static void prvEMACHandlerTask( void * pvParameters )
{
    /* Sleep and handle events.
     * It is important that the sleep doesn't last too long,
     * because the PHY must be polled to monitor the Link Status. */
    for( : : )
    {
        /* xPhyCheckLinkStatus() will return pdTRUE if the Link Status
         * has changed.
         * The Link Status is store in the boolean field
         * xPhyObject.ulLinkStatusMask */
        if( xPhyCheckLinkStatus( &xPhyObject, xResult ) != 0 )
        {
            /* Something has changed to a Link Status, need re-check. */
            prvEthernetUpdateConfig( pdFALSE );

            #if ( ipconfigSUPPORT_NETWORK_DOWN_EVENT != 0 )
            {
                if( xGetPhyLinkStatus( pxMyInterface ) == pdFALSE )
                {
                    FreeRTOS_NetworkDown( pxMyInterface );
                }
            }
            #endif /* ( ipconfigSUPPORT_NETWORK_DOWN_EVENT != 0 ) */
        }
    }
}

The function pxInterface->pfInitialise() will be called by the IP-task regularly. When the LS was found high, the initialise function will return pdPASS, and the interface “goes up” again.

I’ve played around with xPhyCheckLinkStatus function and saw that the PHY always returns the phyBMSR_LINK_STATUS bit as 0 (link is down), even after I re-plug the cable.
My work around for this is hard-coding the phyBMSR_LINK_STATUS in this function:

for( xPhyIndex = 0; xPhyIndex < pxPhyObject->xPortCount; xPhyIndex++, ulBitMask <<= 1 )
        {
            BaseType_t xPhyAddress = pxPhyObject->ucPhyIndexes[ xPhyIndex ];

            if( pxPhyObject->fnPhyRead( xPhyAddress, phyREG_01_BMSR, &ulStatus ) == 0 )
            {
                if(pxPhyObject->ulPhyIDs[xPhyIndex ] == PHY_ID_MV88E6071) 
                      ulStatus |= phyBMSR_LINK_STATUS;        /* Link Status workaround for MV88E6071 */
                
                if( !!( pxPhyObject->ulLinkStatusMask & ulBitMask ) != !!( ulStatus & phyBMSR_LINK_STATUS ) )
                {
                    if( ( ulStatus & phyBMSR_LINK_STATUS ) != 0 )
                    {
                        pxPhyObject->ulLinkStatusMask |= ulBitMask;
                    }
                    else
                    {
                        pxPhyObject->ulLinkStatusMask &= ~( ulBitMask );
                    }

                    FreeRTOS_printf( ( "xPhyCheckLinkStatus: PHY LS now %02X\n", ( unsigned int ) pxPhyObject->ulLinkStatusMask ) );
                    xNeedCheck = pdTRUE;
                }
            }
        }

and now the ping returns immediately as I re-plug the cable.

Wow, well found!

So is this the only change you made?

 for( xPhyIndex = 0; xPhyIndex < pxPhyObject->xPortCount; xPhyIndex++, ulBitMask <<= 1 )
 {
     BaseType_t xPhyAddress = pxPhyObject->ucPhyIndexes[ xPhyIndex ];
 
     if( pxPhyObject->fnPhyRead( xPhyAddress, phyREG_01_BMSR, &ulStatus ) == 0 )
     {
+        if(pxPhyObject->ulPhyIDs[xPhyIndex ] == PHY_ID_MV88E6071)
+        {
+            ulStatus |= phyBMSR_LINK_STATUS; /* Link Status workaround for MV88E6071 */
+        }         
         if( !!( pxPhyObject->ulLinkStatusMask & ulBitMask ) != !!( ulStatus & phyBMSR_LINK_STATUS )
         {
         ...

In that case we can turn it into a PR. Thanks for reporting back.

Maybe I’m totally wrong, but is manually setting the link status bit a solution ?

Thanks :slight_smile:
This is the only change I made to solve the link status problem. Before that I had a PHY access problem , that I solved by rewriting xSTM32_PhyRead() and xSTM32_PhyWrite().

I’ll let more experienced people than me answer this question. :slight_smile:
I’ll just add that MV88E6071 is not a PHY like LAN8742A, it’s a switch with 5 ports, and it’s connected to the MCU via a MII interface.

Usually also a switch provides an interface (status register set) to query the link status/speed etc. of each port. It’s non-standard and a bit more complicated, but it should be possible. I did that in the past for an other switch chip.
Sure, the local link to MCU is usually always up after power-up, but I think e.g. to properly re-initiate DHCP it’s required to check the link status of the external port(s) to connected (or not) to the ‘real’ LAN.

It is difficult to find a programming manual for Marvell 88E6071. If you know of one, please post a link.

But I see that the part supports 1 Gbit, and therefor it is not supported by phyHandling.c.

The module phyHandling can only be used for PHY’s with a speed of 10/100 Mbps. It is designed to give a universal interface to all 10/100 PHY’s. The detection is automatic: it finds the ID and then it knows how to treat the PHY.

1000 Mbps PHY’s are much more complex and less uniform. Have a look at for instance the 1Gb PHY driver for Ultrascale.

Note that your application should also work if you just assume that the link is up. The default settings are usually OK to work with, also when more ports are connected.

PS how many of the 8 ports are connected on your board?

As for switches: phyHandling will check all 32 “ports” that are connected. The IP-task will tread a network interface as being “up” as soon as 1 or more “ports” has a high Link Status.

Note that when you still attempt to send a packet, while the link status is low, things get messy: the IP-task will wait for a time-out and that is very undesirable.

I couldn’t find any info on this chip. Maybe because it’s rather old.
I do have it’s datasheet, but it’s under NDA.

Only the MCU is directly connected to the switch. Other ports are pluggable with RJ45 cables.

88E6071 uses indirect access to it’s PHY registers. This is why I had to rewrite xSTM32_PhyRead() and xSTM32_PhyWrite(). Because of this, in my implementation I had to comment out xPhyDiscover() and hard-code pxPhyObject properties of the PHY.
According to the datasheet the phyREG_01_BMSR register has the bit phyBMSR_LINK_STATUS (same location), but as I’ve written before, it always remains low

I couldn’t find any info on this chip. Maybe because it’s rather old.
I do have it’s datasheet, but it’s under NDA.

It turns out that I worked with the PHY about 4 years ago! I still have the “Functional Specification”. And indeed I see that PHY access is “indirect”, as implemented here.

And now I also see that this PHY has a speed of a 100/10 Mbps.

And it has the usual “standard” registers:

PHY Control Register
PHY Status Register: bit 2 = phyBMSR_LINK_STATUS 0x04
PHY Identifier
PHY Identifier

Were you able to read the correct identifier ?

Other ports are pluggable with RJ45 cables.

That is what I wanted to know, can you test it as a switch?

in my implementation I had to comment out xPhyDiscover() and hard-code pxPhyObject properties of the PHY

You can still do something like this:

BaseType_t xSTM32_PhyRead( BaseType_t xAddress,
                           BaseType_t xRegister,
                           uint32_t * pulValue )
{
    if( xPhyObject.ulPhyIDs[ 0 ] == PHY_ID_MV88E6071 )
    {
        // Indirect access to the basic registers
    }
    else
    {
        // Normal direct access to the basic registers
    }
}

Earlier in this logging:

    prvEthernetUpdateConfig: LS mask 00 Force 1
    xSTM32F_NetworkInterfaceInitialise returns 0
    xPhyCheckLinkStatus: PHY LS now 01
    prvEMACHandlerTask LS has changed
    prvEthernetUpdateConfig: LS mask 01 Force 0

did phyBMSR_LINK_STATUS become high on its own? Or did you force it by setting ulLinkStatusMask?
PS. I miss the logging PHY ID 0x......

I don’t probe for identifiers. I changed vMACBProbePhy() to support only one PHY:

void vMACBProbePhy( void )
{
    vPhyInitialise( &xPhyObject, xSTM32_PhyRead, xSTM32_PhyWrite );
//    xPhyDiscover( &xPhyObject );
    xPhyObject.ucPhyIndexes[0] = MV88E6071_ADDR;
    xPhyObject.ulPhyIDs[0] = PHY_ID_MV88E6071;
    xPhyObject.xPortCount = 1;
    
    xPhyConfigure( &xPhyObject, &xPHYProperties );
}

it seems to work fine.

That would be a smarter way to do it :slight_smile:

This log is before my force setting of the bit.
I’ve checked now again. In xPhyCheckLinkStatus() there are two places where ulLinkStatusMask is set, depending on the xHadReception input parameter. When I first connect the eth cable, the function is called with xHadReception set to non-zero, so the function does not probe the PHY’s register

Hello again :slight_smile:
After making the workaround for the phyBMSR_LINK_STATUS bit in xPhyCheckLinkStatus() and making it work, I remembered that my board also has an external fiber connection to the MV88E6071 switch.
And guess what? The problem of unplugging and plugging the fiber cable is not solved by this workaround. I mean - unplugging/plugging RJ45 cable works fine, but doing the same thing with a fiber cable - the ping doesn’t return.
I went through the MV88E6071 datasheet and didn’t find any special status bits for the fiber link.
What drives me nuts is that the same board running Keil’s RL-TCPnet library doesn’t have this problem, so this is not a hardware issue.
What else can I check/change?
What is RL-TCPnet doing that FreeRTOS+TCP doesn’t? (It is a closed library so this is a hypothetical question)
If I don’t solve this issue I will be stuck with the old version of the software for the board (the one with RL-TCPnet, without FreeRTOS).
Thanks.
@htibosch