LwIP: Ethernet Interrupt stops working after a while

I run LwIP in Socket Mode on Zynq 7000. Speed setting is 100Mbits.
(The below problem is not seen with a speed setting of Auto Detect)

I can ping successfully from a workstation to the board. After a while, ping response is “Destination Host Unreachable”.

I debug the problem, I find that Ethernet Interrupt is not generated anymore. In the pxVectorTable (as below), there are only Timer Interrupts with ID 29.

 void FreeRTOS_ApplicationIRQHandler( uint32_t ulICCIAR )
{
extern const XScuGic_Config XScuGic_ConfigTable[];
static const XScuGic_VectorTableEntry *pxVectorTable = XScuGic_ConfigTable[ XPAR_SCUGIC_SINGLE_DEVICE_ID ].HandlerTable;
uint32_t ulInterruptID;
const XScuGic_VectorTableEntry *pxVectorEntry;

	/* The ID of the interrupt is obtained by bitwise anding the ICCIAR value
	with 0x3FF. */
	ulInterruptID = ulICCIAR & 0x3FFUL;
	if( ulInterruptID < XSCUGIC_MAX_NUM_INTR_INPUTS )
	{
		/* Call the function installed in the array of installed handler functions. */
		pxVectorEntry = &( pxVectorTable[ ulInterruptID ] );
		pxVectorEntry->Handler( pxVectorEntry->CallBackRef );
	}
}

Any ideas?
I am grateful for any help!

Hi XuanTran,

welcome to the forum!

Do you get any interrupts at all when the system is in that state? What is the MCUs interrupt mask then? Could it be that a critical section has been claimed and not released?

Thanks for your quick reply.

When the Zynq is in this state, only Timer Interrupt with ID 29 is seen. Ethernet Interrupt is not generated anymore, which worked well for a while (a couple of minutes or a few hours).

(There are only two interrupts used in this application: Timer Interrupt and Ethernet Interrupt )

Could you clarify this question? I don’t really get it.

So what are the PRIORITIES of those interrupts? I’m not familiar with the platform, so I don’t know what “ID 29” means or what relevance it has for the priorities.

Anyways, if your system is well configured, the priorities must be as follows:

Ethernet - highest
Sys Tick - medium
Service Handler - lowest.

(all relative with respect to one another, of course).

If the Sys tick handler still works in this configuration while the ethernet IRQ does not, it’s not a problem with your FreeRTOS setup but your target logic (eg you do not clear the interrupt request in the ISR, disable the IRQ on the device level or something the like).

We’d need to see your priority setup as well as the relevant entries in your freertosconfig.h to diagnose whether the problem is FreeRTOS related or not.

1 Like

Ethernet Interrupt has a higher priority than Timer Interrupt.

Information of FreeRTOSConfig.h is as follows.

#define configUSE_PREEMPTION 1

#define configUSE_MUTEXES 1

#define configUSE_RECURSIVE_MUTEXES 1

#define configUSE_COUNTING_SEMAPHORES 1

#define configUSE_TIMERS 1

#define configUSE_IDLE_HOOK 0

#define configUSE_TICK_HOOK 0

#define configUSE_MALLOC_FAILED_HOOK 1

#define configUSE_TRACE_FACILITY 1

#define configUSE_16_BIT_TICKS 0

#define configUSE_APPLICATION_TASK_TAG 0

#define configUSE_CO_ROUTINES 0

#define configTICK_RATE_HZ (1000)

#define configMAX_PRIORITIES (8)

#define configMAX_CO_ROUTINE_PRIORITIES 2

#define configMINIMAL_STACK_SIZE ( ( unsigned short ) 2048)

#define configTOTAL_HEAP_SIZE ( ( size_t ) ( 524288 ) )

#define configMAX_TASK_NAME_LEN 24

#define configIDLE_SHOULD_YIELD 1

#define configTIMER_TASK_PRIORITY (configMAX_PRIORITIES - 1)

#define configTIMER_QUEUE_LENGTH 10

#define configTIMER_TASK_STACK_DEPTH ((configMINIMAL_STACK_SIZE) * 2)

#define configASSERT( x ) if( ( x ) == 0 ) vApplicationAssert( __FILE__, __LINE__ )

#define configUSE_QUEUE_SETS 1

#define configCHECK_FOR_STACK_OVERFLOW 2

#define configQUEUE_REGISTRY_SIZE 10

#define configUSE_STATS_FORMATTING_FUNCTIONS 1

#define configNUM_THREAD_LOCAL_STORAGE_POINTERS 0

#define configUSE_TICKLESS_IDLE	0
#define configTASK_RETURN_ADDRESS    NULL
#define INCLUDE_vTaskPrioritySet             1
#define INCLUDE_uxTaskPriorityGet            1
#define INCLUDE_vTaskDelete                  1
#define INCLUDE_vTaskCleanUpResources        1
#define INCLUDE_vTaskSuspend                 1
#define INCLUDE_vTaskDelayUntil              1
#define INCLUDE_vTaskDelay                   1
#define INCLUDE_eTaskGetState                1
#define INCLUDE_xTimerPendFunctionCall       1
#define INCLUDE_pcTaskGetTaskName            1
#define configMAX_API_CALL_INTERRUPT_PRIORITY (18)

#define configUSE_PORT_OPTIMISED_TASK_SELECTION 1

#define configINTERRUPT_CONTROLLER_BASE_ADDRESS         ( XPAR_PS7_SCUGIC_0_DIST_BASEADDR )
#define configINTERRUPT_CONTROLLER_CPU_INTERFACE_OFFSET ( -0xf00 )
#define configUNIQUE_INTERRUPT_PRIORITIES                32
void vApplicationAssert( const char *pcFile, uint32_t ulLine );
void FreeRTOS_SetupTickInterrupt( void );
#define configSETUP_TICK_INTERRUPT() FreeRTOS_SetupTickInterrupt()

void FreeRTOS_ClearTickInterrupt( void );
#define configCLEAR_TICK_INTERRUPT()	FreeRTOS_ClearTickInterrupt()

The weird thing is that this problem only happens when speed setting of 100 Mbits.
The speed setting affects only Ethernet PHY chip.

looks like this is not a FreeRTOS related problem then. Maybe the PHY gets out of sync. Unless somebody here on the forum is familiar with that hardware configuration, you may likely not get an answer here. Maybe ask at the lwip forum or with Zync support?

Thanks for your support!

@XuanTran, intuitively it makes me think of an unhandled error condition. Something went wrong and the peripheral is waiting for an acknowledgment from the application.
I am familiar with the Zynq 7000 network, although I only use FreeRTOS-Plus-TCP.
I used both 100 Mb and 1 Gb without problems.

In the FreeRTOS+TCP Xylinx driver, 3 interrupts are being used: send, receive, and error:

	XEmacPs_SetHandler( &xemacpsif->emacps, XEMACPS_HANDLER_DMASEND,
						( void * ) emacps_send_handler,
						( void * ) xemacpsif );

	XEmacPs_SetHandler( &xemacpsif->emacps, XEMACPS_HANDLER_DMARECV,
						( void * ) emacps_recv_handler,
						( void * ) xemacpsif );

	XEmacPs_SetHandler( &xemacpsif->emacps, XEMACPS_HANDLER_ERROR,
						( void * ) emacps_error_handler,
						( void * ) xemacpsif );

Can you put a break in you emacps error handler? Or set some variable to remember what events have occurred?

@ htibosch Thanks for your hints.

I put a breakpoint at function emacps_error_handler(). However, the breakpoint isn’t hit.

When ping response is “Destination Host Unreachable”, I unplug a cable and plug it again. I can then ping to the board again.

I unplug a cable and plug it again

It seems to me that the issue is not with the TCP stack nor with FreeRTOS. It seems from this statement that something gets reset when you unplug/plug the cable. It might as well be an issue of the PHY. Maybe it is not configured correctly or it goes into a state from which it cannot recover without the reset…

@ kanherea

Thanks for your answer.
I think you’re right.
I also think it isn’t problem of FreeRTOS and LwIP Stack.

I connected a cable to the board, but didn’t ping.
After a while, the LED on RJ45 connector that indicated Link Status was off.

obviously…

What you can do is read the (R)MMI registers at any time. The blueprint code is in the PHY initialization which uses the management interface. The status register can tell you if your PHY is synced and functional. That’ll probably give you enough hints to further pinpoint the issue. One possible problem could be too imprecise an external quartz for the PHY, or a non complaint layout of the lines between the PHY and the Ethernet jack on your PCB.

@RAc : Thanks for your hints!

Pleaseee help with this isssue

The Ethernet issue is as follows:

  • Ethernet software stack is build based on LWIP;
  • On this LWIP, Enclustra give, as support (on Github, on Enclustra Reference Design), some files dedicated on System on Module type ZX1 (Enclustra) that use a PHY0, P/N: KSZ9031RNX circuit (made by Micrel/Microchip). ) for Ethernet. This PHY can be used for 10/100/1000 Mbps Ethernet link.;
  • When using Enclustra LWIP solution implemented on our Mainboard, build on a hardware solution for no more of 100Mbps, Ethernet link does not work (auto-negotiation is always made at 1Gbps, not available on our hardware);
  • The same Enclustra LWIP solution implemented on PE1 development board from Enclustra, does not work either on 100Mbps to a Laptop (forced by using an Ethernet cable not capable to 1Gbps), but work well at 1Gbps when use a cable allowing 1Gbps.

Note: Tests are made by using Packet Sender application to send test messages from a PC to test board or PE development board. Wireshark was used to record/see traffic on link. Was expected that triggered by a test messages send from Packet Sender to view an answer from test board. When link does not work, on Wireshark is observed that PC does not sent any message to Mainboard or PE development system, at 100Mbps. It seems to be blocked on auto-negotiation activity.

  • LWIP with Enclustra files, was modified in accord with KSZ9031RNX data sheet, 3.8 chapter, to force functioning at 100Mbps, as follows:

  • Relevant modifications of LWIP, recommended by Microchip to use PHY at 100Mbps, are as follows:
    • Register 0 bit 6 set to 0 at line “control &=~(1<<6)”;
    • Register 9 bit 8 set to 0 at line “control &=~(1<<8)”;
    • Register 9 bit 9 set to 0 at line “control &=~(1<<9)”;
    • Register 0 bit 9 set to 1 at line “control |=IEEE_STAT_AUTONEGOCIATE_RESTART”;

  • Doing these settings, 100Mbps auto-negotiation is well established between our board and Laptop and link is up. However, the Ethernet communication between PC and board is still not working.

  • For these modifications, when reading relevant registers describing PHY behaviour/current functioning, we find following situation:
    • Reg 0x0 = 0x1140
    • Reg 0x1 = 0x796D
    • Reg 0x5 = 0x4DE1
    • Reg 0x1F = 0x0328

Note : Even Reg 0 bit 6 is set to 0, after software reset, bit 6 is read as to be 1 (0x1140) (?). Reported values are not consistent with settled and expected values and should be a marker of the problem cause.

  • When LWIP with these modifications (registers modified for 100Mbps according KSZ9031RNX datasheet 3.8 chapter) is used with PE board connected to PC, the same behavior is observed at 100Mbps, as when it is used with our Mainboard.

Enclustra technical support indicate us possible small missing/errors in LWIP modification made by us, without identifying it clear. Also, technical support of Enclustra claim that LWIP solution from Github was tested only at 1Gbps, and not at 100Mbps. Their answers are slow, and seems to be intentionally unclear with a commercial substrate.

We need your advices/technical support in order to overcome this issue. When necessary, it is possible to discuss directly with my colleagues working on this subject, if these information are unclear or elusive.

Seems like a LWIP issue and therefore, you are likely to get better responses on LWIP forums or mailing list.