tcp connection resetting

wacko_eddie wrote on Tuesday, April 04, 2006:

Hi,

I’m using the lwip demo for rowley from the commandline. I integrated a fat filesytem and read out a sdcard located on the spi. Using the at91sam7x-ek board with the at91sam7x256.

The problem I encouter is that after receiving a couple of packages the controller starts sending RST,ACK packages sometimes just a few and sometimes blocks of 5 or 6 RST,ACK packages. And eventualy the webserver task dies. I’m still able to ping the webserver and the  errorchecktask stays alive. I removed the usb and flash task.

Any suggestions?

nobody wrote on Tuesday, April 04, 2006:

> I’m using the lwip demo for rowley from the commandline. I integrated a fat
> filesytem and read out a sdcard located on the spi. Using the at91sam7x-ek board
> with the at91sam7x256.

Cool.  Which file system are you using?

> The problem I encouter is that after receiving a couple of packages the controller
> starts sending RST,ACK packages sometimes just a few and sometimes blocks of
> 5 or 6 RST,ACK packages.

Don’t know too much about TCP/IP but there is an lwIP mail list that may be able to point to some possible sources of this.

>And eventualy the webserver task dies. I’m still able
> to ping the webserver and the  errorchecktask stays alive. I removed the usb
> and flash task.
>
> Any suggestions?

Normally the first answer on this forum is "check the stack depths", but I think in this case the lwIP demo allocates quite large stacks to the lwIP related tasks.  May be worth a look anyway, especially if your file system uses calls to C routines such as string handlers, etc that use a lot of stack.

I think the pings are handled by the lwIP stack task directly, whereas new connections require allocation of memory for various features that persist with the connection.  If the pings are working but the TCP/IP connections are not then maybe there is a problem with the allocation of memory for new connections or freeing of memory for old connections?

It might be worth getting an updated version of the lwIP code?

A few posts ago there was a thread about the function that creates the buffer of task information not being ideal for real applications also.  Maybe try taking this out of the demo.

wacko_eddie wrote on Tuesday, April 04, 2006:

Thnx for your response. I’m using EFSL from www.efsl.be. Martin Thomas ported it to the sam7s and from there I ported it to the sam7x.
Martin has a site with a few projects.
http://www.siwawi.arubi.uni-kl.de/avr_projects/arm_projects/

I was looking to integrate lwip 1.1.1 but found freertos 4.0.0 so i started with that. I’ll check the stacks but I’m trying to figure out howto create a task for eacht get. Because analizing the data further it seems it can’t handle the ammount of requests I’m making. 

wacko_eddie wrote on Friday, April 07, 2006:

Ok I’ve been trough the source trying to locate why it’s resetting. I’ve been trough the mailboxes  from the stack, ending up in:
signed portBASE_TYPE xQueueReceive( xQueueHandle pxQueue, void *pvBuffer, portTickType xTicksToWait )

wich calls for:
  taskYIELD();

on wich my webserver dies.

nobody wrote on Monday, April 17, 2006:

Did you found a solution for this behavior? i made the same experiences. After an undefinable number of connection the http server dies. ping is possible.

i did a port with the lwip 1.1.1 but this one crashed the whole rtos. the stack of the receiver task growed to heaven. seconds before the rtos dies i can see that the stacksize is 0, no matter how much room i will give him.

nobody wrote on Monday, April 17, 2006:

I thought there was a message saying it was a stack issue after all - having looked at the stack usage via a serial port print.  Can’t find this message now.  Maybe it was another thread, and I’m confused?

wacko_eddie wrote on Tuesday, April 18, 2006:

Hmm that’s difficult to say.
I increased the TCP_WND in lwipopts.h
because I also had “TCP Zerowindows”
this solved a lot of resets aswell. But I’m still strugling with the issue… There are still some rst issues.

molnarzs wrote on Tuesday, April 18, 2006:

I had similar problems, after a while the EMAC stopped responding. I don’t know if it applies in your case, but anyhow, I described it in my blog with the solution that made it work:

http://www.zsoltmolnar.hu/blog/embedded/2006/04/freertos-emac-bug.html

nobody wrote on Tuesday, April 18, 2006:

While I appreciate your blog is freely viewable, it would have been good if you had shared that information here.  That is after all one of the points to open source software.  It is made available to you for nothing, you contribute back.

To add to your points, first all interrupts should be handled if creating a commercial grade product, and second there seems to be a known bug in the EMAC hardware itself eluded to by the change history for FreeRTOS between versions

quote, "    + Updated the SAM7X EMAC drivers to take into account the hardware errata
      regarding lost packets.
" from http://www.freertos.org/History.txt

Could this be something to do with your problem?

molnarzs wrote on Tuesday, April 18, 2006:

Hello

I started a new thread with the copy.