At first sight everything seemed fine, ping works, telnet console works, unplug and replug, …
but if i put it under some load, ether by running an iperf test or doing a portscan the cpu gets stuck in HardFault. What could be the Problem? Could it be related to stack sizes or some buffer size being configured wrong?
the PC value getting saved before the HardFault loop points to this line (which is one line after a memcpy)
(gdb) l *pc
0x800e620 is in vIPerfTask (PWD/freertos_plus_projects/plus/Common/Utilities/iperf_task_v3_0g.c:503).
503 pxDataClient->ullAmount = pxControlClient->ullAmount;
and if i do a portscan with nmap it gets stuck pointing to this line:
(gdb) l *pc
0x801876c is in prvTCPReturnPacket_IPV4 (PWD/freertos_plus_projects/plus/Framework/FreeRTOS-Plus-TCP-main/source/FreeRTOS_TCP_Transmission_IPv4.c:147).
147 pxIPHeader = ( ( IPHeader_t * ) &( pxNetworkBuffer->pucEthernetBuffer[ ipSIZE_OF_ETH_HEADER ] ) );
I was looking forward to using this as a reference for my Project, so it would be great to understand this issue and ultimately get rid of it.
Can you examine pxDataClient and pxControlClient and see if those seem corrupted? Also, see dst of memcpy and see if there is a possibility of buffer overrun or stack overflow.
thanks for the replies. After looking into this problem a few times (and then getting distracted), i finally looked at the disassembly and now the solution was quite simple.
The problem was gcc generating an udf opcode, this can be disabled with -fno-delete-null-pointer-checks. see this st forum post.
I don’t really understand the underlying mechanics of this problem and why hping --fast makes it not reach this code but hping --faster does, but this solved it.
with vldr d16 the compiler tried to use a non existing fpu core register (should only have d0-d15). And indeed the makefile specified the wrong fpu type:
- -mfpu=vfpv4 \
+ -mfpu=fpv4-sp-d16 \
i hope i will get to upload and share a cleaned up version of the example-project at some point when i got time…
I suspect some things get written by the DMA, so the compiler is not aware of it. If it really still is a nullpointer at execution time, shouldn’t the cpu also go into a fault handler? the code seems to run stable now, but i get what you mean with treating symptoms…
Not necessarily. On many MCUs, 0 is mapped to the beginning of internal flash which is always readable (and may also be writeable when in flash programming mode), so 0 pointer dereferencing by itself is neither illegal nor technically forbidden in such scenarios. Yet it is one very very common cause for runtime problems, so I suspect that this compiler setting is an optional aid for developers to trap this case and distinguish it from more generic access errors even in those cases where there is no memory mapped to 0.
Again, without this setting in action, it may be the case that your code still attempts to access network buffers at address 0, but coincidentally, the memory behind it (the IVT probably) causes the network driver to bail out in a begnine manner. According to Murphy, this WILL at some point (eg after a firmware update that leaves the IVT different) make the software fail, and most probably at the most critical and inaccessible site of your most important customer. BTDT. I would leave the check in, wait till it gets hit and then inspect your data structures. Alternatively, you could set a hardware breakpoint to catch null pointer dereferences.