Three latent bugs in MPS2_AN385 ether_lan9118 (smsc9220) driver — benign on QEMU, would bite real silicon

Posting this for the record as a heads-up to the community rather than as something I’m able to follow up on. While integrating the FreeRTOS-Plus-TCP MPS2_AN385 network interface against QEMU, I went through the vendored Arm SMSC9220 / LAN9118 low-level driver (source/portable/NetworkInterface/MPS2_AN385/ether_lan9118/smsc9220_eth_drv.c) and spotted three defects.

Important caveat: all three are invisible on QEMU’s MPS2 LAN9118 model — it’s deterministic and never enters the divergent path, so anyone running only under QEMU (as I am) will see everything working fine. They’re a concern only on real LAN9118 silicon, which I have no way to test. So I can’t offer a hardware-validated patch; I just wanted the analysis on record so it’s findable if it ever bites someone. I checked the usual places (FreeRTOS-Plus-TCP / TF-M / mbed-os / CMSIS / Zephyr issues, CVEs) and found no prior report. They’re all present and unchanged in current main.

1. (Critical) smsc9220_mac_regwrite busy-wait tests bit 0, not the BUSY bit (~line 466)

} while( time_out &&
         ( register_map->mac_csr_cmd &
           GET_BIT( register_map->mac_csr_cmd, MAC_CSR_CMD_BUSY_INDEX ) ) );

GET_BIT already returns 0/1, so this ANDs the whole register with bit 0 (the LSB of the MAC CSR address field) instead of testing BUSY. The write can return before completion and race the next MAC/PHY access. The sibling smsc9220_mac_regread (~line 416) has the correct form — GET_BIT( ... ) alone. Fix: drop the leading register_map->mac_csr_cmd &. Why QEMU hides it: MAC-CSR commands complete synchronously in the model, so BUSY is never seen set on readback.

2. (Major) TX path prepends a wasted filler DWORD for word-aligned chunks (~lines 320 / 1150)

fill_tx_fifo computes filler_bytes = ( 4 - remainder_bytes ) with no modulo, so a word-aligned chunk (remainder_bytes == 0) prepends a full zero DWORD, and smsc9220_send_by_chunks correspondingly sets data_start_offset = 4. The packet still transmits correctly (the offset skips the filler), but a TX-FIFO DWORD is wasted per aligned chunk, and the FIFO free-space check (~line 1126) doesn’t account for it — so a chunk within 4 bytes of capacity can overrun. Fix: use ( 4 - remainder_bytes ) % 4 at both sites so aligned chunks use zero filler / offset 0. Why QEMU hides it: it honours data_start_offset, and the FIFO is never driven near capacity in typical use.

3. (Major) smsc9220_check_id returns int into an enum, misreporting mismatch as TIMEOUT (~lines 859 / 996)

smsc9220_check_id is declared int and returns 1 on a chip-ID mismatch; its only caller stores that into enum smsc9220_error_t, where 1 == SMSC9220_ERROR_TIMEOUT. So a real ID mismatch is reported as a timeout (init still aborts, so it’s a misleading diagnostic, not a functional break). Fix: return enum smsc9220_error_tSMSC9220_ERROR_NONE / SMSC9220_ERROR_INTERNAL. Why QEMU hides it: the model’s chip ID always matches.

I’m not set up to drive these to resolution myself, but happy for anyone with real LAN9118 hardware to take them forward. Hope it’s useful.