Skip to content

[net/ee16] Added shared mem support and fixes to driver#64

Merged
Mellvik merged 2 commits intomasterfrom
eexpress
Jun 20, 2024
Merged

[net/ee16] Added shared mem support and fixes to driver#64
Mellvik merged 2 commits intomasterfrom
eexpress

Conversation

@Mellvik
Copy link
Owner

@Mellvik Mellvik commented Jun 18, 2024

With this PR the EtherExpress16 driver enters a completed state. Several serious bugs have been weeded out, shared memory support has been added and the code has been significantly cleaned up. The new version is fast, reliable and supports important bootopts configuration flags. 8bit bus support is still missing - a relatively easy addition when/if the need arises because the scaffolding is already in place.

Summary of key enhancements since the previous PR:

  • Full shared memory and pio support. If a valid (not zero) shared memory address is provided in /bootopts, shared memory is used, otherwise pio. With more practical experience it may be desirable to force pio-mode when the buffer is 16k, like other drivers do. However, at this point there is little evidence to the claim of instability of the 16k/shmem combination.
  • The performance difference between shmem/pio is minimal on fast machines, shmem has a 4% advantage on my 40MHz 386sx. Probably more on slower machines - shared memory becoming increasingly better.
  • The io address in /bootopts MUST match the card setting, other parameters are taken from bootopts if available, regardless of card settings. In verbose mode, the card settings are reported at boottime. The card's shared memory settings are read but currently ignored.
  • The card/driver runs at approx. 80k bytes per second (ftp), ktcp permitting. A real kludge, a 600ms delay, has been added to the readselect routine in the driver. The delay speeds up outgoing file transfers by 300% average by avoiding a read select_wait call. A similar method may speed up other drivers as well. Looking into whether ktcp may be adjusted to avoid this kludge is on the list. See discussion below.
  • Bootopts flags supported are (forced) 16k buffer size, verbose mode, cable type selection.
  • The driver allocates transmit buffers like this: ((bufsize in k)>>3)&7, which means 2@16k, 4@32/64k. The current implementation of ktcp does not enable practical use of more than 2 tx-buffers, so the latter case is a waste - for now. Total number of NIC packet buffers is 10/20/40 @ 16/32/64k.

The driver is (still) happily and silently overwrite unprocessed packets in the rx-queue under heavy load. Changing the code to handle this differently has been deemed not worthwhile this far - possibly because most testing has been done on a 32k buffer NIC (iow ample buffer space). The only case in which this issue may (!) become visible is when launching a flood ping with significant size packets while running a listing in an outgoing telnet connection. Flood pings and ftp transfers work well. AAMOF - ftp transfers into TLVC are just barely affected by a running flood ping, indicating that there is more performance to be gained from tuning/optimizing ktcp.

Which brings us back to the

TCP speed kludge

This 'trick' was accidentally discovered when removing printks (actually kputchars) turned out to kill the performance of outgoing FTPs completely, from 70+k to ~25k. Keeping the kputchars in lasttxstatus(), which is part of tx interrupt processing, brought the performance back. The kputchars were eventually replaced by udelay() calls and moved to the _select function where quite a bit of experimenting lead to what seems to be a reasonable delay value.

This value will be different on a different speed machine, so - while in use - it should be calculated using a speed index, like the machine's BOGOMIPS value - which currently does not exist, but may be coming.

As to why this delay helps: What seems to be the case is that in the course of pushing packets and ticking off incoming ACKs, the delay is just enough to avoid a select_wait()/wake_up() cycle when receiving the ACK immediately following a transmitted data packet. The delay optimizes the rhythm of the exchange so to speak. It is not obvious that there is an easy fix for this in ktcp, but given the size of the improvement, it seems worth a discussion.

@Mellvik
Copy link
Owner Author

Mellvik commented Jun 20, 2024

More testing with 16k NIC buffer size revealed that

  • shmem is indeed somewhat unreliable with this memory size, as indicated by comments in other drivers
  • in pio mode, there will be overrun errors under heavy load (10 rx buffers instead of 16). Instead of being caught by the driver, these errors manifest themselves as ktcp: eth_process errors -1 (11), in practice no more dramatic than lost ping packets.

@Mellvik Mellvik merged commit e7291dd into master Jun 20, 2024
@Mellvik Mellvik mentioned this pull request Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant